options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 145.662567, "speed_pp": 14.059893, "t_tg": 0.000000, "speed_tg": nan, "t": 145.662567, "speed": 14.059893}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 5575 tid 5575 thread 0 bound to OS proc set {0}
OMP: pid 5575 tid 5642 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 72.881805, "speed_pp": 28.100292, "t_tg": 0.000000, "speed_tg": nan, "t": 72.881805, "speed": 28.100292}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 5713 tid 5713 thread 0 bound to OS proc set {0}
OMP: pid 5713 tid 5781 thread 2 bound to OS proc set {32}
OMP: pid 5713 tid 5780 thread 1 bound to OS proc set {16}
OMP: pid 5713 tid 5782 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 36.590981, "speed_pp": 55.970078, "t_tg": 0.000000, "speed_tg": nan, "t": 36.590981, "speed": 55.970078}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 5802 tid 5802 thread 0 bound to OS proc set {0}
OMP: pid 5802 tid 5871 thread 3 bound to OS proc set {24}
OMP: pid 5802 tid 5870 thread 2 bound to OS proc set {16}
OMP: pid 5802 tid 5869 thread 1 bound to OS proc set {8}
OMP: pid 5802 tid 5872 thread 4 bound to OS proc set {32}
OMP: pid 5802 tid 5874 thread 6 bound to OS proc set {48}
OMP: pid 5802 tid 5873 thread 5 bound to OS proc set {40}
OMP: pid 5802 tid 5875 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 18.337395, "speed_pp": 111.684349, "t_tg": 0.000000, "speed_tg": nan, "t": 18.337395, "speed": 111.684349}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 5944 tid 5944 thread 0 bound to OS proc set {0}
OMP: pid 5944 tid 6013 thread 3 bound to OS proc set {12}
OMP: pid 5944 tid 6012 thread 2 bound to OS proc set {8}
OMP: pid 5944 tid 6022 thread 12 bound to OS proc set {48}
OMP: pid 5944 tid 6011 thread 1 bound to OS proc set {4}
OMP: pid 5944 tid 6024 thread 14 bound to OS proc set {56}
OMP: pid 5944 tid 6021 thread 11 bound to OS proc set {44}
OMP: pid 5944 tid 6023 thread 13 bound to OS proc set {52}
OMP: pid 5944 tid 6018 thread 8 bound to OS proc set {32}
OMP: pid 5944 tid 6017 thread 7 bound to OS proc set {28}
OMP: pid 5944 tid 6020 thread 10 bound to OS proc set {40}
OMP: pid 5944 tid 6016 thread 6 bound to OS proc set {24}
OMP: pid 5944 tid 6015 thread 5 bound to OS proc set {20}
OMP: pid 5944 tid 6019 thread 9 bound to OS proc set {36}
OMP: pid 5944 tid 6014 thread 4 bound to OS proc set {16}
OMP: pid 5944 tid 6025 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 9.262382, "speed_pp": 221.109436, "t_tg": 0.000000, "speed_tg": nan, "t": 9.262382, "speed": 221.109436}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 6045 tid 6045 thread 0 bound to OS proc set {0}
OMP: pid 6045 tid 6114 thread 3 bound to OS proc set {8}
OMP: pid 6045 tid 6118 thread 7 bound to OS proc set {18}
OMP: pid 6045 tid 6112 thread 1 bound to OS proc set {2}
OMP: pid 6045 tid 6123 thread 12 bound to OS proc set {32}
OMP: pid 6045 tid 6115 thread 4 bound to OS proc set {10}
OMP: pid 6045 tid 6126 thread 15 bound to OS proc set {40}
OMP: pid 6045 tid 6127 thread 16 bound to OS proc set {43}
OMP: pid 6045 tid 6124 thread 13 bound to OS proc set {35}
OMP: pid 6045 tid 6117 thread 6 bound to OS proc set {16}
OMP: pid 6045 tid 6130 thread 19 bound to OS proc set {51}
OMP: pid 6045 tid 6122 thread 11 bound to OS proc set {29}
OMP: pid 6045 tid 6125 thread 14 bound to OS proc set {37}
OMP: pid 6045 tid 6129 thread 18 bound to OS proc set {48}
OMP: pid 6045 tid 6131 thread 20 bound to OS proc set {54}
OMP: pid 6045 tid 6120 thread 9 bound to OS proc set {24}
OMP: pid 6045 tid 6119 thread 8 bound to OS proc set {21}
OMP: pid 6045 tid 6128 thread 17 bound to OS proc set {46}
OMP: pid 6045 tid 6113 thread 2 bound to OS proc set {5}
OMP: pid 6045 tid 6116 thread 5 bound to OS proc set {13}
OMP: pid 6045 tid 6133 thread 22 bound to OS proc set {59}
OMP: pid 6045 tid 6121 thread 10 bound to OS proc set {27}
OMP: pid 6045 tid 6132 thread 21 bound to OS proc set {56}
OMP: pid 6045 tid 6134 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 6.959575, "speed_pp": 294.270844, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 6.959576, "speed": 294.270782}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 6156 tid 6156 thread 0 bound to OS proc set {0}
OMP: pid 6156 tid 6233 thread 11 bound to OS proc set {22}
OMP: pid 6156 tid 6223 thread 1 bound to OS proc set {2}
OMP: pid 6156 tid 6229 thread 7 bound to OS proc set {14}
OMP: pid 6156 tid 6224 thread 2 bound to OS proc set {4}
OMP: pid 6156 tid 6230 thread 8 bound to OS proc set {16}
OMP: pid 6156 tid 6226 thread 4 bound to OS proc set {8}
OMP: pid 6156 tid 6231 thread 9 bound to OS proc set {18}
OMP: pid 6156 tid 6227 thread 5 bound to OS proc set {10}
OMP: pid 6156 tid 6236 thread 14 bound to OS proc set {28}
OMP: pid 6156 tid 6234 thread 12 bound to OS proc set {24}
OMP: pid 6156 tid 6225 thread 3 bound to OS proc set {6}
OMP: pid 6156 tid 6237 thread 15 bound to OS proc set {30}
OMP: pid 6156 tid 6228 thread 6 bound to OS proc set {12}
OMP: pid 6156 tid 6232 thread 10 bound to OS proc set {20}
OMP: pid 6156 tid 6250 thread 28 bound to OS proc set {56}
OMP: pid 6156 tid 6241 thread 19 bound to OS proc set {38}
OMP: pid 6156 tid 6246 thread 24 bound to OS proc set {48}
OMP: pid 6156 tid 6251 thread 29 bound to OS proc set {58}
OMP: pid 6156 tid 6235 thread 13 bound to OS proc set {26}
OMP: pid 6156 tid 6240 thread 18 bound to OS proc set {36}
OMP: pid 6156 tid 6249 thread 27 bound to OS proc set {54}
OMP: pid 6156 tid 6239 thread 17 bound to OS proc set {34}
OMP: pid 6156 tid 6252 thread 30 bound to OS proc set {60}
OMP: pid 6156 tid 6248 thread 26 bound to OS proc set {52}
OMP: pid 6156 tid 6245 thread 23 bound to OS proc set {46}
OMP: pid 6156 tid 6253 thread 31 bound to OS proc set {62}
OMP: pid 6156 tid 6242 thread 20 bound to OS proc set {40}
OMP: pid 6156 tid 6247 thread 25 bound to OS proc set {50}
OMP: pid 6156 tid 6238 thread 16 bound to OS proc set {32}
OMP: pid 6156 tid 6243 thread 21 bound to OS proc set {42}
OMP: pid 6156 tid 6244 thread 22 bound to OS proc set {44}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 5.558049, "speed_pp": 368.474609, "t_tg": 0.000000, "speed_tg": nan, "t": 5.558049, "speed": 368.474609}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 6364 tid 6364 thread 0 bound to OS proc set {0}
OMP: pid 6364 tid 6433 thread 3 bound to OS proc set {4}
OMP: pid 6364 tid 6431 thread 1 bound to OS proc set {1}
OMP: pid 6364 tid 6438 thread 8 bound to OS proc set {13}
OMP: pid 6364 tid 6440 thread 10 bound to OS proc set {16}
OMP: pid 6364 tid 6437 thread 7 bound to OS proc set {11}
OMP: pid 6364 tid 6441 thread 11 bound to OS proc set {17}
OMP: pid 6364 tid 6444 thread 14 bound to OS proc set {22}
OMP: pid 6364 tid 6442 thread 12 bound to OS proc set {19}
OMP: pid 6364 tid 6465 thread 35 bound to OS proc set {56}
OMP: pid 6364 tid 6449 thread 19 bound to OS proc set {30}
OMP: pid 6364 tid 6436 thread 6 bound to OS proc set {9}
OMP: pid 6364 tid 6434 thread 4 bound to OS proc set {6}
OMP: pid 6364 tid 6432 thread 2 bound to OS proc set {3}
OMP: pid 6364 tid 6462 thread 32 bound to OS proc set {52}
OMP: pid 6364 tid 6461 thread 31 bound to OS proc set {50}
OMP: pid 6364 tid 6439 thread 9 bound to OS proc set {14}
OMP: pid 6364 tid 6464 thread 34 bound to OS proc set {55}
OMP: pid 6364 tid 6460 thread 30 bound to OS proc set {48}
OMP: pid 6364 tid 6466 thread 36 bound to OS proc set {58}
OMP: pid 6364 tid 6445 thread 15 bound to OS proc set {24}
OMP: pid 6364 tid 6469 thread 39 bound to OS proc set {63}
OMP: pid 6364 tid 6458 thread 28 bound to OS proc set {45}
OMP: pid 6364 tid 6454 thread 24 bound to OS proc set {39}
OMP: pid 6364 tid 6450 thread 20 bound to OS proc set {32}
OMP: pid 6364 tid 6443 thread 13 bound to OS proc set {21}
OMP: pid 6364 tid 6448 thread 18 bound to OS proc set {29}
OMP: pid 6364 tid 6468 thread 38 bound to OS proc set {61}
OMP: pid 6364 tid 6452 thread 22 bound to OS proc set {35}
OMP: pid 6364 tid 6453 thread 23 bound to OS proc set {37}
OMP: pid 6364 tid 6463 thread 33 bound to OS proc set {53}
OMP: pid 6364 tid 6459 thread 29 bound to OS proc set {47}
OMP: pid 6364 tid 6467 thread 37 bound to OS proc set {60}
OMP: pid 6364 tid 6435 thread 5 bound to OS proc set {8}
OMP: pid 6364 tid 6447 thread 17 bound to OS proc set {27}
OMP: pid 6364 tid 6456 thread 26 bound to OS proc set {42}
OMP: pid 6364 tid 6455 thread 25 bound to OS proc set {40}
OMP: pid 6364 tid 6457 thread 27 bound to OS proc set {43}
OMP: pid 6364 tid 6446 thread 16 bound to OS proc set {26}
OMP: pid 6364 tid 6451 thread 21 bound to OS proc set {34}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 4.697811, "speed_pp": 435.947723, "t_tg": 0.000000, "speed_tg": nan, "t": 4.697811, "speed": 435.947723}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 6493 tid 6493 thread 0 bound to OS proc set {0}
OMP: pid 6493 tid 6567 thread 3 bound to OS proc set {4}
OMP: pid 6493 tid 6568 thread 4 bound to OS proc set {5}
OMP: pid 6493 tid 6576 thread 12 bound to OS proc set {16}
OMP: pid 6493 tid 6569 thread 5 bound to OS proc set {6}
OMP: pid 6493 tid 6575 thread 11 bound to OS proc set {14}
OMP: pid 6493 tid 6571 thread 7 bound to OS proc set {9}
OMP: pid 6493 tid 6577 thread 13 bound to OS proc set {17}
OMP: pid 6493 tid 6574 thread 10 bound to OS proc set {13}
OMP: pid 6493 tid 6566 thread 2 bound to OS proc set {2}
OMP: pid 6493 tid 6579 thread 15 bound to OS proc set {20}
OMP: pid 6493 tid 6595 thread 31 bound to OS proc set {41}
OMP: pid 6493 tid 6593 thread 29 bound to OS proc set {39}
OMP: pid 6493 tid 6591 thread 27 bound to OS proc set {36}
OMP: pid 6493 tid 6578 thread 14 bound to OS proc set {18}
OMP: pid 6493 tid 6580 thread 16 bound to OS proc set {21}
OMP: pid 6493 tid 6570 thread 6 bound to OS proc set {8}
OMP: pid 6493 tid 6599 thread 35 bound to OS proc set {47}
OMP: pid 6493 tid 6597 thread 33 bound to OS proc set {44}
OMP: pid 6493 tid 6565 thread 1 bound to OS proc set {1}
OMP: pid 6493 tid 6592 thread 28 bound to OS proc set {37}
OMP: pid 6493 tid 6608 thread 44 bound to OS proc set {59}
OMP: pid 6493 tid 6581 thread 17 bound to OS proc set {23}
OMP: pid 6493 tid 6611 thread 47 bound to OS proc set {63}
OMP: pid 6493 tid 6573 thread 9 bound to OS proc set {12}
OMP: pid 6493 tid 6572 thread 8 bound to OS proc set {10}
OMP: pid 6493 tid 6582 thread 18 bound to OS proc set {24}
OMP: pid 6493 tid 6596 thread 32 bound to OS proc set {43}
OMP: pid 6493 tid 6590 thread 26 bound to OS proc set {35}
OMP: pid 6493 tid 6588 thread 24 bound to OS proc set {32}
OMP: pid 6493 tid 6594 thread 30 bound to OS proc set {40}
OMP: pid 6493 tid 6602 thread 38 bound to OS proc set {51}
OMP: pid 6493 tid 6598 thread 34 bound to OS proc set {46}
OMP: pid 6493 tid 6600 thread 36 bound to OS proc set {48}
OMP: pid 6493 tid 6610 thread 46 bound to OS proc set {62}
OMP: pid 6493 tid 6583 thread 19 bound to OS proc set {25}
OMP: pid 6493 tid 6585 thread 21 bound to OS proc set {28}
OMP: pid 6493 tid 6584 thread 20 bound to OS proc set {27}
OMP: pid 6493 tid 6604 thread 40 bound to OS proc set {54}
OMP: pid 6493 tid 6607 thread 43 bound to OS proc set {58}
OMP: pid 6493 tid 6586 thread 22 bound to OS proc set {29}
OMP: pid 6493 tid 6589 thread 25 bound to OS proc set {33}
OMP: pid 6493 tid 6609 thread 45 bound to OS proc set {60}
OMP: pid 6493 tid 6605 thread 41 bound to OS proc set {55}
OMP: pid 6493 tid 6603 thread 39 bound to OS proc set {52}
OMP: pid 6493 tid 6606 thread 42 bound to OS proc set {56}
OMP: pid 6493 tid 6601 thread 37 bound to OS proc set {50}
OMP: pid 6493 tid 6587 thread 23 bound to OS proc set {31}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 4.082723, "speed_pp": 501.625977, "t_tg": 0.000000, "speed_tg": nan, "t": 4.082723, "speed": 501.625977}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 6637 tid 6637 thread 0 bound to OS proc set {0}
OMP: pid 6637 tid 6706 thread 3 bound to OS proc set {3}
OMP: pid 6637 tid 6715 thread 12 bound to OS proc set {13}
OMP: pid 6637 tid 6705 thread 2 bound to OS proc set {2}
OMP: pid 6637 tid 6714 thread 11 bound to OS proc set {12}
OMP: pid 6637 tid 6704 thread 1 bound to OS proc set {1}
OMP: pid 6637 tid 6707 thread 4 bound to OS proc set {4}
OMP: pid 6637 tid 6709 thread 6 bound to OS proc set {6}
OMP: pid 6637 tid 6708 thread 5 bound to OS proc set {5}
OMP: pid 6637 tid 6735 thread 32 bound to OS proc set {37}
OMP: pid 6637 tid 6717 thread 14 bound to OS proc set {16}
OMP: pid 6637 tid 6750 thread 47 bound to OS proc set {54}
OMP: pid 6637 tid 6716 thread 13 bound to OS proc set {15}
OMP: pid 6637 tid 6758 thread 55 bound to OS proc set {63}
OMP: pid 6637 tid 6747 thread 44 bound to OS proc set {51}
OMP: pid 6637 tid 6754 thread 51 bound to OS proc set {59}
OMP: pid 6637 tid 6751 thread 48 bound to OS proc set {55}
OMP: pid 6637 tid 6722 thread 19 bound to OS proc set {22}
OMP: pid 6637 tid 6749 thread 46 bound to OS proc set {53}
OMP: pid 6637 tid 6753 thread 50 bound to OS proc set {58}
OMP: pid 6637 tid 6743 thread 40 bound to OS proc set {46}
OMP: pid 6637 tid 6755 thread 52 bound to OS proc set {60}
OMP: pid 6637 tid 6721 thread 18 bound to OS proc set {20}
OMP: pid 6637 tid 6752 thread 49 bound to OS proc set {56}
OMP: pid 6637 tid 6710 thread 7 bound to OS proc set {8}
OMP: pid 6637 tid 6711 thread 8 bound to OS proc set {9}
OMP: pid 6637 tid 6742 thread 39 bound to OS proc set {45}
OMP: pid 6637 tid 6731 thread 28 bound to OS proc set {32}
OMP: pid 6637 tid 6746 thread 43 bound to OS proc set {49}
OMP: pid 6637 tid 6720 thread 17 bound to OS proc set {19}
OMP: pid 6637 tid 6719 thread 16 bound to OS proc set {18}
OMP: pid 6637 tid 6718 thread 15 bound to OS proc set {17}
OMP: pid 6637 tid 6744 thread 41 bound to OS proc set {47}
OMP: pid 6637 tid 6748 thread 45 bound to OS proc set {52}
OMP: pid 6637 tid 6734 thread 31 bound to OS proc set {35}
OMP: pid 6637 tid 6733 thread 30 bound to OS proc set {34}
OMP: pid 6637 tid 6713 thread 10 bound to OS proc set {11}
OMP: pid 6637 tid 6745 thread 42 bound to OS proc set {48}
OMP: pid 6637 tid 6736 thread 33 bound to OS proc set {38}
OMP: pid 6637 tid 6712 thread 9 bound to OS proc set {10}
OMP: pid 6637 tid 6739 thread 36 bound to OS proc set {41}
OMP: pid 6637 tid 6730 thread 27 bound to OS proc set {31}
OMP: pid 6637 tid 6727 thread 24 bound to OS proc set {27}
OMP: pid 6637 tid 6729 thread 26 bound to OS proc set {30}
OMP: pid 6637 tid 6740 thread 37 bound to OS proc set {42}
OMP: pid 6637 tid 6737 thread 34 bound to OS proc set {39}
OMP: pid 6637 tid 6738 thread 35 bound to OS proc set {40}
OMP: pid 6637 tid 6757 thread 54 bound to OS proc set {62}
OMP: pid 6637 tid 6756 thread 53 bound to OS proc set {61}
OMP: pid 6637 tid 6723 thread 20 bound to OS proc set {23}
OMP: pid 6637 tid 6741 thread 38 bound to OS proc set {44}
OMP: pid 6637 tid 6726 thread 23 bound to OS proc set {26}
OMP: pid 6637 tid 6732 thread 29 bound to OS proc set {33}
OMP: pid 6637 tid 6725 thread 22 bound to OS proc set {25}
OMP: pid 6637 tid 6724 thread 21 bound to OS proc set {24}
OMP: pid 6637 tid 6728 thread 25 bound to OS proc set {29}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.597667, "speed_pp": 569.257812, "t_tg": 0.000000, "speed_tg": nan, "t": 3.597667, "speed": 569.257812}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 10864 tid 10864 thread 0 bound to OS proc set {0}
OMP: pid 10864 tid 11466 thread 15 bound to OS proc set {15}
OMP: pid 10864 tid 11454 thread 3 bound to OS proc set {3}
OMP: pid 10864 tid 11463 thread 12 bound to OS proc set {12}
OMP: pid 10864 tid 11465 thread 14 bound to OS proc set {14}
OMP: pid 10864 tid 11462 thread 11 bound to OS proc set {11}
OMP: pid 10864 tid 11464 thread 13 bound to OS proc set {13}
OMP: pid 10864 tid 11459 thread 8 bound to OS proc set {8}
OMP: pid 10864 tid 11461 thread 10 bound to OS proc set {10}
OMP: pid 10864 tid 11458 thread 7 bound to OS proc set {7}
OMP: pid 10864 tid 11467 thread 16 bound to OS proc set {16}
OMP: pid 10864 tid 11455 thread 4 bound to OS proc set {4}
OMP: pid 10864 tid 11460 thread 9 bound to OS proc set {9}
OMP: pid 10864 tid 11457 thread 6 bound to OS proc set {6}
OMP: pid 10864 tid 11456 thread 5 bound to OS proc set {5}
OMP: pid 10864 tid 11452 thread 1 bound to OS proc set {1}
OMP: pid 10864 tid 11453 thread 2 bound to OS proc set {2}
OMP: pid 10864 tid 11502 thread 51 bound to OS proc set {51}
OMP: pid 10864 tid 11495 thread 44 bound to OS proc set {44}
OMP: pid 10864 tid 11501 thread 50 bound to OS proc set {50}
OMP: pid 10864 tid 11470 thread 19 bound to OS proc set {19}
OMP: pid 10864 tid 11483 thread 32 bound to OS proc set {32}
OMP: pid 10864 tid 11514 thread 63 bound to OS proc set {63}
OMP: pid 10864 tid 11511 thread 60 bound to OS proc set {60}
OMP: pid 10864 tid 11512 thread 61 bound to OS proc set {61}
OMP: pid 10864 tid 11498 thread 47 bound to OS proc set {47}
OMP: pid 10864 tid 11509 thread 58 bound to OS proc set {58}
OMP: pid 10864 tid 11513 thread 62 bound to OS proc set {62}
OMP: pid 10864 tid 11479 thread 28 bound to OS proc set {28}
OMP: pid 10864 tid 11475 thread 24 bound to OS proc set {24}
OMP: pid 10864 tid 11500 thread 49 bound to OS proc set {49}
OMP: pid 10864 tid 11510 thread 59 bound to OS proc set {59}
OMP: pid 10864 tid 11482 thread 31 bound to OS proc set {31}
OMP: pid 10864 tid 11469 thread 18 bound to OS proc set {18}
OMP: pid 10864 tid 11506 thread 55 bound to OS proc set {55}
OMP: pid 10864 tid 11478 thread 27 bound to OS proc set {27}
OMP: pid 10864 tid 11505 thread 54 bound to OS proc set {54}
OMP: pid 10864 tid 11486 thread 35 bound to OS proc set {35}
OMP: pid 10864 tid 11508 thread 57 bound to OS proc set {57}
OMP: pid 10864 tid 11474 thread 23 bound to OS proc set {23}
OMP: pid 10864 tid 11481 thread 30 bound to OS proc set {30}
OMP: pid 10864 tid 11499 thread 48 bound to OS proc set {48}
OMP: pid 10864 tid 11485 thread 34 bound to OS proc set {34}
OMP: pid 10864 tid 11484 thread 33 bound to OS proc set {33}
OMP: pid 10864 tid 11497 thread 46 bound to OS proc set {46}
OMP: pid 10864 tid 11473 thread 22 bound to OS proc set {22}
OMP: pid 10864 tid 11480 thread 29 bound to OS proc set {29}
OMP: pid 10864 tid 11472 thread 21 bound to OS proc set {21}
OMP: pid 10864 tid 11491 thread 40 bound to OS proc set {40}
OMP: pid 10864 tid 11476 thread 25 bound to OS proc set {25}
OMP: pid 10864 tid 11487 thread 36 bound to OS proc set {36}
OMP: pid 10864 tid 11489 thread 38 bound to OS proc set {38}
OMP: pid 10864 tid 11477 thread 26 bound to OS proc set {26}
OMP: pid 10864 tid 11488 thread 37 bound to OS proc set {37}
OMP: pid 10864 tid 11494 thread 43 bound to OS proc set {43}
OMP: pid 10864 tid 11493 thread 42 bound to OS proc set {42}
OMP: pid 10864 tid 11496 thread 45 bound to OS proc set {45}
OMP: pid 10864 tid 11490 thread 39 bound to OS proc set {39}
OMP: pid 10864 tid 11507 thread 56 bound to OS proc set {56}
OMP: pid 10864 tid 11492 thread 41 bound to OS proc set {41}
OMP: pid 10864 tid 11468 thread 17 bound to OS proc set {17}
OMP: pid 10864 tid 11471 thread 20 bound to OS proc set {20}
OMP: pid 10864 tid 11504 thread 53 bound to OS proc set {53}
OMP: pid 10864 tid 11503 thread 52 bound to OS proc set {52}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.191428, "speed_pp": 641.718994, "t_tg": 0.000000, "speed_tg": nan, "t": 3.191428, "speed": 641.718994}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-398-0068/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-08-25/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################

×