options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 36.531586, "speed_pp": 14.015269, "t_tg": 0.000000, "speed_tg": nan, "t": 36.531586, "speed": 14.015269}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1161909 tid 1161909 thread 0 bound to OS proc set {0}
OMP: pid 1161909 tid 1161976 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 18.295237, "speed_pp": 27.985426, "t_tg": 0.000000, "speed_tg": nan, "t": 18.295237, "speed": 27.985426}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1161996 tid 1161996 thread 0 bound to OS proc set {0}
OMP: pid 1161996 tid 1162064 thread 2 bound to OS proc set {32}
OMP: pid 1161996 tid 1162063 thread 1 bound to OS proc set {16}
OMP: pid 1161996 tid 1162065 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 9.168004, "speed_pp": 55.846397, "t_tg": 0.000000, "speed_tg": nan, "t": 9.168004, "speed": 55.846397}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1162133 tid 1162133 thread 0 bound to OS proc set {0}
OMP: pid 1162133 tid 1162202 thread 3 bound to OS proc set {24}
OMP: pid 1162133 tid 1162201 thread 2 bound to OS proc set {16}
OMP: pid 1162133 tid 1162203 thread 4 bound to OS proc set {32}
OMP: pid 1162133 tid 1162200 thread 1 bound to OS proc set {8}
OMP: pid 1162133 tid 1162205 thread 6 bound to OS proc set {48}
OMP: pid 1162133 tid 1162204 thread 5 bound to OS proc set {40}
OMP: pid 1162133 tid 1162206 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 4.597633, "speed_pp": 111.361656, "t_tg": 0.000000, "speed_tg": nan, "t": 4.597633, "speed": 111.361656}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1162227 tid 1162227 thread 0 bound to OS proc set {0}
OMP: pid 1162227 tid 1162296 thread 3 bound to OS proc set {12}
OMP: pid 1162227 tid 1162295 thread 2 bound to OS proc set {8}
OMP: pid 1162227 tid 1162305 thread 12 bound to OS proc set {48}
OMP: pid 1162227 tid 1162304 thread 11 bound to OS proc set {44}
OMP: pid 1162227 tid 1162294 thread 1 bound to OS proc set {4}
OMP: pid 1162227 tid 1162306 thread 13 bound to OS proc set {52}
OMP: pid 1162227 tid 1162307 thread 14 bound to OS proc set {56}
OMP: pid 1162227 tid 1162301 thread 8 bound to OS proc set {32}
OMP: pid 1162227 tid 1162300 thread 7 bound to OS proc set {28}
OMP: pid 1162227 tid 1162299 thread 6 bound to OS proc set {24}
OMP: pid 1162227 tid 1162297 thread 4 bound to OS proc set {16}
OMP: pid 1162227 tid 1162303 thread 10 bound to OS proc set {40}
OMP: pid 1162227 tid 1162302 thread 9 bound to OS proc set {36}
OMP: pid 1162227 tid 1162298 thread 5 bound to OS proc set {20}
OMP: pid 1162227 tid 1162308 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 2.323118, "speed_pp": 220.393463, "t_tg": 0.000000, "speed_tg": nan, "t": 2.323118, "speed": 220.393463}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1162328 tid 1162328 thread 0 bound to OS proc set {0}
OMP: pid 1162328 tid 1162396 thread 1 bound to OS proc set {2}
OMP: pid 1162328 tid 1162398 thread 3 bound to OS proc set {8}
OMP: pid 1162328 tid 1162410 thread 15 bound to OS proc set {40}
OMP: pid 1162328 tid 1162401 thread 6 bound to OS proc set {16}
OMP: pid 1162328 tid 1162404 thread 9 bound to OS proc set {24}
OMP: pid 1162328 tid 1162407 thread 12 bound to OS proc set {32}
OMP: pid 1162328 tid 1162406 thread 11 bound to OS proc set {29}
OMP: pid 1162328 tid 1162399 thread 4 bound to OS proc set {10}
OMP: pid 1162328 tid 1162409 thread 14 bound to OS proc set {37}
OMP: pid 1162328 tid 1162397 thread 2 bound to OS proc set {5}
OMP: pid 1162328 tid 1162414 thread 19 bound to OS proc set {51}
OMP: pid 1162328 tid 1162400 thread 5 bound to OS proc set {13}
OMP: pid 1162328 tid 1162402 thread 7 bound to OS proc set {18}
OMP: pid 1162328 tid 1162403 thread 8 bound to OS proc set {21}
OMP: pid 1162328 tid 1162408 thread 13 bound to OS proc set {35}
OMP: pid 1162328 tid 1162411 thread 16 bound to OS proc set {43}
OMP: pid 1162328 tid 1162418 thread 23 bound to OS proc set {62}
OMP: pid 1162328 tid 1162413 thread 18 bound to OS proc set {48}
OMP: pid 1162328 tid 1162412 thread 17 bound to OS proc set {46}
OMP: pid 1162328 tid 1162415 thread 20 bound to OS proc set {54}
OMP: pid 1162328 tid 1162405 thread 10 bound to OS proc set {27}
OMP: pid 1162328 tid 1162417 thread 22 bound to OS proc set {59}
OMP: pid 1162328 tid 1162416 thread 21 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 1.747881, "speed_pp": 292.926117, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 1.747882, "speed": 292.925964}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1162438 tid 1162438 thread 0 bound to OS proc set {0}
OMP: pid 1162438 tid 1162511 thread 7 bound to OS proc set {14}
OMP: pid 1162438 tid 1162505 thread 1 bound to OS proc set {2}
OMP: pid 1162438 tid 1162508 thread 4 bound to OS proc set {8}
OMP: pid 1162438 tid 1162506 thread 2 bound to OS proc set {4}
OMP: pid 1162438 tid 1162515 thread 11 bound to OS proc set {22}
OMP: pid 1162438 tid 1162507 thread 3 bound to OS proc set {6}
OMP: pid 1162438 tid 1162518 thread 14 bound to OS proc set {28}
OMP: pid 1162438 tid 1162513 thread 9 bound to OS proc set {18}
OMP: pid 1162438 tid 1162510 thread 6 bound to OS proc set {12}
OMP: pid 1162438 tid 1162519 thread 15 bound to OS proc set {30}
OMP: pid 1162438 tid 1162512 thread 8 bound to OS proc set {16}
OMP: pid 1162438 tid 1162516 thread 12 bound to OS proc set {24}
OMP: pid 1162438 tid 1162532 thread 28 bound to OS proc set {56}
OMP: pid 1162438 tid 1162528 thread 24 bound to OS proc set {48}
OMP: pid 1162438 tid 1162522 thread 18 bound to OS proc set {36}
OMP: pid 1162438 tid 1162523 thread 19 bound to OS proc set {38}
OMP: pid 1162438 tid 1162531 thread 27 bound to OS proc set {54}
OMP: pid 1162438 tid 1162534 thread 30 bound to OS proc set {60}
OMP: pid 1162438 tid 1162514 thread 10 bound to OS proc set {20}
OMP: pid 1162438 tid 1162517 thread 13 bound to OS proc set {26}
OMP: pid 1162438 tid 1162520 thread 16 bound to OS proc set {32}
OMP: pid 1162438 tid 1162535 thread 31 bound to OS proc set {62}
OMP: pid 1162438 tid 1162533 thread 29 bound to OS proc set {58}
OMP: pid 1162438 tid 1162521 thread 17 bound to OS proc set {34}
OMP: pid 1162438 tid 1162527 thread 23 bound to OS proc set {46}
OMP: pid 1162438 tid 1162530 thread 26 bound to OS proc set {52}
OMP: pid 1162438 tid 1162524 thread 20 bound to OS proc set {40}
OMP: pid 1162438 tid 1162529 thread 25 bound to OS proc set {50}
OMP: pid 1162438 tid 1162525 thread 21 bound to OS proc set {42}
OMP: pid 1162438 tid 1162509 thread 5 bound to OS proc set {10}
OMP: pid 1162438 tid 1162526 thread 22 bound to OS proc set {44}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 1.397234, "speed_pp": 366.438263, "t_tg": 0.000000, "speed_tg": nan, "t": 1.397234, "speed": 366.438263}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1162605 tid 1162605 thread 0 bound to OS proc set {0}
OMP: pid 1162605 tid 1162673 thread 1 bound to OS proc set {1}
OMP: pid 1162605 tid 1162674 thread 2 bound to OS proc set {3}
OMP: pid 1162605 tid 1162686 thread 14 bound to OS proc set {22}
OMP: pid 1162605 tid 1162675 thread 3 bound to OS proc set {4}
OMP: pid 1162605 tid 1162683 thread 11 bound to OS proc set {17}
OMP: pid 1162605 tid 1162680 thread 8 bound to OS proc set {13}
OMP: pid 1162605 tid 1162684 thread 12 bound to OS proc set {19}
OMP: pid 1162605 tid 1162682 thread 10 bound to OS proc set {16}
OMP: pid 1162605 tid 1162676 thread 4 bound to OS proc set {6}
OMP: pid 1162605 tid 1162677 thread 5 bound to OS proc set {8}
OMP: pid 1162605 tid 1162687 thread 15 bound to OS proc set {24}
OMP: pid 1162605 tid 1162707 thread 35 bound to OS proc set {56}
OMP: pid 1162605 tid 1162704 thread 32 bound to OS proc set {52}
OMP: pid 1162605 tid 1162690 thread 18 bound to OS proc set {29}
OMP: pid 1162605 tid 1162678 thread 6 bound to OS proc set {9}
OMP: pid 1162605 tid 1162706 thread 34 bound to OS proc set {55}
OMP: pid 1162605 tid 1162679 thread 7 bound to OS proc set {11}
OMP: pid 1162605 tid 1162696 thread 24 bound to OS proc set {39}
OMP: pid 1162605 tid 1162703 thread 31 bound to OS proc set {50}
OMP: pid 1162605 tid 1162695 thread 23 bound to OS proc set {37}
OMP: pid 1162605 tid 1162699 thread 27 bound to OS proc set {43}
OMP: pid 1162605 tid 1162700 thread 28 bound to OS proc set {45}
OMP: pid 1162605 tid 1162685 thread 13 bound to OS proc set {21}
OMP: pid 1162605 tid 1162694 thread 22 bound to OS proc set {35}
OMP: pid 1162605 tid 1162705 thread 33 bound to OS proc set {53}
OMP: pid 1162605 tid 1162711 thread 39 bound to OS proc set {63}
OMP: pid 1162605 tid 1162702 thread 30 bound to OS proc set {48}
OMP: pid 1162605 tid 1162681 thread 9 bound to OS proc set {14}
OMP: pid 1162605 tid 1162697 thread 25 bound to OS proc set {40}
OMP: pid 1162605 tid 1162692 thread 20 bound to OS proc set {32}
OMP: pid 1162605 tid 1162689 thread 17 bound to OS proc set {27}
OMP: pid 1162605 tid 1162688 thread 16 bound to OS proc set {26}
OMP: pid 1162605 tid 1162691 thread 19 bound to OS proc set {30}
OMP: pid 1162605 tid 1162708 thread 36 bound to OS proc set {58}
OMP: pid 1162605 tid 1162701 thread 29 bound to OS proc set {47}
OMP: pid 1162605 tid 1162698 thread 26 bound to OS proc set {42}
OMP: pid 1162605 tid 1162709 thread 37 bound to OS proc set {60}
OMP: pid 1162605 tid 1162710 thread 38 bound to OS proc set {61}
OMP: pid 1162605 tid 1162693 thread 21 bound to OS proc set {34}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 1.180460, "speed_pp": 433.729248, "t_tg": 0.000000, "speed_tg": nan, "t": 1.180460, "speed": 433.729248}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1162731 tid 1162731 thread 0 bound to OS proc set {0}
OMP: pid 1162731 tid 1162799 thread 2 bound to OS proc set {2}
OMP: pid 1162731 tid 1162798 thread 1 bound to OS proc set {1}
OMP: pid 1162731 tid 1162808 thread 11 bound to OS proc set {14}
OMP: pid 1162731 tid 1162800 thread 3 bound to OS proc set {4}
OMP: pid 1162731 tid 1162805 thread 8 bound to OS proc set {10}
OMP: pid 1162731 tid 1162804 thread 7 bound to OS proc set {9}
OMP: pid 1162731 tid 1162811 thread 14 bound to OS proc set {18}
OMP: pid 1162731 tid 1162809 thread 12 bound to OS proc set {16}
OMP: pid 1162731 tid 1162803 thread 6 bound to OS proc set {8}
OMP: pid 1162731 tid 1162841 thread 44 bound to OS proc set {59}
OMP: pid 1162731 tid 1162810 thread 13 bound to OS proc set {17}
OMP: pid 1162731 tid 1162831 thread 34 bound to OS proc set {46}
OMP: pid 1162731 tid 1162806 thread 9 bound to OS proc set {12}
OMP: pid 1162731 tid 1162830 thread 33 bound to OS proc set {44}
OMP: pid 1162731 tid 1162807 thread 10 bound to OS proc set {13}
OMP: pid 1162731 tid 1162832 thread 35 bound to OS proc set {47}
OMP: pid 1162731 tid 1162801 thread 4 bound to OS proc set {5}
OMP: pid 1162731 tid 1162812 thread 15 bound to OS proc set {20}
OMP: pid 1162731 tid 1162821 thread 24 bound to OS proc set {32}
OMP: pid 1162731 tid 1162817 thread 20 bound to OS proc set {27}
OMP: pid 1162731 tid 1162815 thread 18 bound to OS proc set {24}
OMP: pid 1162731 tid 1162829 thread 32 bound to OS proc set {43}
OMP: pid 1162731 tid 1162827 thread 30 bound to OS proc set {40}
OMP: pid 1162731 tid 1162824 thread 27 bound to OS proc set {36}
OMP: pid 1162731 tid 1162823 thread 26 bound to OS proc set {35}
OMP: pid 1162731 tid 1162833 thread 36 bound to OS proc set {48}
OMP: pid 1162731 tid 1162822 thread 25 bound to OS proc set {33}
OMP: pid 1162731 tid 1162844 thread 47 bound to OS proc set {63}
OMP: pid 1162731 tid 1162826 thread 29 bound to OS proc set {39}
OMP: pid 1162731 tid 1162820 thread 23 bound to OS proc set {31}
OMP: pid 1162731 tid 1162814 thread 17 bound to OS proc set {23}
OMP: pid 1162731 tid 1162825 thread 28 bound to OS proc set {37}
OMP: pid 1162731 tid 1162802 thread 5 bound to OS proc set {6}
OMP: pid 1162731 tid 1162816 thread 19 bound to OS proc set {25}
OMP: pid 1162731 tid 1162813 thread 16 bound to OS proc set {21}
OMP: pid 1162731 tid 1162828 thread 31 bound to OS proc set {41}
OMP: pid 1162731 tid 1162843 thread 46 bound to OS proc set {62}
OMP: pid 1162731 tid 1162842 thread 45 bound to OS proc set {60}
OMP: pid 1162731 tid 1162840 thread 43 bound to OS proc set {58}
OMP: pid 1162731 tid 1162819 thread 22 bound to OS proc set {29}
OMP: pid 1162731 tid 1162818 thread 21 bound to OS proc set {28}
OMP: pid 1162731 tid 1162835 thread 38 bound to OS proc set {51}
OMP: pid 1162731 tid 1162834 thread 37 bound to OS proc set {50}
OMP: pid 1162731 tid 1162836 thread 39 bound to OS proc set {52}
OMP: pid 1162731 tid 1162839 thread 42 bound to OS proc set {56}
OMP: pid 1162731 tid 1162838 thread 41 bound to OS proc set {55}
OMP: pid 1162731 tid 1162837 thread 40 bound to OS proc set {54}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 1.030961, "speed_pp": 496.623993, "t_tg": 0.000000, "speed_tg": nan, "t": 1.030961, "speed": 496.623993}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1162865 tid 1162865 thread 0 bound to OS proc set {0}
OMP: pid 1162865 tid 1162934 thread 3 bound to OS proc set {3}
OMP: pid 1162865 tid 1162933 thread 2 bound to OS proc set {2}
OMP: pid 1162865 tid 1162932 thread 1 bound to OS proc set {1}
OMP: pid 1162865 tid 1162935 thread 4 bound to OS proc set {4}
OMP: pid 1162865 tid 1162937 thread 6 bound to OS proc set {6}
OMP: pid 1162865 tid 1162936 thread 5 bound to OS proc set {5}
OMP: pid 1162865 tid 1162938 thread 7 bound to OS proc set {8}
OMP: pid 1162865 tid 1162942 thread 11 bound to OS proc set {12}
OMP: pid 1162865 tid 1162962 thread 31 bound to OS proc set {35}
OMP: pid 1162865 tid 1162943 thread 12 bound to OS proc set {13}
OMP: pid 1162865 tid 1162945 thread 14 bound to OS proc set {16}
OMP: pid 1162865 tid 1162982 thread 51 bound to OS proc set {59}
OMP: pid 1162865 tid 1162986 thread 55 bound to OS proc set {63}
OMP: pid 1162865 tid 1162947 thread 16 bound to OS proc set {18}
OMP: pid 1162865 tid 1162941 thread 10 bound to OS proc set {11}
OMP: pid 1162865 tid 1162980 thread 49 bound to OS proc set {56}
OMP: pid 1162865 tid 1162981 thread 50 bound to OS proc set {58}
OMP: pid 1162865 tid 1162958 thread 27 bound to OS proc set {31}
OMP: pid 1162865 tid 1162961 thread 30 bound to OS proc set {34}
OMP: pid 1162865 tid 1162959 thread 28 bound to OS proc set {32}
OMP: pid 1162865 tid 1162979 thread 48 bound to OS proc set {55}
OMP: pid 1162865 tid 1162955 thread 24 bound to OS proc set {27}
OMP: pid 1162865 tid 1162975 thread 44 bound to OS proc set {51}
OMP: pid 1162865 tid 1162983 thread 52 bound to OS proc set {60}
OMP: pid 1162865 tid 1162946 thread 15 bound to OS proc set {17}
OMP: pid 1162865 tid 1162960 thread 29 bound to OS proc set {33}
OMP: pid 1162865 tid 1162956 thread 25 bound to OS proc set {29}
OMP: pid 1162865 tid 1162939 thread 8 bound to OS proc set {9}
OMP: pid 1162865 tid 1162978 thread 47 bound to OS proc set {54}
OMP: pid 1162865 tid 1162948 thread 17 bound to OS proc set {19}
OMP: pid 1162865 tid 1162976 thread 45 bound to OS proc set {52}
OMP: pid 1162865 tid 1162950 thread 19 bound to OS proc set {22}
OMP: pid 1162865 tid 1162965 thread 34 bound to OS proc set {39}
OMP: pid 1162865 tid 1162977 thread 46 bound to OS proc set {53}
OMP: pid 1162865 tid 1162953 thread 22 bound to OS proc set {25}
OMP: pid 1162865 tid 1162949 thread 18 bound to OS proc set {20}
OMP: pid 1162865 tid 1162940 thread 9 bound to OS proc set {10}
OMP: pid 1162865 tid 1162944 thread 13 bound to OS proc set {15}
OMP: pid 1162865 tid 1162957 thread 26 bound to OS proc set {30}
OMP: pid 1162865 tid 1162954 thread 23 bound to OS proc set {26}
OMP: pid 1162865 tid 1162985 thread 54 bound to OS proc set {62}
OMP: pid 1162865 tid 1162974 thread 43 bound to OS proc set {49}
OMP: pid 1162865 tid 1162970 thread 39 bound to OS proc set {45}
OMP: pid 1162865 tid 1162969 thread 38 bound to OS proc set {44}
OMP: pid 1162865 tid 1162966 thread 35 bound to OS proc set {40}
OMP: pid 1162865 tid 1162964 thread 33 bound to OS proc set {38}
OMP: pid 1162865 tid 1162968 thread 37 bound to OS proc set {42}
OMP: pid 1162865 tid 1162963 thread 32 bound to OS proc set {37}
OMP: pid 1162865 tid 1162952 thread 21 bound to OS proc set {24}
OMP: pid 1162865 tid 1162973 thread 42 bound to OS proc set {48}
OMP: pid 1162865 tid 1162967 thread 36 bound to OS proc set {41}
OMP: pid 1162865 tid 1162971 thread 40 bound to OS proc set {46}
OMP: pid 1162865 tid 1162972 thread 41 bound to OS proc set {47}
OMP: pid 1162865 tid 1162984 thread 53 bound to OS proc set {61}
OMP: pid 1162865 tid 1162951 thread 20 bound to OS proc set {23}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.914123, "speed_pp": 560.099670, "t_tg": 0.000000, "speed_tg": nan, "t": 0.914123, "speed": 560.099670}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1163006 tid 1163006 thread 0 bound to OS proc set {0}
OMP: pid 1163006 tid 1163075 thread 3 bound to OS proc set {3}
OMP: pid 1163006 tid 1163080 thread 8 bound to OS proc set {8}
OMP: pid 1163006 tid 1163073 thread 1 bound to OS proc set {1}
OMP: pid 1163006 tid 1163074 thread 2 bound to OS proc set {2}
OMP: pid 1163006 tid 1163076 thread 4 bound to OS proc set {4}
OMP: pid 1163006 tid 1163082 thread 10 bound to OS proc set {10}
OMP: pid 1163006 tid 1163079 thread 7 bound to OS proc set {7}
OMP: pid 1163006 tid 1163078 thread 6 bound to OS proc set {6}
OMP: pid 1163006 tid 1163081 thread 9 bound to OS proc set {9}
OMP: pid 1163006 tid 1163077 thread 5 bound to OS proc set {5}
OMP: pid 1163006 tid 1163084 thread 12 bound to OS proc set {12}
OMP: pid 1163006 tid 1163086 thread 14 bound to OS proc set {14}
OMP: pid 1163006 tid 1163085 thread 13 bound to OS proc set {13}
OMP: pid 1163006 tid 1163087 thread 15 bound to OS proc set {15}
OMP: pid 1163006 tid 1163102 thread 30 bound to OS proc set {30}
OMP: pid 1163006 tid 1163132 thread 60 bound to OS proc set {60}
OMP: pid 1163006 tid 1163088 thread 16 bound to OS proc set {16}
OMP: pid 1163006 tid 1163091 thread 19 bound to OS proc set {19}
OMP: pid 1163006 tid 1163092 thread 20 bound to OS proc set {20}
OMP: pid 1163006 tid 1163090 thread 18 bound to OS proc set {18}
OMP: pid 1163006 tid 1163089 thread 17 bound to OS proc set {17}
OMP: pid 1163006 tid 1163118 thread 46 bound to OS proc set {46}
OMP: pid 1163006 tid 1163103 thread 31 bound to OS proc set {31}
OMP: pid 1163006 tid 1163101 thread 29 bound to OS proc set {29}
OMP: pid 1163006 tid 1163100 thread 28 bound to OS proc set {28}
OMP: pid 1163006 tid 1163099 thread 27 bound to OS proc set {27}
OMP: pid 1163006 tid 1163083 thread 11 bound to OS proc set {11}
OMP: pid 1163006 tid 1163131 thread 59 bound to OS proc set {59}
OMP: pid 1163006 tid 1163134 thread 62 bound to OS proc set {62}
OMP: pid 1163006 tid 1163119 thread 47 bound to OS proc set {47}
OMP: pid 1163006 tid 1163096 thread 24 bound to OS proc set {24}
OMP: pid 1163006 tid 1163104 thread 32 bound to OS proc set {32}
OMP: pid 1163006 tid 1163117 thread 45 bound to OS proc set {45}
OMP: pid 1163006 tid 1163106 thread 34 bound to OS proc set {34}
OMP: pid 1163006 tid 1163098 thread 26 bound to OS proc set {26}
OMP: pid 1163006 tid 1163095 thread 23 bound to OS proc set {23}
OMP: pid 1163006 tid 1163105 thread 33 bound to OS proc set {33}
OMP: pid 1163006 tid 1163107 thread 35 bound to OS proc set {35}
OMP: pid 1163006 tid 1163123 thread 51 bound to OS proc set {51}
OMP: pid 1163006 tid 1163113 thread 41 bound to OS proc set {41}
OMP: pid 1163006 tid 1163116 thread 44 bound to OS proc set {44}
OMP: pid 1163006 tid 1163130 thread 58 bound to OS proc set {58}
OMP: pid 1163006 tid 1163110 thread 38 bound to OS proc set {38}
OMP: pid 1163006 tid 1163120 thread 48 bound to OS proc set {48}
OMP: pid 1163006 tid 1163133 thread 61 bound to OS proc set {61}
OMP: pid 1163006 tid 1163122 thread 50 bound to OS proc set {50}
OMP: pid 1163006 tid 1163115 thread 43 bound to OS proc set {43}
OMP: pid 1163006 tid 1163112 thread 40 bound to OS proc set {40}
OMP: pid 1163006 tid 1163127 thread 55 bound to OS proc set {55}
OMP: pid 1163006 tid 1163124 thread 52 bound to OS proc set {52}
OMP: pid 1163006 tid 1163121 thread 49 bound to OS proc set {49}
OMP: pid 1163006 tid 1163126 thread 54 bound to OS proc set {54}
OMP: pid 1163006 tid 1163114 thread 42 bound to OS proc set {42}
OMP: pid 1163006 tid 1163108 thread 36 bound to OS proc set {36}
OMP: pid 1163006 tid 1163093 thread 21 bound to OS proc set {21}
OMP: pid 1163006 tid 1163094 thread 22 bound to OS proc set {22}
OMP: pid 1163006 tid 1163125 thread 53 bound to OS proc set {53}
OMP: pid 1163006 tid 1163109 thread 37 bound to OS proc set {37}
OMP: pid 1163006 tid 1163111 thread 39 bound to OS proc set {39}
OMP: pid 1163006 tid 1163128 thread 56 bound to OS proc set {56}
OMP: pid 1163006 tid 1163129 thread 57 bound to OS proc set {57}
OMP: pid 1163006 tid 1163097 thread 25 bound to OS proc set {25}
OMP: pid 1163006 tid 1163135 thread 63 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.814694, "speed_pp": 628.456848, "t_tg": 0.000000, "speed_tg": nan, "t": 0.814694, "speed": 628.456848}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-7699/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-55/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################

×