options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 62.421513, "speed_pp": 16.404600, "t_tg": 0.000000, "speed_tg": nan, "t": 62.421513, "speed": 16.404600}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 841834 tid 841834 thread 0 bound to OS proc set {0}
OMP: pid 841834 tid 841934 thread 1 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 31.129410, "speed_pp": 32.894939, "t_tg": 0.000000, "speed_tg": nan, "t": 31.129410, "speed": 32.894939}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 842004 tid 842004 thread 0 bound to OS proc set {0}
OMP: pid 842004 tid 842104 thread 2 bound to OS proc set {48}
OMP: pid 842004 tid 842103 thread 1 bound to OS proc set {24}
OMP: pid 842004 tid 842105 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 15.585934, "speed_pp": 65.700264, "t_tg": 0.000000, "speed_tg": nan, "t": 15.585934, "speed": 65.700264}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 842127 tid 842127 thread 0 bound to OS proc set {0}
OMP: pid 842127 tid 842227 thread 2 bound to OS proc set {24}
OMP: pid 842127 tid 842228 thread 3 bound to OS proc set {36}
OMP: pid 842127 tid 842229 thread 4 bound to OS proc set {48}
OMP: pid 842127 tid 842226 thread 1 bound to OS proc set {12}
OMP: pid 842127 tid 842231 thread 6 bound to OS proc set {72}
OMP: pid 842127 tid 842230 thread 5 bound to OS proc set {60}
OMP: pid 842127 tid 842232 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 7.823085, "speed_pp": 130.894653, "t_tg": 0.000000, "speed_tg": nan, "t": 7.823085, "speed": 130.894653}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 842253 tid 842253 thread 0 bound to OS proc set {0}
OMP: pid 842253 tid 842353 thread 2 bound to OS proc set {12}
OMP: pid 842253 tid 842354 thread 3 bound to OS proc set {18}
OMP: pid 842253 tid 842355 thread 4 bound to OS proc set {24}
OMP: pid 842253 tid 842363 thread 12 bound to OS proc set {72}
OMP: pid 842253 tid 842352 thread 1 bound to OS proc set {6}
OMP: pid 842253 tid 842365 thread 14 bound to OS proc set {84}
OMP: pid 842253 tid 842358 thread 7 bound to OS proc set {42}
OMP: pid 842253 tid 842359 thread 8 bound to OS proc set {48}
OMP: pid 842253 tid 842364 thread 13 bound to OS proc set {78}
OMP: pid 842253 tid 842357 thread 6 bound to OS proc set {36}
OMP: pid 842253 tid 842361 thread 10 bound to OS proc set {60}
OMP: pid 842253 tid 842356 thread 5 bound to OS proc set {30}
OMP: pid 842253 tid 842362 thread 11 bound to OS proc set {66}
OMP: pid 842253 tid 842360 thread 9 bound to OS proc set {54}
OMP: pid 842253 tid 842366 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 3.971555, "speed_pp": 257.833527, "t_tg": 0.000000, "speed_tg": nan, "t": 3.971555, "speed": 257.833527}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 842434 tid 842434 thread 0 bound to OS proc set {0}
OMP: pid 842434 tid 842535 thread 3 bound to OS proc set {12}
OMP: pid 842434 tid 842536 thread 4 bound to OS proc set {16}
OMP: pid 842434 tid 842534 thread 2 bound to OS proc set {8}
OMP: pid 842434 tid 842533 thread 1 bound to OS proc set {4}
OMP: pid 842434 tid 842547 thread 15 bound to OS proc set {60}
OMP: pid 842434 tid 842539 thread 7 bound to OS proc set {28}
OMP: pid 842434 tid 842548 thread 16 bound to OS proc set {64}
OMP: pid 842434 tid 842544 thread 12 bound to OS proc set {48}
OMP: pid 842434 tid 842546 thread 14 bound to OS proc set {56}
OMP: pid 842434 tid 842537 thread 5 bound to OS proc set {20}
OMP: pid 842434 tid 842538 thread 6 bound to OS proc set {24}
OMP: pid 842434 tid 842551 thread 19 bound to OS proc set {76}
OMP: pid 842434 tid 842540 thread 8 bound to OS proc set {32}
OMP: pid 842434 tid 842545 thread 13 bound to OS proc set {52}
OMP: pid 842434 tid 842543 thread 11 bound to OS proc set {44}
OMP: pid 842434 tid 842542 thread 10 bound to OS proc set {40}
OMP: pid 842434 tid 842541 thread 9 bound to OS proc set {36}
OMP: pid 842434 tid 842552 thread 20 bound to OS proc set {80}
OMP: pid 842434 tid 842549 thread 17 bound to OS proc set {68}
OMP: pid 842434 tid 842550 thread 18 bound to OS proc set {72}
OMP: pid 842434 tid 842553 thread 21 bound to OS proc set {84}
OMP: pid 842434 tid 842554 thread 22 bound to OS proc set {88}
OMP: pid 842434 tid 842555 thread 23 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 2.902076, "speed_pp": 352.850861, "t_tg": 0.000000, "speed_tg": nan, "t": 2.902076, "speed": 352.850861}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 842575 tid 842575 thread 0 bound to OS proc set {0}
OMP: pid 842575 tid 842677 thread 4 bound to OS proc set {12}
OMP: pid 842575 tid 842674 thread 1 bound to OS proc set {3}
OMP: pid 842575 tid 842676 thread 3 bound to OS proc set {9}
OMP: pid 842575 tid 842687 thread 14 bound to OS proc set {42}
OMP: pid 842575 tid 842688 thread 15 bound to OS proc set {45}
OMP: pid 842575 tid 842679 thread 6 bound to OS proc set {18}
OMP: pid 842575 tid 842675 thread 2 bound to OS proc set {6}
OMP: pid 842575 tid 842684 thread 11 bound to OS proc set {33}
OMP: pid 842575 tid 842685 thread 12 bound to OS proc set {36}
OMP: pid 842575 tid 842686 thread 13 bound to OS proc set {39}
OMP: pid 842575 tid 842701 thread 28 bound to OS proc set {84}
OMP: pid 842575 tid 842678 thread 5 bound to OS proc set {15}
OMP: pid 842575 tid 842683 thread 10 bound to OS proc set {30}
OMP: pid 842575 tid 842703 thread 30 bound to OS proc set {90}
OMP: pid 842575 tid 842692 thread 19 bound to OS proc set {57}
OMP: pid 842575 tid 842689 thread 16 bound to OS proc set {48}
OMP: pid 842575 tid 842697 thread 24 bound to OS proc set {72}
OMP: pid 842575 tid 842680 thread 7 bound to OS proc set {21}
OMP: pid 842575 tid 842681 thread 8 bound to OS proc set {24}
OMP: pid 842575 tid 842691 thread 18 bound to OS proc set {54}
OMP: pid 842575 tid 842690 thread 17 bound to OS proc set {51}
OMP: pid 842575 tid 842700 thread 27 bound to OS proc set {81}
OMP: pid 842575 tid 842702 thread 29 bound to OS proc set {87}
OMP: pid 842575 tid 842699 thread 26 bound to OS proc set {78}
OMP: pid 842575 tid 842682 thread 9 bound to OS proc set {27}
OMP: pid 842575 tid 842693 thread 20 bound to OS proc set {60}
OMP: pid 842575 tid 842704 thread 31 bound to OS proc set {93}
OMP: pid 842575 tid 842695 thread 22 bound to OS proc set {66}
OMP: pid 842575 tid 842698 thread 25 bound to OS proc set {75}
OMP: pid 842575 tid 842696 thread 23 bound to OS proc set {69}
OMP: pid 842575 tid 842694 thread 21 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 2.303083, "speed_pp": 444.621429, "t_tg": 0.000000, "speed_tg": nan, "t": 2.303083, "speed": 444.621429}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 842724 tid 842724 thread 0 bound to OS proc set {0}
OMP: pid 842724 tid 842823 thread 1 bound to OS proc set {2}
OMP: pid 842724 tid 842837 thread 15 bound to OS proc set {36}
OMP: pid 842724 tid 842825 thread 3 bound to OS proc set {7}
OMP: pid 842724 tid 842854 thread 32 bound to OS proc set {77}
OMP: pid 842724 tid 842824 thread 2 bound to OS proc set {4}
OMP: pid 842724 tid 842829 thread 7 bound to OS proc set {16}
OMP: pid 842724 tid 842830 thread 8 bound to OS proc set {19}
OMP: pid 842724 tid 842836 thread 14 bound to OS proc set {33}
OMP: pid 842724 tid 842861 thread 39 bound to OS proc set {94}
OMP: pid 842724 tid 842857 thread 35 bound to OS proc set {84}
OMP: pid 842724 tid 842856 thread 34 bound to OS proc set {82}
OMP: pid 842724 tid 842827 thread 5 bound to OS proc set {12}
OMP: pid 842724 tid 842855 thread 33 bound to OS proc set {80}
OMP: pid 842724 tid 842838 thread 16 bound to OS proc set {38}
OMP: pid 842724 tid 842826 thread 4 bound to OS proc set {9}
OMP: pid 842724 tid 842834 thread 12 bound to OS proc set {29}
OMP: pid 842724 tid 842832 thread 10 bound to OS proc set {24}
OMP: pid 842724 tid 842828 thread 6 bound to OS proc set {14}
OMP: pid 842724 tid 842860 thread 38 bound to OS proc set {92}
OMP: pid 842724 tid 842839 thread 17 bound to OS proc set {41}
OMP: pid 842724 tid 842846 thread 24 bound to OS proc set {58}
OMP: pid 842724 tid 842835 thread 13 bound to OS proc set {31}
OMP: pid 842724 tid 842853 thread 31 bound to OS proc set {75}
OMP: pid 842724 tid 842841 thread 19 bound to OS proc set {46}
OMP: pid 842724 tid 842833 thread 11 bound to OS proc set {26}
OMP: pid 842724 tid 842850 thread 28 bound to OS proc set {67}
OMP: pid 842724 tid 842848 thread 26 bound to OS proc set {63}
OMP: pid 842724 tid 842840 thread 18 bound to OS proc set {43}
OMP: pid 842724 tid 842849 thread 27 bound to OS proc set {65}
OMP: pid 842724 tid 842847 thread 25 bound to OS proc set {60}
OMP: pid 842724 tid 842858 thread 36 bound to OS proc set {87}
OMP: pid 842724 tid 842859 thread 37 bound to OS proc set {89}
OMP: pid 842724 tid 842852 thread 30 bound to OS proc set {72}
OMP: pid 842724 tid 842851 thread 29 bound to OS proc set {70}
OMP: pid 842724 tid 842831 thread 9 bound to OS proc set {21}
OMP: pid 842724 tid 842845 thread 23 bound to OS proc set {55}
OMP: pid 842724 tid 842842 thread 20 bound to OS proc set {48}
OMP: pid 842724 tid 842844 thread 22 bound to OS proc set {53}
OMP: pid 842724 tid 842843 thread 21 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.940788, "speed_pp": 527.620728, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 1.940789, "speed": 527.620483}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 842881 tid 842881 thread 0 bound to OS proc set {0}
OMP: pid 842881 tid 842987 thread 8 bound to OS proc set {16}
OMP: pid 842881 tid 842990 thread 11 bound to OS proc set {22}
OMP: pid 842881 tid 842993 thread 14 bound to OS proc set {28}
OMP: pid 842881 tid 842995 thread 16 bound to OS proc set {32}
OMP: pid 842881 tid 842982 thread 3 bound to OS proc set {6}
OMP: pid 842881 tid 842997 thread 18 bound to OS proc set {36}
OMP: pid 842881 tid 842988 thread 9 bound to OS proc set {18}
OMP: pid 842881 tid 842985 thread 6 bound to OS proc set {12}
OMP: pid 842881 tid 842981 thread 2 bound to OS proc set {4}
OMP: pid 842881 tid 842986 thread 7 bound to OS proc set {14}
OMP: pid 842881 tid 843023 thread 44 bound to OS proc set {88}
OMP: pid 842881 tid 842980 thread 1 bound to OS proc set {2}
OMP: pid 842881 tid 842991 thread 12 bound to OS proc set {24}
OMP: pid 842881 tid 842984 thread 5 bound to OS proc set {10}
OMP: pid 842881 tid 843011 thread 32 bound to OS proc set {64}
OMP: pid 842881 tid 843014 thread 35 bound to OS proc set {70}
OMP: pid 842881 tid 843003 thread 24 bound to OS proc set {48}
OMP: pid 842881 tid 843010 thread 31 bound to OS proc set {62}
OMP: pid 842881 tid 842983 thread 4 bound to OS proc set {8}
OMP: pid 842881 tid 843007 thread 28 bound to OS proc set {56}
OMP: pid 842881 tid 842992 thread 13 bound to OS proc set {26}
OMP: pid 842881 tid 842996 thread 17 bound to OS proc set {34}
OMP: pid 842881 tid 843002 thread 23 bound to OS proc set {46}
OMP: pid 842881 tid 843006 thread 27 bound to OS proc set {54}
OMP: pid 842881 tid 843009 thread 30 bound to OS proc set {60}
OMP: pid 842881 tid 843000 thread 21 bound to OS proc set {42}
OMP: pid 842881 tid 843024 thread 45 bound to OS proc set {90}
OMP: pid 842881 tid 843013 thread 34 bound to OS proc set {68}
OMP: pid 842881 tid 843025 thread 46 bound to OS proc set {92}
OMP: pid 842881 tid 843005 thread 26 bound to OS proc set {52}
OMP: pid 842881 tid 843004 thread 25 bound to OS proc set {50}
OMP: pid 842881 tid 843026 thread 47 bound to OS proc set {94}
OMP: pid 842881 tid 842994 thread 15 bound to OS proc set {30}
OMP: pid 842881 tid 843017 thread 38 bound to OS proc set {76}
OMP: pid 842881 tid 842999 thread 20 bound to OS proc set {40}
OMP: pid 842881 tid 843015 thread 36 bound to OS proc set {72}
OMP: pid 842881 tid 842989 thread 10 bound to OS proc set {20}
OMP: pid 842881 tid 843022 thread 43 bound to OS proc set {86}
OMP: pid 842881 tid 843018 thread 39 bound to OS proc set {78}
OMP: pid 842881 tid 843012 thread 33 bound to OS proc set {66}
OMP: pid 842881 tid 842998 thread 19 bound to OS proc set {38}
OMP: pid 842881 tid 843019 thread 40 bound to OS proc set {80}
OMP: pid 842881 tid 843008 thread 29 bound to OS proc set {58}
OMP: pid 842881 tid 843016 thread 37 bound to OS proc set {74}
OMP: pid 842881 tid 843021 thread 42 bound to OS proc set {84}
OMP: pid 842881 tid 843020 thread 41 bound to OS proc set {82}
OMP: pid 842881 tid 843001 thread 22 bound to OS proc set {44}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.688849, "speed_pp": 606.330139, "t_tg": 0.000000, "speed_tg": nan, "t": 1.688849, "speed": 606.330139}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 843046 tid 843046 thread 0 bound to OS proc set {0}
OMP: pid 843046 tid 843145 thread 1 bound to OS proc set {1}
OMP: pid 843046 tid 843146 thread 2 bound to OS proc set {3}
OMP: pid 843046 tid 843159 thread 15 bound to OS proc set {25}
OMP: pid 843046 tid 843158 thread 14 bound to OS proc set {24}
OMP: pid 843046 tid 843176 thread 32 bound to OS proc set {55}
OMP: pid 843046 tid 843151 thread 7 bound to OS proc set {12}
OMP: pid 843046 tid 843153 thread 9 bound to OS proc set {15}
OMP: pid 843046 tid 843195 thread 51 bound to OS proc set {88}
OMP: pid 843046 tid 843152 thread 8 bound to OS proc set {13}
OMP: pid 843046 tid 843199 thread 55 bound to OS proc set {95}
OMP: pid 843046 tid 843154 thread 10 bound to OS proc set {17}
OMP: pid 843046 tid 843175 thread 31 bound to OS proc set {53}
OMP: pid 843046 tid 843191 thread 47 bound to OS proc set {81}
OMP: pid 843046 tid 843162 thread 18 bound to OS proc set {31}
OMP: pid 843046 tid 843155 thread 11 bound to OS proc set {19}
OMP: pid 843046 tid 843171 thread 27 bound to OS proc set {46}
OMP: pid 843046 tid 843196 thread 52 bound to OS proc set {90}
OMP: pid 843046 tid 843157 thread 13 bound to OS proc set {22}
OMP: pid 843046 tid 843178 thread 34 bound to OS proc set {58}
OMP: pid 843046 tid 843188 thread 44 bound to OS proc set {76}
OMP: pid 843046 tid 843148 thread 4 bound to OS proc set {6}
OMP: pid 843046 tid 843174 thread 30 bound to OS proc set {51}
OMP: pid 843046 tid 843167 thread 23 bound to OS proc set {39}
OMP: pid 843046 tid 843179 thread 35 bound to OS proc set {60}
OMP: pid 843046 tid 843150 thread 6 bound to OS proc set {10}
OMP: pid 843046 tid 843156 thread 12 bound to OS proc set {20}
OMP: pid 843046 tid 843198 thread 54 bound to OS proc set {93}
OMP: pid 843046 tid 843164 thread 20 bound to OS proc set {34}
OMP: pid 843046 tid 843172 thread 28 bound to OS proc set {48}
OMP: pid 843046 tid 843194 thread 50 bound to OS proc set {86}
OMP: pid 843046 tid 843147 thread 3 bound to OS proc set {5}
OMP: pid 843046 tid 843163 thread 19 bound to OS proc set {32}
OMP: pid 843046 tid 843149 thread 5 bound to OS proc set {8}
OMP: pid 843046 tid 843173 thread 29 bound to OS proc set {50}
OMP: pid 843046 tid 843168 thread 24 bound to OS proc set {41}
OMP: pid 843046 tid 843169 thread 25 bound to OS proc set {43}
OMP: pid 843046 tid 843180 thread 36 bound to OS proc set {62}
OMP: pid 843046 tid 843160 thread 16 bound to OS proc set {27}
OMP: pid 843046 tid 843170 thread 26 bound to OS proc set {45}
OMP: pid 843046 tid 843187 thread 43 bound to OS proc set {74}
OMP: pid 843046 tid 843182 thread 38 bound to OS proc set {65}
OMP: pid 843046 tid 843190 thread 46 bound to OS proc set {79}
OMP: pid 843046 tid 843165 thread 21 bound to OS proc set {36}
OMP: pid 843046 tid 843181 thread 37 bound to OS proc set {64}
OMP: pid 843046 tid 843184 thread 40 bound to OS proc set {69}
OMP: pid 843046 tid 843161 thread 17 bound to OS proc set {29}
OMP: pid 843046 tid 843197 thread 53 bound to OS proc set {91}
OMP: pid 843046 tid 843193 thread 49 bound to OS proc set {84}
OMP: pid 843046 tid 843189 thread 45 bound to OS proc set {77}
OMP: pid 843046 tid 843166 thread 22 bound to OS proc set {38}
OMP: pid 843046 tid 843183 thread 39 bound to OS proc set {67}
OMP: pid 843046 tid 843186 thread 42 bound to OS proc set {72}
OMP: pid 843046 tid 843177 thread 33 bound to OS proc set {57}
OMP: pid 843046 tid 843192 thread 48 bound to OS proc set {83}
OMP: pid 843046 tid 843185 thread 41 bound to OS proc set {71}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.493653, "speed_pp": 685.567505, "t_tg": 0.000000, "speed_tg": nan, "t": 1.493653, "speed": 685.567505}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 843268 tid 843268 thread 0 bound to OS proc set {0}
OMP: pid 843268 tid 843367 thread 1 bound to OS proc set {1}
OMP: pid 843268 tid 843381 thread 15 bound to OS proc set {22}
OMP: pid 843268 tid 843368 thread 2 bound to OS proc set {3}
OMP: pid 843268 tid 843374 thread 8 bound to OS proc set {12}
OMP: pid 843268 tid 843398 thread 32 bound to OS proc set {48}
OMP: pid 843268 tid 843400 thread 34 bound to OS proc set {51}
OMP: pid 843268 tid 843406 thread 40 bound to OS proc set {60}
OMP: pid 843268 tid 843377 thread 11 bound to OS proc set {16}
OMP: pid 843268 tid 843396 thread 30 bound to OS proc set {45}
OMP: pid 843268 tid 843417 thread 51 bound to OS proc set {77}
OMP: pid 843268 tid 843413 thread 47 bound to OS proc set {71}
OMP: pid 843268 tid 843408 thread 42 bound to OS proc set {63}
OMP: pid 843268 tid 843369 thread 3 bound to OS proc set {4}
OMP: pid 843268 tid 843395 thread 29 bound to OS proc set {43}
OMP: pid 843268 tid 843370 thread 4 bound to OS proc set {6}
OMP: pid 843268 tid 843399 thread 33 bound to OS proc set {50}
OMP: pid 843268 tid 843390 thread 24 bound to OS proc set {36}
OMP: pid 843268 tid 843429 thread 63 bound to OS proc set {95}
OMP: pid 843268 tid 843414 thread 48 bound to OS proc set {72}
OMP: pid 843268 tid 843379 thread 13 bound to OS proc set {19}
OMP: pid 843268 tid 843373 thread 7 bound to OS proc set {10}
OMP: pid 843268 tid 843376 thread 10 bound to OS proc set {15}
OMP: pid 843268 tid 843401 thread 35 bound to OS proc set {53}
OMP: pid 843268 tid 843409 thread 43 bound to OS proc set {65}
OMP: pid 843268 tid 843410 thread 44 bound to OS proc set {66}
OMP: pid 843268 tid 843372 thread 6 bound to OS proc set {9}
OMP: pid 843268 tid 843392 thread 26 bound to OS proc set {39}
OMP: pid 843268 tid 843412 thread 46 bound to OS proc set {69}
OMP: pid 843268 tid 843394 thread 28 bound to OS proc set {42}
OMP: pid 843268 tid 843384 thread 18 bound to OS proc set {27}
OMP: pid 843268 tid 843382 thread 16 bound to OS proc set {24}
OMP: pid 843268 tid 843391 thread 25 bound to OS proc set {37}
OMP: pid 843268 tid 843416 thread 50 bound to OS proc set {75}
OMP: pid 843268 tid 843404 thread 38 bound to OS proc set {57}
OMP: pid 843268 tid 843385 thread 19 bound to OS proc set {28}
OMP: pid 843268 tid 843397 thread 31 bound to OS proc set {46}
OMP: pid 843268 tid 843426 thread 60 bound to OS proc set {90}
OMP: pid 843268 tid 843393 thread 27 bound to OS proc set {40}
OMP: pid 843268 tid 843378 thread 12 bound to OS proc set {18}
OMP: pid 843268 tid 843371 thread 5 bound to OS proc set {7}
OMP: pid 843268 tid 843375 thread 9 bound to OS proc set {13}
OMP: pid 843268 tid 843407 thread 41 bound to OS proc set {62}
OMP: pid 843268 tid 843380 thread 14 bound to OS proc set {21}
OMP: pid 843268 tid 843403 thread 37 bound to OS proc set {56}
OMP: pid 843268 tid 843402 thread 36 bound to OS proc set {54}
OMP: pid 843268 tid 843428 thread 62 bound to OS proc set {93}
OMP: pid 843268 tid 843405 thread 39 bound to OS proc set {59}
OMP: pid 843268 tid 843425 thread 59 bound to OS proc set {89}
OMP: pid 843268 tid 843386 thread 20 bound to OS proc set {30}
OMP: pid 843268 tid 843389 thread 23 bound to OS proc set {34}
OMP: pid 843268 tid 843383 thread 17 bound to OS proc set {25}
OMP: pid 843268 tid 843388 thread 22 bound to OS proc set {33}
OMP: pid 843268 tid 843415 thread 49 bound to OS proc set {74}
OMP: pid 843268 tid 843422 thread 56 bound to OS proc set {84}
OMP: pid 843268 tid 843418 thread 52 bound to OS proc set {78}
OMP: pid 843268 tid 843420 thread 54 bound to OS proc set {81}
OMP: pid 843268 tid 843427 thread 61 bound to OS proc set {92}
OMP: pid 843268 tid 843421 thread 55 bound to OS proc set {83}
OMP: pid 843268 tid 843423 thread 57 bound to OS proc set {86}
OMP: pid 843268 tid 843411 thread 45 bound to OS proc set {68}
OMP: pid 843268 tid 843419 thread 53 bound to OS proc set {80}
OMP: pid 843268 tid 843424 thread 58 bound to OS proc set {87}
OMP: pid 843268 tid 843387 thread 21 bound to OS proc set {31}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.331329, "speed_pp": 769.156250, "t_tg": 0.000000, "speed_tg": nan, "t": 1.331329, "speed": 769.156250}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 843449 tid 843449 thread 0 bound to OS proc set {0}
OMP: pid 843449 tid 843549 thread 2 bound to OS proc set {2}
OMP: pid 843449 tid 843548 thread 1 bound to OS proc set {1}
OMP: pid 843449 tid 843559 thread 12 bound to OS proc set {16}
OMP: pid 843449 tid 843614 thread 67 bound to OS proc set {90}
OMP: pid 843449 tid 843611 thread 64 bound to OS proc set {86}
OMP: pid 843449 tid 843598 thread 51 bound to OS proc set {68}
OMP: pid 843449 tid 843613 thread 66 bound to OS proc set {88}
OMP: pid 843449 tid 843612 thread 65 bound to OS proc set {87}
OMP: pid 843449 tid 843582 thread 35 bound to OS proc set {47}
OMP: pid 843449 tid 843595 thread 48 bound to OS proc set {64}
OMP: pid 843449 tid 843618 thread 71 bound to OS proc set {95}
OMP: pid 843449 tid 843593 thread 46 bound to OS proc set {61}
OMP: pid 843449 tid 843554 thread 7 bound to OS proc set {9}
OMP: pid 843449 tid 843581 thread 34 bound to OS proc set {45}
OMP: pid 843449 tid 843617 thread 70 bound to OS proc set {94}
OMP: pid 843449 tid 843555 thread 8 bound to OS proc set {10}
OMP: pid 843449 tid 843615 thread 68 bound to OS proc set {91}
OMP: pid 843449 tid 843556 thread 9 bound to OS proc set {12}
OMP: pid 843449 tid 843558 thread 11 bound to OS proc set {14}
OMP: pid 843449 tid 843597 thread 50 bound to OS proc set {67}
OMP: pid 843449 tid 843577 thread 30 bound to OS proc set {40}
OMP: pid 843449 tid 843550 thread 3 bound to OS proc set {4}
OMP: pid 843449 tid 843592 thread 45 bound to OS proc set {60}
OMP: pid 843449 tid 843594 thread 47 bound to OS proc set {63}
OMP: pid 843449 tid 843578 thread 31 bound to OS proc set {41}
OMP: pid 843449 tid 843557 thread 10 bound to OS proc set {13}
OMP: pid 843449 tid 843562 thread 15 bound to OS proc set {20}
OMP: pid 843449 tid 843576 thread 29 bound to OS proc set {39}
OMP: pid 843449 tid 843590 thread 43 bound to OS proc set {57}
OMP: pid 843449 tid 843579 thread 32 bound to OS proc set {43}
OMP: pid 843449 tid 843596 thread 49 bound to OS proc set {66}
OMP: pid 843449 tid 843553 thread 6 bound to OS proc set {8}
OMP: pid 843449 tid 843560 thread 13 bound to OS proc set {17}
OMP: pid 843449 tid 843571 thread 24 bound to OS proc set {32}
OMP: pid 843449 tid 843607 thread 60 bound to OS proc set {80}
OMP: pid 843449 tid 843586 thread 39 bound to OS proc set {52}
OMP: pid 843449 tid 843591 thread 44 bound to OS proc set {59}
OMP: pid 843449 tid 843610 thread 63 bound to OS proc set {84}
OMP: pid 843449 tid 843573 thread 26 bound to OS proc set {35}
OMP: pid 843449 tid 843572 thread 25 bound to OS proc set {33}
OMP: pid 843449 tid 843566 thread 19 bound to OS proc set {25}
OMP: pid 843449 tid 843588 thread 41 bound to OS proc set {55}
OMP: pid 843449 tid 843574 thread 27 bound to OS proc set {36}
OMP: pid 843449 tid 843583 thread 36 bound to OS proc set {48}
OMP: pid 843449 tid 843580 thread 33 bound to OS proc set {44}
OMP: pid 843449 tid 843589 thread 42 bound to OS proc set {56}
OMP: pid 843449 tid 843575 thread 28 bound to OS proc set {37}
OMP: pid 843449 tid 843601 thread 54 bound to OS proc set {72}
OMP: pid 843449 tid 843587 thread 40 bound to OS proc set {53}
OMP: pid 843449 tid 843584 thread 37 bound to OS proc set {49}
OMP: pid 843449 tid 843606 thread 59 bound to OS proc set {79}
OMP: pid 843449 tid 843565 thread 18 bound to OS proc set {24}
OMP: pid 843449 tid 843616 thread 69 bound to OS proc set {92}
OMP: pid 843449 tid 843600 thread 53 bound to OS proc set {71}
OMP: pid 843449 tid 843603 thread 56 bound to OS proc set {75}
OMP: pid 843449 tid 843609 thread 62 bound to OS proc set {83}
OMP: pid 843449 tid 843585 thread 38 bound to OS proc set {51}
OMP: pid 843449 tid 843599 thread 52 bound to OS proc set {70}
OMP: pid 843449 tid 843608 thread 61 bound to OS proc set {82}
OMP: pid 843449 tid 843568 thread 21 bound to OS proc set {28}
OMP: pid 843449 tid 843551 thread 4 bound to OS proc set {5}
OMP: pid 843449 tid 843564 thread 17 bound to OS proc set {22}
OMP: pid 843449 tid 843561 thread 14 bound to OS proc set {18}
OMP: pid 843449 tid 843563 thread 16 bound to OS proc set {21}
OMP: pid 843449 tid 843605 thread 58 bound to OS proc set {78}
OMP: pid 843449 tid 843567 thread 20 bound to OS proc set {26}
OMP: pid 843449 tid 843552 thread 5 bound to OS proc set {6}
OMP: pid 843449 tid 843570 thread 23 bound to OS proc set {30}
OMP: pid 843449 tid 843602 thread 55 bound to OS proc set {74}
OMP: pid 843449 tid 843604 thread 57 bound to OS proc set {76}
OMP: pid 843449 tid 843569 thread 22 bound to OS proc set {29}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.249428, "speed_pp": 819.575012, "t_tg": 0.000000, "speed_tg": nan, "t": 1.249428, "speed": 819.575012}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_11  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 843638 tid 843638 thread 0 bound to OS proc set {0}
OMP: pid 843638 tid 843739 thread 3 bound to OS proc set {3}
OMP: pid 843638 tid 843738 thread 2 bound to OS proc set {2}
OMP: pid 843638 tid 843737 thread 1 bound to OS proc set {1}
OMP: pid 843638 tid 843740 thread 4 bound to OS proc set {4}
OMP: pid 843638 tid 843744 thread 8 bound to OS proc set {9}
OMP: pid 843638 tid 843743 thread 7 bound to OS proc set {8}
OMP: pid 843638 tid 843742 thread 6 bound to OS proc set {7}
OMP: pid 843638 tid 843741 thread 5 bound to OS proc set {6}
OMP: pid 843638 tid 843745 thread 9 bound to OS proc set {10}
OMP: pid 843638 tid 843747 thread 11 bound to OS proc set {13}
OMP: pid 843638 tid 843750 thread 14 bound to OS proc set {16}
OMP: pid 843638 tid 843770 thread 34 bound to OS proc set {41}
OMP: pid 843638 tid 843787 thread 51 bound to OS proc set {61}
OMP: pid 843638 tid 843776 thread 40 bound to OS proc set {48}
OMP: pid 843638 tid 843778 thread 42 bound to OS proc set {50}
OMP: pid 843638 tid 843763 thread 27 bound to OS proc set {32}
OMP: pid 843638 tid 843775 thread 39 bound to OS proc set {47}
OMP: pid 843638 tid 843779 thread 43 bound to OS proc set {52}
OMP: pid 843638 tid 843769 thread 33 bound to OS proc set {40}
OMP: pid 843638 tid 843815 thread 79 bound to OS proc set {95}
OMP: pid 843638 tid 843766 thread 30 bound to OS proc set {36}
OMP: pid 843638 tid 843786 thread 50 bound to OS proc set {60}
OMP: pid 843638 tid 843748 thread 12 bound to OS proc set {14}
OMP: pid 843638 tid 843749 thread 13 bound to OS proc set {15}
OMP: pid 843638 tid 843752 thread 16 bound to OS proc set {19}
OMP: pid 843638 tid 843800 thread 64 bound to OS proc set {77}
OMP: pid 843638 tid 843799 thread 63 bound to OS proc set {76}
OMP: pid 843638 tid 843760 thread 24 bound to OS proc set {29}
OMP: pid 843638 tid 843764 thread 28 bound to OS proc set {33}
OMP: pid 843638 tid 843762 thread 26 bound to OS proc set {31}
OMP: pid 843638 tid 843785 thread 49 bound to OS proc set {59}
OMP: pid 843638 tid 843783 thread 47 bound to OS proc set {56}
OMP: pid 843638 tid 843767 thread 31 bound to OS proc set {37}
OMP: pid 843638 tid 843759 thread 23 bound to OS proc set {27}
OMP: pid 843638 tid 843784 thread 48 bound to OS proc set {58}
OMP: pid 843638 tid 843761 thread 25 bound to OS proc set {30}
OMP: pid 843638 tid 843792 thread 56 bound to OS proc set {67}
OMP: pid 843638 tid 843782 thread 46 bound to OS proc set {55}
OMP: pid 843638 tid 843765 thread 29 bound to OS proc set {35}
OMP: pid 843638 tid 843755 thread 19 bound to OS proc set {23}
OMP: pid 843638 tid 843774 thread 38 bound to OS proc set {46}
OMP: pid 843638 tid 843772 thread 36 bound to OS proc set {43}
OMP: pid 843638 tid 843780 thread 44 bound to OS proc set {53}
OMP: pid 843638 tid 843802 thread 66 bound to OS proc set {80}
OMP: pid 843638 tid 843777 thread 41 bound to OS proc set {49}
OMP: pid 843638 tid 843758 thread 22 bound to OS proc set {26}
OMP: pid 843638 tid 843814 thread 78 bound to OS proc set {94}
OMP: pid 843638 tid 843751 thread 15 bound to OS proc set {18}
OMP: pid 843638 tid 843771 thread 35 bound to OS proc set {42}
OMP: pid 843638 tid 843797 thread 61 bound to OS proc set {73}
OMP: pid 843638 tid 843798 thread 62 bound to OS proc set {75}
OMP: pid 843638 tid 843803 thread 67 bound to OS proc set {81}
OMP: pid 843638 tid 843811 thread 75 bound to OS proc set {90}
OMP: pid 843638 tid 843796 thread 60 bound to OS proc set {72}
OMP: pid 843638 tid 843781 thread 45 bound to OS proc set {54}
OMP: pid 843638 tid 843788 thread 52 bound to OS proc set {63}
OMP: pid 843638 tid 843753 thread 17 bound to OS proc set {20}
OMP: pid 843638 tid 843793 thread 57 bound to OS proc set {69}
OMP: pid 843638 tid 843757 thread 21 bound to OS proc set {25}
OMP: pid 843638 tid 843768 thread 32 bound to OS proc set {38}
OMP: pid 843638 tid 843773 thread 37 bound to OS proc set {44}
OMP: pid 843638 tid 843794 thread 58 bound to OS proc set {70}
OMP: pid 843638 tid 843812 thread 76 bound to OS proc set {92}
OMP: pid 843638 tid 843789 thread 53 bound to OS proc set {64}
OMP: pid 843638 tid 843756 thread 20 bound to OS proc set {24}
OMP: pid 843638 tid 843795 thread 59 bound to OS proc set {71}
OMP: pid 843638 tid 843801 thread 65 bound to OS proc set {78}
OMP: pid 843638 tid 843754 thread 18 bound to OS proc set {21}
OMP: pid 843638 tid 843790 thread 54 bound to OS proc set {65}
OMP: pid 843638 tid 843791 thread 55 bound to OS proc set {66}
OMP: pid 843638 tid 843808 thread 72 bound to OS proc set {87}
OMP: pid 843638 tid 843813 thread 77 bound to OS proc set {93}
OMP: pid 843638 tid 843807 thread 71 bound to OS proc set {86}
OMP: pid 843638 tid 843809 thread 73 bound to OS proc set {88}
OMP: pid 843638 tid 843810 thread 74 bound to OS proc set {89}
OMP: pid 843638 tid 843804 thread 68 bound to OS proc set {82}
OMP: pid 843638 tid 843805 thread 69 bound to OS proc set {83}
OMP: pid 843638 tid 843806 thread 70 bound to OS proc set {84}
OMP: pid 843638 tid 843746 thread 10 bound to OS proc set {12}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.144300, "speed_pp": 894.870239, "t_tg": 0.000000, "speed_tg": nan, "t": 1.144300, "speed": 894.870239}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_12  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 843835 tid 843835 thread 0 bound to OS proc set {0}
OMP: pid 843835 tid 843937 thread 3 bound to OS proc set {3}
OMP: pid 843835 tid 843936 thread 2 bound to OS proc set {2}
OMP: pid 843835 tid 843969 thread 35 bound to OS proc set {38}
OMP: pid 843835 tid 843942 thread 8 bound to OS proc set {8}
OMP: pid 843835 tid 843935 thread 1 bound to OS proc set {1}
OMP: pid 843835 tid 843941 thread 7 bound to OS proc set {7}
OMP: pid 843835 tid 843938 thread 4 bound to OS proc set {4}
OMP: pid 843835 tid 843968 thread 34 bound to OS proc set {37}
OMP: pid 843835 tid 843940 thread 6 bound to OS proc set {6}
OMP: pid 843835 tid 843943 thread 9 bound to OS proc set {9}
OMP: pid 843835 tid 843973 thread 39 bound to OS proc set {42}
OMP: pid 843835 tid 843948 thread 14 bound to OS proc set {15}
OMP: pid 843835 tid 843970 thread 36 bound to OS proc set {39}
OMP: pid 843835 tid 843939 thread 5 bound to OS proc set {5}
OMP: pid 843835 tid 843972 thread 38 bound to OS proc set {41}
OMP: pid 843835 tid 843985 thread 51 bound to OS proc set {56}
OMP: pid 843835 tid 843949 thread 15 bound to OS proc set {16}
OMP: pid 843835 tid 843971 thread 37 bound to OS proc set {40}
OMP: pid 843835 tid 843977 thread 43 bound to OS proc set {47}
OMP: pid 843835 tid 843994 thread 60 bound to OS proc set {66}
OMP: pid 843835 tid 843944 thread 10 bound to OS proc set {11}
OMP: pid 843835 tid 843952 thread 18 bound to OS proc set {19}
OMP: pid 843835 tid 843965 thread 31 bound to OS proc set {34}
OMP: pid 843835 tid 843998 thread 64 bound to OS proc set {70}
OMP: pid 843835 tid 843946 thread 12 bound to OS proc set {13}
OMP: pid 843835 tid 844017 thread 83 bound to OS proc set {91}
OMP: pid 843835 tid 844008 thread 74 bound to OS proc set {81}
OMP: pid 843835 tid 843981 thread 47 bound to OS proc set {51}
OMP: pid 843835 tid 843947 thread 13 bound to OS proc set {14}
OMP: pid 843835 tid 843997 thread 63 bound to OS proc set {69}
OMP: pid 843835 tid 843961 thread 27 bound to OS proc set {29}
OMP: pid 843835 tid 844010 thread 76 bound to OS proc set {83}
OMP: pid 843835 tid 843993 thread 59 bound to OS proc set {65}
OMP: pid 843835 tid 843960 thread 26 bound to OS proc set {28}
OMP: pid 843835 tid 844012 thread 78 bound to OS proc set {85}
OMP: pid 843835 tid 844009 thread 75 bound to OS proc set {82}
OMP: pid 843835 tid 843980 thread 46 bound to OS proc set {50}
OMP: pid 843835 tid 843966 thread 32 bound to OS proc set {35}
OMP: pid 843835 tid 843945 thread 11 bound to OS proc set {12}
OMP: pid 843835 tid 843950 thread 16 bound to OS proc set {17}
OMP: pid 843835 tid 843982 thread 48 bound to OS proc set {52}
OMP: pid 843835 tid 843990 thread 56 bound to OS proc set {61}
OMP: pid 843835 tid 844006 thread 72 bound to OS proc set {79}
OMP: pid 843835 tid 844011 thread 77 bound to OS proc set {84}
OMP: pid 843835 tid 843957 thread 23 bound to OS proc set {25}
OMP: pid 843835 tid 843953 thread 19 bound to OS proc set {20}
OMP: pid 843835 tid 843984 thread 50 bound to OS proc set {55}
OMP: pid 843835 tid 843958 thread 24 bound to OS proc set {26}
OMP: pid 843835 tid 843995 thread 61 bound to OS proc set {67}
OMP: pid 843835 tid 843996 thread 62 bound to OS proc set {68}
OMP: pid 843835 tid 843959 thread 25 bound to OS proc set {27}
OMP: pid 843835 tid 843989 thread 55 bound to OS proc set {60}
OMP: pid 843835 tid 843986 thread 52 bound to OS proc set {57}
OMP: pid 843835 tid 844016 thread 82 bound to OS proc set {90}
OMP: pid 843835 tid 843974 thread 40 bound to OS proc set {44}
OMP: pid 843835 tid 843978 thread 44 bound to OS proc set {48}
OMP: pid 843835 tid 844007 thread 73 bound to OS proc set {80}
OMP: pid 843835 tid 843991 thread 57 bound to OS proc set {62}
OMP: pid 843835 tid 843963 thread 29 bound to OS proc set {31}
OMP: pid 843835 tid 843962 thread 28 bound to OS proc set {30}
OMP: pid 843835 tid 843956 thread 22 bound to OS proc set {24}
OMP: pid 843835 tid 844013 thread 79 bound to OS proc set {87}
OMP: pid 843835 tid 843999 thread 65 bound to OS proc set {71}
OMP: pid 843835 tid 844005 thread 71 bound to OS proc set {78}
OMP: pid 843835 tid 843967 thread 33 bound to OS proc set {36}
OMP: pid 843835 tid 843976 thread 42 bound to OS proc set {46}
OMP: pid 843835 tid 843979 thread 45 bound to OS proc set {49}
OMP: pid 843835 tid 843988 thread 54 bound to OS proc set {59}
OMP: pid 843835 tid 843992 thread 58 bound to OS proc set {63}
OMP: pid 843835 tid 844014 thread 80 bound to OS proc set {88}
OMP: pid 843835 tid 844015 thread 81 bound to OS proc set {89}
OMP: pid 843835 tid 843951 thread 17 bound to OS proc set {18}
OMP: pid 843835 tid 843975 thread 41 bound to OS proc set {45}
OMP: pid 843835 tid 843983 thread 49 bound to OS proc set {54}
OMP: pid 843835 tid 844001 thread 67 bound to OS proc set {73}
OMP: pid 843835 tid 844002 thread 68 bound to OS proc set {74}
OMP: pid 843835 tid 843955 thread 21 bound to OS proc set {23}
OMP: pid 843835 tid 844021 thread 87 bound to OS proc set {95}
OMP: pid 843835 tid 843987 thread 53 bound to OS proc set {58}
OMP: pid 843835 tid 844004 thread 70 bound to OS proc set {77}
OMP: pid 843835 tid 844018 thread 84 bound to OS proc set {92}
OMP: pid 843835 tid 844020 thread 86 bound to OS proc set {94}
OMP: pid 843835 tid 843964 thread 30 bound to OS proc set {33}
OMP: pid 843835 tid 844000 thread 66 bound to OS proc set {72}
OMP: pid 843835 tid 844003 thread 69 bound to OS proc set {76}
OMP: pid 843835 tid 844019 thread 85 bound to OS proc set {93}
OMP: pid 843835 tid 843954 thread 20 bound to OS proc set {22}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.072665, "speed_pp": 954.631714, "t_tg": 0.000000, "speed_tg": nan, "t": 1.072665, "speed": 954.631714}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_13  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 844041 tid 844041 thread 0 bound to OS proc set {0}
OMP: pid 844041 tid 844142 thread 3 bound to OS proc set {3}
OMP: pid 844041 tid 844154 thread 15 bound to OS proc set {15}
OMP: pid 844041 tid 844151 thread 12 bound to OS proc set {12}
OMP: pid 844041 tid 844202 thread 63 bound to OS proc set {63}
OMP: pid 844041 tid 844141 thread 2 bound to OS proc set {2}
OMP: pid 844041 tid 844150 thread 11 bound to OS proc set {11}
OMP: pid 844041 tid 844190 thread 51 bound to OS proc set {51}
OMP: pid 844041 tid 844153 thread 14 bound to OS proc set {14}
OMP: pid 844041 tid 844170 thread 31 bound to OS proc set {31}
OMP: pid 844041 tid 844147 thread 8 bound to OS proc set {8}
OMP: pid 844041 tid 844140 thread 1 bound to OS proc set {1}
OMP: pid 844041 tid 844186 thread 47 bound to OS proc set {47}
OMP: pid 844041 tid 844187 thread 48 bound to OS proc set {48}
OMP: pid 844041 tid 844146 thread 7 bound to OS proc set {7}
OMP: pid 844041 tid 844167 thread 28 bound to OS proc set {28}
OMP: pid 844041 tid 844201 thread 62 bound to OS proc set {62}
OMP: pid 844041 tid 844171 thread 32 bound to OS proc set {32}
OMP: pid 844041 tid 844152 thread 13 bound to OS proc set {13}
OMP: pid 844041 tid 844189 thread 50 bound to OS proc set {50}
OMP: pid 844041 tid 844158 thread 19 bound to OS proc set {19}
OMP: pid 844041 tid 844166 thread 27 bound to OS proc set {27}
OMP: pid 844041 tid 844143 thread 4 bound to OS proc set {4}
OMP: pid 844041 tid 844169 thread 30 bound to OS proc set {30}
OMP: pid 844041 tid 844199 thread 60 bound to OS proc set {60}
OMP: pid 844041 tid 844174 thread 35 bound to OS proc set {35}
OMP: pid 844041 tid 844206 thread 67 bound to OS proc set {67}
OMP: pid 844041 tid 844149 thread 10 bound to OS proc set {10}
OMP: pid 844041 tid 844163 thread 24 bound to OS proc set {24}
OMP: pid 844041 tid 844198 thread 59 bound to OS proc set {59}
OMP: pid 844041 tid 844148 thread 9 bound to OS proc set {9}
OMP: pid 844041 tid 844165 thread 26 bound to OS proc set {26}
OMP: pid 844041 tid 844185 thread 46 bound to OS proc set {46}
OMP: pid 844041 tid 844191 thread 52 bound to OS proc set {52}
OMP: pid 844041 tid 844155 thread 16 bound to OS proc set {16}
OMP: pid 844041 tid 844205 thread 66 bound to OS proc set {66}
OMP: pid 844041 tid 844200 thread 61 bound to OS proc set {61}
OMP: pid 844041 tid 844210 thread 71 bound to OS proc set {71}
OMP: pid 844041 tid 844203 thread 64 bound to OS proc set {64}
OMP: pid 844041 tid 844145 thread 6 bound to OS proc set {6}
OMP: pid 844041 tid 844144 thread 5 bound to OS proc set {5}
OMP: pid 844041 tid 844162 thread 23 bound to OS proc set {23}
OMP: pid 844041 tid 844183 thread 44 bound to OS proc set {44}
OMP: pid 844041 tid 844157 thread 18 bound to OS proc set {18}
OMP: pid 844041 tid 844168 thread 29 bound to OS proc set {29}
OMP: pid 844041 tid 844197 thread 58 bound to OS proc set {58}
OMP: pid 844041 tid 844188 thread 49 bound to OS proc set {49}
OMP: pid 844041 tid 844195 thread 56 bound to OS proc set {56}
OMP: pid 844041 tid 844156 thread 17 bound to OS proc set {17}
OMP: pid 844041 tid 844182 thread 43 bound to OS proc set {43}
OMP: pid 844041 tid 844159 thread 20 bound to OS proc set {20}
OMP: pid 844041 tid 844184 thread 45 bound to OS proc set {45}
OMP: pid 844041 tid 844194 thread 55 bound to OS proc set {55}
OMP: pid 844041 tid 844193 thread 54 bound to OS proc set {54}
OMP: pid 844041 tid 844178 thread 39 bound to OS proc set {39}
OMP: pid 844041 tid 844179 thread 40 bound to OS proc set {40}
OMP: pid 844041 tid 844207 thread 68 bound to OS proc set {68}
OMP: pid 844041 tid 844181 thread 42 bound to OS proc set {42}
OMP: pid 844041 tid 844164 thread 25 bound to OS proc set {25}
OMP: pid 844041 tid 844204 thread 65 bound to OS proc set {65}
OMP: pid 844041 tid 844196 thread 57 bound to OS proc set {57}
OMP: pid 844041 tid 844209 thread 70 bound to OS proc set {70}
OMP: pid 844041 tid 844192 thread 53 bound to OS proc set {53}
OMP: pid 844041 tid 844161 thread 22 bound to OS proc set {22}
OMP: pid 844041 tid 844173 thread 34 bound to OS proc set {34}
OMP: pid 844041 tid 844180 thread 41 bound to OS proc set {41}
OMP: pid 844041 tid 844172 thread 33 bound to OS proc set {33}
OMP: pid 844041 tid 844218 thread 79 bound to OS proc set {79}
OMP: pid 844041 tid 844175 thread 36 bound to OS proc set {36}
OMP: pid 844041 tid 844215 thread 76 bound to OS proc set {76}
OMP: pid 844041 tid 844177 thread 38 bound to OS proc set {38}
OMP: pid 844041 tid 844208 thread 69 bound to OS proc set {69}
OMP: pid 844041 tid 844160 thread 21 bound to OS proc set {21}
OMP: pid 844041 tid 844176 thread 37 bound to OS proc set {37}
OMP: pid 844041 tid 844214 thread 75 bound to OS proc set {75}
OMP: pid 844041 tid 844216 thread 77 bound to OS proc set {77}
OMP: pid 844041 tid 844231 thread 92 bound to OS proc set {92}
OMP: pid 844041 tid 844217 thread 78 bound to OS proc set {78}
OMP: pid 844041 tid 844234 thread 95 bound to OS proc set {95}
OMP: pid 844041 tid 844213 thread 74 bound to OS proc set {74}
OMP: pid 844041 tid 844233 thread 94 bound to OS proc set {94}
OMP: pid 844041 tid 844222 thread 83 bound to OS proc set {83}
OMP: pid 844041 tid 844211 thread 72 bound to OS proc set {72}
OMP: pid 844041 tid 844230 thread 91 bound to OS proc set {91}
OMP: pid 844041 tid 844228 thread 89 bound to OS proc set {89}
OMP: pid 844041 tid 844220 thread 81 bound to OS proc set {81}
OMP: pid 844041 tid 844227 thread 88 bound to OS proc set {88}
OMP: pid 844041 tid 844229 thread 90 bound to OS proc set {90}
OMP: pid 844041 tid 844219 thread 80 bound to OS proc set {80}
OMP: pid 844041 tid 844221 thread 82 bound to OS proc set {82}
OMP: pid 844041 tid 844232 thread 93 bound to OS proc set {93}
OMP: pid 844041 tid 844224 thread 85 bound to OS proc set {85}
OMP: pid 844041 tid 844225 thread 86 bound to OS proc set {86}
OMP: pid 844041 tid 844226 thread 87 bound to OS proc set {87}
OMP: pid 844041 tid 844223 thread 84 bound to OS proc set {84}
OMP: pid 844041 tid 844212 thread 73 bound to OS proc set {73}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 1.023567, "speed_pp": 1000.423096, "t_tg": 0.000000, "speed_tg": nan, "t": 1.023567, "speed": 1000.423096}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/PP128_B8_Q4/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_13-54-58/tools/lprof_npsu_run_14  #
########################################################################################################################################################################################################################################

×