options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 104.151031, "speed_tg": 9.831876, "t": 104.151031, "speed": 9.831876}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_0  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 608800 tid 608800 thread 0 bound to OS proc set {0}
OMP: pid 608800 tid 608899 thread 1 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 52.378143, "speed_tg": 19.550138, "t": 52.378143, "speed": 19.550138}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_1  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 608971 tid 608971 thread 0 bound to OS proc set {0}
OMP: pid 608971 tid 609071 thread 2 bound to OS proc set {48}
OMP: pid 608971 tid 609070 thread 1 bound to OS proc set {24}
OMP: pid 608971 tid 609072 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 26.642584, "speed_tg": 38.434711, "t": 26.642584, "speed": 38.434711}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_2  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 609092 tid 609092 thread 0 bound to OS proc set {0}
OMP: pid 609092 tid 609192 thread 2 bound to OS proc set {24}
OMP: pid 609092 tid 609194 thread 4 bound to OS proc set {48}
OMP: pid 609092 tid 609191 thread 1 bound to OS proc set {12}
OMP: pid 609092 tid 609196 thread 6 bound to OS proc set {72}
OMP: pid 609092 tid 609195 thread 5 bound to OS proc set {60}
OMP: pid 609092 tid 609193 thread 3 bound to OS proc set {36}
OMP: pid 609092 tid 609197 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 14.028880, "speed_tg": 72.992287, "t": 14.028880, "speed": 72.992287}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_3  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 609267 tid 609267 thread 0 bound to OS proc set {0}
OMP: pid 609267 tid 609367 thread 2 bound to OS proc set {12}
OMP: pid 609267 tid 609368 thread 3 bound to OS proc set {18}
OMP: pid 609267 tid 609377 thread 12 bound to OS proc set {72}
OMP: pid 609267 tid 609378 thread 13 bound to OS proc set {78}
OMP: pid 609267 tid 609379 thread 14 bound to OS proc set {84}
OMP: pid 609267 tid 609369 thread 4 bound to OS proc set {24}
OMP: pid 609267 tid 609376 thread 11 bound to OS proc set {66}
OMP: pid 609267 tid 609372 thread 7 bound to OS proc set {42}
OMP: pid 609267 tid 609375 thread 10 bound to OS proc set {60}
OMP: pid 609267 tid 609366 thread 1 bound to OS proc set {6}
OMP: pid 609267 tid 609373 thread 8 bound to OS proc set {48}
OMP: pid 609267 tid 609370 thread 5 bound to OS proc set {30}
OMP: pid 609267 tid 609374 thread 9 bound to OS proc set {54}
OMP: pid 609267 tid 609371 thread 6 bound to OS proc set {36}
OMP: pid 609267 tid 609380 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.953114, "speed_tg": 128.754593, "t": 7.953114, "speed": 128.754593}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_4  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 609401 tid 609401 thread 0 bound to OS proc set {0}
OMP: pid 609401 tid 609500 thread 1 bound to OS proc set {4}
OMP: pid 609401 tid 609503 thread 4 bound to OS proc set {16}
OMP: pid 609401 tid 609502 thread 3 bound to OS proc set {12}
OMP: pid 609401 tid 609511 thread 12 bound to OS proc set {48}
OMP: pid 609401 tid 609501 thread 2 bound to OS proc set {8}
OMP: pid 609401 tid 609514 thread 15 bound to OS proc set {60}
OMP: pid 609401 tid 609510 thread 11 bound to OS proc set {44}
OMP: pid 609401 tid 609507 thread 8 bound to OS proc set {32}
OMP: pid 609401 tid 609515 thread 16 bound to OS proc set {64}
OMP: pid 609401 tid 609517 thread 18 bound to OS proc set {72}
OMP: pid 609401 tid 609518 thread 19 bound to OS proc set {76}
OMP: pid 609401 tid 609504 thread 5 bound to OS proc set {20}
OMP: pid 609401 tid 609505 thread 6 bound to OS proc set {24}
OMP: pid 609401 tid 609506 thread 7 bound to OS proc set {28}
OMP: pid 609401 tid 609513 thread 14 bound to OS proc set {56}
OMP: pid 609401 tid 609512 thread 13 bound to OS proc set {52}
OMP: pid 609401 tid 609519 thread 20 bound to OS proc set {80}
OMP: pid 609401 tid 609509 thread 10 bound to OS proc set {40}
OMP: pid 609401 tid 609516 thread 17 bound to OS proc set {68}
OMP: pid 609401 tid 609508 thread 9 bound to OS proc set {36}
OMP: pid 609401 tid 609520 thread 21 bound to OS proc set {84}
OMP: pid 609401 tid 609521 thread 22 bound to OS proc set {88}
OMP: pid 609401 tid 609522 thread 23 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.335299, "speed_tg": 161.634048, "t": 6.335299, "speed": 161.634048}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_5  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 609542 tid 609542 thread 0 bound to OS proc set {0}
OMP: pid 609542 tid 609641 thread 1 bound to OS proc set {3}
OMP: pid 609542 tid 609644 thread 4 bound to OS proc set {12}
OMP: pid 609542 tid 609652 thread 12 bound to OS proc set {36}
OMP: pid 609542 tid 609655 thread 15 bound to OS proc set {45}
OMP: pid 609542 tid 609645 thread 5 bound to OS proc set {15}
OMP: pid 609542 tid 609642 thread 2 bound to OS proc set {6}
OMP: pid 609542 tid 609653 thread 13 bound to OS proc set {39}
OMP: pid 609542 tid 609654 thread 14 bound to OS proc set {42}
OMP: pid 609542 tid 609648 thread 8 bound to OS proc set {24}
OMP: pid 609542 tid 609651 thread 11 bound to OS proc set {33}
OMP: pid 609542 tid 609647 thread 7 bound to OS proc set {21}
OMP: pid 609542 tid 609668 thread 28 bound to OS proc set {84}
OMP: pid 609542 tid 609646 thread 6 bound to OS proc set {18}
OMP: pid 609542 tid 609643 thread 3 bound to OS proc set {9}
OMP: pid 609542 tid 609656 thread 16 bound to OS proc set {48}
OMP: pid 609542 tid 609649 thread 9 bound to OS proc set {27}
OMP: pid 609542 tid 609659 thread 19 bound to OS proc set {57}
OMP: pid 609542 tid 609667 thread 27 bound to OS proc set {81}
OMP: pid 609542 tid 609664 thread 24 bound to OS proc set {72}
OMP: pid 609542 tid 609650 thread 10 bound to OS proc set {30}
OMP: pid 609542 tid 609670 thread 30 bound to OS proc set {90}
OMP: pid 609542 tid 609658 thread 18 bound to OS proc set {54}
OMP: pid 609542 tid 609666 thread 26 bound to OS proc set {78}
OMP: pid 609542 tid 609660 thread 20 bound to OS proc set {60}
OMP: pid 609542 tid 609669 thread 29 bound to OS proc set {87}
OMP: pid 609542 tid 609665 thread 25 bound to OS proc set {75}
OMP: pid 609542 tid 609671 thread 31 bound to OS proc set {93}
OMP: pid 609542 tid 609657 thread 17 bound to OS proc set {51}
OMP: pid 609542 tid 609663 thread 23 bound to OS proc set {69}
OMP: pid 609542 tid 609662 thread 22 bound to OS proc set {66}
OMP: pid 609542 tid 609661 thread 21 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.420687, "speed_tg": 188.905945, "t": 5.420687, "speed": 188.905945}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_6  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 609691 tid 609691 thread 0 bound to OS proc set {0}
OMP: pid 609691 tid 609804 thread 15 bound to OS proc set {36}
OMP: pid 609691 tid 609791 thread 2 bound to OS proc set {4}
OMP: pid 609691 tid 609824 thread 35 bound to OS proc set {84}
OMP: pid 609691 tid 609790 thread 1 bound to OS proc set {2}
OMP: pid 609691 tid 609821 thread 32 bound to OS proc set {77}
OMP: pid 609691 tid 609803 thread 14 bound to OS proc set {33}
OMP: pid 609691 tid 609795 thread 6 bound to OS proc set {14}
OMP: pid 609691 tid 609792 thread 3 bound to OS proc set {7}
OMP: pid 609691 tid 609801 thread 12 bound to OS proc set {29}
OMP: pid 609691 tid 609797 thread 8 bound to OS proc set {19}
OMP: pid 609691 tid 609793 thread 4 bound to OS proc set {9}
OMP: pid 609691 tid 609796 thread 7 bound to OS proc set {16}
OMP: pid 609691 tid 609802 thread 13 bound to OS proc set {31}
OMP: pid 609691 tid 609823 thread 34 bound to OS proc set {82}
OMP: pid 609691 tid 609825 thread 36 bound to OS proc set {87}
OMP: pid 609691 tid 609799 thread 10 bound to OS proc set {24}
OMP: pid 609691 tid 609828 thread 39 bound to OS proc set {94}
OMP: pid 609691 tid 609794 thread 5 bound to OS proc set {12}
OMP: pid 609691 tid 609820 thread 31 bound to OS proc set {75}
OMP: pid 609691 tid 609827 thread 38 bound to OS proc set {92}
OMP: pid 609691 tid 609805 thread 16 bound to OS proc set {38}
OMP: pid 609691 tid 609822 thread 33 bound to OS proc set {80}
OMP: pid 609691 tid 609819 thread 30 bound to OS proc set {72}
OMP: pid 609691 tid 609807 thread 18 bound to OS proc set {43}
OMP: pid 609691 tid 609813 thread 24 bound to OS proc set {58}
OMP: pid 609691 tid 609808 thread 19 bound to OS proc set {46}
OMP: pid 609691 tid 609816 thread 27 bound to OS proc set {65}
OMP: pid 609691 tid 609800 thread 11 bound to OS proc set {26}
OMP: pid 609691 tid 609798 thread 9 bound to OS proc set {21}
OMP: pid 609691 tid 609815 thread 26 bound to OS proc set {63}
OMP: pid 609691 tid 609818 thread 29 bound to OS proc set {70}
OMP: pid 609691 tid 609806 thread 17 bound to OS proc set {41}
OMP: pid 609691 tid 609812 thread 23 bound to OS proc set {55}
OMP: pid 609691 tid 609826 thread 37 bound to OS proc set {89}
OMP: pid 609691 tid 609814 thread 25 bound to OS proc set {60}
OMP: pid 609691 tid 609817 thread 28 bound to OS proc set {67}
OMP: pid 609691 tid 609811 thread 22 bound to OS proc set {53}
OMP: pid 609691 tid 609810 thread 21 bound to OS proc set {50}
OMP: pid 609691 tid 609809 thread 20 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.950255, "speed_tg": 206.858032, "t": 4.950255, "speed": 206.858032}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_7  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 609896 tid 609896 thread 0 bound to OS proc set {0}
OMP: pid 609896 tid 610014 thread 19 bound to OS proc set {38}
OMP: pid 609896 tid 610005 thread 10 bound to OS proc set {20}
OMP: pid 609896 tid 610002 thread 7 bound to OS proc set {14}
OMP: pid 609896 tid 610006 thread 11 bound to OS proc set {22}
OMP: pid 609896 tid 610004 thread 9 bound to OS proc set {18}
OMP: pid 609896 tid 609996 thread 1 bound to OS proc set {2}
OMP: pid 609896 tid 610003 thread 8 bound to OS proc set {16}
OMP: pid 609896 tid 610011 thread 16 bound to OS proc set {32}
OMP: pid 609896 tid 610019 thread 24 bound to OS proc set {48}
OMP: pid 609896 tid 609997 thread 2 bound to OS proc set {4}
OMP: pid 609896 tid 610039 thread 44 bound to OS proc set {88}
OMP: pid 609896 tid 610023 thread 28 bound to OS proc set {56}
OMP: pid 609896 tid 609998 thread 3 bound to OS proc set {6}
OMP: pid 609896 tid 610001 thread 6 bound to OS proc set {12}
OMP: pid 609896 tid 610015 thread 20 bound to OS proc set {40}
OMP: pid 609896 tid 610027 thread 32 bound to OS proc set {64}
OMP: pid 609896 tid 610026 thread 31 bound to OS proc set {62}
OMP: pid 609896 tid 610025 thread 30 bound to OS proc set {60}
OMP: pid 609896 tid 610013 thread 18 bound to OS proc set {36}
OMP: pid 609896 tid 610010 thread 15 bound to OS proc set {30}
OMP: pid 609896 tid 610009 thread 14 bound to OS proc set {28}
OMP: pid 609896 tid 610007 thread 12 bound to OS proc set {24}
OMP: pid 609896 tid 610000 thread 5 bound to OS proc set {10}
OMP: pid 609896 tid 610021 thread 26 bound to OS proc set {52}
OMP: pid 609896 tid 610012 thread 17 bound to OS proc set {34}
OMP: pid 609896 tid 610008 thread 13 bound to OS proc set {26}
OMP: pid 609896 tid 610029 thread 34 bound to OS proc set {68}
OMP: pid 609896 tid 609999 thread 4 bound to OS proc set {8}
OMP: pid 609896 tid 610024 thread 29 bound to OS proc set {58}
OMP: pid 609896 tid 610041 thread 46 bound to OS proc set {92}
OMP: pid 609896 tid 610042 thread 47 bound to OS proc set {94}
OMP: pid 609896 tid 610022 thread 27 bound to OS proc set {54}
OMP: pid 609896 tid 610018 thread 23 bound to OS proc set {46}
OMP: pid 609896 tid 610030 thread 35 bound to OS proc set {70}
OMP: pid 609896 tid 610034 thread 39 bound to OS proc set {78}
OMP: pid 609896 tid 610017 thread 22 bound to OS proc set {44}
OMP: pid 609896 tid 610038 thread 43 bound to OS proc set {86}
OMP: pid 609896 tid 610035 thread 40 bound to OS proc set {80}
OMP: pid 609896 tid 610020 thread 25 bound to OS proc set {50}
OMP: pid 609896 tid 610028 thread 33 bound to OS proc set {66}
OMP: pid 609896 tid 610033 thread 38 bound to OS proc set {76}
OMP: pid 609896 tid 610040 thread 45 bound to OS proc set {90}
OMP: pid 609896 tid 610031 thread 36 bound to OS proc set {72}
OMP: pid 609896 tid 610036 thread 41 bound to OS proc set {82}
OMP: pid 609896 tid 610037 thread 42 bound to OS proc set {84}
OMP: pid 609896 tid 610016 thread 21 bound to OS proc set {42}
OMP: pid 609896 tid 610032 thread 37 bound to OS proc set {74}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.691680, "speed_tg": 218.258713, "t": 4.691680, "speed": 218.258713}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_8  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 610062 tid 610062 thread 0 bound to OS proc set {0}
OMP: pid 610062 tid 610161 thread 1 bound to OS proc set {1}
OMP: pid 610062 tid 610168 thread 8 bound to OS proc set {13}
OMP: pid 610062 tid 610191 thread 31 bound to OS proc set {53}
OMP: pid 610062 tid 610167 thread 7 bound to OS proc set {12}
OMP: pid 610062 tid 610192 thread 32 bound to OS proc set {55}
OMP: pid 610062 tid 610187 thread 27 bound to OS proc set {46}
OMP: pid 610062 tid 610211 thread 51 bound to OS proc set {88}
OMP: pid 610062 tid 610208 thread 48 bound to OS proc set {83}
OMP: pid 610062 tid 610210 thread 50 bound to OS proc set {86}
OMP: pid 610062 tid 610212 thread 52 bound to OS proc set {90}
OMP: pid 610062 tid 610169 thread 9 bound to OS proc set {15}
OMP: pid 610062 tid 610209 thread 49 bound to OS proc set {84}
OMP: pid 610062 tid 610162 thread 2 bound to OS proc set {3}
OMP: pid 610062 tid 610175 thread 15 bound to OS proc set {25}
OMP: pid 610062 tid 610214 thread 54 bound to OS proc set {93}
OMP: pid 610062 tid 610179 thread 19 bound to OS proc set {32}
OMP: pid 610062 tid 610215 thread 55 bound to OS proc set {95}
OMP: pid 610062 tid 610171 thread 11 bound to OS proc set {19}
OMP: pid 610062 tid 610213 thread 53 bound to OS proc set {91}
OMP: pid 610062 tid 610164 thread 4 bound to OS proc set {6}
OMP: pid 610062 tid 610189 thread 29 bound to OS proc set {50}
OMP: pid 610062 tid 610190 thread 30 bound to OS proc set {51}
OMP: pid 610062 tid 610170 thread 10 bound to OS proc set {17}
OMP: pid 610062 tid 610178 thread 18 bound to OS proc set {31}
OMP: pid 610062 tid 610184 thread 24 bound to OS proc set {41}
OMP: pid 610062 tid 610172 thread 12 bound to OS proc set {20}
OMP: pid 610062 tid 610188 thread 28 bound to OS proc set {48}
OMP: pid 610062 tid 610195 thread 35 bound to OS proc set {60}
OMP: pid 610062 tid 610194 thread 34 bound to OS proc set {58}
OMP: pid 610062 tid 610193 thread 33 bound to OS proc set {57}
OMP: pid 610062 tid 610174 thread 14 bound to OS proc set {24}
OMP: pid 610062 tid 610182 thread 22 bound to OS proc set {38}
OMP: pid 610062 tid 610185 thread 25 bound to OS proc set {43}
OMP: pid 610062 tid 610180 thread 20 bound to OS proc set {34}
OMP: pid 610062 tid 610186 thread 26 bound to OS proc set {45}
OMP: pid 610062 tid 610165 thread 5 bound to OS proc set {8}
OMP: pid 610062 tid 610204 thread 44 bound to OS proc set {76}
OMP: pid 610062 tid 610176 thread 16 bound to OS proc set {27}
OMP: pid 610062 tid 610177 thread 17 bound to OS proc set {29}
OMP: pid 610062 tid 610163 thread 3 bound to OS proc set {5}
OMP: pid 610062 tid 610207 thread 47 bound to OS proc set {81}
OMP: pid 610062 tid 610166 thread 6 bound to OS proc set {10}
OMP: pid 610062 tid 610203 thread 43 bound to OS proc set {74}
OMP: pid 610062 tid 610181 thread 21 bound to OS proc set {36}
OMP: pid 610062 tid 610200 thread 40 bound to OS proc set {69}
OMP: pid 610062 tid 610183 thread 23 bound to OS proc set {39}
OMP: pid 610062 tid 610173 thread 13 bound to OS proc set {22}
OMP: pid 610062 tid 610198 thread 38 bound to OS proc set {65}
OMP: pid 610062 tid 610199 thread 39 bound to OS proc set {67}
OMP: pid 610062 tid 610196 thread 36 bound to OS proc set {62}
OMP: pid 610062 tid 610202 thread 42 bound to OS proc set {72}
OMP: pid 610062 tid 610206 thread 46 bound to OS proc set {79}
OMP: pid 610062 tid 610197 thread 37 bound to OS proc set {64}
OMP: pid 610062 tid 610205 thread 45 bound to OS proc set {77}
OMP: pid 610062 tid 610201 thread 41 bound to OS proc set {71}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.537907, "speed_tg": 225.654678, "t": 4.537907, "speed": 225.654678}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_9  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 610236 tid 610236 thread 0 bound to OS proc set {0}
OMP: pid 610236 tid 610335 thread 1 bound to OS proc set {1}
OMP: pid 610236 tid 610365 thread 31 bound to OS proc set {46}
OMP: pid 610236 tid 610385 thread 51 bound to OS proc set {77}
OMP: pid 610236 tid 610382 thread 48 bound to OS proc set {72}
OMP: pid 610236 tid 610397 thread 63 bound to OS proc set {95}
OMP: pid 610236 tid 610341 thread 7 bound to OS proc set {10}
OMP: pid 610236 tid 610337 thread 3 bound to OS proc set {4}
OMP: pid 610236 tid 610364 thread 30 bound to OS proc set {45}
OMP: pid 610236 tid 610346 thread 12 bound to OS proc set {18}
OMP: pid 610236 tid 610336 thread 2 bound to OS proc set {3}
OMP: pid 610236 tid 610368 thread 34 bound to OS proc set {51}
OMP: pid 610236 tid 610348 thread 14 bound to OS proc set {21}
OMP: pid 610236 tid 610381 thread 47 bound to OS proc set {71}
OMP: pid 610236 tid 610369 thread 35 bound to OS proc set {53}
OMP: pid 610236 tid 610362 thread 28 bound to OS proc set {42}
OMP: pid 610236 tid 610345 thread 11 bound to OS proc set {16}
OMP: pid 610236 tid 610374 thread 40 bound to OS proc set {60}
OMP: pid 610236 tid 610367 thread 33 bound to OS proc set {50}
OMP: pid 610236 tid 610358 thread 24 bound to OS proc set {36}
OMP: pid 610236 tid 610343 thread 9 bound to OS proc set {13}
OMP: pid 610236 tid 610377 thread 43 bound to OS proc set {65}
OMP: pid 610236 tid 610353 thread 19 bound to OS proc set {28}
OMP: pid 610236 tid 610363 thread 29 bound to OS proc set {43}
OMP: pid 610236 tid 610378 thread 44 bound to OS proc set {66}
OMP: pid 610236 tid 610393 thread 59 bound to OS proc set {89}
OMP: pid 610236 tid 610344 thread 10 bound to OS proc set {15}
OMP: pid 610236 tid 610390 thread 56 bound to OS proc set {84}
OMP: pid 610236 tid 610376 thread 42 bound to OS proc set {63}
OMP: pid 610236 tid 610338 thread 4 bound to OS proc set {6}
OMP: pid 610236 tid 610342 thread 8 bound to OS proc set {12}
OMP: pid 610236 tid 610375 thread 41 bound to OS proc set {62}
OMP: pid 610236 tid 610340 thread 6 bound to OS proc set {9}
OMP: pid 610236 tid 610361 thread 27 bound to OS proc set {40}
OMP: pid 610236 tid 610349 thread 15 bound to OS proc set {22}
OMP: pid 610236 tid 610351 thread 17 bound to OS proc set {25}
OMP: pid 610236 tid 610373 thread 39 bound to OS proc set {59}
OMP: pid 610236 tid 610347 thread 13 bound to OS proc set {19}
OMP: pid 610236 tid 610357 thread 23 bound to OS proc set {34}
OMP: pid 610236 tid 610366 thread 32 bound to OS proc set {48}
OMP: pid 610236 tid 610383 thread 49 bound to OS proc set {74}
OMP: pid 610236 tid 610384 thread 50 bound to OS proc set {75}
OMP: pid 610236 tid 610396 thread 62 bound to OS proc set {93}
OMP: pid 610236 tid 610354 thread 20 bound to OS proc set {30}
OMP: pid 610236 tid 610359 thread 25 bound to OS proc set {37}
OMP: pid 610236 tid 610350 thread 16 bound to OS proc set {24}
OMP: pid 610236 tid 610371 thread 37 bound to OS proc set {56}
OMP: pid 610236 tid 610372 thread 38 bound to OS proc set {57}
OMP: pid 610236 tid 610339 thread 5 bound to OS proc set {7}
OMP: pid 610236 tid 610360 thread 26 bound to OS proc set {39}
OMP: pid 610236 tid 610389 thread 55 bound to OS proc set {83}
OMP: pid 610236 tid 610352 thread 18 bound to OS proc set {27}
OMP: pid 610236 tid 610386 thread 52 bound to OS proc set {78}
OMP: pid 610236 tid 610356 thread 22 bound to OS proc set {33}
OMP: pid 610236 tid 610355 thread 21 bound to OS proc set {31}
OMP: pid 610236 tid 610392 thread 58 bound to OS proc set {87}
OMP: pid 610236 tid 610394 thread 60 bound to OS proc set {90}
OMP: pid 610236 tid 610388 thread 54 bound to OS proc set {81}
OMP: pid 610236 tid 610395 thread 61 bound to OS proc set {92}
OMP: pid 610236 tid 610387 thread 53 bound to OS proc set {80}
OMP: pid 610236 tid 610391 thread 57 bound to OS proc set {86}
OMP: pid 610236 tid 610380 thread 46 bound to OS proc set {69}
OMP: pid 610236 tid 610379 thread 45 bound to OS proc set {68}
OMP: pid 610236 tid 610370 thread 36 bound to OS proc set {54}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.414943, "speed_tg": 231.939560, "t": 4.414943, "speed": 231.939560}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_10  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 610417 tid 610417 thread 0 bound to OS proc set {0}
OMP: pid 610417 tid 610517 thread 2 bound to OS proc set {2}
OMP: pid 610417 tid 610516 thread 1 bound to OS proc set {1}
OMP: pid 610417 tid 610527 thread 12 bound to OS proc set {16}
OMP: pid 610417 tid 610582 thread 67 bound to OS proc set {90}
OMP: pid 610417 tid 610579 thread 64 bound to OS proc set {86}
OMP: pid 610417 tid 610565 thread 50 bound to OS proc set {67}
OMP: pid 610417 tid 610526 thread 11 bound to OS proc set {14}
OMP: pid 610417 tid 610581 thread 66 bound to OS proc set {88}
OMP: pid 610417 tid 610563 thread 48 bound to OS proc set {64}
OMP: pid 610417 tid 610566 thread 51 bound to OS proc set {68}
OMP: pid 610417 tid 610525 thread 10 bound to OS proc set {13}
OMP: pid 610417 tid 610528 thread 13 bound to OS proc set {17}
OMP: pid 610417 tid 610580 thread 65 bound to OS proc set {87}
OMP: pid 610417 tid 610523 thread 8 bound to OS proc set {10}
OMP: pid 610417 tid 610578 thread 63 bound to OS proc set {84}
OMP: pid 610417 tid 610586 thread 71 bound to OS proc set {95}
OMP: pid 610417 tid 610539 thread 24 bound to OS proc set {32}
OMP: pid 610417 tid 610585 thread 70 bound to OS proc set {94}
OMP: pid 610417 tid 610575 thread 60 bound to OS proc set {80}
OMP: pid 610417 tid 610522 thread 7 bound to OS proc set {9}
OMP: pid 610417 tid 610574 thread 59 bound to OS proc set {79}
OMP: pid 610417 tid 610583 thread 68 bound to OS proc set {91}
OMP: pid 610417 tid 610584 thread 69 bound to OS proc set {92}
OMP: pid 610417 tid 610577 thread 62 bound to OS proc set {83}
OMP: pid 610417 tid 610529 thread 14 bound to OS proc set {18}
OMP: pid 610417 tid 610543 thread 28 bound to OS proc set {37}
OMP: pid 610417 tid 610561 thread 46 bound to OS proc set {61}
OMP: pid 610417 tid 610542 thread 27 bound to OS proc set {36}
OMP: pid 610417 tid 610521 thread 6 bound to OS proc set {8}
OMP: pid 610417 tid 610564 thread 49 bound to OS proc set {66}
OMP: pid 610417 tid 610555 thread 40 bound to OS proc set {53}
OMP: pid 610417 tid 610531 thread 16 bound to OS proc set {21}
OMP: pid 610417 tid 610549 thread 34 bound to OS proc set {45}
OMP: pid 610417 tid 610546 thread 31 bound to OS proc set {41}
OMP: pid 610417 tid 610562 thread 47 bound to OS proc set {63}
OMP: pid 610417 tid 610550 thread 35 bound to OS proc set {47}
OMP: pid 610417 tid 610530 thread 15 bound to OS proc set {20}
OMP: pid 610417 tid 610554 thread 39 bound to OS proc set {52}
OMP: pid 610417 tid 610558 thread 43 bound to OS proc set {57}
OMP: pid 610417 tid 610547 thread 32 bound to OS proc set {43}
OMP: pid 610417 tid 610560 thread 45 bound to OS proc set {60}
OMP: pid 610417 tid 610534 thread 19 bound to OS proc set {25}
OMP: pid 610417 tid 610518 thread 3 bound to OS proc set {4}
OMP: pid 610417 tid 610570 thread 55 bound to OS proc set {74}
OMP: pid 610417 tid 610538 thread 23 bound to OS proc set {30}
OMP: pid 610417 tid 610541 thread 26 bound to OS proc set {35}
OMP: pid 610417 tid 610540 thread 25 bound to OS proc set {33}
OMP: pid 610417 tid 610532 thread 17 bound to OS proc set {22}
OMP: pid 610417 tid 610545 thread 30 bound to OS proc set {40}
OMP: pid 610417 tid 610519 thread 4 bound to OS proc set {5}
OMP: pid 610417 tid 610524 thread 9 bound to OS proc set {12}
OMP: pid 610417 tid 610559 thread 44 bound to OS proc set {59}
OMP: pid 610417 tid 610544 thread 29 bound to OS proc set {39}
OMP: pid 610417 tid 610557 thread 42 bound to OS proc set {56}
OMP: pid 610417 tid 610567 thread 52 bound to OS proc set {70}
OMP: pid 610417 tid 610535 thread 20 bound to OS proc set {26}
OMP: pid 610417 tid 610548 thread 33 bound to OS proc set {44}
OMP: pid 610417 tid 610571 thread 56 bound to OS proc set {75}
OMP: pid 610417 tid 610568 thread 53 bound to OS proc set {71}
OMP: pid 610417 tid 610536 thread 21 bound to OS proc set {28}
OMP: pid 610417 tid 610551 thread 36 bound to OS proc set {48}
OMP: pid 610417 tid 610573 thread 58 bound to OS proc set {78}
OMP: pid 610417 tid 610569 thread 54 bound to OS proc set {72}
OMP: pid 610417 tid 610552 thread 37 bound to OS proc set {49}
OMP: pid 610417 tid 610553 thread 38 bound to OS proc set {51}
OMP: pid 610417 tid 610576 thread 61 bound to OS proc set {82}
OMP: pid 610417 tid 610533 thread 18 bound to OS proc set {24}
OMP: pid 610417 tid 610556 thread 41 bound to OS proc set {55}
OMP: pid 610417 tid 610537 thread 22 bound to OS proc set {29}
OMP: pid 610417 tid 610520 thread 5 bound to OS proc set {6}
OMP: pid 610417 tid 610572 thread 57 bound to OS proc set {76}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.410836, "speed_tg": 232.155533, "t": 4.410836, "speed": 232.155533}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_11  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 610656 tid 610656 thread 0 bound to OS proc set {0}
OMP: pid 610656 tid 610757 thread 3 bound to OS proc set {3}
OMP: pid 610656 tid 610756 thread 2 bound to OS proc set {2}
OMP: pid 610656 tid 610755 thread 1 bound to OS proc set {1}
OMP: pid 610656 tid 610758 thread 4 bound to OS proc set {4}
OMP: pid 610656 tid 610769 thread 15 bound to OS proc set {18}
OMP: pid 610656 tid 610768 thread 14 bound to OS proc set {16}
OMP: pid 610656 tid 610833 thread 79 bound to OS proc set {95}
OMP: pid 610656 tid 610761 thread 7 bound to OS proc set {8}
OMP: pid 610656 tid 610764 thread 10 bound to OS proc set {12}
OMP: pid 610656 tid 610820 thread 66 bound to OS proc set {80}
OMP: pid 610656 tid 610782 thread 28 bound to OS proc set {33}
OMP: pid 610656 tid 610766 thread 12 bound to OS proc set {14}
OMP: pid 610656 tid 610784 thread 30 bound to OS proc set {36}
OMP: pid 610656 tid 610804 thread 50 bound to OS proc set {60}
OMP: pid 610656 tid 610788 thread 34 bound to OS proc set {41}
OMP: pid 610656 tid 610830 thread 76 bound to OS proc set {92}
OMP: pid 610656 tid 610781 thread 27 bound to OS proc set {32}
OMP: pid 610656 tid 610765 thread 11 bound to OS proc set {13}
OMP: pid 610656 tid 610821 thread 67 bound to OS proc set {81}
OMP: pid 610656 tid 610801 thread 47 bound to OS proc set {56}
OMP: pid 610656 tid 610794 thread 40 bound to OS proc set {48}
OMP: pid 610656 tid 610762 thread 8 bound to OS proc set {9}
OMP: pid 610656 tid 610810 thread 56 bound to OS proc set {67}
OMP: pid 610656 tid 610803 thread 49 bound to OS proc set {59}
OMP: pid 610656 tid 610778 thread 24 bound to OS proc set {29}
OMP: pid 610656 tid 610767 thread 13 bound to OS proc set {15}
OMP: pid 610656 tid 610770 thread 16 bound to OS proc set {19}
OMP: pid 610656 tid 610793 thread 39 bound to OS proc set {47}
OMP: pid 610656 tid 610805 thread 51 bound to OS proc set {61}
OMP: pid 610656 tid 610809 thread 55 bound to OS proc set {66}
OMP: pid 610656 tid 610779 thread 25 bound to OS proc set {30}
OMP: pid 610656 tid 610796 thread 42 bound to OS proc set {50}
OMP: pid 610656 tid 610832 thread 78 bound to OS proc set {94}
OMP: pid 610656 tid 610819 thread 65 bound to OS proc set {78}
OMP: pid 610656 tid 610760 thread 6 bound to OS proc set {7}
OMP: pid 610656 tid 610802 thread 48 bound to OS proc set {58}
OMP: pid 610656 tid 610789 thread 35 bound to OS proc set {42}
OMP: pid 610656 tid 610777 thread 23 bound to OS proc set {27}
OMP: pid 610656 tid 610785 thread 31 bound to OS proc set {37}
OMP: pid 610656 tid 610798 thread 44 bound to OS proc set {53}
OMP: pid 610656 tid 610773 thread 19 bound to OS proc set {23}
OMP: pid 610656 tid 610817 thread 63 bound to OS proc set {76}
OMP: pid 610656 tid 610797 thread 43 bound to OS proc set {52}
OMP: pid 610656 tid 610763 thread 9 bound to OS proc set {10}
OMP: pid 610656 tid 610771 thread 17 bound to OS proc set {20}
OMP: pid 610656 tid 610792 thread 38 bound to OS proc set {46}
OMP: pid 610656 tid 610776 thread 22 bound to OS proc set {26}
OMP: pid 610656 tid 610818 thread 64 bound to OS proc set {77}
OMP: pid 610656 tid 610780 thread 26 bound to OS proc set {31}
OMP: pid 610656 tid 610783 thread 29 bound to OS proc set {35}
OMP: pid 610656 tid 610772 thread 18 bound to OS proc set {21}
OMP: pid 610656 tid 610815 thread 61 bound to OS proc set {73}
OMP: pid 610656 tid 610786 thread 32 bound to OS proc set {38}
OMP: pid 610656 tid 610800 thread 46 bound to OS proc set {55}
OMP: pid 610656 tid 610811 thread 57 bound to OS proc set {69}
OMP: pid 610656 tid 610814 thread 60 bound to OS proc set {72}
OMP: pid 610656 tid 610816 thread 62 bound to OS proc set {75}
OMP: pid 610656 tid 610787 thread 33 bound to OS proc set {40}
OMP: pid 610656 tid 610795 thread 41 bound to OS proc set {49}
OMP: pid 610656 tid 610808 thread 54 bound to OS proc set {65}
OMP: pid 610656 tid 610806 thread 52 bound to OS proc set {63}
OMP: pid 610656 tid 610831 thread 77 bound to OS proc set {93}
OMP: pid 610656 tid 610790 thread 36 bound to OS proc set {43}
OMP: pid 610656 tid 610813 thread 59 bound to OS proc set {71}
OMP: pid 610656 tid 610826 thread 72 bound to OS proc set {87}
OMP: pid 610656 tid 610791 thread 37 bound to OS proc set {44}
OMP: pid 610656 tid 610812 thread 58 bound to OS proc set {70}
OMP: pid 610656 tid 610774 thread 20 bound to OS proc set {24}
OMP: pid 610656 tid 610759 thread 5 bound to OS proc set {6}
OMP: pid 610656 tid 610822 thread 68 bound to OS proc set {82}
OMP: pid 610656 tid 610799 thread 45 bound to OS proc set {54}
OMP: pid 610656 tid 610829 thread 75 bound to OS proc set {90}
OMP: pid 610656 tid 610828 thread 74 bound to OS proc set {89}
OMP: pid 610656 tid 610823 thread 69 bound to OS proc set {83}
OMP: pid 610656 tid 610825 thread 71 bound to OS proc set {86}
OMP: pid 610656 tid 610807 thread 53 bound to OS proc set {64}
OMP: pid 610656 tid 610824 thread 70 bound to OS proc set {84}
OMP: pid 610656 tid 610775 thread 21 bound to OS proc set {25}
OMP: pid 610656 tid 610827 thread 73 bound to OS proc set {88}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.411217, "speed_tg": 232.135468, "t": 4.411217, "speed": 232.135468}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_12  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 610853 tid 610853 thread 0 bound to OS proc set {0}
OMP: pid 610853 tid 610954 thread 3 bound to OS proc set {3}
OMP: pid 610853 tid 610953 thread 2 bound to OS proc set {2}
OMP: pid 610853 tid 610978 thread 27 bound to OS proc set {29}
OMP: pid 610853 tid 610959 thread 8 bound to OS proc set {8}
OMP: pid 610853 tid 610979 thread 28 bound to OS proc set {30}
OMP: pid 610853 tid 610958 thread 7 bound to OS proc set {7}
OMP: pid 610853 tid 610952 thread 1 bound to OS proc set {1}
OMP: pid 610853 tid 610955 thread 4 bound to OS proc set {4}
OMP: pid 610853 tid 610975 thread 24 bound to OS proc set {26}
OMP: pid 610853 tid 610957 thread 6 bound to OS proc set {6}
OMP: pid 610853 tid 610977 thread 26 bound to OS proc set {28}
OMP: pid 610853 tid 610980 thread 29 bound to OS proc set {31}
OMP: pid 610853 tid 610960 thread 9 bound to OS proc set {9}
OMP: pid 610853 tid 610956 thread 5 bound to OS proc set {5}
OMP: pid 610853 tid 610976 thread 25 bound to OS proc set {27}
OMP: pid 610853 tid 610965 thread 14 bound to OS proc set {15}
OMP: pid 610853 tid 611010 thread 59 bound to OS proc set {65}
OMP: pid 610853 tid 610999 thread 48 bound to OS proc set {52}
OMP: pid 610853 tid 610962 thread 11 bound to OS proc set {12}
OMP: pid 610853 tid 610981 thread 30 bound to OS proc set {33}
OMP: pid 610853 tid 610985 thread 34 bound to OS proc set {37}
OMP: pid 610853 tid 611014 thread 63 bound to OS proc set {69}
OMP: pid 610853 tid 610998 thread 47 bound to OS proc set {51}
OMP: pid 610853 tid 610966 thread 15 bound to OS proc set {16}
OMP: pid 610853 tid 610982 thread 31 bound to OS proc set {34}
OMP: pid 610853 tid 610963 thread 12 bound to OS proc set {13}
OMP: pid 610853 tid 611006 thread 55 bound to OS proc set {60}
OMP: pid 610853 tid 611009 thread 58 bound to OS proc set {63}
OMP: pid 610853 tid 610969 thread 18 bound to OS proc set {19}
OMP: pid 610853 tid 611011 thread 60 bound to OS proc set {66}
OMP: pid 610853 tid 610997 thread 46 bound to OS proc set {50}
OMP: pid 610853 tid 610990 thread 39 bound to OS proc set {42}
OMP: pid 610853 tid 611027 thread 76 bound to OS proc set {83}
OMP: pid 610853 tid 610995 thread 44 bound to OS proc set {48}
OMP: pid 610853 tid 610964 thread 13 bound to OS proc set {14}
OMP: pid 610853 tid 611002 thread 51 bound to OS proc set {56}
OMP: pid 610853 tid 610961 thread 10 bound to OS proc set {11}
OMP: pid 610853 tid 611000 thread 49 bound to OS proc set {54}
OMP: pid 610853 tid 611028 thread 77 bound to OS proc set {84}
OMP: pid 610853 tid 610986 thread 35 bound to OS proc set {38}
OMP: pid 610853 tid 611001 thread 50 bound to OS proc set {55}
OMP: pid 610853 tid 611007 thread 56 bound to OS proc set {61}
OMP: pid 610853 tid 611005 thread 54 bound to OS proc set {59}
OMP: pid 610853 tid 610993 thread 42 bound to OS proc set {46}
OMP: pid 610853 tid 611029 thread 78 bound to OS proc set {85}
OMP: pid 610853 tid 611008 thread 57 bound to OS proc set {62}
OMP: pid 610853 tid 610983 thread 32 bound to OS proc set {35}
OMP: pid 610853 tid 610994 thread 43 bound to OS proc set {47}
OMP: pid 610853 tid 611013 thread 62 bound to OS proc set {68}
OMP: pid 610853 tid 611004 thread 53 bound to OS proc set {58}
OMP: pid 610853 tid 610984 thread 33 bound to OS proc set {36}
OMP: pid 610853 tid 611033 thread 82 bound to OS proc set {90}
OMP: pid 610853 tid 611012 thread 61 bound to OS proc set {67}
OMP: pid 610853 tid 610970 thread 19 bound to OS proc set {20}
OMP: pid 610853 tid 610992 thread 41 bound to OS proc set {45}
OMP: pid 610853 tid 611018 thread 67 bound to OS proc set {73}
OMP: pid 610853 tid 611003 thread 52 bound to OS proc set {57}
OMP: pid 610853 tid 610987 thread 36 bound to OS proc set {39}
OMP: pid 610853 tid 610996 thread 45 bound to OS proc set {49}
OMP: pid 610853 tid 610974 thread 23 bound to OS proc set {25}
OMP: pid 610853 tid 611025 thread 74 bound to OS proc set {81}
OMP: pid 610853 tid 610989 thread 38 bound to OS proc set {41}
OMP: pid 610853 tid 610991 thread 40 bound to OS proc set {44}
OMP: pid 610853 tid 610967 thread 16 bound to OS proc set {17}
OMP: pid 610853 tid 610971 thread 20 bound to OS proc set {22}
OMP: pid 610853 tid 611019 thread 68 bound to OS proc set {74}
OMP: pid 610853 tid 611026 thread 75 bound to OS proc set {82}
OMP: pid 610853 tid 610972 thread 21 bound to OS proc set {23}
OMP: pid 610853 tid 611017 thread 66 bound to OS proc set {72}
OMP: pid 610853 tid 610973 thread 22 bound to OS proc set {24}
OMP: pid 610853 tid 611034 thread 83 bound to OS proc set {91}
OMP: pid 610853 tid 611015 thread 64 bound to OS proc set {70}
OMP: pid 610853 tid 610988 thread 37 bound to OS proc set {40}
OMP: pid 610853 tid 611021 thread 70 bound to OS proc set {77}
OMP: pid 610853 tid 611024 thread 73 bound to OS proc set {80}
OMP: pid 610853 tid 611030 thread 79 bound to OS proc set {87}
OMP: pid 610853 tid 610968 thread 17 bound to OS proc set {18}
OMP: pid 610853 tid 611016 thread 65 bound to OS proc set {71}
OMP: pid 610853 tid 611020 thread 69 bound to OS proc set {76}
OMP: pid 610853 tid 611022 thread 71 bound to OS proc set {78}
OMP: pid 610853 tid 611023 thread 72 bound to OS proc set {79}
OMP: pid 610853 tid 611031 thread 80 bound to OS proc set {88}
OMP: pid 610853 tid 611032 thread 81 bound to OS proc set {89}
OMP: pid 610853 tid 611038 thread 87 bound to OS proc set {95}
OMP: pid 610853 tid 611035 thread 84 bound to OS proc set {92}
OMP: pid 610853 tid 611036 thread 85 bound to OS proc set {93}
OMP: pid 610853 tid 611037 thread 86 bound to OS proc set {94}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.424495, "speed_tg": 231.438828, "t": 4.424495, "speed": 231.438828}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_13  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 611058 tid 611058 thread 0 bound to OS proc set {0}
OMP: pid 611058 tid 611159 thread 3 bound to OS proc set {3}
OMP: pid 611058 tid 611171 thread 15 bound to OS proc set {15}
OMP: pid 611058 tid 611167 thread 11 bound to OS proc set {11}
OMP: pid 611058 tid 611158 thread 2 bound to OS proc set {2}
OMP: pid 611058 tid 611168 thread 12 bound to OS proc set {12}
OMP: pid 611058 tid 611219 thread 63 bound to OS proc set {63}
OMP: pid 611058 tid 611164 thread 8 bound to OS proc set {8}
OMP: pid 611058 tid 611207 thread 51 bound to OS proc set {51}
OMP: pid 611058 tid 611215 thread 59 bound to OS proc set {59}
OMP: pid 611058 tid 611170 thread 14 bound to OS proc set {14}
OMP: pid 611058 tid 611187 thread 31 bound to OS proc set {31}
OMP: pid 611058 tid 611188 thread 32 bound to OS proc set {32}
OMP: pid 611058 tid 611184 thread 28 bound to OS proc set {28}
OMP: pid 611058 tid 611218 thread 62 bound to OS proc set {62}
OMP: pid 611058 tid 611206 thread 50 bound to OS proc set {50}
OMP: pid 611058 tid 611157 thread 1 bound to OS proc set {1}
OMP: pid 611058 tid 611216 thread 60 bound to OS proc set {60}
OMP: pid 611058 tid 611203 thread 47 bound to OS proc set {47}
OMP: pid 611058 tid 611163 thread 7 bound to OS proc set {7}
OMP: pid 611058 tid 611169 thread 13 bound to OS proc set {13}
OMP: pid 611058 tid 611175 thread 19 bound to OS proc set {19}
OMP: pid 611058 tid 611204 thread 48 bound to OS proc set {48}
OMP: pid 611058 tid 611166 thread 10 bound to OS proc set {10}
OMP: pid 611058 tid 611160 thread 4 bound to OS proc set {4}
OMP: pid 611058 tid 611172 thread 16 bound to OS proc set {16}
OMP: pid 611058 tid 611191 thread 35 bound to OS proc set {35}
OMP: pid 611058 tid 611165 thread 9 bound to OS proc set {9}
OMP: pid 611058 tid 611186 thread 30 bound to OS proc set {30}
OMP: pid 611058 tid 611212 thread 56 bound to OS proc set {56}
OMP: pid 611058 tid 611162 thread 6 bound to OS proc set {6}
OMP: pid 611058 tid 611199 thread 43 bound to OS proc set {43}
OMP: pid 611058 tid 611200 thread 44 bound to OS proc set {44}
OMP: pid 611058 tid 611174 thread 18 bound to OS proc set {18}
OMP: pid 611058 tid 611211 thread 55 bound to OS proc set {55}
OMP: pid 611058 tid 611183 thread 27 bound to OS proc set {27}
OMP: pid 611058 tid 611217 thread 61 bound to OS proc set {61}
OMP: pid 611058 tid 611190 thread 34 bound to OS proc set {34}
OMP: pid 611058 tid 611161 thread 5 bound to OS proc set {5}
OMP: pid 611058 tid 611180 thread 24 bound to OS proc set {24}
OMP: pid 611058 tid 611196 thread 40 bound to OS proc set {40}
OMP: pid 611058 tid 611195 thread 39 bound to OS proc set {39}
OMP: pid 611058 tid 611205 thread 49 bound to OS proc set {49}
OMP: pid 611058 tid 611223 thread 67 bound to OS proc set {67}
OMP: pid 611058 tid 611185 thread 29 bound to OS proc set {29}
OMP: pid 611058 tid 611202 thread 46 bound to OS proc set {46}
OMP: pid 611058 tid 611182 thread 26 bound to OS proc set {26}
OMP: pid 611058 tid 611179 thread 23 bound to OS proc set {23}
OMP: pid 611058 tid 611181 thread 25 bound to OS proc set {25}
OMP: pid 611058 tid 611198 thread 42 bound to OS proc set {42}
OMP: pid 611058 tid 611220 thread 64 bound to OS proc set {64}
OMP: pid 611058 tid 611214 thread 58 bound to OS proc set {58}
OMP: pid 611058 tid 611173 thread 17 bound to OS proc set {17}
OMP: pid 611058 tid 611208 thread 52 bound to OS proc set {52}
OMP: pid 611058 tid 611192 thread 36 bound to OS proc set {36}
OMP: pid 611058 tid 611210 thread 54 bound to OS proc set {54}
OMP: pid 611058 tid 611189 thread 33 bound to OS proc set {33}
OMP: pid 611058 tid 611194 thread 38 bound to OS proc set {38}
OMP: pid 611058 tid 611176 thread 20 bound to OS proc set {20}
OMP: pid 611058 tid 611213 thread 57 bound to OS proc set {57}
OMP: pid 611058 tid 611197 thread 41 bound to OS proc set {41}
OMP: pid 611058 tid 611222 thread 66 bound to OS proc set {66}
OMP: pid 611058 tid 611209 thread 53 bound to OS proc set {53}
OMP: pid 611058 tid 611201 thread 45 bound to OS proc set {45}
OMP: pid 611058 tid 611193 thread 37 bound to OS proc set {37}
OMP: pid 611058 tid 611251 thread 95 bound to OS proc set {95}
OMP: pid 611058 tid 611221 thread 65 bound to OS proc set {65}
OMP: pid 611058 tid 611177 thread 21 bound to OS proc set {21}
OMP: pid 611058 tid 611234 thread 78 bound to OS proc set {78}
OMP: pid 611058 tid 611232 thread 76 bound to OS proc set {76}
OMP: pid 611058 tid 611178 thread 22 bound to OS proc set {22}
OMP: pid 611058 tid 611230 thread 74 bound to OS proc set {74}
OMP: pid 611058 tid 611235 thread 79 bound to OS proc set {79}
OMP: pid 611058 tid 611238 thread 82 bound to OS proc set {82}
OMP: pid 611058 tid 611250 thread 94 bound to OS proc set {94}
OMP: pid 611058 tid 611224 thread 68 bound to OS proc set {68}
OMP: pid 611058 tid 611246 thread 90 bound to OS proc set {90}
OMP: pid 611058 tid 611248 thread 92 bound to OS proc set {92}
OMP: pid 611058 tid 611228 thread 72 bound to OS proc set {72}
OMP: pid 611058 tid 611236 thread 80 bound to OS proc set {80}
OMP: pid 611058 tid 611231 thread 75 bound to OS proc set {75}
OMP: pid 611058 tid 611239 thread 83 bound to OS proc set {83}
OMP: pid 611058 tid 611240 thread 84 bound to OS proc set {84}
OMP: pid 611058 tid 611229 thread 73 bound to OS proc set {73}
OMP: pid 611058 tid 611227 thread 71 bound to OS proc set {71}
OMP: pid 611058 tid 611225 thread 69 bound to OS proc set {69}
OMP: pid 611058 tid 611233 thread 77 bound to OS proc set {77}
OMP: pid 611058 tid 611244 thread 88 bound to OS proc set {88}
OMP: pid 611058 tid 611242 thread 86 bound to OS proc set {86}
OMP: pid 611058 tid 611245 thread 89 bound to OS proc set {89}
OMP: pid 611058 tid 611241 thread 85 bound to OS proc set {85}
OMP: pid 611058 tid 611243 thread 87 bound to OS proc set {87}
OMP: pid 611058 tid 611247 thread 91 bound to OS proc set {91}
OMP: pid 611058 tid 611249 thread 93 bound to OS proc set {93}
OMP: pid 611058 tid 611226 thread 70 bound to OS proc set {70}
OMP: pid 611058 tid 611237 thread 81 bound to OS proc set {81}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.530800, "speed_tg": 226.008652, "t": 4.530800, "speed": 226.008652}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-6718/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-55-49/tools/lprof_npsu_run_14  #
#########################################################################################################################################################################################################################################

×