options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 9.157262, "speed_pp": 13.977978, "t_tg": 0.000000, "speed_tg": nan, "t": 9.157262, "speed": 13.977978}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1139977 tid 1139977 thread 0 bound to OS proc set {0}
OMP: pid 1139977 tid 1140044 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 4.602892, "speed_pp": 27.808605, "t_tg": 0.000000, "speed_tg": nan, "t": 4.602892, "speed": 27.808605}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140064 tid 1140064 thread 0 bound to OS proc set {0}
OMP: pid 1140064 tid 1140132 thread 1 bound to OS proc set {16}
OMP: pid 1140064 tid 1140133 thread 2 bound to OS proc set {32}
OMP: pid 1140064 tid 1140134 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 2.299452, "speed_pp": 55.665436, "t_tg": 0.000000, "speed_tg": nan, "t": 2.299452, "speed": 55.665436}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140154 tid 1140154 thread 0 bound to OS proc set {0}
OMP: pid 1140154 tid 1140222 thread 2 bound to OS proc set {16}
OMP: pid 1140154 tid 1140223 thread 3 bound to OS proc set {24}
OMP: pid 1140154 tid 1140221 thread 1 bound to OS proc set {8}
OMP: pid 1140154 tid 1140226 thread 6 bound to OS proc set {48}
OMP: pid 1140154 tid 1140224 thread 4 bound to OS proc set {32}
OMP: pid 1140154 tid 1140225 thread 5 bound to OS proc set {40}
OMP: pid 1140154 tid 1140227 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 1.158677, "speed_pp": 110.470825, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 1.158678, "speed": 110.470734}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140248 tid 1140248 thread 0 bound to OS proc set {0}
OMP: pid 1140248 tid 1140317 thread 3 bound to OS proc set {12}
OMP: pid 1140248 tid 1140326 thread 12 bound to OS proc set {48}
OMP: pid 1140248 tid 1140316 thread 2 bound to OS proc set {8}
OMP: pid 1140248 tid 1140315 thread 1 bound to OS proc set {4}
OMP: pid 1140248 tid 1140322 thread 8 bound to OS proc set {32}
OMP: pid 1140248 tid 1140328 thread 14 bound to OS proc set {56}
OMP: pid 1140248 tid 1140327 thread 13 bound to OS proc set {52}
OMP: pid 1140248 tid 1140321 thread 7 bound to OS proc set {28}
OMP: pid 1140248 tid 1140324 thread 10 bound to OS proc set {40}
OMP: pid 1140248 tid 1140318 thread 4 bound to OS proc set {16}
OMP: pid 1140248 tid 1140325 thread 11 bound to OS proc set {44}
OMP: pid 1140248 tid 1140320 thread 6 bound to OS proc set {24}
OMP: pid 1140248 tid 1140323 thread 9 bound to OS proc set {36}
OMP: pid 1140248 tid 1140319 thread 5 bound to OS proc set {20}
OMP: pid 1140248 tid 1140329 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.594316, "speed_pp": 215.373642, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 0.594317, "speed": 215.373276}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140397 tid 1140397 thread 0 bound to OS proc set {0}
OMP: pid 1140397 tid 1140466 thread 3 bound to OS proc set {8}
OMP: pid 1140397 tid 1140470 thread 7 bound to OS proc set {18}
OMP: pid 1140397 tid 1140479 thread 16 bound to OS proc set {43}
OMP: pid 1140397 tid 1140475 thread 12 bound to OS proc set {32}
OMP: pid 1140397 tid 1140464 thread 1 bound to OS proc set {2}
OMP: pid 1140397 tid 1140467 thread 4 bound to OS proc set {10}
OMP: pid 1140397 tid 1140478 thread 15 bound to OS proc set {40}
OMP: pid 1140397 tid 1140477 thread 14 bound to OS proc set {37}
OMP: pid 1140397 tid 1140482 thread 19 bound to OS proc set {51}
OMP: pid 1140397 tid 1140474 thread 11 bound to OS proc set {29}
OMP: pid 1140397 tid 1140469 thread 6 bound to OS proc set {16}
OMP: pid 1140397 tid 1140481 thread 18 bound to OS proc set {48}
OMP: pid 1140397 tid 1140483 thread 20 bound to OS proc set {54}
OMP: pid 1140397 tid 1140472 thread 9 bound to OS proc set {24}
OMP: pid 1140397 tid 1140476 thread 13 bound to OS proc set {35}
OMP: pid 1140397 tid 1140480 thread 17 bound to OS proc set {46}
OMP: pid 1140397 tid 1140473 thread 10 bound to OS proc set {27}
OMP: pid 1140397 tid 1140471 thread 8 bound to OS proc set {21}
OMP: pid 1140397 tid 1140468 thread 5 bound to OS proc set {13}
OMP: pid 1140397 tid 1140465 thread 2 bound to OS proc set {5}
OMP: pid 1140397 tid 1140485 thread 22 bound to OS proc set {59}
OMP: pid 1140397 tid 1140486 thread 23 bound to OS proc set {62}
OMP: pid 1140397 tid 1140484 thread 21 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.451361, "speed_pp": 283.586761, "t_tg": 0.000000, "speed_tg": nan, "t": 0.451361, "speed": 283.586761}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140507 tid 1140507 thread 0 bound to OS proc set {0}
OMP: pid 1140507 tid 1140584 thread 11 bound to OS proc set {22}
OMP: pid 1140507 tid 1140580 thread 7 bound to OS proc set {14}
OMP: pid 1140507 tid 1140574 thread 1 bound to OS proc set {2}
OMP: pid 1140507 tid 1140576 thread 3 bound to OS proc set {6}
OMP: pid 1140507 tid 1140575 thread 2 bound to OS proc set {4}
OMP: pid 1140507 tid 1140581 thread 8 bound to OS proc set {16}
OMP: pid 1140507 tid 1140587 thread 14 bound to OS proc set {28}
OMP: pid 1140507 tid 1140579 thread 6 bound to OS proc set {12}
OMP: pid 1140507 tid 1140582 thread 9 bound to OS proc set {18}
OMP: pid 1140507 tid 1140577 thread 4 bound to OS proc set {8}
OMP: pid 1140507 tid 1140585 thread 12 bound to OS proc set {24}
OMP: pid 1140507 tid 1140588 thread 15 bound to OS proc set {30}
OMP: pid 1140507 tid 1140578 thread 5 bound to OS proc set {10}
OMP: pid 1140507 tid 1140601 thread 28 bound to OS proc set {56}
OMP: pid 1140507 tid 1140592 thread 19 bound to OS proc set {38}
OMP: pid 1140507 tid 1140600 thread 27 bound to OS proc set {54}
OMP: pid 1140507 tid 1140583 thread 10 bound to OS proc set {20}
OMP: pid 1140507 tid 1140602 thread 29 bound to OS proc set {58}
OMP: pid 1140507 tid 1140591 thread 18 bound to OS proc set {36}
OMP: pid 1140507 tid 1140596 thread 23 bound to OS proc set {46}
OMP: pid 1140507 tid 1140586 thread 13 bound to OS proc set {26}
OMP: pid 1140507 tid 1140603 thread 30 bound to OS proc set {60}
OMP: pid 1140507 tid 1140590 thread 17 bound to OS proc set {34}
OMP: pid 1140507 tid 1140597 thread 24 bound to OS proc set {48}
OMP: pid 1140507 tid 1140604 thread 31 bound to OS proc set {62}
OMP: pid 1140507 tid 1140598 thread 25 bound to OS proc set {50}
OMP: pid 1140507 tid 1140599 thread 26 bound to OS proc set {52}
OMP: pid 1140507 tid 1140593 thread 20 bound to OS proc set {40}
OMP: pid 1140507 tid 1140594 thread 21 bound to OS proc set {42}
OMP: pid 1140507 tid 1140589 thread 16 bound to OS proc set {32}
OMP: pid 1140507 tid 1140595 thread 22 bound to OS proc set {44}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.363678, "speed_pp": 351.959686, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 0.363679, "speed": 351.958710}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140624 tid 1140624 thread 0 bound to OS proc set {0}
OMP: pid 1140624 tid 1140691 thread 1 bound to OS proc set {1}
OMP: pid 1140624 tid 1140692 thread 2 bound to OS proc set {3}
OMP: pid 1140624 tid 1140700 thread 10 bound to OS proc set {16}
OMP: pid 1140624 tid 1140697 thread 7 bound to OS proc set {11}
OMP: pid 1140624 tid 1140702 thread 12 bound to OS proc set {19}
OMP: pid 1140624 tid 1140698 thread 8 bound to OS proc set {13}
OMP: pid 1140624 tid 1140696 thread 6 bound to OS proc set {9}
OMP: pid 1140624 tid 1140693 thread 3 bound to OS proc set {4}
OMP: pid 1140624 tid 1140699 thread 9 bound to OS proc set {14}
OMP: pid 1140624 tid 1140694 thread 4 bound to OS proc set {6}
OMP: pid 1140624 tid 1140705 thread 15 bound to OS proc set {24}
OMP: pid 1140624 tid 1140701 thread 11 bound to OS proc set {17}
OMP: pid 1140624 tid 1140707 thread 17 bound to OS proc set {27}
OMP: pid 1140624 tid 1140718 thread 28 bound to OS proc set {45}
OMP: pid 1140624 tid 1140725 thread 35 bound to OS proc set {56}
OMP: pid 1140624 tid 1140695 thread 5 bound to OS proc set {8}
OMP: pid 1140624 tid 1140710 thread 20 bound to OS proc set {32}
OMP: pid 1140624 tid 1140722 thread 32 bound to OS proc set {52}
OMP: pid 1140624 tid 1140721 thread 31 bound to OS proc set {50}
OMP: pid 1140624 tid 1140714 thread 24 bound to OS proc set {39}
OMP: pid 1140624 tid 1140726 thread 36 bound to OS proc set {58}
OMP: pid 1140624 tid 1140704 thread 14 bound to OS proc set {22}
OMP: pid 1140624 tid 1140709 thread 19 bound to OS proc set {30}
OMP: pid 1140624 tid 1140717 thread 27 bound to OS proc set {43}
OMP: pid 1140624 tid 1140712 thread 22 bound to OS proc set {35}
OMP: pid 1140624 tid 1140708 thread 18 bound to OS proc set {29}
OMP: pid 1140624 tid 1140706 thread 16 bound to OS proc set {26}
OMP: pid 1140624 tid 1140703 thread 13 bound to OS proc set {21}
OMP: pid 1140624 tid 1140724 thread 34 bound to OS proc set {55}
OMP: pid 1140624 tid 1140713 thread 23 bound to OS proc set {37}
OMP: pid 1140624 tid 1140728 thread 38 bound to OS proc set {61}
OMP: pid 1140624 tid 1140720 thread 30 bound to OS proc set {48}
OMP: pid 1140624 tid 1140729 thread 39 bound to OS proc set {63}
OMP: pid 1140624 tid 1140719 thread 29 bound to OS proc set {47}
OMP: pid 1140624 tid 1140723 thread 33 bound to OS proc set {53}
OMP: pid 1140624 tid 1140727 thread 37 bound to OS proc set {60}
OMP: pid 1140624 tid 1140711 thread 21 bound to OS proc set {34}
OMP: pid 1140624 tid 1140715 thread 25 bound to OS proc set {40}
OMP: pid 1140624 tid 1140716 thread 26 bound to OS proc set {42}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.312842, "speed_pp": 409.152222, "t_tg": 0.000000, "speed_tg": nan, "t": 0.312842, "speed": 409.152222}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140750 tid 1140750 thread 0 bound to OS proc set {0}
OMP: pid 1140750 tid 1140818 thread 2 bound to OS proc set {2}
OMP: pid 1140750 tid 1140817 thread 1 bound to OS proc set {1}
OMP: pid 1140750 tid 1140827 thread 11 bound to OS proc set {14}
OMP: pid 1140750 tid 1140828 thread 12 bound to OS proc set {16}
OMP: pid 1140750 tid 1140822 thread 6 bound to OS proc set {8}
OMP: pid 1140750 tid 1140819 thread 3 bound to OS proc set {4}
OMP: pid 1140750 tid 1140826 thread 10 bound to OS proc set {13}
OMP: pid 1140750 tid 1140824 thread 8 bound to OS proc set {10}
OMP: pid 1140750 tid 1140830 thread 14 bound to OS proc set {18}
OMP: pid 1140750 tid 1140834 thread 18 bound to OS proc set {24}
OMP: pid 1140750 tid 1140829 thread 13 bound to OS proc set {17}
OMP: pid 1140750 tid 1140825 thread 9 bound to OS proc set {12}
OMP: pid 1140750 tid 1140820 thread 4 bound to OS proc set {5}
OMP: pid 1140750 tid 1140821 thread 5 bound to OS proc set {6}
OMP: pid 1140750 tid 1140844 thread 28 bound to OS proc set {37}
OMP: pid 1140750 tid 1140823 thread 7 bound to OS proc set {9}
OMP: pid 1140750 tid 1140833 thread 17 bound to OS proc set {23}
OMP: pid 1140750 tid 1140831 thread 15 bound to OS proc set {20}
OMP: pid 1140750 tid 1140839 thread 23 bound to OS proc set {31}
OMP: pid 1140750 tid 1140851 thread 35 bound to OS proc set {47}
OMP: pid 1140750 tid 1140838 thread 22 bound to OS proc set {29}
OMP: pid 1140750 tid 1140840 thread 24 bound to OS proc set {32}
OMP: pid 1140750 tid 1140847 thread 31 bound to OS proc set {41}
OMP: pid 1140750 tid 1140850 thread 34 bound to OS proc set {46}
OMP: pid 1140750 tid 1140842 thread 26 bound to OS proc set {35}
OMP: pid 1140750 tid 1140843 thread 27 bound to OS proc set {36}
OMP: pid 1140750 tid 1140849 thread 33 bound to OS proc set {44}
OMP: pid 1140750 tid 1140852 thread 36 bound to OS proc set {48}
OMP: pid 1140750 tid 1140836 thread 20 bound to OS proc set {27}
OMP: pid 1140750 tid 1140845 thread 29 bound to OS proc set {39}
OMP: pid 1140750 tid 1140859 thread 43 bound to OS proc set {58}
OMP: pid 1140750 tid 1140848 thread 32 bound to OS proc set {43}
OMP: pid 1140750 tid 1140846 thread 30 bound to OS proc set {40}
OMP: pid 1140750 tid 1140841 thread 25 bound to OS proc set {33}
OMP: pid 1140750 tid 1140863 thread 47 bound to OS proc set {63}
OMP: pid 1140750 tid 1140835 thread 19 bound to OS proc set {25}
OMP: pid 1140750 tid 1140862 thread 46 bound to OS proc set {62}
OMP: pid 1140750 tid 1140853 thread 37 bound to OS proc set {50}
OMP: pid 1140750 tid 1140832 thread 16 bound to OS proc set {21}
OMP: pid 1140750 tid 1140837 thread 21 bound to OS proc set {28}
OMP: pid 1140750 tid 1140860 thread 44 bound to OS proc set {59}
OMP: pid 1140750 tid 1140856 thread 40 bound to OS proc set {54}
OMP: pid 1140750 tid 1140858 thread 42 bound to OS proc set {56}
OMP: pid 1140750 tid 1140861 thread 45 bound to OS proc set {60}
OMP: pid 1140750 tid 1140855 thread 39 bound to OS proc set {52}
OMP: pid 1140750 tid 1140857 thread 41 bound to OS proc set {55}
OMP: pid 1140750 tid 1140854 thread 38 bound to OS proc set {51}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.277087, "speed_pp": 461.948761, "t_tg": 0.000000, "speed_tg": nan, "t": 0.277087, "speed": 461.948761}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1140883 tid 1140883 thread 0 bound to OS proc set {0}
OMP: pid 1140883 tid 1140952 thread 3 bound to OS proc set {3}
OMP: pid 1140883 tid 1140951 thread 2 bound to OS proc set {2}
OMP: pid 1140883 tid 1140950 thread 1 bound to OS proc set {1}
OMP: pid 1140883 tid 1140953 thread 4 bound to OS proc set {4}
OMP: pid 1140883 tid 1140955 thread 6 bound to OS proc set {6}
OMP: pid 1140883 tid 1140997 thread 48 bound to OS proc set {55}
OMP: pid 1140883 tid 1140954 thread 5 bound to OS proc set {5}
OMP: pid 1140883 tid 1140960 thread 11 bound to OS proc set {12}
OMP: pid 1140883 tid 1140959 thread 10 bound to OS proc set {11}
OMP: pid 1140883 tid 1140980 thread 31 bound to OS proc set {35}
OMP: pid 1140883 tid 1140956 thread 7 bound to OS proc set {8}
OMP: pid 1140883 tid 1140977 thread 28 bound to OS proc set {32}
OMP: pid 1140883 tid 1141000 thread 51 bound to OS proc set {59}
OMP: pid 1140883 tid 1140961 thread 12 bound to OS proc set {13}
OMP: pid 1140883 tid 1140963 thread 14 bound to OS proc set {16}
OMP: pid 1140883 tid 1140999 thread 50 bound to OS proc set {58}
OMP: pid 1140883 tid 1140962 thread 13 bound to OS proc set {15}
OMP: pid 1140883 tid 1140957 thread 8 bound to OS proc set {9}
OMP: pid 1140883 tid 1140965 thread 16 bound to OS proc set {18}
OMP: pid 1140883 tid 1140964 thread 15 bound to OS proc set {17}
OMP: pid 1140883 tid 1140993 thread 44 bound to OS proc set {51}
OMP: pid 1140883 tid 1140979 thread 30 bound to OS proc set {34}
OMP: pid 1140883 tid 1140996 thread 47 bound to OS proc set {54}
OMP: pid 1140883 tid 1140958 thread 9 bound to OS proc set {10}
OMP: pid 1140883 tid 1140966 thread 17 bound to OS proc set {19}
OMP: pid 1140883 tid 1140973 thread 24 bound to OS proc set {27}
OMP: pid 1140883 tid 1140998 thread 49 bound to OS proc set {56}
OMP: pid 1140883 tid 1140981 thread 32 bound to OS proc set {37}
OMP: pid 1140883 tid 1140978 thread 29 bound to OS proc set {33}
OMP: pid 1140883 tid 1140976 thread 27 bound to OS proc set {31}
OMP: pid 1140883 tid 1140995 thread 46 bound to OS proc set {53}
OMP: pid 1140883 tid 1140967 thread 18 bound to OS proc set {20}
OMP: pid 1140883 tid 1141003 thread 54 bound to OS proc set {62}
OMP: pid 1140883 tid 1141004 thread 55 bound to OS proc set {63}
OMP: pid 1140883 tid 1140968 thread 19 bound to OS proc set {22}
OMP: pid 1140883 tid 1140969 thread 20 bound to OS proc set {23}
OMP: pid 1140883 tid 1140985 thread 36 bound to OS proc set {41}
OMP: pid 1140883 tid 1140984 thread 35 bound to OS proc set {40}
OMP: pid 1140883 tid 1140972 thread 23 bound to OS proc set {26}
OMP: pid 1140883 tid 1140974 thread 25 bound to OS proc set {29}
OMP: pid 1140883 tid 1140975 thread 26 bound to OS proc set {30}
OMP: pid 1140883 tid 1140991 thread 42 bound to OS proc set {48}
OMP: pid 1140883 tid 1140988 thread 39 bound to OS proc set {45}
OMP: pid 1140883 tid 1140970 thread 21 bound to OS proc set {24}
OMP: pid 1140883 tid 1140983 thread 34 bound to OS proc set {39}
OMP: pid 1140883 tid 1140971 thread 22 bound to OS proc set {25}
OMP: pid 1140883 tid 1140987 thread 38 bound to OS proc set {44}
OMP: pid 1140883 tid 1140982 thread 33 bound to OS proc set {38}
OMP: pid 1140883 tid 1140994 thread 45 bound to OS proc set {52}
OMP: pid 1140883 tid 1140992 thread 43 bound to OS proc set {49}
OMP: pid 1140883 tid 1141001 thread 52 bound to OS proc set {60}
OMP: pid 1140883 tid 1141002 thread 53 bound to OS proc set {61}
OMP: pid 1140883 tid 1140986 thread 37 bound to OS proc set {42}
OMP: pid 1140883 tid 1140989 thread 40 bound to OS proc set {46}
OMP: pid 1140883 tid 1140990 thread 41 bound to OS proc set {47}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.249481, "speed_pp": 513.065125, "t_tg": 0.000000, "speed_tg": nan, "t": 0.249481, "speed": 513.065125}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1141073 tid 1141073 thread 0 bound to OS proc set {0}
OMP: pid 1141073 tid 1141142 thread 3 bound to OS proc set {3}
OMP: pid 1141073 tid 1141154 thread 15 bound to OS proc set {15}
OMP: pid 1141073 tid 1141150 thread 11 bound to OS proc set {11}
OMP: pid 1141073 tid 1141151 thread 12 bound to OS proc set {12}
OMP: pid 1141073 tid 1141141 thread 2 bound to OS proc set {2}
OMP: pid 1141073 tid 1141147 thread 8 bound to OS proc set {8}
OMP: pid 1141073 tid 1141149 thread 10 bound to OS proc set {10}
OMP: pid 1141073 tid 1141153 thread 14 bound to OS proc set {14}
OMP: pid 1141073 tid 1141152 thread 13 bound to OS proc set {13}
OMP: pid 1141073 tid 1141146 thread 7 bound to OS proc set {7}
OMP: pid 1141073 tid 1141158 thread 19 bound to OS proc set {19}
OMP: pid 1141073 tid 1141143 thread 4 bound to OS proc set {4}
OMP: pid 1141073 tid 1141148 thread 9 bound to OS proc set {9}
OMP: pid 1141073 tid 1141155 thread 16 bound to OS proc set {16}
OMP: pid 1141073 tid 1141145 thread 6 bound to OS proc set {6}
OMP: pid 1141073 tid 1141163 thread 24 bound to OS proc set {24}
OMP: pid 1141073 tid 1141157 thread 18 bound to OS proc set {18}
OMP: pid 1141073 tid 1141156 thread 17 bound to OS proc set {17}
OMP: pid 1141073 tid 1141162 thread 23 bound to OS proc set {23}
OMP: pid 1141073 tid 1141159 thread 20 bound to OS proc set {20}
OMP: pid 1141073 tid 1141161 thread 22 bound to OS proc set {22}
OMP: pid 1141073 tid 1141167 thread 28 bound to OS proc set {28}
OMP: pid 1141073 tid 1141160 thread 21 bound to OS proc set {21}
OMP: pid 1141073 tid 1141190 thread 51 bound to OS proc set {51}
OMP: pid 1141073 tid 1141189 thread 50 bound to OS proc set {50}
OMP: pid 1141073 tid 1141202 thread 63 bound to OS proc set {63}
OMP: pid 1141073 tid 1141171 thread 32 bound to OS proc set {32}
OMP: pid 1141073 tid 1141188 thread 49 bound to OS proc set {49}
OMP: pid 1141073 tid 1141140 thread 1 bound to OS proc set {1}
OMP: pid 1141073 tid 1141182 thread 43 bound to OS proc set {43}
OMP: pid 1141073 tid 1141199 thread 60 bound to OS proc set {60}
OMP: pid 1141073 tid 1141169 thread 30 bound to OS proc set {30}
OMP: pid 1141073 tid 1141185 thread 46 bound to OS proc set {46}
OMP: pid 1141073 tid 1141186 thread 47 bound to OS proc set {47}
OMP: pid 1141073 tid 1141166 thread 27 bound to OS proc set {27}
OMP: pid 1141073 tid 1141173 thread 34 bound to OS proc set {34}
OMP: pid 1141073 tid 1141201 thread 62 bound to OS proc set {62}
OMP: pid 1141073 tid 1141170 thread 31 bound to OS proc set {31}
OMP: pid 1141073 tid 1141175 thread 36 bound to OS proc set {36}
OMP: pid 1141073 tid 1141194 thread 55 bound to OS proc set {55}
OMP: pid 1141073 tid 1141198 thread 59 bound to OS proc set {59}
OMP: pid 1141073 tid 1141165 thread 26 bound to OS proc set {26}
OMP: pid 1141073 tid 1141197 thread 58 bound to OS proc set {58}
OMP: pid 1141073 tid 1141184 thread 45 bound to OS proc set {45}
OMP: pid 1141073 tid 1141174 thread 35 bound to OS proc set {35}
OMP: pid 1141073 tid 1141172 thread 33 bound to OS proc set {33}
OMP: pid 1141073 tid 1141187 thread 48 bound to OS proc set {48}
OMP: pid 1141073 tid 1141164 thread 25 bound to OS proc set {25}
OMP: pid 1141073 tid 1141181 thread 42 bound to OS proc set {42}
OMP: pid 1141073 tid 1141177 thread 38 bound to OS proc set {38}
OMP: pid 1141073 tid 1141196 thread 57 bound to OS proc set {57}
OMP: pid 1141073 tid 1141193 thread 54 bound to OS proc set {54}
OMP: pid 1141073 tid 1141200 thread 61 bound to OS proc set {61}
OMP: pid 1141073 tid 1141183 thread 44 bound to OS proc set {44}
OMP: pid 1141073 tid 1141178 thread 39 bound to OS proc set {39}
OMP: pid 1141073 tid 1141180 thread 41 bound to OS proc set {41}
OMP: pid 1141073 tid 1141191 thread 52 bound to OS proc set {52}
OMP: pid 1141073 tid 1141179 thread 40 bound to OS proc set {40}
OMP: pid 1141073 tid 1141192 thread 53 bound to OS proc set {53}
OMP: pid 1141073 tid 1141195 thread 56 bound to OS proc set {56}
OMP: pid 1141073 tid 1141168 thread 29 bound to OS proc set {29}
OMP: pid 1141073 tid 1141176 thread 37 bound to OS proc set {37}
OMP: pid 1141073 tid 1141144 thread 5 bound to OS proc set {5}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.230421, "speed_pp": 555.504883, "t_tg": 0.000000, "speed_tg": nan, "t": 0.230421, "speed": 555.504883}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-4104/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-34-35/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################

×