options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 370.551605, "speed_pp": 5.526896, "t_tg": 0.000000, "speed_tg": nan, "t": 370.551605, "speed": 5.526896}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3670738 tid 3670738 thread 0 bound to OS proc set {0}
OMP: pid 3670738 tid 3670805 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 188.658615, "speed_pp": 10.855587, "t_tg": 0.000000, "speed_tg": nan, "t": 188.658615, "speed": 10.855587}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3671809 tid 3671809 thread 0 bound to OS proc set {0}
OMP: pid 3671809 tid 3671876 thread 1 bound to OS proc set {16}
OMP: pid 3671809 tid 3671877 thread 2 bound to OS proc set {32}
OMP: pid 3671809 tid 3671878 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 98.215858, "speed_pp": 20.852030, "t_tg": 0.000000, "speed_tg": nan, "t": 98.215858, "speed": 20.852030}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3674602 tid 3674602 thread 0 bound to OS proc set {0}
OMP: pid 3674602 tid 3674671 thread 3 bound to OS proc set {24}
OMP: pid 3674602 tid 3674670 thread 2 bound to OS proc set {16}
OMP: pid 3674602 tid 3674672 thread 4 bound to OS proc set {32}
OMP: pid 3674602 tid 3674669 thread 1 bound to OS proc set {8}
OMP: pid 3674602 tid 3674674 thread 6 bound to OS proc set {48}
OMP: pid 3674602 tid 3674673 thread 5 bound to OS proc set {40}
OMP: pid 3674602 tid 3674675 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 59.578121, "speed_pp": 34.375034, "t_tg": 0.000000, "speed_tg": nan, "t": 59.578121, "speed": 34.375034}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3680934 tid 3680934 thread 0 bound to OS proc set {0}
OMP: pid 3680934 tid 3681002 thread 2 bound to OS proc set {8}
OMP: pid 3680934 tid 3681001 thread 1 bound to OS proc set {4}
OMP: pid 3680934 tid 3681012 thread 12 bound to OS proc set {48}
OMP: pid 3680934 tid 3681003 thread 3 bound to OS proc set {12}
OMP: pid 3680934 tid 3681014 thread 14 bound to OS proc set {56}
OMP: pid 3680934 tid 3681013 thread 13 bound to OS proc set {52}
OMP: pid 3680934 tid 3681004 thread 4 bound to OS proc set {16}
OMP: pid 3680934 tid 3681008 thread 8 bound to OS proc set {32}
OMP: pid 3680934 tid 3681011 thread 11 bound to OS proc set {44}
OMP: pid 3680934 tid 3681007 thread 7 bound to OS proc set {28}
OMP: pid 3680934 tid 3681010 thread 10 bound to OS proc set {40}
OMP: pid 3680934 tid 3681006 thread 6 bound to OS proc set {24}
OMP: pid 3680934 tid 3681005 thread 5 bound to OS proc set {20}
OMP: pid 3680934 tid 3681009 thread 9 bound to OS proc set {36}
OMP: pid 3680934 tid 3681015 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 41.766655, "speed_pp": 49.034332, "t_tg": 0.000000, "speed_tg": nan, "t": 41.766655, "speed": 49.034332}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3694296 tid 3694296 thread 0 bound to OS proc set {0}
OMP: pid 3694296 tid 3694363 thread 1 bound to OS proc set {2}
OMP: pid 3694296 tid 3694370 thread 8 bound to OS proc set {21}
OMP: pid 3694296 tid 3694365 thread 3 bound to OS proc set {8}
OMP: pid 3694296 tid 3694378 thread 16 bound to OS proc set {43}
OMP: pid 3694296 tid 3694377 thread 15 bound to OS proc set {40}
OMP: pid 3694296 tid 3694374 thread 12 bound to OS proc set {32}
OMP: pid 3694296 tid 3694371 thread 9 bound to OS proc set {24}
OMP: pid 3694296 tid 3694366 thread 4 bound to OS proc set {10}
OMP: pid 3694296 tid 3694368 thread 6 bound to OS proc set {16}
OMP: pid 3694296 tid 3694381 thread 19 bound to OS proc set {51}
OMP: pid 3694296 tid 3694379 thread 17 bound to OS proc set {46}
OMP: pid 3694296 tid 3694380 thread 18 bound to OS proc set {48}
OMP: pid 3694296 tid 3694369 thread 7 bound to OS proc set {18}
OMP: pid 3694296 tid 3694376 thread 14 bound to OS proc set {37}
OMP: pid 3694296 tid 3694382 thread 20 bound to OS proc set {54}
OMP: pid 3694296 tid 3694373 thread 11 bound to OS proc set {29}
OMP: pid 3694296 tid 3694367 thread 5 bound to OS proc set {13}
OMP: pid 3694296 tid 3694364 thread 2 bound to OS proc set {5}
OMP: pid 3694296 tid 3694372 thread 10 bound to OS proc set {27}
OMP: pid 3694296 tid 3694383 thread 21 bound to OS proc set {56}
OMP: pid 3694296 tid 3694375 thread 13 bound to OS proc set {35}
OMP: pid 3694296 tid 3694384 thread 22 bound to OS proc set {59}
OMP: pid 3694296 tid 3694385 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 35.585953, "speed_pp": 57.550800, "t_tg": 0.000000, "speed_tg": nan, "t": 35.585953, "speed": 57.550800}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3714790 tid 3714790 thread 0 bound to OS proc set {0}
OMP: pid 3714790 tid 3714857 thread 1 bound to OS proc set {2}
OMP: pid 3714790 tid 3714858 thread 2 bound to OS proc set {4}
OMP: pid 3714790 tid 3714865 thread 9 bound to OS proc set {18}
OMP: pid 3714790 tid 3714863 thread 7 bound to OS proc set {14}
OMP: pid 3714790 tid 3714859 thread 3 bound to OS proc set {6}
OMP: pid 3714790 tid 3714860 thread 4 bound to OS proc set {8}
OMP: pid 3714790 tid 3714869 thread 13 bound to OS proc set {26}
OMP: pid 3714790 tid 3714862 thread 6 bound to OS proc set {12}
OMP: pid 3714790 tid 3714864 thread 8 bound to OS proc set {16}
OMP: pid 3714790 tid 3714866 thread 10 bound to OS proc set {20}
OMP: pid 3714790 tid 3714884 thread 28 bound to OS proc set {56}
OMP: pid 3714790 tid 3714861 thread 5 bound to OS proc set {10}
OMP: pid 3714790 tid 3714874 thread 18 bound to OS proc set {36}
OMP: pid 3714790 tid 3714871 thread 15 bound to OS proc set {30}
OMP: pid 3714790 tid 3714870 thread 14 bound to OS proc set {28}
OMP: pid 3714790 tid 3714872 thread 16 bound to OS proc set {32}
OMP: pid 3714790 tid 3714886 thread 30 bound to OS proc set {60}
OMP: pid 3714790 tid 3714875 thread 19 bound to OS proc set {38}
OMP: pid 3714790 tid 3714868 thread 12 bound to OS proc set {24}
OMP: pid 3714790 tid 3714885 thread 29 bound to OS proc set {58}
OMP: pid 3714790 tid 3714867 thread 11 bound to OS proc set {22}
OMP: pid 3714790 tid 3714873 thread 17 bound to OS proc set {34}
OMP: pid 3714790 tid 3714887 thread 31 bound to OS proc set {62}
OMP: pid 3714790 tid 3714880 thread 24 bound to OS proc set {48}
OMP: pid 3714790 tid 3714883 thread 27 bound to OS proc set {54}
OMP: pid 3714790 tid 3714882 thread 26 bound to OS proc set {52}
OMP: pid 3714790 tid 3714876 thread 20 bound to OS proc set {40}
OMP: pid 3714790 tid 3714879 thread 23 bound to OS proc set {46}
OMP: pid 3714790 tid 3714878 thread 22 bound to OS proc set {44}
OMP: pid 3714790 tid 3714877 thread 21 bound to OS proc set {42}
OMP: pid 3714790 tid 3714881 thread 25 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 33.017967, "speed_pp": 62.026836, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 33.017967, "speed": 62.026836}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3742313 tid 3742313 thread 0 bound to OS proc set {0}
OMP: pid 3742313 tid 3742380 thread 1 bound to OS proc set {1}
OMP: pid 3742313 tid 3742381 thread 2 bound to OS proc set {3}
OMP: pid 3742313 tid 3742386 thread 7 bound to OS proc set {11}
OMP: pid 3742313 tid 3742382 thread 3 bound to OS proc set {4}
OMP: pid 3742313 tid 3742393 thread 14 bound to OS proc set {22}
OMP: pid 3742313 tid 3742391 thread 12 bound to OS proc set {19}
OMP: pid 3742313 tid 3742383 thread 4 bound to OS proc set {6}
OMP: pid 3742313 tid 3742411 thread 32 bound to OS proc set {52}
OMP: pid 3742313 tid 3742389 thread 10 bound to OS proc set {16}
OMP: pid 3742313 tid 3742385 thread 6 bound to OS proc set {9}
OMP: pid 3742313 tid 3742414 thread 35 bound to OS proc set {56}
OMP: pid 3742313 tid 3742387 thread 8 bound to OS proc set {13}
OMP: pid 3742313 tid 3742397 thread 18 bound to OS proc set {29}
OMP: pid 3742313 tid 3742415 thread 36 bound to OS proc set {58}
OMP: pid 3742313 tid 3742403 thread 24 bound to OS proc set {39}
OMP: pid 3742313 tid 3742390 thread 11 bound to OS proc set {17}
OMP: pid 3742313 tid 3742388 thread 9 bound to OS proc set {14}
OMP: pid 3742313 tid 3742413 thread 34 bound to OS proc set {55}
OMP: pid 3742313 tid 3742384 thread 5 bound to OS proc set {8}
OMP: pid 3742313 tid 3742417 thread 38 bound to OS proc set {61}
OMP: pid 3742313 tid 3742418 thread 39 bound to OS proc set {63}
OMP: pid 3742313 tid 3742410 thread 31 bound to OS proc set {50}
OMP: pid 3742313 tid 3742412 thread 33 bound to OS proc set {53}
OMP: pid 3742313 tid 3742394 thread 15 bound to OS proc set {24}
OMP: pid 3742313 tid 3742402 thread 23 bound to OS proc set {37}
OMP: pid 3742313 tid 3742406 thread 27 bound to OS proc set {43}
OMP: pid 3742313 tid 3742398 thread 19 bound to OS proc set {30}
OMP: pid 3742313 tid 3742392 thread 13 bound to OS proc set {21}
OMP: pid 3742313 tid 3742399 thread 20 bound to OS proc set {32}
OMP: pid 3742313 tid 3742407 thread 28 bound to OS proc set {45}
OMP: pid 3742313 tid 3742401 thread 22 bound to OS proc set {35}
OMP: pid 3742313 tid 3742396 thread 17 bound to OS proc set {27}
OMP: pid 3742313 tid 3742409 thread 30 bound to OS proc set {48}
OMP: pid 3742313 tid 3742416 thread 37 bound to OS proc set {60}
OMP: pid 3742313 tid 3742395 thread 16 bound to OS proc set {26}
OMP: pid 3742313 tid 3742408 thread 29 bound to OS proc set {47}
OMP: pid 3742313 tid 3742405 thread 26 bound to OS proc set {42}
OMP: pid 3742313 tid 3742404 thread 25 bound to OS proc set {40}
OMP: pid 3742313 tid 3742400 thread 21 bound to OS proc set {34}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 32.862473, "speed_pp": 62.320328, "t_tg": 0.000000, "speed_tg": nan, "t": 32.862473, "speed": 62.320328}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3776966 tid 3776966 thread 0 bound to OS proc set {0}
OMP: pid 3776966 tid 3777034 thread 2 bound to OS proc set {2}
OMP: pid 3776966 tid 3777033 thread 1 bound to OS proc set {1}
OMP: pid 3776966 tid 3777044 thread 12 bound to OS proc set {16}
OMP: pid 3776966 tid 3777042 thread 10 bound to OS proc set {13}
OMP: pid 3776966 tid 3777043 thread 11 bound to OS proc set {14}
OMP: pid 3776966 tid 3777035 thread 3 bound to OS proc set {4}
OMP: pid 3776966 tid 3777060 thread 28 bound to OS proc set {37}
OMP: pid 3776966 tid 3777045 thread 13 bound to OS proc set {17}
OMP: pid 3776966 tid 3777038 thread 6 bound to OS proc set {8}
OMP: pid 3776966 tid 3777047 thread 15 bound to OS proc set {20}
OMP: pid 3776966 tid 3777051 thread 19 bound to OS proc set {25}
OMP: pid 3776966 tid 3777056 thread 24 bound to OS proc set {32}
OMP: pid 3776966 tid 3777058 thread 26 bound to OS proc set {35}
OMP: pid 3776966 tid 3777039 thread 7 bound to OS proc set {9}
OMP: pid 3776966 tid 3777076 thread 44 bound to OS proc set {59}
OMP: pid 3776966 tid 3777041 thread 9 bound to OS proc set {12}
OMP: pid 3776966 tid 3777037 thread 5 bound to OS proc set {6}
OMP: pid 3776966 tid 3777036 thread 4 bound to OS proc set {5}
OMP: pid 3776966 tid 3777079 thread 47 bound to OS proc set {63}
OMP: pid 3776966 tid 3777040 thread 8 bound to OS proc set {10}
OMP: pid 3776966 tid 3777046 thread 14 bound to OS proc set {18}
OMP: pid 3776966 tid 3777065 thread 33 bound to OS proc set {44}
OMP: pid 3776966 tid 3777066 thread 34 bound to OS proc set {46}
OMP: pid 3776966 tid 3777059 thread 27 bound to OS proc set {36}
OMP: pid 3776966 tid 3777057 thread 25 bound to OS proc set {33}
OMP: pid 3776966 tid 3777072 thread 40 bound to OS proc set {54}
OMP: pid 3776966 tid 3777055 thread 23 bound to OS proc set {31}
OMP: pid 3776966 tid 3777064 thread 32 bound to OS proc set {43}
OMP: pid 3776966 tid 3777067 thread 35 bound to OS proc set {47}
OMP: pid 3776966 tid 3777075 thread 43 bound to OS proc set {58}
OMP: pid 3776966 tid 3777048 thread 16 bound to OS proc set {21}
OMP: pid 3776966 tid 3777052 thread 20 bound to OS proc set {27}
OMP: pid 3776966 tid 3777049 thread 17 bound to OS proc set {23}
OMP: pid 3776966 tid 3777062 thread 30 bound to OS proc set {40}
OMP: pid 3776966 tid 3777050 thread 18 bound to OS proc set {24}
OMP: pid 3776966 tid 3777068 thread 36 bound to OS proc set {48}
OMP: pid 3776966 tid 3777063 thread 31 bound to OS proc set {41}
OMP: pid 3776966 tid 3777054 thread 22 bound to OS proc set {29}
OMP: pid 3776966 tid 3777078 thread 46 bound to OS proc set {62}
OMP: pid 3776966 tid 3777053 thread 21 bound to OS proc set {28}
OMP: pid 3776966 tid 3777061 thread 29 bound to OS proc set {39}
OMP: pid 3776966 tid 3777074 thread 42 bound to OS proc set {56}
OMP: pid 3776966 tid 3777071 thread 39 bound to OS proc set {52}
OMP: pid 3776966 tid 3777070 thread 38 bound to OS proc set {51}
OMP: pid 3776966 tid 3777077 thread 45 bound to OS proc set {60}
OMP: pid 3776966 tid 3777073 thread 41 bound to OS proc set {55}
OMP: pid 3776966 tid 3777069 thread 37 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 33.199364, "speed_pp": 61.687931, "t_tg": 0.000000, "speed_tg": nan, "t": 33.199364, "speed": 61.687931}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3818650 tid 3818650 thread 0 bound to OS proc set {0}
OMP: pid 3818650 tid 3818718 thread 2 bound to OS proc set {2}
OMP: pid 3818650 tid 3818719 thread 3 bound to OS proc set {3}
OMP: pid 3818650 tid 3818717 thread 1 bound to OS proc set {1}
OMP: pid 3818650 tid 3818720 thread 4 bound to OS proc set {4}
OMP: pid 3818650 tid 3818722 thread 6 bound to OS proc set {6}
OMP: pid 3818650 tid 3818721 thread 5 bound to OS proc set {5}
OMP: pid 3818650 tid 3818727 thread 11 bound to OS proc set {12}
OMP: pid 3818650 tid 3818726 thread 10 bound to OS proc set {11}
OMP: pid 3818650 tid 3818723 thread 7 bound to OS proc set {8}
OMP: pid 3818650 tid 3818771 thread 55 bound to OS proc set {63}
OMP: pid 3818650 tid 3818747 thread 31 bound to OS proc set {35}
OMP: pid 3818650 tid 3818767 thread 51 bound to OS proc set {59}
OMP: pid 3818650 tid 3818765 thread 49 bound to OS proc set {56}
OMP: pid 3818650 tid 3818730 thread 14 bound to OS proc set {16}
OMP: pid 3818650 tid 3818732 thread 16 bound to OS proc set {18}
OMP: pid 3818650 tid 3818766 thread 50 bound to OS proc set {58}
OMP: pid 3818650 tid 3818763 thread 47 bound to OS proc set {54}
OMP: pid 3818650 tid 3818744 thread 28 bound to OS proc set {32}
OMP: pid 3818650 tid 3818728 thread 12 bound to OS proc set {13}
OMP: pid 3818650 tid 3818768 thread 52 bound to OS proc set {60}
OMP: pid 3818650 tid 3818729 thread 13 bound to OS proc set {15}
OMP: pid 3818650 tid 3818725 thread 9 bound to OS proc set {10}
OMP: pid 3818650 tid 3818746 thread 30 bound to OS proc set {34}
OMP: pid 3818650 tid 3818724 thread 8 bound to OS proc set {9}
OMP: pid 3818650 tid 3818758 thread 42 bound to OS proc set {48}
OMP: pid 3818650 tid 3818764 thread 48 bound to OS proc set {55}
OMP: pid 3818650 tid 3818734 thread 18 bound to OS proc set {20}
OMP: pid 3818650 tid 3818751 thread 35 bound to OS proc set {40}
OMP: pid 3818650 tid 3818760 thread 44 bound to OS proc set {51}
OMP: pid 3818650 tid 3818743 thread 27 bound to OS proc set {31}
OMP: pid 3818650 tid 3818745 thread 29 bound to OS proc set {33}
OMP: pid 3818650 tid 3818731 thread 15 bound to OS proc set {17}
OMP: pid 3818650 tid 3818748 thread 32 bound to OS proc set {37}
OMP: pid 3818650 tid 3818752 thread 36 bound to OS proc set {41}
OMP: pid 3818650 tid 3818762 thread 46 bound to OS proc set {53}
OMP: pid 3818650 tid 3818742 thread 26 bound to OS proc set {30}
OMP: pid 3818650 tid 3818750 thread 34 bound to OS proc set {39}
OMP: pid 3818650 tid 3818754 thread 38 bound to OS proc set {44}
OMP: pid 3818650 tid 3818740 thread 24 bound to OS proc set {27}
OMP: pid 3818650 tid 3818770 thread 54 bound to OS proc set {62}
OMP: pid 3818650 tid 3818757 thread 41 bound to OS proc set {47}
OMP: pid 3818650 tid 3818759 thread 43 bound to OS proc set {49}
OMP: pid 3818650 tid 3818749 thread 33 bound to OS proc set {38}
OMP: pid 3818650 tid 3818769 thread 53 bound to OS proc set {61}
OMP: pid 3818650 tid 3818761 thread 45 bound to OS proc set {52}
OMP: pid 3818650 tid 3818756 thread 40 bound to OS proc set {46}
OMP: pid 3818650 tid 3818755 thread 39 bound to OS proc set {45}
OMP: pid 3818650 tid 3818753 thread 37 bound to OS proc set {42}
OMP: pid 3818650 tid 3818741 thread 25 bound to OS proc set {29}
OMP: pid 3818650 tid 3818736 thread 20 bound to OS proc set {23}
OMP: pid 3818650 tid 3818733 thread 17 bound to OS proc set {19}
OMP: pid 3818650 tid 3818738 thread 22 bound to OS proc set {25}
OMP: pid 3818650 tid 3818739 thread 23 bound to OS proc set {26}
OMP: pid 3818650 tid 3818735 thread 19 bound to OS proc set {22}
OMP: pid 3818650 tid 3818737 thread 21 bound to OS proc set {24}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 33.227486, "speed_pp": 61.635719, "t_tg": 0.000000, "speed_tg": nan, "t": 33.227486, "speed": 61.635719}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3867475 tid 3867475 thread 0 bound to OS proc set {0}
OMP: pid 3867475 tid 3867556 thread 15 bound to OS proc set {15}
OMP: pid 3867475 tid 3867553 thread 12 bound to OS proc set {12}
OMP: pid 3867475 tid 3867544 thread 3 bound to OS proc set {3}
OMP: pid 3867475 tid 3867543 thread 2 bound to OS proc set {2}
OMP: pid 3867475 tid 3867555 thread 14 bound to OS proc set {14}
OMP: pid 3867475 tid 3867549 thread 8 bound to OS proc set {8}
OMP: pid 3867475 tid 3867552 thread 11 bound to OS proc set {11}
OMP: pid 3867475 tid 3867554 thread 13 bound to OS proc set {13}
OMP: pid 3867475 tid 3867548 thread 7 bound to OS proc set {7}
OMP: pid 3867475 tid 3867551 thread 10 bound to OS proc set {10}
OMP: pid 3867475 tid 3867557 thread 16 bound to OS proc set {16}
OMP: pid 3867475 tid 3867560 thread 19 bound to OS proc set {19}
OMP: pid 3867475 tid 3867572 thread 31 bound to OS proc set {31}
OMP: pid 3867475 tid 3867569 thread 28 bound to OS proc set {28}
OMP: pid 3867475 tid 3867545 thread 4 bound to OS proc set {4}
OMP: pid 3867475 tid 3867547 thread 6 bound to OS proc set {6}
OMP: pid 3867475 tid 3867550 thread 9 bound to OS proc set {9}
OMP: pid 3867475 tid 3867559 thread 18 bound to OS proc set {18}
OMP: pid 3867475 tid 3867568 thread 27 bound to OS proc set {27}
OMP: pid 3867475 tid 3867571 thread 30 bound to OS proc set {30}
OMP: pid 3867475 tid 3867558 thread 17 bound to OS proc set {17}
OMP: pid 3867475 tid 3867565 thread 24 bound to OS proc set {24}
OMP: pid 3867475 tid 3867570 thread 29 bound to OS proc set {29}
OMP: pid 3867475 tid 3867567 thread 26 bound to OS proc set {26}
OMP: pid 3867475 tid 3867564 thread 23 bound to OS proc set {23}
OMP: pid 3867475 tid 3867566 thread 25 bound to OS proc set {25}
OMP: pid 3867475 tid 3867563 thread 22 bound to OS proc set {22}
OMP: pid 3867475 tid 3867561 thread 20 bound to OS proc set {20}
OMP: pid 3867475 tid 3867576 thread 35 bound to OS proc set {35}
OMP: pid 3867475 tid 3867573 thread 32 bound to OS proc set {32}
OMP: pid 3867475 tid 3867591 thread 50 bound to OS proc set {50}
OMP: pid 3867475 tid 3867542 thread 1 bound to OS proc set {1}
OMP: pid 3867475 tid 3867603 thread 62 bound to OS proc set {62}
OMP: pid 3867475 tid 3867604 thread 63 bound to OS proc set {63}
OMP: pid 3867475 tid 3867602 thread 61 bound to OS proc set {61}
OMP: pid 3867475 tid 3867593 thread 52 bound to OS proc set {52}
OMP: pid 3867475 tid 3867590 thread 49 bound to OS proc set {49}
OMP: pid 3867475 tid 3867574 thread 33 bound to OS proc set {33}
OMP: pid 3867475 tid 3867589 thread 48 bound to OS proc set {48}
OMP: pid 3867475 tid 3867592 thread 51 bound to OS proc set {51}
OMP: pid 3867475 tid 3867595 thread 54 bound to OS proc set {54}
OMP: pid 3867475 tid 3867575 thread 34 bound to OS proc set {34}
OMP: pid 3867475 tid 3867579 thread 38 bound to OS proc set {38}
OMP: pid 3867475 tid 3867581 thread 40 bound to OS proc set {40}
OMP: pid 3867475 tid 3867588 thread 47 bound to OS proc set {47}
OMP: pid 3867475 tid 3867585 thread 44 bound to OS proc set {44}
OMP: pid 3867475 tid 3867601 thread 60 bound to OS proc set {60}
OMP: pid 3867475 tid 3867597 thread 56 bound to OS proc set {56}
OMP: pid 3867475 tid 3867583 thread 42 bound to OS proc set {42}
OMP: pid 3867475 tid 3867599 thread 58 bound to OS proc set {58}
OMP: pid 3867475 tid 3867580 thread 39 bound to OS proc set {39}
OMP: pid 3867475 tid 3867596 thread 55 bound to OS proc set {55}
OMP: pid 3867475 tid 3867586 thread 45 bound to OS proc set {45}
OMP: pid 3867475 tid 3867578 thread 37 bound to OS proc set {37}
OMP: pid 3867475 tid 3867594 thread 53 bound to OS proc set {53}
OMP: pid 3867475 tid 3867582 thread 41 bound to OS proc set {41}
OMP: pid 3867475 tid 3867600 thread 59 bound to OS proc set {59}
OMP: pid 3867475 tid 3867598 thread 57 bound to OS proc set {57}
OMP: pid 3867475 tid 3867584 thread 43 bound to OS proc set {43}
OMP: pid 3867475 tid 3867546 thread 5 bound to OS proc set {5}
OMP: pid 3867475 tid 3867562 thread 21 bound to OS proc set {21}
OMP: pid 3867475 tid 3867577 thread 36 bound to OS proc set {36}
OMP: pid 3867475 tid 3867587 thread 46 bound to OS proc set {46}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 43.654484, "speed_pp": 46.913853, "t_tg": 0.000000, "speed_tg": nan, "t": 43.654484, "speed": 46.913853}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-415-7045/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_14-57-23/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################

×