options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 207.327927, "speed_tg": 9.878071, "t": 207.327927, "speed": 9.878071}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 316251 tid 316251 thread 0 bound to OS proc set {0}
OMP: pid 316251 tid 316318 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 104.098969, "speed_tg": 19.673586, "t": 104.098969, "speed": 19.673586}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 316389 tid 316389 thread 0 bound to OS proc set {0}
OMP: pid 316389 tid 316458 thread 2 bound to OS proc set {32}
OMP: pid 316389 tid 316457 thread 1 bound to OS proc set {16}
OMP: pid 316389 tid 316459 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 52.332424, "speed_tg": 39.134438, "t": 52.332424, "speed": 39.134438}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 316529 tid 316529 thread 0 bound to OS proc set {0}
OMP: pid 316529 tid 316596 thread 1 bound to OS proc set {8}
OMP: pid 316529 tid 316598 thread 3 bound to OS proc set {24}
OMP: pid 316529 tid 316597 thread 2 bound to OS proc set {16}
OMP: pid 316529 tid 316599 thread 4 bound to OS proc set {32}
OMP: pid 316529 tid 316601 thread 6 bound to OS proc set {48}
OMP: pid 316529 tid 316600 thread 5 bound to OS proc set {40}
OMP: pid 316529 tid 316602 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 26.706421, "speed_tg": 76.685677, "t": 26.706423, "speed": 76.685677}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 316623 tid 316623 thread 0 bound to OS proc set {0}
OMP: pid 316623 tid 316693 thread 4 bound to OS proc set {16}
OMP: pid 316623 tid 316690 thread 1 bound to OS proc set {4}
OMP: pid 316623 tid 316701 thread 12 bound to OS proc set {48}
OMP: pid 316623 tid 316691 thread 2 bound to OS proc set {8}
OMP: pid 316623 tid 316692 thread 3 bound to OS proc set {12}
OMP: pid 316623 tid 316703 thread 14 bound to OS proc set {56}
OMP: pid 316623 tid 316697 thread 8 bound to OS proc set {32}
OMP: pid 316623 tid 316700 thread 11 bound to OS proc set {44}
OMP: pid 316623 tid 316696 thread 7 bound to OS proc set {28}
OMP: pid 316623 tid 316695 thread 6 bound to OS proc set {24}
OMP: pid 316623 tid 316702 thread 13 bound to OS proc set {52}
OMP: pid 316623 tid 316694 thread 5 bound to OS proc set {20}
OMP: pid 316623 tid 316699 thread 10 bound to OS proc set {40}
OMP: pid 316623 tid 316698 thread 9 bound to OS proc set {36}
OMP: pid 316623 tid 316704 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 14.112069, "speed_tg": 145.124008, "t": 14.112069, "speed": 145.124008}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 316772 tid 316772 thread 0 bound to OS proc set {0}
OMP: pid 316772 tid 316845 thread 7 bound to OS proc set {18}
OMP: pid 316772 tid 316840 thread 2 bound to OS proc set {5}
OMP: pid 316772 tid 316839 thread 1 bound to OS proc set {2}
OMP: pid 316772 tid 316846 thread 8 bound to OS proc set {21}
OMP: pid 316772 tid 316841 thread 3 bound to OS proc set {8}
OMP: pid 316772 tid 316842 thread 4 bound to OS proc set {10}
OMP: pid 316772 tid 316844 thread 6 bound to OS proc set {16}
OMP: pid 316772 tid 316843 thread 5 bound to OS proc set {13}
OMP: pid 316772 tid 316854 thread 16 bound to OS proc set {43}
OMP: pid 316772 tid 316856 thread 18 bound to OS proc set {48}
OMP: pid 316772 tid 316847 thread 9 bound to OS proc set {24}
OMP: pid 316772 tid 316857 thread 19 bound to OS proc set {51}
OMP: pid 316772 tid 316850 thread 12 bound to OS proc set {32}
OMP: pid 316772 tid 316853 thread 15 bound to OS proc set {40}
OMP: pid 316772 tid 316852 thread 14 bound to OS proc set {37}
OMP: pid 316772 tid 316849 thread 11 bound to OS proc set {29}
OMP: pid 316772 tid 316855 thread 17 bound to OS proc set {46}
OMP: pid 316772 tid 316858 thread 20 bound to OS proc set {54}
OMP: pid 316772 tid 316851 thread 13 bound to OS proc set {35}
OMP: pid 316772 tid 316848 thread 10 bound to OS proc set {27}
OMP: pid 316772 tid 316859 thread 21 bound to OS proc set {56}
OMP: pid 316772 tid 316860 thread 22 bound to OS proc set {59}
OMP: pid 316772 tid 316861 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 10.871305, "speed_tg": 188.385834, "t": 10.871305, "speed": 188.385834}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 316882 tid 316882 thread 0 bound to OS proc set {0}
OMP: pid 316882 tid 316960 thread 12 bound to OS proc set {24}
OMP: pid 316882 tid 316951 thread 3 bound to OS proc set {6}
OMP: pid 316882 tid 316949 thread 1 bound to OS proc set {2}
OMP: pid 316882 tid 316955 thread 7 bound to OS proc set {14}
OMP: pid 316882 tid 316956 thread 8 bound to OS proc set {16}
OMP: pid 316882 tid 316950 thread 2 bound to OS proc set {4}
OMP: pid 316882 tid 316958 thread 10 bound to OS proc set {20}
OMP: pid 316882 tid 316963 thread 15 bound to OS proc set {30}
OMP: pid 316882 tid 316954 thread 6 bound to OS proc set {12}
OMP: pid 316882 tid 316964 thread 16 bound to OS proc set {32}
OMP: pid 316882 tid 316952 thread 4 bound to OS proc set {8}
OMP: pid 316882 tid 316957 thread 9 bound to OS proc set {18}
OMP: pid 316882 tid 316967 thread 19 bound to OS proc set {38}
OMP: pid 316882 tid 316959 thread 11 bound to OS proc set {22}
OMP: pid 316882 tid 316976 thread 28 bound to OS proc set {56}
OMP: pid 316882 tid 316975 thread 27 bound to OS proc set {54}
OMP: pid 316882 tid 316978 thread 30 bound to OS proc set {60}
OMP: pid 316882 tid 316966 thread 18 bound to OS proc set {36}
OMP: pid 316882 tid 316953 thread 5 bound to OS proc set {10}
OMP: pid 316882 tid 316965 thread 17 bound to OS proc set {34}
OMP: pid 316882 tid 316961 thread 13 bound to OS proc set {26}
OMP: pid 316882 tid 316962 thread 14 bound to OS proc set {28}
OMP: pid 316882 tid 316968 thread 20 bound to OS proc set {40}
OMP: pid 316882 tid 316977 thread 29 bound to OS proc set {58}
OMP: pid 316882 tid 316971 thread 23 bound to OS proc set {46}
OMP: pid 316882 tid 316972 thread 24 bound to OS proc set {48}
OMP: pid 316882 tid 316974 thread 26 bound to OS proc set {52}
OMP: pid 316882 tid 316973 thread 25 bound to OS proc set {50}
OMP: pid 316882 tid 316969 thread 21 bound to OS proc set {42}
OMP: pid 316882 tid 316979 thread 31 bound to OS proc set {62}
OMP: pid 316882 tid 316970 thread 22 bound to OS proc set {44}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 8.962382, "speed_tg": 228.510666, "t": 8.962382, "speed": 228.510666}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 316999 tid 316999 thread 0 bound to OS proc set {0}
OMP: pid 316999 tid 317066 thread 1 bound to OS proc set {1}
OMP: pid 316999 tid 317067 thread 2 bound to OS proc set {3}
OMP: pid 316999 tid 317080 thread 15 bound to OS proc set {24}
OMP: pid 316999 tid 317097 thread 32 bound to OS proc set {52}
OMP: pid 316999 tid 317068 thread 3 bound to OS proc set {4}
OMP: pid 316999 tid 317079 thread 14 bound to OS proc set {22}
OMP: pid 316999 tid 317070 thread 5 bound to OS proc set {8}
OMP: pid 316999 tid 317071 thread 6 bound to OS proc set {9}
OMP: pid 316999 tid 317100 thread 35 bound to OS proc set {56}
OMP: pid 316999 tid 317076 thread 11 bound to OS proc set {17}
OMP: pid 316999 tid 317069 thread 4 bound to OS proc set {6}
OMP: pid 316999 tid 317099 thread 34 bound to OS proc set {55}
OMP: pid 316999 tid 317074 thread 9 bound to OS proc set {14}
OMP: pid 316999 tid 317098 thread 33 bound to OS proc set {53}
OMP: pid 316999 tid 317103 thread 38 bound to OS proc set {61}
OMP: pid 316999 tid 317077 thread 12 bound to OS proc set {19}
OMP: pid 316999 tid 317075 thread 10 bound to OS proc set {16}
OMP: pid 316999 tid 317093 thread 28 bound to OS proc set {45}
OMP: pid 316999 tid 317072 thread 7 bound to OS proc set {11}
OMP: pid 316999 tid 317104 thread 39 bound to OS proc set {63}
OMP: pid 316999 tid 317078 thread 13 bound to OS proc set {21}
OMP: pid 316999 tid 317089 thread 24 bound to OS proc set {39}
OMP: pid 316999 tid 317083 thread 18 bound to OS proc set {29}
OMP: pid 316999 tid 317095 thread 30 bound to OS proc set {48}
OMP: pid 316999 tid 317096 thread 31 bound to OS proc set {50}
OMP: pid 316999 tid 317102 thread 37 bound to OS proc set {60}
OMP: pid 316999 tid 317085 thread 20 bound to OS proc set {32}
OMP: pid 316999 tid 317084 thread 19 bound to OS proc set {30}
OMP: pid 316999 tid 317101 thread 36 bound to OS proc set {58}
OMP: pid 316999 tid 317092 thread 27 bound to OS proc set {43}
OMP: pid 316999 tid 317082 thread 17 bound to OS proc set {27}
OMP: pid 316999 tid 317086 thread 21 bound to OS proc set {34}
OMP: pid 316999 tid 317081 thread 16 bound to OS proc set {26}
OMP: pid 316999 tid 317094 thread 29 bound to OS proc set {47}
OMP: pid 316999 tid 317088 thread 23 bound to OS proc set {37}
OMP: pid 316999 tid 317087 thread 22 bound to OS proc set {35}
OMP: pid 316999 tid 317073 thread 8 bound to OS proc set {13}
OMP: pid 316999 tid 317090 thread 25 bound to OS proc set {40}
OMP: pid 316999 tid 317091 thread 26 bound to OS proc set {42}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 7.922470, "speed_tg": 258.505249, "t": 7.922471, "speed": 258.505219}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 317174 tid 317174 thread 0 bound to OS proc set {0}
OMP: pid 317174 tid 317242 thread 2 bound to OS proc set {2}
OMP: pid 317174 tid 317241 thread 1 bound to OS proc set {1}
OMP: pid 317174 tid 317252 thread 12 bound to OS proc set {16}
OMP: pid 317174 tid 317251 thread 11 bound to OS proc set {14}
OMP: pid 317174 tid 317243 thread 3 bound to OS proc set {4}
OMP: pid 317174 tid 317244 thread 4 bound to OS proc set {5}
OMP: pid 317174 tid 317250 thread 10 bound to OS proc set {13}
OMP: pid 317174 tid 317275 thread 35 bound to OS proc set {47}
OMP: pid 317174 tid 317247 thread 7 bound to OS proc set {9}
OMP: pid 317174 tid 317255 thread 15 bound to OS proc set {20}
OMP: pid 317174 tid 317246 thread 6 bound to OS proc set {8}
OMP: pid 317174 tid 317253 thread 13 bound to OS proc set {17}
OMP: pid 317174 tid 317249 thread 9 bound to OS proc set {12}
OMP: pid 317174 tid 317248 thread 8 bound to OS proc set {10}
OMP: pid 317174 tid 317273 thread 33 bound to OS proc set {44}
OMP: pid 317174 tid 317274 thread 34 bound to OS proc set {46}
OMP: pid 317174 tid 317270 thread 30 bound to OS proc set {40}
OMP: pid 317174 tid 317254 thread 14 bound to OS proc set {18}
OMP: pid 317174 tid 317245 thread 5 bound to OS proc set {6}
OMP: pid 317174 tid 317271 thread 31 bound to OS proc set {41}
OMP: pid 317174 tid 317284 thread 44 bound to OS proc set {59}
OMP: pid 317174 tid 317286 thread 46 bound to OS proc set {62}
OMP: pid 317174 tid 317263 thread 23 bound to OS proc set {31}
OMP: pid 317174 tid 317258 thread 18 bound to OS proc set {24}
OMP: pid 317174 tid 317256 thread 16 bound to OS proc set {21}
OMP: pid 317174 tid 317287 thread 47 bound to OS proc set {63}
OMP: pid 317174 tid 317283 thread 43 bound to OS proc set {58}
OMP: pid 317174 tid 317262 thread 22 bound to OS proc set {29}
OMP: pid 317174 tid 317272 thread 32 bound to OS proc set {43}
OMP: pid 317174 tid 317267 thread 27 bound to OS proc set {36}
OMP: pid 317174 tid 317257 thread 17 bound to OS proc set {23}
OMP: pid 317174 tid 317268 thread 28 bound to OS proc set {37}
OMP: pid 317174 tid 317266 thread 26 bound to OS proc set {35}
OMP: pid 317174 tid 317276 thread 36 bound to OS proc set {48}
OMP: pid 317174 tid 317269 thread 29 bound to OS proc set {39}
OMP: pid 317174 tid 317285 thread 45 bound to OS proc set {60}
OMP: pid 317174 tid 317265 thread 25 bound to OS proc set {33}
OMP: pid 317174 tid 317264 thread 24 bound to OS proc set {32}
OMP: pid 317174 tid 317261 thread 21 bound to OS proc set {28}
OMP: pid 317174 tid 317260 thread 20 bound to OS proc set {27}
OMP: pid 317174 tid 317279 thread 39 bound to OS proc set {52}
OMP: pid 317174 tid 317259 thread 19 bound to OS proc set {25}
OMP: pid 317174 tid 317280 thread 40 bound to OS proc set {54}
OMP: pid 317174 tid 317281 thread 41 bound to OS proc set {55}
OMP: pid 317174 tid 317278 thread 38 bound to OS proc set {51}
OMP: pid 317174 tid 317282 thread 42 bound to OS proc set {56}
OMP: pid 317174 tid 317277 thread 37 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.256312, "speed_tg": 282.237030, "t": 7.256312, "speed": 282.237030}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 317308 tid 317308 thread 0 bound to OS proc set {0}
OMP: pid 317308 tid 317389 thread 15 bound to OS proc set {17}
OMP: pid 317308 tid 317388 thread 14 bound to OS proc set {16}
OMP: pid 317308 tid 317375 thread 1 bound to OS proc set {1}
OMP: pid 317308 tid 317387 thread 13 bound to OS proc set {15}
OMP: pid 317308 tid 317390 thread 16 bound to OS proc set {18}
OMP: pid 317308 tid 317392 thread 18 bound to OS proc set {20}
OMP: pid 317308 tid 317391 thread 17 bound to OS proc set {19}
OMP: pid 317308 tid 317377 thread 3 bound to OS proc set {3}
OMP: pid 317308 tid 317423 thread 49 bound to OS proc set {56}
OMP: pid 317308 tid 317386 thread 12 bound to OS proc set {13}
OMP: pid 317308 tid 317378 thread 4 bound to OS proc set {4}
OMP: pid 317308 tid 317385 thread 11 bound to OS proc set {12}
OMP: pid 317308 tid 317422 thread 48 bound to OS proc set {55}
OMP: pid 317308 tid 317382 thread 8 bound to OS proc set {9}
OMP: pid 317308 tid 317376 thread 2 bound to OS proc set {2}
OMP: pid 317308 tid 317424 thread 50 bound to OS proc set {58}
OMP: pid 317308 tid 317421 thread 47 bound to OS proc set {54}
OMP: pid 317308 tid 317426 thread 52 bound to OS proc set {60}
OMP: pid 317308 tid 317379 thread 5 bound to OS proc set {5}
OMP: pid 317308 tid 317425 thread 51 bound to OS proc set {59}
OMP: pid 317308 tid 317418 thread 44 bound to OS proc set {51}
OMP: pid 317308 tid 317406 thread 32 bound to OS proc set {37}
OMP: pid 317308 tid 317420 thread 46 bound to OS proc set {53}
OMP: pid 317308 tid 317381 thread 7 bound to OS proc set {8}
OMP: pid 317308 tid 317414 thread 40 bound to OS proc set {46}
OMP: pid 317308 tid 317416 thread 42 bound to OS proc set {48}
OMP: pid 317308 tid 317427 thread 53 bound to OS proc set {61}
OMP: pid 317308 tid 317405 thread 31 bound to OS proc set {35}
OMP: pid 317308 tid 317417 thread 43 bound to OS proc set {49}
OMP: pid 317308 tid 317428 thread 54 bound to OS proc set {62}
OMP: pid 317308 tid 317384 thread 10 bound to OS proc set {11}
OMP: pid 317308 tid 317419 thread 45 bound to OS proc set {52}
OMP: pid 317308 tid 317407 thread 33 bound to OS proc set {38}
OMP: pid 317308 tid 317383 thread 9 bound to OS proc set {10}
OMP: pid 317308 tid 317413 thread 39 bound to OS proc set {45}
OMP: pid 317308 tid 317402 thread 28 bound to OS proc set {32}
OMP: pid 317308 tid 317397 thread 23 bound to OS proc set {26}
OMP: pid 317308 tid 317400 thread 26 bound to OS proc set {30}
OMP: pid 317308 tid 317410 thread 36 bound to OS proc set {41}
OMP: pid 317308 tid 317415 thread 41 bound to OS proc set {47}
OMP: pid 317308 tid 317380 thread 6 bound to OS proc set {6}
OMP: pid 317308 tid 317404 thread 30 bound to OS proc set {34}
OMP: pid 317308 tid 317401 thread 27 bound to OS proc set {31}
OMP: pid 317308 tid 317409 thread 35 bound to OS proc set {40}
OMP: pid 317308 tid 317403 thread 29 bound to OS proc set {33}
OMP: pid 317308 tid 317412 thread 38 bound to OS proc set {44}
OMP: pid 317308 tid 317393 thread 19 bound to OS proc set {22}
OMP: pid 317308 tid 317396 thread 22 bound to OS proc set {25}
OMP: pid 317308 tid 317408 thread 34 bound to OS proc set {39}
OMP: pid 317308 tid 317394 thread 20 bound to OS proc set {23}
OMP: pid 317308 tid 317398 thread 24 bound to OS proc set {27}
OMP: pid 317308 tid 317411 thread 37 bound to OS proc set {42}
OMP: pid 317308 tid 317395 thread 21 bound to OS proc set {24}
OMP: pid 317308 tid 317399 thread 25 bound to OS proc set {29}
OMP: pid 317308 tid 317429 thread 55 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.779281, "speed_tg": 302.096924, "t": 6.779281, "speed": 302.096924}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 317449 tid 317449 thread 0 bound to OS proc set {0}
OMP: pid 317449 tid 317518 thread 3 bound to OS proc set {3}
OMP: pid 317449 tid 317517 thread 2 bound to OS proc set {2}
OMP: pid 317449 tid 317516 thread 1 bound to OS proc set {1}
OMP: pid 317449 tid 317519 thread 4 bound to OS proc set {4}
OMP: pid 317449 tid 317527 thread 12 bound to OS proc set {12}
OMP: pid 317449 tid 317547 thread 32 bound to OS proc set {32}
OMP: pid 317449 tid 317530 thread 15 bound to OS proc set {15}
OMP: pid 317449 tid 317520 thread 5 bound to OS proc set {5}
OMP: pid 317449 tid 317523 thread 8 bound to OS proc set {8}
OMP: pid 317449 tid 317529 thread 14 bound to OS proc set {14}
OMP: pid 317449 tid 317528 thread 13 bound to OS proc set {13}
OMP: pid 317449 tid 317566 thread 51 bound to OS proc set {51}
OMP: pid 317449 tid 317522 thread 7 bound to OS proc set {7}
OMP: pid 317449 tid 317534 thread 19 bound to OS proc set {19}
OMP: pid 317449 tid 317578 thread 63 bound to OS proc set {63}
OMP: pid 317449 tid 317526 thread 11 bound to OS proc set {11}
OMP: pid 317449 tid 317521 thread 6 bound to OS proc set {6}
OMP: pid 317449 tid 317563 thread 48 bound to OS proc set {48}
OMP: pid 317449 tid 317575 thread 60 bound to OS proc set {60}
OMP: pid 317449 tid 317546 thread 31 bound to OS proc set {31}
OMP: pid 317449 tid 317524 thread 9 bound to OS proc set {9}
OMP: pid 317449 tid 317550 thread 35 bound to OS proc set {35}
OMP: pid 317449 tid 317539 thread 24 bound to OS proc set {24}
OMP: pid 317449 tid 317531 thread 16 bound to OS proc set {16}
OMP: pid 317449 tid 317525 thread 10 bound to OS proc set {10}
OMP: pid 317449 tid 317543 thread 28 bound to OS proc set {28}
OMP: pid 317449 tid 317562 thread 47 bound to OS proc set {47}
OMP: pid 317449 tid 317559 thread 44 bound to OS proc set {44}
OMP: pid 317449 tid 317571 thread 56 bound to OS proc set {56}
OMP: pid 317449 tid 317564 thread 49 bound to OS proc set {49}
OMP: pid 317449 tid 317533 thread 18 bound to OS proc set {18}
OMP: pid 317449 tid 317542 thread 27 bound to OS proc set {27}
OMP: pid 317449 tid 317532 thread 17 bound to OS proc set {17}
OMP: pid 317449 tid 317551 thread 36 bound to OS proc set {36}
OMP: pid 317449 tid 317548 thread 33 bound to OS proc set {33}
OMP: pid 317449 tid 317545 thread 30 bound to OS proc set {30}
OMP: pid 317449 tid 317574 thread 59 bound to OS proc set {59}
OMP: pid 317449 tid 317549 thread 34 bound to OS proc set {34}
OMP: pid 317449 tid 317570 thread 55 bound to OS proc set {55}
OMP: pid 317449 tid 317552 thread 37 bound to OS proc set {37}
OMP: pid 317449 tid 317565 thread 50 bound to OS proc set {50}
OMP: pid 317449 tid 317555 thread 40 bound to OS proc set {40}
OMP: pid 317449 tid 317538 thread 23 bound to OS proc set {23}
OMP: pid 317449 tid 317558 thread 43 bound to OS proc set {43}
OMP: pid 317449 tid 317553 thread 38 bound to OS proc set {38}
OMP: pid 317449 tid 317536 thread 21 bound to OS proc set {21}
OMP: pid 317449 tid 317576 thread 61 bound to OS proc set {61}
OMP: pid 317449 tid 317537 thread 22 bound to OS proc set {22}
OMP: pid 317449 tid 317567 thread 52 bound to OS proc set {52}
OMP: pid 317449 tid 317540 thread 25 bound to OS proc set {25}
OMP: pid 317449 tid 317541 thread 26 bound to OS proc set {26}
OMP: pid 317449 tid 317554 thread 39 bound to OS proc set {39}
OMP: pid 317449 tid 317544 thread 29 bound to OS proc set {29}
OMP: pid 317449 tid 317577 thread 62 bound to OS proc set {62}
OMP: pid 317449 tid 317569 thread 54 bound to OS proc set {54}
OMP: pid 317449 tid 317568 thread 53 bound to OS proc set {53}
OMP: pid 317449 tid 317573 thread 58 bound to OS proc set {58}
OMP: pid 317449 tid 317535 thread 20 bound to OS proc set {20}
OMP: pid 317449 tid 317572 thread 57 bound to OS proc set {57}
OMP: pid 317449 tid 317557 thread 42 bound to OS proc set {42}
OMP: pid 317449 tid 317561 thread 46 bound to OS proc set {46}
OMP: pid 317449 tid 317560 thread 45 bound to OS proc set {45}
OMP: pid 317449 tid 317556 thread 41 bound to OS proc set {41}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.466614, "speed_tg": 316.703613, "t": 6.466614, "speed": 316.703613}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-8838/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-41-09/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################

×