options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 119.496696, "speed_tg": 8.569275, "t": 119.496696, "speed": 8.569275}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 304484 tid 304484 thread 0 bound to OS proc set {0}
OMP: pid 304484 tid 304551 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 60.440556, "speed_tg": 16.942266, "t": 60.440556, "speed": 16.942266}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 304622 tid 304622 thread 0 bound to OS proc set {0}
OMP: pid 304622 tid 304690 thread 2 bound to OS proc set {32}
OMP: pid 304622 tid 304689 thread 1 bound to OS proc set {16}
OMP: pid 304622 tid 304691 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 30.800768, "speed_tg": 33.245926, "t": 30.800770, "speed": 33.245922}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 304718 tid 304718 thread 0 bound to OS proc set {0}
OMP: pid 304718 tid 304787 thread 3 bound to OS proc set {24}
OMP: pid 304718 tid 304786 thread 2 bound to OS proc set {16}
OMP: pid 304718 tid 304788 thread 4 bound to OS proc set {32}
OMP: pid 304718 tid 304790 thread 6 bound to OS proc set {48}
OMP: pid 304718 tid 304789 thread 5 bound to OS proc set {40}
OMP: pid 304718 tid 304785 thread 1 bound to OS proc set {8}
OMP: pid 304718 tid 304791 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 16.044230, "speed_tg": 63.823570, "t": 16.044231, "speed": 63.823563}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 304860 tid 304860 thread 0 bound to OS proc set {0}
OMP: pid 304860 tid 304927 thread 1 bound to OS proc set {4}
OMP: pid 304860 tid 304938 thread 12 bound to OS proc set {48}
OMP: pid 304860 tid 304929 thread 3 bound to OS proc set {12}
OMP: pid 304860 tid 304928 thread 2 bound to OS proc set {8}
OMP: pid 304860 tid 304940 thread 14 bound to OS proc set {56}
OMP: pid 304860 tid 304934 thread 8 bound to OS proc set {32}
OMP: pid 304860 tid 304939 thread 13 bound to OS proc set {52}
OMP: pid 304860 tid 304937 thread 11 bound to OS proc set {44}
OMP: pid 304860 tid 304936 thread 10 bound to OS proc set {40}
OMP: pid 304860 tid 304933 thread 7 bound to OS proc set {28}
OMP: pid 304860 tid 304935 thread 9 bound to OS proc set {36}
OMP: pid 304860 tid 304932 thread 6 bound to OS proc set {24}
OMP: pid 304860 tid 304931 thread 5 bound to OS proc set {20}
OMP: pid 304860 tid 304930 thread 4 bound to OS proc set {16}
OMP: pid 304860 tid 304941 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 9.005549, "speed_tg": 113.707664, "t": 9.005549, "speed": 113.707664}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 304961 tid 304961 thread 0 bound to OS proc set {0}
OMP: pid 304961 tid 305028 thread 1 bound to OS proc set {2}
OMP: pid 304961 tid 305029 thread 2 bound to OS proc set {5}
OMP: pid 304961 tid 305035 thread 8 bound to OS proc set {21}
OMP: pid 304961 tid 305034 thread 7 bound to OS proc set {18}
OMP: pid 304961 tid 305031 thread 4 bound to OS proc set {10}
OMP: pid 304961 tid 305036 thread 9 bound to OS proc set {24}
OMP: pid 304961 tid 305042 thread 15 bound to OS proc set {40}
OMP: pid 304961 tid 305043 thread 16 bound to OS proc set {43}
OMP: pid 304961 tid 305038 thread 11 bound to OS proc set {29}
OMP: pid 304961 tid 305046 thread 19 bound to OS proc set {51}
OMP: pid 304961 tid 305041 thread 14 bound to OS proc set {37}
OMP: pid 304961 tid 305032 thread 5 bound to OS proc set {13}
OMP: pid 304961 tid 305030 thread 3 bound to OS proc set {8}
OMP: pid 304961 tid 305033 thread 6 bound to OS proc set {16}
OMP: pid 304961 tid 305039 thread 12 bound to OS proc set {32}
OMP: pid 304961 tid 305037 thread 10 bound to OS proc set {27}
OMP: pid 304961 tid 305045 thread 18 bound to OS proc set {48}
OMP: pid 304961 tid 305040 thread 13 bound to OS proc set {35}
OMP: pid 304961 tid 305047 thread 20 bound to OS proc set {54}
OMP: pid 304961 tid 305044 thread 17 bound to OS proc set {46}
OMP: pid 304961 tid 305049 thread 22 bound to OS proc set {59}
OMP: pid 304961 tid 305048 thread 21 bound to OS proc set {56}
OMP: pid 304961 tid 305050 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.351552, "speed_tg": 139.290314, "t": 7.351552, "speed": 139.290314}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 305070 tid 305070 thread 0 bound to OS proc set {0}
OMP: pid 305070 tid 305140 thread 4 bound to OS proc set {8}
OMP: pid 305070 tid 305146 thread 10 bound to OS proc set {20}
OMP: pid 305070 tid 305144 thread 8 bound to OS proc set {16}
OMP: pid 305070 tid 305137 thread 1 bound to OS proc set {2}
OMP: pid 305070 tid 305138 thread 2 bound to OS proc set {4}
OMP: pid 305070 tid 305143 thread 7 bound to OS proc set {14}
OMP: pid 305070 tid 305142 thread 6 bound to OS proc set {12}
OMP: pid 305070 tid 305164 thread 28 bound to OS proc set {56}
OMP: pid 305070 tid 305145 thread 9 bound to OS proc set {18}
OMP: pid 305070 tid 305150 thread 14 bound to OS proc set {28}
OMP: pid 305070 tid 305141 thread 5 bound to OS proc set {10}
OMP: pid 305070 tid 305139 thread 3 bound to OS proc set {6}
OMP: pid 305070 tid 305148 thread 12 bound to OS proc set {24}
OMP: pid 305070 tid 305155 thread 19 bound to OS proc set {38}
OMP: pid 305070 tid 305154 thread 18 bound to OS proc set {36}
OMP: pid 305070 tid 305151 thread 15 bound to OS proc set {30}
OMP: pid 305070 tid 305166 thread 30 bound to OS proc set {60}
OMP: pid 305070 tid 305160 thread 24 bound to OS proc set {48}
OMP: pid 305070 tid 305163 thread 27 bound to OS proc set {54}
OMP: pid 305070 tid 305147 thread 11 bound to OS proc set {22}
OMP: pid 305070 tid 305162 thread 26 bound to OS proc set {52}
OMP: pid 305070 tid 305152 thread 16 bound to OS proc set {32}
OMP: pid 305070 tid 305149 thread 13 bound to OS proc set {26}
OMP: pid 305070 tid 305156 thread 20 bound to OS proc set {40}
OMP: pid 305070 tid 305153 thread 17 bound to OS proc set {34}
OMP: pid 305070 tid 305165 thread 29 bound to OS proc set {58}
OMP: pid 305070 tid 305158 thread 22 bound to OS proc set {44}
OMP: pid 305070 tid 305159 thread 23 bound to OS proc set {46}
OMP: pid 305070 tid 305157 thread 21 bound to OS proc set {42}
OMP: pid 305070 tid 305161 thread 25 bound to OS proc set {50}
OMP: pid 305070 tid 305167 thread 31 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.515601, "speed_tg": 157.161240, "t": 6.515601, "speed": 157.161240}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 305237 tid 305237 thread 0 bound to OS proc set {0}
OMP: pid 305237 tid 305304 thread 1 bound to OS proc set {1}
OMP: pid 305237 tid 305318 thread 15 bound to OS proc set {24}
OMP: pid 305237 tid 305305 thread 2 bound to OS proc set {3}
OMP: pid 305237 tid 305311 thread 8 bound to OS proc set {13}
OMP: pid 305237 tid 305306 thread 3 bound to OS proc set {4}
OMP: pid 305237 tid 305317 thread 14 bound to OS proc set {22}
OMP: pid 305237 tid 305309 thread 6 bound to OS proc set {9}
OMP: pid 305237 tid 305314 thread 11 bound to OS proc set {17}
OMP: pid 305237 tid 305315 thread 12 bound to OS proc set {19}
OMP: pid 305237 tid 305342 thread 39 bound to OS proc set {63}
OMP: pid 305237 tid 305313 thread 10 bound to OS proc set {16}
OMP: pid 305237 tid 305327 thread 24 bound to OS proc set {39}
OMP: pid 305237 tid 305321 thread 18 bound to OS proc set {29}
OMP: pid 305237 tid 305310 thread 7 bound to OS proc set {11}
OMP: pid 305237 tid 305312 thread 9 bound to OS proc set {14}
OMP: pid 305237 tid 305335 thread 32 bound to OS proc set {52}
OMP: pid 305237 tid 305308 thread 5 bound to OS proc set {8}
OMP: pid 305237 tid 305338 thread 35 bound to OS proc set {56}
OMP: pid 305237 tid 305307 thread 4 bound to OS proc set {6}
OMP: pid 305237 tid 305331 thread 28 bound to OS proc set {45}
OMP: pid 305237 tid 305334 thread 31 bound to OS proc set {50}
OMP: pid 305237 tid 305339 thread 36 bound to OS proc set {58}
OMP: pid 305237 tid 305330 thread 27 bound to OS proc set {43}
OMP: pid 305237 tid 305337 thread 34 bound to OS proc set {55}
OMP: pid 305237 tid 305322 thread 19 bound to OS proc set {30}
OMP: pid 305237 tid 305341 thread 38 bound to OS proc set {61}
OMP: pid 305237 tid 305336 thread 33 bound to OS proc set {53}
OMP: pid 305237 tid 305316 thread 13 bound to OS proc set {21}
OMP: pid 305237 tid 305333 thread 30 bound to OS proc set {48}
OMP: pid 305237 tid 305319 thread 16 bound to OS proc set {26}
OMP: pid 305237 tid 305323 thread 20 bound to OS proc set {32}
OMP: pid 305237 tid 305325 thread 22 bound to OS proc set {35}
OMP: pid 305237 tid 305340 thread 37 bound to OS proc set {60}
OMP: pid 305237 tid 305332 thread 29 bound to OS proc set {47}
OMP: pid 305237 tid 305320 thread 17 bound to OS proc set {27}
OMP: pid 305237 tid 305326 thread 23 bound to OS proc set {37}
OMP: pid 305237 tid 305328 thread 25 bound to OS proc set {40}
OMP: pid 305237 tid 305329 thread 26 bound to OS proc set {42}
OMP: pid 305237 tid 305324 thread 21 bound to OS proc set {34}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.033006, "speed_tg": 169.732956, "t": 6.033006, "speed": 169.732956}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 305362 tid 305362 thread 0 bound to OS proc set {0}
OMP: pid 305362 tid 305430 thread 2 bound to OS proc set {2}
OMP: pid 305362 tid 305429 thread 1 bound to OS proc set {1}
OMP: pid 305362 tid 305440 thread 12 bound to OS proc set {16}
OMP: pid 305362 tid 305441 thread 13 bound to OS proc set {17}
OMP: pid 305362 tid 305439 thread 11 bound to OS proc set {14}
OMP: pid 305362 tid 305443 thread 15 bound to OS proc set {20}
OMP: pid 305362 tid 305436 thread 8 bound to OS proc set {10}
OMP: pid 305362 tid 305435 thread 7 bound to OS proc set {9}
OMP: pid 305362 tid 305437 thread 9 bound to OS proc set {12}
OMP: pid 305362 tid 305431 thread 3 bound to OS proc set {4}
OMP: pid 305362 tid 305434 thread 6 bound to OS proc set {8}
OMP: pid 305362 tid 305472 thread 44 bound to OS proc set {59}
OMP: pid 305362 tid 305442 thread 14 bound to OS proc set {18}
OMP: pid 305362 tid 305438 thread 10 bound to OS proc set {13}
OMP: pid 305362 tid 305459 thread 31 bound to OS proc set {41}
OMP: pid 305362 tid 305433 thread 5 bound to OS proc set {6}
OMP: pid 305362 tid 305432 thread 4 bound to OS proc set {5}
OMP: pid 305362 tid 305474 thread 46 bound to OS proc set {62}
OMP: pid 305362 tid 305446 thread 18 bound to OS proc set {24}
OMP: pid 305362 tid 305451 thread 23 bound to OS proc set {31}
OMP: pid 305362 tid 305455 thread 27 bound to OS proc set {36}
OMP: pid 305362 tid 305475 thread 47 bound to OS proc set {63}
OMP: pid 305362 tid 305460 thread 32 bound to OS proc set {43}
OMP: pid 305362 tid 305463 thread 35 bound to OS proc set {47}
OMP: pid 305362 tid 305448 thread 20 bound to OS proc set {27}
OMP: pid 305362 tid 305444 thread 16 bound to OS proc set {21}
OMP: pid 305362 tid 305447 thread 19 bound to OS proc set {25}
OMP: pid 305362 tid 305454 thread 26 bound to OS proc set {35}
OMP: pid 305362 tid 305458 thread 30 bound to OS proc set {40}
OMP: pid 305362 tid 305471 thread 43 bound to OS proc set {58}
OMP: pid 305362 tid 305462 thread 34 bound to OS proc set {46}
OMP: pid 305362 tid 305461 thread 33 bound to OS proc set {44}
OMP: pid 305362 tid 305456 thread 28 bound to OS proc set {37}
OMP: pid 305362 tid 305452 thread 24 bound to OS proc set {32}
OMP: pid 305362 tid 305453 thread 25 bound to OS proc set {33}
OMP: pid 305362 tid 305445 thread 17 bound to OS proc set {23}
OMP: pid 305362 tid 305473 thread 45 bound to OS proc set {60}
OMP: pid 305362 tid 305464 thread 36 bound to OS proc set {48}
OMP: pid 305362 tid 305450 thread 22 bound to OS proc set {29}
OMP: pid 305362 tid 305457 thread 29 bound to OS proc set {39}
OMP: pid 305362 tid 305449 thread 21 bound to OS proc set {28}
OMP: pid 305362 tid 305468 thread 40 bound to OS proc set {54}
OMP: pid 305362 tid 305470 thread 42 bound to OS proc set {56}
OMP: pid 305362 tid 305466 thread 38 bound to OS proc set {51}
OMP: pid 305362 tid 305469 thread 41 bound to OS proc set {55}
OMP: pid 305362 tid 305467 thread 39 bound to OS proc set {52}
OMP: pid 305362 tid 305465 thread 37 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.766068, "speed_tg": 177.590698, "t": 5.766068, "speed": 177.590698}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 305495 tid 305495 thread 0 bound to OS proc set {0}
OMP: pid 305495 tid 305576 thread 15 bound to OS proc set {17}
OMP: pid 305495 tid 305572 thread 11 bound to OS proc set {12}
OMP: pid 305495 tid 305569 thread 8 bound to OS proc set {9}
OMP: pid 305495 tid 305563 thread 2 bound to OS proc set {2}
OMP: pid 305495 tid 305575 thread 14 bound to OS proc set {16}
OMP: pid 305495 tid 305571 thread 10 bound to OS proc set {11}
OMP: pid 305495 tid 305574 thread 13 bound to OS proc set {15}
OMP: pid 305495 tid 305568 thread 7 bound to OS proc set {8}
OMP: pid 305495 tid 305570 thread 9 bound to OS proc set {10}
OMP: pid 305495 tid 305562 thread 1 bound to OS proc set {1}
OMP: pid 305495 tid 305577 thread 16 bound to OS proc set {18}
OMP: pid 305495 tid 305579 thread 18 bound to OS proc set {20}
OMP: pid 305495 tid 305578 thread 17 bound to OS proc set {19}
OMP: pid 305495 tid 305593 thread 32 bound to OS proc set {37}
OMP: pid 305495 tid 305564 thread 3 bound to OS proc set {3}
OMP: pid 305495 tid 305612 thread 51 bound to OS proc set {59}
OMP: pid 305495 tid 305573 thread 12 bound to OS proc set {13}
OMP: pid 305495 tid 305610 thread 49 bound to OS proc set {56}
OMP: pid 305495 tid 305608 thread 47 bound to OS proc set {54}
OMP: pid 305495 tid 305611 thread 50 bound to OS proc set {58}
OMP: pid 305495 tid 305565 thread 4 bound to OS proc set {4}
OMP: pid 305495 tid 305609 thread 48 bound to OS proc set {55}
OMP: pid 305495 tid 305607 thread 46 bound to OS proc set {53}
OMP: pid 305495 tid 305604 thread 43 bound to OS proc set {49}
OMP: pid 305495 tid 305605 thread 44 bound to OS proc set {51}
OMP: pid 305495 tid 305603 thread 42 bound to OS proc set {48}
OMP: pid 305495 tid 305600 thread 39 bound to OS proc set {45}
OMP: pid 305495 tid 305613 thread 52 bound to OS proc set {60}
OMP: pid 305495 tid 305616 thread 55 bound to OS proc set {63}
OMP: pid 305495 tid 305601 thread 40 bound to OS proc set {46}
OMP: pid 305495 tid 305615 thread 54 bound to OS proc set {62}
OMP: pid 305495 tid 305584 thread 23 bound to OS proc set {26}
OMP: pid 305495 tid 305591 thread 30 bound to OS proc set {34}
OMP: pid 305495 tid 305589 thread 28 bound to OS proc set {32}
OMP: pid 305495 tid 305592 thread 31 bound to OS proc set {35}
OMP: pid 305495 tid 305567 thread 6 bound to OS proc set {6}
OMP: pid 305495 tid 305566 thread 5 bound to OS proc set {5}
OMP: pid 305495 tid 305594 thread 33 bound to OS proc set {38}
OMP: pid 305495 tid 305583 thread 22 bound to OS proc set {25}
OMP: pid 305495 tid 305580 thread 19 bound to OS proc set {22}
OMP: pid 305495 tid 305606 thread 45 bound to OS proc set {52}
OMP: pid 305495 tid 305588 thread 27 bound to OS proc set {31}
OMP: pid 305495 tid 305590 thread 29 bound to OS proc set {33}
OMP: pid 305495 tid 305585 thread 24 bound to OS proc set {27}
OMP: pid 305495 tid 305602 thread 41 bound to OS proc set {47}
OMP: pid 305495 tid 305596 thread 35 bound to OS proc set {40}
OMP: pid 305495 tid 305597 thread 36 bound to OS proc set {41}
OMP: pid 305495 tid 305599 thread 38 bound to OS proc set {44}
OMP: pid 305495 tid 305595 thread 34 bound to OS proc set {39}
OMP: pid 305495 tid 305581 thread 20 bound to OS proc set {23}
OMP: pid 305495 tid 305598 thread 37 bound to OS proc set {42}
OMP: pid 305495 tid 305586 thread 25 bound to OS proc set {29}
OMP: pid 305495 tid 305582 thread 21 bound to OS proc set {24}
OMP: pid 305495 tid 305587 thread 26 bound to OS proc set {30}
OMP: pid 305495 tid 305614 thread 53 bound to OS proc set {61}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.599392, "speed_tg": 182.876999, "t": 5.599392, "speed": 182.876999}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 305636 tid 305636 thread 0 bound to OS proc set {0}
OMP: pid 305636 tid 305705 thread 3 bound to OS proc set {3}
OMP: pid 305636 tid 305704 thread 2 bound to OS proc set {2}
OMP: pid 305636 tid 305714 thread 12 bound to OS proc set {12}
OMP: pid 305636 tid 305710 thread 8 bound to OS proc set {8}
OMP: pid 305636 tid 305713 thread 11 bound to OS proc set {11}
OMP: pid 305636 tid 305712 thread 10 bound to OS proc set {10}
OMP: pid 305636 tid 305706 thread 4 bound to OS proc set {4}
OMP: pid 305636 tid 305715 thread 13 bound to OS proc set {13}
OMP: pid 305636 tid 305709 thread 7 bound to OS proc set {7}
OMP: pid 305636 tid 305711 thread 9 bound to OS proc set {9}
OMP: pid 305636 tid 305708 thread 6 bound to OS proc set {6}
OMP: pid 305636 tid 305764 thread 62 bound to OS proc set {62}
OMP: pid 305636 tid 305752 thread 50 bound to OS proc set {50}
OMP: pid 305636 tid 305717 thread 15 bound to OS proc set {15}
OMP: pid 305636 tid 305730 thread 28 bound to OS proc set {28}
OMP: pid 305636 tid 305718 thread 16 bound to OS proc set {16}
OMP: pid 305636 tid 305703 thread 1 bound to OS proc set {1}
OMP: pid 305636 tid 305736 thread 34 bound to OS proc set {34}
OMP: pid 305636 tid 305716 thread 14 bound to OS proc set {14}
OMP: pid 305636 tid 305721 thread 19 bound to OS proc set {19}
OMP: pid 305636 tid 305763 thread 61 bound to OS proc set {61}
OMP: pid 305636 tid 305753 thread 51 bound to OS proc set {51}
OMP: pid 305636 tid 305765 thread 63 bound to OS proc set {63}
OMP: pid 305636 tid 305751 thread 49 bound to OS proc set {49}
OMP: pid 305636 tid 305750 thread 48 bound to OS proc set {48}
OMP: pid 305636 tid 305762 thread 60 bound to OS proc set {60}
OMP: pid 305636 tid 305760 thread 58 bound to OS proc set {58}
OMP: pid 305636 tid 305729 thread 27 bound to OS proc set {27}
OMP: pid 305636 tid 305748 thread 46 bound to OS proc set {46}
OMP: pid 305636 tid 305719 thread 17 bound to OS proc set {17}
OMP: pid 305636 tid 305734 thread 32 bound to OS proc set {32}
OMP: pid 305636 tid 305737 thread 35 bound to OS proc set {35}
OMP: pid 305636 tid 305758 thread 56 bound to OS proc set {56}
OMP: pid 305636 tid 305749 thread 47 bound to OS proc set {47}
OMP: pid 305636 tid 305728 thread 26 bound to OS proc set {26}
OMP: pid 305636 tid 305756 thread 54 bound to OS proc set {54}
OMP: pid 305636 tid 305733 thread 31 bound to OS proc set {31}
OMP: pid 305636 tid 305746 thread 44 bound to OS proc set {44}
OMP: pid 305636 tid 305720 thread 18 bound to OS proc set {18}
OMP: pid 305636 tid 305726 thread 24 bound to OS proc set {24}
OMP: pid 305636 tid 305742 thread 40 bound to OS proc set {40}
OMP: pid 305636 tid 305740 thread 38 bound to OS proc set {38}
OMP: pid 305636 tid 305761 thread 59 bound to OS proc set {59}
OMP: pid 305636 tid 305722 thread 20 bound to OS proc set {20}
OMP: pid 305636 tid 305727 thread 25 bound to OS proc set {25}
OMP: pid 305636 tid 305739 thread 37 bound to OS proc set {37}
OMP: pid 305636 tid 305747 thread 45 bound to OS proc set {45}
OMP: pid 305636 tid 305743 thread 41 bound to OS proc set {41}
OMP: pid 305636 tid 305741 thread 39 bound to OS proc set {39}
OMP: pid 305636 tid 305732 thread 30 bound to OS proc set {30}
OMP: pid 305636 tid 305725 thread 23 bound to OS proc set {23}
OMP: pid 305636 tid 305757 thread 55 bound to OS proc set {55}
OMP: pid 305636 tid 305745 thread 43 bound to OS proc set {43}
OMP: pid 305636 tid 305731 thread 29 bound to OS proc set {29}
OMP: pid 305636 tid 305755 thread 53 bound to OS proc set {53}
OMP: pid 305636 tid 305759 thread 57 bound to OS proc set {57}
OMP: pid 305636 tid 305744 thread 42 bound to OS proc set {42}
OMP: pid 305636 tid 305723 thread 21 bound to OS proc set {21}
OMP: pid 305636 tid 305738 thread 36 bound to OS proc set {36}
OMP: pid 305636 tid 305735 thread 33 bound to OS proc set {33}
OMP: pid 305636 tid 305707 thread 5 bound to OS proc set {5}
OMP: pid 305636 tid 305724 thread 22 bound to OS proc set {22}
OMP: pid 305636 tid 305754 thread 52 bound to OS proc set {52}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 8, "n_kv": 1024, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.591608, "speed_tg": 183.131577, "t": 5.591608, "speed": 183.131577}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-6681/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-56-22/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################

×