options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 18.274853, "speed_pp": 14.008321, "t_tg": 0.000000, "speed_tg": nan, "t": 18.274853, "speed": 14.008321}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_0  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1150778 tid 1150778 thread 0 bound to OS proc set {0}
OMP: pid 1150778 tid 1150845 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 9.151305, "speed_pp": 27.974152, "t_tg": 0.000000, "speed_tg": nan, "t": 9.151305, "speed": 27.974152}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_1  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1150865 tid 1150865 thread 0 bound to OS proc set {0}
OMP: pid 1150865 tid 1150933 thread 2 bound to OS proc set {32}
OMP: pid 1150865 tid 1150932 thread 1 bound to OS proc set {16}
OMP: pid 1150865 tid 1150934 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 4.590272, "speed_pp": 55.770115, "t_tg": 0.000000, "speed_tg": nan, "t": 4.590272, "speed": 55.770115}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_2  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151004 tid 1151004 thread 0 bound to OS proc set {0}
OMP: pid 1151004 tid 1151073 thread 3 bound to OS proc set {24}
OMP: pid 1151004 tid 1151072 thread 2 bound to OS proc set {16}
OMP: pid 1151004 tid 1151071 thread 1 bound to OS proc set {8}
OMP: pid 1151004 tid 1151074 thread 4 bound to OS proc set {32}
OMP: pid 1151004 tid 1151076 thread 6 bound to OS proc set {48}
OMP: pid 1151004 tid 1151075 thread 5 bound to OS proc set {40}
OMP: pid 1151004 tid 1151077 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 2.305183, "speed_pp": 111.054092, "t_tg": 0.000000, "speed_tg": nan, "t": 2.305183, "speed": 111.054092}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_3  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151097 tid 1151097 thread 0 bound to OS proc set {0}
OMP: pid 1151097 tid 1151176 thread 12 bound to OS proc set {48}
OMP: pid 1151097 tid 1151167 thread 3 bound to OS proc set {12}
OMP: pid 1151097 tid 1151178 thread 14 bound to OS proc set {56}
OMP: pid 1151097 tid 1151166 thread 2 bound to OS proc set {8}
OMP: pid 1151097 tid 1151172 thread 8 bound to OS proc set {32}
OMP: pid 1151097 tid 1151177 thread 13 bound to OS proc set {52}
OMP: pid 1151097 tid 1151165 thread 1 bound to OS proc set {4}
OMP: pid 1151097 tid 1151175 thread 11 bound to OS proc set {44}
OMP: pid 1151097 tid 1151174 thread 10 bound to OS proc set {40}
OMP: pid 1151097 tid 1151170 thread 6 bound to OS proc set {24}
OMP: pid 1151097 tid 1151171 thread 7 bound to OS proc set {28}
OMP: pid 1151097 tid 1151168 thread 4 bound to OS proc set {16}
OMP: pid 1151097 tid 1151169 thread 5 bound to OS proc set {20}
OMP: pid 1151097 tid 1151173 thread 9 bound to OS proc set {36}
OMP: pid 1151097 tid 1151179 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 1.169632, "speed_pp": 218.872269, "t_tg": 0.000000, "speed_tg": nan, "t": 1.169632, "speed": 218.872269}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_4  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151199 tid 1151199 thread 0 bound to OS proc set {0}
OMP: pid 1151199 tid 1151268 thread 3 bound to OS proc set {8}
OMP: pid 1151199 tid 1151266 thread 1 bound to OS proc set {2}
OMP: pid 1151199 tid 1151281 thread 16 bound to OS proc set {43}
OMP: pid 1151199 tid 1151277 thread 12 bound to OS proc set {32}
OMP: pid 1151199 tid 1151269 thread 4 bound to OS proc set {10}
OMP: pid 1151199 tid 1151271 thread 6 bound to OS proc set {16}
OMP: pid 1151199 tid 1151279 thread 14 bound to OS proc set {37}
OMP: pid 1151199 tid 1151272 thread 7 bound to OS proc set {18}
OMP: pid 1151199 tid 1151280 thread 15 bound to OS proc set {40}
OMP: pid 1151199 tid 1151276 thread 11 bound to OS proc set {29}
OMP: pid 1151199 tid 1151284 thread 19 bound to OS proc set {51}
OMP: pid 1151199 tid 1151282 thread 17 bound to OS proc set {46}
OMP: pid 1151199 tid 1151283 thread 18 bound to OS proc set {48}
OMP: pid 1151199 tid 1151285 thread 20 bound to OS proc set {54}
OMP: pid 1151199 tid 1151273 thread 8 bound to OS proc set {21}
OMP: pid 1151199 tid 1151278 thread 13 bound to OS proc set {35}
OMP: pid 1151199 tid 1151274 thread 9 bound to OS proc set {24}
OMP: pid 1151199 tid 1151275 thread 10 bound to OS proc set {27}
OMP: pid 1151199 tid 1151287 thread 22 bound to OS proc set {59}
OMP: pid 1151199 tid 1151270 thread 5 bound to OS proc set {13}
OMP: pid 1151199 tid 1151267 thread 2 bound to OS proc set {5}
OMP: pid 1151199 tid 1151286 thread 21 bound to OS proc set {56}
OMP: pid 1151199 tid 1151288 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 0.881757, "speed_pp": 290.329407, "t_tg": 0.000000, "speed_tg": nan, "t": 0.881757, "speed": 290.329407}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_5  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151308 tid 1151308 thread 0 bound to OS proc set {0}
OMP: pid 1151308 tid 1151381 thread 7 bound to OS proc set {14}
OMP: pid 1151308 tid 1151375 thread 1 bound to OS proc set {2}
OMP: pid 1151308 tid 1151376 thread 2 bound to OS proc set {4}
OMP: pid 1151308 tid 1151382 thread 8 bound to OS proc set {16}
OMP: pid 1151308 tid 1151388 thread 14 bound to OS proc set {28}
OMP: pid 1151308 tid 1151379 thread 5 bound to OS proc set {10}
OMP: pid 1151308 tid 1151378 thread 4 bound to OS proc set {8}
OMP: pid 1151308 tid 1151386 thread 12 bound to OS proc set {24}
OMP: pid 1151308 tid 1151389 thread 15 bound to OS proc set {30}
OMP: pid 1151308 tid 1151398 thread 24 bound to OS proc set {48}
OMP: pid 1151308 tid 1151377 thread 3 bound to OS proc set {6}
OMP: pid 1151308 tid 1151380 thread 6 bound to OS proc set {12}
OMP: pid 1151308 tid 1151383 thread 9 bound to OS proc set {18}
OMP: pid 1151308 tid 1151393 thread 19 bound to OS proc set {38}
OMP: pid 1151308 tid 1151402 thread 28 bound to OS proc set {56}
OMP: pid 1151308 tid 1151384 thread 10 bound to OS proc set {20}
OMP: pid 1151308 tid 1151390 thread 16 bound to OS proc set {32}
OMP: pid 1151308 tid 1151404 thread 30 bound to OS proc set {60}
OMP: pid 1151308 tid 1151397 thread 23 bound to OS proc set {46}
OMP: pid 1151308 tid 1151401 thread 27 bound to OS proc set {54}
OMP: pid 1151308 tid 1151385 thread 11 bound to OS proc set {22}
OMP: pid 1151308 tid 1151387 thread 13 bound to OS proc set {26}
OMP: pid 1151308 tid 1151405 thread 31 bound to OS proc set {62}
OMP: pid 1151308 tid 1151391 thread 17 bound to OS proc set {34}
OMP: pid 1151308 tid 1151392 thread 18 bound to OS proc set {36}
OMP: pid 1151308 tid 1151403 thread 29 bound to OS proc set {58}
OMP: pid 1151308 tid 1151394 thread 20 bound to OS proc set {40}
OMP: pid 1151308 tid 1151399 thread 25 bound to OS proc set {50}
OMP: pid 1151308 tid 1151395 thread 21 bound to OS proc set {42}
OMP: pid 1151308 tid 1151400 thread 26 bound to OS proc set {52}
OMP: pid 1151308 tid 1151396 thread 22 bound to OS proc set {44}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 0.708078, "speed_pp": 361.542084, "t_tg": 0.000000, "speed_tg": nan, "t": 0.708078, "speed": 361.542084}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_6  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151425 tid 1151425 thread 0 bound to OS proc set {0}
OMP: pid 1151425 tid 1151492 thread 1 bound to OS proc set {1}
OMP: pid 1151425 tid 1151493 thread 2 bound to OS proc set {3}
OMP: pid 1151425 tid 1151505 thread 14 bound to OS proc set {22}
OMP: pid 1151425 tid 1151494 thread 3 bound to OS proc set {4}
OMP: pid 1151425 tid 1151498 thread 7 bound to OS proc set {11}
OMP: pid 1151425 tid 1151499 thread 8 bound to OS proc set {13}
OMP: pid 1151425 tid 1151503 thread 12 bound to OS proc set {19}
OMP: pid 1151425 tid 1151501 thread 10 bound to OS proc set {16}
OMP: pid 1151425 tid 1151504 thread 13 bound to OS proc set {21}
OMP: pid 1151425 tid 1151523 thread 32 bound to OS proc set {52}
OMP: pid 1151425 tid 1151526 thread 35 bound to OS proc set {56}
OMP: pid 1151425 tid 1151506 thread 15 bound to OS proc set {24}
OMP: pid 1151425 tid 1151502 thread 11 bound to OS proc set {17}
OMP: pid 1151425 tid 1151527 thread 36 bound to OS proc set {58}
OMP: pid 1151425 tid 1151525 thread 34 bound to OS proc set {55}
OMP: pid 1151425 tid 1151509 thread 18 bound to OS proc set {29}
OMP: pid 1151425 tid 1151519 thread 28 bound to OS proc set {45}
OMP: pid 1151425 tid 1151508 thread 17 bound to OS proc set {27}
OMP: pid 1151425 tid 1151515 thread 24 bound to OS proc set {39}
OMP: pid 1151425 tid 1151500 thread 9 bound to OS proc set {14}
OMP: pid 1151425 tid 1151495 thread 4 bound to OS proc set {6}
OMP: pid 1151425 tid 1151497 thread 6 bound to OS proc set {9}
OMP: pid 1151425 tid 1151496 thread 5 bound to OS proc set {8}
OMP: pid 1151425 tid 1151510 thread 19 bound to OS proc set {30}
OMP: pid 1151425 tid 1151514 thread 23 bound to OS proc set {37}
OMP: pid 1151425 tid 1151524 thread 33 bound to OS proc set {53}
OMP: pid 1151425 tid 1151522 thread 31 bound to OS proc set {50}
OMP: pid 1151425 tid 1151518 thread 27 bound to OS proc set {43}
OMP: pid 1151425 tid 1151511 thread 20 bound to OS proc set {32}
OMP: pid 1151425 tid 1151507 thread 16 bound to OS proc set {26}
OMP: pid 1151425 tid 1151530 thread 39 bound to OS proc set {63}
OMP: pid 1151425 tid 1151529 thread 38 bound to OS proc set {61}
OMP: pid 1151425 tid 1151521 thread 30 bound to OS proc set {48}
OMP: pid 1151425 tid 1151528 thread 37 bound to OS proc set {60}
OMP: pid 1151425 tid 1151517 thread 26 bound to OS proc set {42}
OMP: pid 1151425 tid 1151520 thread 29 bound to OS proc set {47}
OMP: pid 1151425 tid 1151513 thread 22 bound to OS proc set {35}
OMP: pid 1151425 tid 1151516 thread 25 bound to OS proc set {40}
OMP: pid 1151425 tid 1151512 thread 21 bound to OS proc set {34}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 0.606171, "speed_pp": 422.323059, "t_tg": 0.000000, "speed_tg": nan, "t": 0.606171, "speed": 422.323059}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_7  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151598 tid 1151598 thread 0 bound to OS proc set {0}
OMP: pid 1151598 tid 1151666 thread 2 bound to OS proc set {2}
OMP: pid 1151598 tid 1151665 thread 1 bound to OS proc set {1}
OMP: pid 1151598 tid 1151675 thread 11 bound to OS proc set {14}
OMP: pid 1151598 tid 1151676 thread 12 bound to OS proc set {16}
OMP: pid 1151598 tid 1151672 thread 8 bound to OS proc set {10}
OMP: pid 1151598 tid 1151678 thread 14 bound to OS proc set {18}
OMP: pid 1151598 tid 1151673 thread 9 bound to OS proc set {12}
OMP: pid 1151598 tid 1151667 thread 3 bound to OS proc set {4}
OMP: pid 1151598 tid 1151671 thread 7 bound to OS proc set {9}
OMP: pid 1151598 tid 1151674 thread 10 bound to OS proc set {13}
OMP: pid 1151598 tid 1151708 thread 44 bound to OS proc set {59}
OMP: pid 1151598 tid 1151697 thread 33 bound to OS proc set {44}
OMP: pid 1151598 tid 1151682 thread 18 bound to OS proc set {24}
OMP: pid 1151598 tid 1151677 thread 13 bound to OS proc set {17}
OMP: pid 1151598 tid 1151698 thread 34 bound to OS proc set {46}
OMP: pid 1151598 tid 1151681 thread 17 bound to OS proc set {23}
OMP: pid 1151598 tid 1151687 thread 23 bound to OS proc set {31}
OMP: pid 1151598 tid 1151694 thread 30 bound to OS proc set {40}
OMP: pid 1151598 tid 1151699 thread 35 bound to OS proc set {47}
OMP: pid 1151598 tid 1151670 thread 6 bound to OS proc set {8}
OMP: pid 1151598 tid 1151710 thread 46 bound to OS proc set {62}
OMP: pid 1151598 tid 1151679 thread 15 bound to OS proc set {20}
OMP: pid 1151598 tid 1151688 thread 24 bound to OS proc set {32}
OMP: pid 1151598 tid 1151693 thread 29 bound to OS proc set {39}
OMP: pid 1151598 tid 1151686 thread 22 bound to OS proc set {29}
OMP: pid 1151598 tid 1151690 thread 26 bound to OS proc set {35}
OMP: pid 1151598 tid 1151696 thread 32 bound to OS proc set {43}
OMP: pid 1151598 tid 1151684 thread 20 bound to OS proc set {27}
OMP: pid 1151598 tid 1151700 thread 36 bound to OS proc set {48}
OMP: pid 1151598 tid 1151691 thread 27 bound to OS proc set {36}
OMP: pid 1151598 tid 1151669 thread 5 bound to OS proc set {6}
OMP: pid 1151598 tid 1151689 thread 25 bound to OS proc set {33}
OMP: pid 1151598 tid 1151695 thread 31 bound to OS proc set {41}
OMP: pid 1151598 tid 1151668 thread 4 bound to OS proc set {5}
OMP: pid 1151598 tid 1151683 thread 19 bound to OS proc set {25}
OMP: pid 1151598 tid 1151692 thread 28 bound to OS proc set {37}
OMP: pid 1151598 tid 1151707 thread 43 bound to OS proc set {58}
OMP: pid 1151598 tid 1151680 thread 16 bound to OS proc set {21}
OMP: pid 1151598 tid 1151685 thread 21 bound to OS proc set {28}
OMP: pid 1151598 tid 1151711 thread 47 bound to OS proc set {63}
OMP: pid 1151598 tid 1151709 thread 45 bound to OS proc set {60}
OMP: pid 1151598 tid 1151704 thread 40 bound to OS proc set {54}
OMP: pid 1151598 tid 1151703 thread 39 bound to OS proc set {52}
OMP: pid 1151598 tid 1151701 thread 37 bound to OS proc set {50}
OMP: pid 1151598 tid 1151702 thread 38 bound to OS proc set {51}
OMP: pid 1151598 tid 1151705 thread 41 bound to OS proc set {55}
OMP: pid 1151598 tid 1151706 thread 42 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 0.533168, "speed_pp": 480.148834, "t_tg": 0.000000, "speed_tg": nan, "t": 0.533168, "speed": 480.148834}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_8  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151731 tid 1151731 thread 0 bound to OS proc set {0}
OMP: pid 1151731 tid 1151809 thread 12 bound to OS proc set {13}
OMP: pid 1151731 tid 1151808 thread 11 bound to OS proc set {12}
OMP: pid 1151731 tid 1151799 thread 2 bound to OS proc set {2}
OMP: pid 1151731 tid 1151805 thread 8 bound to OS proc set {9}
OMP: pid 1151731 tid 1151798 thread 1 bound to OS proc set {1}
OMP: pid 1151731 tid 1151807 thread 10 bound to OS proc set {11}
OMP: pid 1151731 tid 1151804 thread 7 bound to OS proc set {8}
OMP: pid 1151731 tid 1151806 thread 9 bound to OS proc set {10}
OMP: pid 1151731 tid 1151848 thread 51 bound to OS proc set {59}
OMP: pid 1151731 tid 1151845 thread 48 bound to OS proc set {55}
OMP: pid 1151731 tid 1151811 thread 14 bound to OS proc set {16}
OMP: pid 1151731 tid 1151810 thread 13 bound to OS proc set {15}
OMP: pid 1151731 tid 1151846 thread 49 bound to OS proc set {56}
OMP: pid 1151731 tid 1151812 thread 15 bound to OS proc set {17}
OMP: pid 1151731 tid 1151801 thread 4 bound to OS proc set {4}
OMP: pid 1151731 tid 1151852 thread 55 bound to OS proc set {63}
OMP: pid 1151731 tid 1151825 thread 28 bound to OS proc set {32}
OMP: pid 1151731 tid 1151800 thread 3 bound to OS proc set {3}
OMP: pid 1151731 tid 1151841 thread 44 bound to OS proc set {51}
OMP: pid 1151731 tid 1151829 thread 32 bound to OS proc set {37}
OMP: pid 1151731 tid 1151847 thread 50 bound to OS proc set {58}
OMP: pid 1151731 tid 1151844 thread 47 bound to OS proc set {54}
OMP: pid 1151731 tid 1151815 thread 18 bound to OS proc set {20}
OMP: pid 1151731 tid 1151843 thread 46 bound to OS proc set {53}
OMP: pid 1151731 tid 1151849 thread 52 bound to OS proc set {60}
OMP: pid 1151731 tid 1151828 thread 31 bound to OS proc set {35}
OMP: pid 1151731 tid 1151802 thread 5 bound to OS proc set {5}
OMP: pid 1151731 tid 1151824 thread 27 bound to OS proc set {31}
OMP: pid 1151731 tid 1151827 thread 30 bound to OS proc set {34}
OMP: pid 1151731 tid 1151803 thread 6 bound to OS proc set {6}
OMP: pid 1151731 tid 1151813 thread 16 bound to OS proc set {18}
OMP: pid 1151731 tid 1151821 thread 24 bound to OS proc set {27}
OMP: pid 1151731 tid 1151814 thread 17 bound to OS proc set {19}
OMP: pid 1151731 tid 1151817 thread 20 bound to OS proc set {23}
OMP: pid 1151731 tid 1151836 thread 39 bound to OS proc set {45}
OMP: pid 1151731 tid 1151839 thread 42 bound to OS proc set {48}
OMP: pid 1151731 tid 1151823 thread 26 bound to OS proc set {30}
OMP: pid 1151731 tid 1151851 thread 54 bound to OS proc set {62}
OMP: pid 1151731 tid 1151826 thread 29 bound to OS proc set {33}
OMP: pid 1151731 tid 1151822 thread 25 bound to OS proc set {29}
OMP: pid 1151731 tid 1151840 thread 43 bound to OS proc set {49}
OMP: pid 1151731 tid 1151842 thread 45 bound to OS proc set {52}
OMP: pid 1151731 tid 1151833 thread 36 bound to OS proc set {41}
OMP: pid 1151731 tid 1151830 thread 33 bound to OS proc set {38}
OMP: pid 1151731 tid 1151818 thread 21 bound to OS proc set {24}
OMP: pid 1151731 tid 1151819 thread 22 bound to OS proc set {25}
OMP: pid 1151731 tid 1151831 thread 34 bound to OS proc set {39}
OMP: pid 1151731 tid 1151835 thread 38 bound to OS proc set {44}
OMP: pid 1151731 tid 1151816 thread 19 bound to OS proc set {22}
OMP: pid 1151731 tid 1151832 thread 35 bound to OS proc set {40}
OMP: pid 1151731 tid 1151838 thread 41 bound to OS proc set {47}
OMP: pid 1151731 tid 1151850 thread 53 bound to OS proc set {61}
OMP: pid 1151731 tid 1151837 thread 40 bound to OS proc set {46}
OMP: pid 1151731 tid 1151820 thread 23 bound to OS proc set {26}
OMP: pid 1151731 tid 1151834 thread 37 bound to OS proc set {42}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 0.479007, "speed_pp": 534.438965, "t_tg": 0.000000, "speed_tg": nan, "t": 0.479007, "speed": 534.438965}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9

To display your profiling results:
#######################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                               COMMAND                                                                                                #
#######################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_9  #
#######################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 1151872 tid 1151872 thread 0 bound to OS proc set {0}
OMP: pid 1151872 tid 1151953 thread 15 bound to OS proc set {15}
OMP: pid 1151872 tid 1151950 thread 12 bound to OS proc set {12}
OMP: pid 1151872 tid 1151941 thread 3 bound to OS proc set {3}
OMP: pid 1151872 tid 1151940 thread 2 bound to OS proc set {2}
OMP: pid 1151872 tid 1151970 thread 32 bound to OS proc set {32}
OMP: pid 1151872 tid 1151946 thread 8 bound to OS proc set {8}
OMP: pid 1151872 tid 1151949 thread 11 bound to OS proc set {11}
OMP: pid 1151872 tid 1151973 thread 35 bound to OS proc set {35}
OMP: pid 1151872 tid 1151989 thread 51 bound to OS proc set {51}
OMP: pid 1151872 tid 1151951 thread 13 bound to OS proc set {13}
OMP: pid 1151872 tid 1151985 thread 47 bound to OS proc set {47}
OMP: pid 1151872 tid 1151952 thread 14 bound to OS proc set {14}
OMP: pid 1151872 tid 1151942 thread 4 bound to OS proc set {4}
OMP: pid 1151872 tid 1151969 thread 31 bound to OS proc set {31}
OMP: pid 1151872 tid 1151986 thread 48 bound to OS proc set {48}
OMP: pid 1151872 tid 1152001 thread 63 bound to OS proc set {63}
OMP: pid 1151872 tid 1151948 thread 10 bound to OS proc set {10}
OMP: pid 1151872 tid 1151966 thread 28 bound to OS proc set {28}
OMP: pid 1151872 tid 1151997 thread 59 bound to OS proc set {59}
OMP: pid 1151872 tid 1151945 thread 7 bound to OS proc set {7}
OMP: pid 1151872 tid 1151957 thread 19 bound to OS proc set {19}
OMP: pid 1151872 tid 1151939 thread 1 bound to OS proc set {1}
OMP: pid 1151872 tid 1151954 thread 16 bound to OS proc set {16}
OMP: pid 1151872 tid 1151972 thread 34 bound to OS proc set {34}
OMP: pid 1151872 tid 1151982 thread 44 bound to OS proc set {44}
OMP: pid 1151872 tid 1151988 thread 50 bound to OS proc set {50}
OMP: pid 1151872 tid 1151994 thread 56 bound to OS proc set {56}
OMP: pid 1151872 tid 1151981 thread 43 bound to OS proc set {43}
OMP: pid 1151872 tid 1151944 thread 6 bound to OS proc set {6}
OMP: pid 1151872 tid 1151996 thread 58 bound to OS proc set {58}
OMP: pid 1151872 tid 1151956 thread 18 bound to OS proc set {18}
OMP: pid 1151872 tid 1151947 thread 9 bound to OS proc set {9}
OMP: pid 1151872 tid 1151984 thread 46 bound to OS proc set {46}
OMP: pid 1151872 tid 1151987 thread 49 bound to OS proc set {49}
OMP: pid 1151872 tid 1151965 thread 27 bound to OS proc set {27}
OMP: pid 1151872 tid 1151993 thread 55 bound to OS proc set {55}
OMP: pid 1151872 tid 1151978 thread 40 bound to OS proc set {40}
OMP: pid 1151872 tid 1151962 thread 24 bound to OS proc set {24}
OMP: pid 1151872 tid 1151971 thread 33 bound to OS proc set {33}
OMP: pid 1151872 tid 1151968 thread 30 bound to OS proc set {30}
OMP: pid 1151872 tid 1151983 thread 45 bound to OS proc set {45}
OMP: pid 1151872 tid 1151967 thread 29 bound to OS proc set {29}
OMP: pid 1151872 tid 1151955 thread 17 bound to OS proc set {17}
OMP: pid 1151872 tid 1151964 thread 26 bound to OS proc set {26}
OMP: pid 1151872 tid 1151943 thread 5 bound to OS proc set {5}
OMP: pid 1151872 tid 1151977 thread 39 bound to OS proc set {39}
OMP: pid 1151872 tid 1151974 thread 36 bound to OS proc set {36}
OMP: pid 1151872 tid 1151963 thread 25 bound to OS proc set {25}
OMP: pid 1151872 tid 1151976 thread 38 bound to OS proc set {38}
OMP: pid 1151872 tid 1151961 thread 23 bound to OS proc set {23}
OMP: pid 1151872 tid 1151980 thread 42 bound to OS proc set {42}
OMP: pid 1151872 tid 1151990 thread 52 bound to OS proc set {52}
OMP: pid 1151872 tid 1151958 thread 20 bound to OS proc set {20}
OMP: pid 1151872 tid 1151992 thread 54 bound to OS proc set {54}
OMP: pid 1151872 tid 1151979 thread 41 bound to OS proc set {41}
OMP: pid 1151872 tid 1151975 thread 37 bound to OS proc set {37}
OMP: pid 1151872 tid 1151995 thread 57 bound to OS proc set {57}
OMP: pid 1151872 tid 1151960 thread 22 bound to OS proc set {22}
OMP: pid 1151872 tid 1152000 thread 62 bound to OS proc set {62}
OMP: pid 1151872 tid 1151959 thread 21 bound to OS proc set {21}
OMP: pid 1151872 tid 1151991 thread 53 bound to OS proc set {53}
OMP: pid 1151872 tid 1151999 thread 61 bound to OS proc set {61}
OMP: pid 1151872 tid 1151998 thread 60 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 0.429845, "speed_pp": 595.563538, "t_tg": 0.000000, "speed_tg": nan, "t": 0.429845, "speed": 595.563538}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-399-5586/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-00-33/tools/lprof_npsu_run_10  #
########################################################################################################################################################################################################################################

×