options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 52.562206, "speed_tg": 9.740839, "t": 52.562206, "speed": 9.740839}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_0  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 595571 tid 595571 thread 0 bound to OS proc set {0}
OMP: pid 595571 tid 595670 thread 1 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 26.454468, "speed_tg": 19.354010, "t": 26.454470, "speed": 19.354008}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_1  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 595740 tid 595740 thread 0 bound to OS proc set {0}
OMP: pid 595740 tid 595839 thread 1 bound to OS proc set {24}
OMP: pid 595740 tid 595840 thread 2 bound to OS proc set {48}
OMP: pid 595740 tid 595841 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 13.464477, "speed_tg": 38.025986, "t": 13.464477, "speed": 38.025986}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_2  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 595863 tid 595863 thread 0 bound to OS proc set {0}
OMP: pid 595863 tid 595964 thread 3 bound to OS proc set {36}
OMP: pid 595863 tid 595963 thread 2 bound to OS proc set {24}
OMP: pid 595863 tid 595965 thread 4 bound to OS proc set {48}
OMP: pid 595863 tid 595967 thread 6 bound to OS proc set {72}
OMP: pid 595863 tid 595962 thread 1 bound to OS proc set {12}
OMP: pid 595863 tid 595966 thread 5 bound to OS proc set {60}
OMP: pid 595863 tid 595968 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.259862, "speed_tg": 70.524757, "t": 7.259862, "speed": 70.524757}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_3  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 595988 tid 595988 thread 0 bound to OS proc set {0}
OMP: pid 595988 tid 596089 thread 3 bound to OS proc set {18}
OMP: pid 595988 tid 596088 thread 2 bound to OS proc set {12}
OMP: pid 595988 tid 596087 thread 1 bound to OS proc set {6}
OMP: pid 595988 tid 596094 thread 8 bound to OS proc set {48}
OMP: pid 595988 tid 596093 thread 7 bound to OS proc set {42}
OMP: pid 595988 tid 596098 thread 12 bound to OS proc set {72}
OMP: pid 595988 tid 596090 thread 4 bound to OS proc set {24}
OMP: pid 595988 tid 596100 thread 14 bound to OS proc set {84}
OMP: pid 595988 tid 596096 thread 10 bound to OS proc set {60}
OMP: pid 595988 tid 596092 thread 6 bound to OS proc set {36}
OMP: pid 595988 tid 596099 thread 13 bound to OS proc set {78}
OMP: pid 595988 tid 596095 thread 9 bound to OS proc set {54}
OMP: pid 595988 tid 596097 thread 11 bound to OS proc set {66}
OMP: pid 595988 tid 596091 thread 5 bound to OS proc set {30}
OMP: pid 595988 tid 596101 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.310293, "speed_tg": 118.785423, "t": 4.310293, "speed": 118.785423}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_4  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 596121 tid 596121 thread 0 bound to OS proc set {0}
OMP: pid 596121 tid 596222 thread 2 bound to OS proc set {8}
OMP: pid 596121 tid 596224 thread 4 bound to OS proc set {16}
OMP: pid 596121 tid 596221 thread 1 bound to OS proc set {4}
OMP: pid 596121 tid 596223 thread 3 bound to OS proc set {12}
OMP: pid 596121 tid 596235 thread 15 bound to OS proc set {60}
OMP: pid 596121 tid 596228 thread 8 bound to OS proc set {32}
OMP: pid 596121 tid 596225 thread 5 bound to OS proc set {20}
OMP: pid 596121 tid 596239 thread 19 bound to OS proc set {76}
OMP: pid 596121 tid 596236 thread 16 bound to OS proc set {64}
OMP: pid 596121 tid 596230 thread 10 bound to OS proc set {40}
OMP: pid 596121 tid 596232 thread 12 bound to OS proc set {48}
OMP: pid 596121 tid 596231 thread 11 bound to OS proc set {44}
OMP: pid 596121 tid 596229 thread 9 bound to OS proc set {36}
OMP: pid 596121 tid 596227 thread 7 bound to OS proc set {28}
OMP: pid 596121 tid 596226 thread 6 bound to OS proc set {24}
OMP: pid 596121 tid 596240 thread 20 bound to OS proc set {80}
OMP: pid 596121 tid 596234 thread 14 bound to OS proc set {56}
OMP: pid 596121 tid 596237 thread 17 bound to OS proc set {68}
OMP: pid 596121 tid 596238 thread 18 bound to OS proc set {72}
OMP: pid 596121 tid 596233 thread 13 bound to OS proc set {52}
OMP: pid 596121 tid 596242 thread 22 bound to OS proc set {88}
OMP: pid 596121 tid 596243 thread 23 bound to OS proc set {92}
OMP: pid 596121 tid 596241 thread 21 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 3.600169, "speed_tg": 142.215546, "t": 3.600170, "speed": 142.215515}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_5  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 596311 tid 596311 thread 0 bound to OS proc set {0}
OMP: pid 596311 tid 596419 thread 10 bound to OS proc set {30}
OMP: pid 596311 tid 596413 thread 4 bound to OS proc set {12}
OMP: pid 596311 tid 596417 thread 8 bound to OS proc set {24}
OMP: pid 596311 tid 596410 thread 1 bound to OS proc set {3}
OMP: pid 596311 tid 596412 thread 3 bound to OS proc set {9}
OMP: pid 596311 tid 596414 thread 5 bound to OS proc set {15}
OMP: pid 596311 tid 596424 thread 15 bound to OS proc set {45}
OMP: pid 596311 tid 596415 thread 6 bound to OS proc set {18}
OMP: pid 596311 tid 596421 thread 12 bound to OS proc set {36}
OMP: pid 596311 tid 596411 thread 2 bound to OS proc set {6}
OMP: pid 596311 tid 596420 thread 11 bound to OS proc set {33}
OMP: pid 596311 tid 596416 thread 7 bound to OS proc set {21}
OMP: pid 596311 tid 596437 thread 28 bound to OS proc set {84}
OMP: pid 596311 tid 596425 thread 16 bound to OS proc set {48}
OMP: pid 596311 tid 596423 thread 14 bound to OS proc set {42}
OMP: pid 596311 tid 596439 thread 30 bound to OS proc set {90}
OMP: pid 596311 tid 596440 thread 31 bound to OS proc set {93}
OMP: pid 596311 tid 596433 thread 24 bound to OS proc set {72}
OMP: pid 596311 tid 596438 thread 29 bound to OS proc set {87}
OMP: pid 596311 tid 596422 thread 13 bound to OS proc set {39}
OMP: pid 596311 tid 596418 thread 9 bound to OS proc set {27}
OMP: pid 596311 tid 596436 thread 27 bound to OS proc set {81}
OMP: pid 596311 tid 596435 thread 26 bound to OS proc set {78}
OMP: pid 596311 tid 596427 thread 18 bound to OS proc set {54}
OMP: pid 596311 tid 596426 thread 17 bound to OS proc set {51}
OMP: pid 596311 tid 596428 thread 19 bound to OS proc set {57}
OMP: pid 596311 tid 596432 thread 23 bound to OS proc set {69}
OMP: pid 596311 tid 596430 thread 21 bound to OS proc set {63}
OMP: pid 596311 tid 596429 thread 20 bound to OS proc set {60}
OMP: pid 596311 tid 596434 thread 25 bound to OS proc set {75}
OMP: pid 596311 tid 596431 thread 22 bound to OS proc set {66}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.231429, "speed_tg": 158.443832, "t": 3.231429, "speed": 158.443832}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_6  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 596461 tid 596461 thread 0 bound to OS proc set {0}
OMP: pid 596461 tid 596574 thread 15 bound to OS proc set {36}
OMP: pid 596461 tid 596567 thread 8 bound to OS proc set {19}
OMP: pid 596461 tid 596573 thread 14 bound to OS proc set {33}
OMP: pid 596461 tid 596591 thread 32 bound to OS proc set {77}
OMP: pid 596461 tid 596571 thread 12 bound to OS proc set {29}
OMP: pid 596461 tid 596594 thread 35 bound to OS proc set {84}
OMP: pid 596461 tid 596598 thread 39 bound to OS proc set {94}
OMP: pid 596461 tid 596595 thread 36 bound to OS proc set {87}
OMP: pid 596461 tid 596590 thread 31 bound to OS proc set {75}
OMP: pid 596461 tid 596578 thread 19 bound to OS proc set {46}
OMP: pid 596461 tid 596593 thread 34 bound to OS proc set {82}
OMP: pid 596461 tid 596577 thread 18 bound to OS proc set {43}
OMP: pid 596461 tid 596572 thread 13 bound to OS proc set {31}
OMP: pid 596461 tid 596592 thread 33 bound to OS proc set {80}
OMP: pid 596461 tid 596597 thread 38 bound to OS proc set {92}
OMP: pid 596461 tid 596562 thread 3 bound to OS proc set {7}
OMP: pid 596461 tid 596566 thread 7 bound to OS proc set {16}
OMP: pid 596461 tid 596561 thread 2 bound to OS proc set {4}
OMP: pid 596461 tid 596587 thread 28 bound to OS proc set {67}
OMP: pid 596461 tid 596586 thread 27 bound to OS proc set {65}
OMP: pid 596461 tid 596575 thread 16 bound to OS proc set {38}
OMP: pid 596461 tid 596564 thread 5 bound to OS proc set {12}
OMP: pid 596461 tid 596569 thread 10 bound to OS proc set {24}
OMP: pid 596461 tid 596583 thread 24 bound to OS proc set {58}
OMP: pid 596461 tid 596563 thread 4 bound to OS proc set {9}
OMP: pid 596461 tid 596589 thread 30 bound to OS proc set {72}
OMP: pid 596461 tid 596585 thread 26 bound to OS proc set {63}
OMP: pid 596461 tid 596568 thread 9 bound to OS proc set {21}
OMP: pid 596461 tid 596560 thread 1 bound to OS proc set {2}
OMP: pid 596461 tid 596565 thread 6 bound to OS proc set {14}
OMP: pid 596461 tid 596576 thread 17 bound to OS proc set {41}
OMP: pid 596461 tid 596588 thread 29 bound to OS proc set {70}
OMP: pid 596461 tid 596570 thread 11 bound to OS proc set {26}
OMP: pid 596461 tid 596584 thread 25 bound to OS proc set {60}
OMP: pid 596461 tid 596579 thread 20 bound to OS proc set {48}
OMP: pid 596461 tid 596596 thread 37 bound to OS proc set {89}
OMP: pid 596461 tid 596582 thread 23 bound to OS proc set {55}
OMP: pid 596461 tid 596580 thread 21 bound to OS proc set {50}
OMP: pid 596461 tid 596581 thread 22 bound to OS proc set {53}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.106558, "speed_tg": 164.812622, "t": 3.106558, "speed": 164.812622}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_7  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 596618 tid 596618 thread 0 bound to OS proc set {0}
OMP: pid 596618 tid 596724 thread 8 bound to OS proc set {16}
OMP: pid 596618 tid 596731 thread 15 bound to OS proc set {30}
OMP: pid 596618 tid 596717 thread 1 bound to OS proc set {2}
OMP: pid 596618 tid 596732 thread 16 bound to OS proc set {32}
OMP: pid 596618 tid 596720 thread 4 bound to OS proc set {8}
OMP: pid 596618 tid 596718 thread 2 bound to OS proc set {4}
OMP: pid 596618 tid 596740 thread 24 bound to OS proc set {48}
OMP: pid 596618 tid 596719 thread 3 bound to OS proc set {6}
OMP: pid 596618 tid 596760 thread 44 bound to OS proc set {88}
OMP: pid 596618 tid 596744 thread 28 bound to OS proc set {56}
OMP: pid 596618 tid 596726 thread 10 bound to OS proc set {20}
OMP: pid 596618 tid 596734 thread 18 bound to OS proc set {36}
OMP: pid 596618 tid 596747 thread 31 bound to OS proc set {62}
OMP: pid 596618 tid 596751 thread 35 bound to OS proc set {70}
OMP: pid 596618 tid 596725 thread 9 bound to OS proc set {18}
OMP: pid 596618 tid 596729 thread 13 bound to OS proc set {26}
OMP: pid 596618 tid 596733 thread 17 bound to OS proc set {34}
OMP: pid 596618 tid 596748 thread 32 bound to OS proc set {64}
OMP: pid 596618 tid 596723 thread 7 bound to OS proc set {14}
OMP: pid 596618 tid 596746 thread 30 bound to OS proc set {60}
OMP: pid 596618 tid 596762 thread 46 bound to OS proc set {92}
OMP: pid 596618 tid 596727 thread 11 bound to OS proc set {22}
OMP: pid 596618 tid 596739 thread 23 bound to OS proc set {46}
OMP: pid 596618 tid 596728 thread 12 bound to OS proc set {24}
OMP: pid 596618 tid 596730 thread 14 bound to OS proc set {28}
OMP: pid 596618 tid 596738 thread 22 bound to OS proc set {44}
OMP: pid 596618 tid 596721 thread 5 bound to OS proc set {10}
OMP: pid 596618 tid 596722 thread 6 bound to OS proc set {12}
OMP: pid 596618 tid 596745 thread 29 bound to OS proc set {58}
OMP: pid 596618 tid 596743 thread 27 bound to OS proc set {54}
OMP: pid 596618 tid 596737 thread 21 bound to OS proc set {42}
OMP: pid 596618 tid 596736 thread 20 bound to OS proc set {40}
OMP: pid 596618 tid 596750 thread 34 bound to OS proc set {68}
OMP: pid 596618 tid 596735 thread 19 bound to OS proc set {38}
OMP: pid 596618 tid 596749 thread 33 bound to OS proc set {66}
OMP: pid 596618 tid 596742 thread 26 bound to OS proc set {52}
OMP: pid 596618 tid 596741 thread 25 bound to OS proc set {50}
OMP: pid 596618 tid 596752 thread 36 bound to OS proc set {72}
OMP: pid 596618 tid 596756 thread 40 bound to OS proc set {80}
OMP: pid 596618 tid 596759 thread 43 bound to OS proc set {86}
OMP: pid 596618 tid 596763 thread 47 bound to OS proc set {94}
OMP: pid 596618 tid 596754 thread 38 bound to OS proc set {76}
OMP: pid 596618 tid 596753 thread 37 bound to OS proc set {74}
OMP: pid 596618 tid 596758 thread 42 bound to OS proc set {84}
OMP: pid 596618 tid 596761 thread 45 bound to OS proc set {90}
OMP: pid 596618 tid 596755 thread 39 bound to OS proc set {78}
OMP: pid 596618 tid 596757 thread 41 bound to OS proc set {82}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.086853, "speed_tg": 165.864716, "t": 3.086853, "speed": 165.864716}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_8  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 596783 tid 596783 thread 0 bound to OS proc set {0}
OMP: pid 596783 tid 596883 thread 1 bound to OS proc set {1}
OMP: pid 596783 tid 596884 thread 2 bound to OS proc set {3}
OMP: pid 596783 tid 596889 thread 7 bound to OS proc set {12}
OMP: pid 596783 tid 596913 thread 31 bound to OS proc set {53}
OMP: pid 596783 tid 596901 thread 19 bound to OS proc set {32}
OMP: pid 596783 tid 596894 thread 12 bound to OS proc set {20}
OMP: pid 596783 tid 596933 thread 51 bound to OS proc set {88}
OMP: pid 596783 tid 596890 thread 8 bound to OS proc set {13}
OMP: pid 596783 tid 596930 thread 48 bound to OS proc set {83}
OMP: pid 596783 tid 596897 thread 15 bound to OS proc set {25}
OMP: pid 596783 tid 596893 thread 11 bound to OS proc set {19}
OMP: pid 596783 tid 596932 thread 50 bound to OS proc set {86}
OMP: pid 596783 tid 596891 thread 9 bound to OS proc set {15}
OMP: pid 596783 tid 596931 thread 49 bound to OS proc set {84}
OMP: pid 596783 tid 596886 thread 4 bound to OS proc set {6}
OMP: pid 596783 tid 596900 thread 18 bound to OS proc set {31}
OMP: pid 596783 tid 596910 thread 28 bound to OS proc set {48}
OMP: pid 596783 tid 596906 thread 24 bound to OS proc set {41}
OMP: pid 596783 tid 596937 thread 55 bound to OS proc set {95}
OMP: pid 596783 tid 596914 thread 32 bound to OS proc set {55}
OMP: pid 596783 tid 596911 thread 29 bound to OS proc set {50}
OMP: pid 596783 tid 596917 thread 35 bound to OS proc set {60}
OMP: pid 596783 tid 596896 thread 14 bound to OS proc set {24}
OMP: pid 596783 tid 596912 thread 30 bound to OS proc set {51}
OMP: pid 596783 tid 596916 thread 34 bound to OS proc set {58}
OMP: pid 596783 tid 596907 thread 25 bound to OS proc set {43}
OMP: pid 596783 tid 596934 thread 52 bound to OS proc set {90}
OMP: pid 596783 tid 596892 thread 10 bound to OS proc set {17}
OMP: pid 596783 tid 596908 thread 26 bound to OS proc set {45}
OMP: pid 596783 tid 596909 thread 27 bound to OS proc set {46}
OMP: pid 596783 tid 596926 thread 44 bound to OS proc set {76}
OMP: pid 596783 tid 596898 thread 16 bound to OS proc set {27}
OMP: pid 596783 tid 596885 thread 3 bound to OS proc set {5}
OMP: pid 596783 tid 596902 thread 20 bound to OS proc set {34}
OMP: pid 596783 tid 596888 thread 6 bound to OS proc set {10}
OMP: pid 596783 tid 596929 thread 47 bound to OS proc set {81}
OMP: pid 596783 tid 596887 thread 5 bound to OS proc set {8}
OMP: pid 596783 tid 596905 thread 23 bound to OS proc set {39}
OMP: pid 596783 tid 596928 thread 46 bound to OS proc set {79}
OMP: pid 596783 tid 596922 thread 40 bound to OS proc set {69}
OMP: pid 596783 tid 596925 thread 43 bound to OS proc set {74}
OMP: pid 596783 tid 596895 thread 13 bound to OS proc set {22}
OMP: pid 596783 tid 596921 thread 39 bound to OS proc set {67}
OMP: pid 596783 tid 596903 thread 21 bound to OS proc set {36}
OMP: pid 596783 tid 596936 thread 54 bound to OS proc set {93}
OMP: pid 596783 tid 596915 thread 33 bound to OS proc set {57}
OMP: pid 596783 tid 596918 thread 36 bound to OS proc set {62}
OMP: pid 596783 tid 596920 thread 38 bound to OS proc set {65}
OMP: pid 596783 tid 596924 thread 42 bound to OS proc set {72}
OMP: pid 596783 tid 596904 thread 22 bound to OS proc set {38}
OMP: pid 596783 tid 596919 thread 37 bound to OS proc set {64}
OMP: pid 596783 tid 596927 thread 45 bound to OS proc set {77}
OMP: pid 596783 tid 596923 thread 41 bound to OS proc set {71}
OMP: pid 596783 tid 596935 thread 53 bound to OS proc set {91}
OMP: pid 596783 tid 596899 thread 17 bound to OS proc set {29}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.100794, "speed_tg": 165.118988, "t": 3.100794, "speed": 165.118988}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_9  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 596957 tid 596957 thread 0 bound to OS proc set {0}
OMP: pid 596957 tid 597056 thread 1 bound to OS proc set {1}
OMP: pid 596957 tid 597063 thread 8 bound to OS proc set {12}
OMP: pid 596957 tid 597057 thread 2 bound to OS proc set {3}
OMP: pid 596957 tid 597062 thread 7 bound to OS proc set {10}
OMP: pid 596957 tid 597066 thread 11 bound to OS proc set {16}
OMP: pid 596957 tid 597118 thread 63 bound to OS proc set {95}
OMP: pid 596957 tid 597058 thread 3 bound to OS proc set {4}
OMP: pid 596957 tid 597090 thread 35 bound to OS proc set {53}
OMP: pid 596957 tid 597106 thread 51 bound to OS proc set {77}
OMP: pid 596957 tid 597103 thread 48 bound to OS proc set {72}
OMP: pid 596957 tid 597065 thread 10 bound to OS proc set {15}
OMP: pid 596957 tid 597064 thread 9 bound to OS proc set {13}
OMP: pid 596957 tid 597079 thread 24 bound to OS proc set {36}
OMP: pid 596957 tid 597105 thread 50 bound to OS proc set {75}
OMP: pid 596957 tid 597069 thread 14 bound to OS proc set {21}
OMP: pid 596957 tid 597085 thread 30 bound to OS proc set {45}
OMP: pid 596957 tid 597086 thread 31 bound to OS proc set {46}
OMP: pid 596957 tid 597098 thread 43 bound to OS proc set {65}
OMP: pid 596957 tid 597089 thread 34 bound to OS proc set {51}
OMP: pid 596957 tid 597059 thread 4 bound to OS proc set {6}
OMP: pid 596957 tid 597083 thread 28 bound to OS proc set {42}
OMP: pid 596957 tid 597067 thread 12 bound to OS proc set {18}
OMP: pid 596957 tid 597091 thread 36 bound to OS proc set {54}
OMP: pid 596957 tid 597095 thread 40 bound to OS proc set {60}
OMP: pid 596957 tid 597087 thread 32 bound to OS proc set {48}
OMP: pid 596957 tid 597060 thread 5 bound to OS proc set {7}
OMP: pid 596957 tid 597088 thread 33 bound to OS proc set {50}
OMP: pid 596957 tid 597102 thread 47 bound to OS proc set {71}
OMP: pid 596957 tid 597074 thread 19 bound to OS proc set {28}
OMP: pid 596957 tid 597072 thread 17 bound to OS proc set {25}
OMP: pid 596957 tid 597061 thread 6 bound to OS proc set {9}
OMP: pid 596957 tid 597117 thread 62 bound to OS proc set {93}
OMP: pid 596957 tid 597068 thread 13 bound to OS proc set {19}
OMP: pid 596957 tid 597073 thread 18 bound to OS proc set {27}
OMP: pid 596957 tid 597110 thread 55 bound to OS proc set {83}
OMP: pid 596957 tid 597070 thread 15 bound to OS proc set {22}
OMP: pid 596957 tid 597080 thread 25 bound to OS proc set {37}
OMP: pid 596957 tid 597099 thread 44 bound to OS proc set {66}
OMP: pid 596957 tid 597071 thread 16 bound to OS proc set {24}
OMP: pid 596957 tid 597082 thread 27 bound to OS proc set {40}
OMP: pid 596957 tid 597075 thread 20 bound to OS proc set {30}
OMP: pid 596957 tid 597114 thread 59 bound to OS proc set {89}
OMP: pid 596957 tid 597078 thread 23 bound to OS proc set {34}
OMP: pid 596957 tid 597084 thread 29 bound to OS proc set {43}
OMP: pid 596957 tid 597104 thread 49 bound to OS proc set {74}
OMP: pid 596957 tid 597111 thread 56 bound to OS proc set {84}
OMP: pid 596957 tid 597081 thread 26 bound to OS proc set {39}
OMP: pid 596957 tid 597097 thread 42 bound to OS proc set {63}
OMP: pid 596957 tid 597094 thread 39 bound to OS proc set {59}
OMP: pid 596957 tid 597093 thread 38 bound to OS proc set {57}
OMP: pid 596957 tid 597077 thread 22 bound to OS proc set {33}
OMP: pid 596957 tid 597101 thread 46 bound to OS proc set {69}
OMP: pid 596957 tid 597113 thread 58 bound to OS proc set {87}
OMP: pid 596957 tid 597092 thread 37 bound to OS proc set {56}
OMP: pid 596957 tid 597100 thread 45 bound to OS proc set {68}
OMP: pid 596957 tid 597107 thread 52 bound to OS proc set {78}
OMP: pid 596957 tid 597076 thread 21 bound to OS proc set {31}
OMP: pid 596957 tid 597109 thread 54 bound to OS proc set {81}
OMP: pid 596957 tid 597115 thread 60 bound to OS proc set {90}
OMP: pid 596957 tid 597116 thread 61 bound to OS proc set {92}
OMP: pid 596957 tid 597112 thread 57 bound to OS proc set {86}
OMP: pid 596957 tid 597108 thread 53 bound to OS proc set {80}
OMP: pid 596957 tid 597096 thread 41 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.110477, "speed_tg": 164.604980, "t": 3.110477, "speed": 164.604980}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_10  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 597187 tid 597187 thread 0 bound to OS proc set {0}
OMP: pid 597187 tid 597287 thread 2 bound to OS proc set {2}
OMP: pid 597187 tid 597286 thread 1 bound to OS proc set {1}
OMP: pid 597187 tid 597288 thread 3 bound to OS proc set {4}
OMP: pid 597187 tid 597296 thread 11 bound to OS proc set {14}
OMP: pid 597187 tid 597293 thread 8 bound to OS proc set {10}
OMP: pid 597187 tid 597352 thread 67 bound to OS proc set {90}
OMP: pid 597187 tid 597356 thread 71 bound to OS proc set {95}
OMP: pid 597187 tid 597349 thread 64 bound to OS proc set {86}
OMP: pid 597187 tid 597333 thread 48 bound to OS proc set {64}
OMP: pid 597187 tid 597320 thread 35 bound to OS proc set {47}
OMP: pid 597187 tid 597312 thread 27 bound to OS proc set {36}
OMP: pid 597187 tid 597299 thread 14 bound to OS proc set {18}
OMP: pid 597187 tid 597353 thread 68 bound to OS proc set {91}
OMP: pid 597187 tid 597336 thread 51 bound to OS proc set {68}
OMP: pid 597187 tid 597311 thread 26 bound to OS proc set {35}
OMP: pid 597187 tid 597295 thread 10 bound to OS proc set {13}
OMP: pid 597187 tid 597304 thread 19 bound to OS proc set {25}
OMP: pid 597187 tid 597315 thread 30 bound to OS proc set {40}
OMP: pid 597187 tid 597355 thread 70 bound to OS proc set {94}
OMP: pid 597187 tid 597297 thread 12 bound to OS proc set {16}
OMP: pid 597187 tid 597309 thread 24 bound to OS proc set {32}
OMP: pid 597187 tid 597292 thread 7 bound to OS proc set {9}
OMP: pid 597187 tid 597300 thread 15 bound to OS proc set {20}
OMP: pid 597187 tid 597335 thread 50 bound to OS proc set {67}
OMP: pid 597187 tid 597303 thread 18 bound to OS proc set {24}
OMP: pid 597187 tid 597330 thread 45 bound to OS proc set {60}
OMP: pid 597187 tid 597319 thread 34 bound to OS proc set {45}
OMP: pid 597187 tid 597331 thread 46 bound to OS proc set {61}
OMP: pid 597187 tid 597325 thread 40 bound to OS proc set {53}
OMP: pid 597187 tid 597289 thread 4 bound to OS proc set {5}
OMP: pid 597187 tid 597316 thread 31 bound to OS proc set {41}
OMP: pid 597187 tid 597348 thread 63 bound to OS proc set {84}
OMP: pid 597187 tid 597291 thread 6 bound to OS proc set {8}
OMP: pid 597187 tid 597337 thread 52 bound to OS proc set {70}
OMP: pid 597187 tid 597332 thread 47 bound to OS proc set {63}
OMP: pid 597187 tid 597308 thread 23 bound to OS proc set {30}
OMP: pid 597187 tid 597318 thread 33 bound to OS proc set {44}
OMP: pid 597187 tid 597350 thread 65 bound to OS proc set {87}
OMP: pid 597187 tid 597294 thread 9 bound to OS proc set {12}
OMP: pid 597187 tid 597328 thread 43 bound to OS proc set {57}
OMP: pid 597187 tid 597324 thread 39 bound to OS proc set {52}
OMP: pid 597187 tid 597351 thread 66 bound to OS proc set {88}
OMP: pid 597187 tid 597314 thread 29 bound to OS proc set {39}
OMP: pid 597187 tid 597313 thread 28 bound to OS proc set {37}
OMP: pid 597187 tid 597323 thread 38 bound to OS proc set {51}
OMP: pid 597187 tid 597298 thread 13 bound to OS proc set {17}
OMP: pid 597187 tid 597317 thread 32 bound to OS proc set {43}
OMP: pid 597187 tid 597329 thread 44 bound to OS proc set {59}
OMP: pid 597187 tid 597290 thread 5 bound to OS proc set {6}
OMP: pid 597187 tid 597310 thread 25 bound to OS proc set {33}
OMP: pid 597187 tid 597322 thread 37 bound to OS proc set {49}
OMP: pid 597187 tid 597305 thread 20 bound to OS proc set {26}
OMP: pid 597187 tid 597301 thread 16 bound to OS proc set {21}
OMP: pid 597187 tid 597338 thread 53 bound to OS proc set {71}
OMP: pid 597187 tid 597340 thread 55 bound to OS proc set {74}
OMP: pid 597187 tid 597321 thread 36 bound to OS proc set {48}
OMP: pid 597187 tid 597344 thread 59 bound to OS proc set {79}
OMP: pid 597187 tid 597339 thread 54 bound to OS proc set {72}
OMP: pid 597187 tid 597307 thread 22 bound to OS proc set {29}
OMP: pid 597187 tid 597326 thread 41 bound to OS proc set {55}
OMP: pid 597187 tid 597306 thread 21 bound to OS proc set {28}
OMP: pid 597187 tid 597302 thread 17 bound to OS proc set {22}
OMP: pid 597187 tid 597345 thread 60 bound to OS proc set {80}
OMP: pid 597187 tid 597327 thread 42 bound to OS proc set {56}
OMP: pid 597187 tid 597334 thread 49 bound to OS proc set {66}
OMP: pid 597187 tid 597354 thread 69 bound to OS proc set {92}
OMP: pid 597187 tid 597347 thread 62 bound to OS proc set {83}
OMP: pid 597187 tid 597341 thread 56 bound to OS proc set {75}
OMP: pid 597187 tid 597346 thread 61 bound to OS proc set {82}
OMP: pid 597187 tid 597343 thread 58 bound to OS proc set {78}
OMP: pid 597187 tid 597342 thread 57 bound to OS proc set {76}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.166415, "speed_tg": 161.697067, "t": 3.166415, "speed": 161.697067}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_11  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 597376 tid 597376 thread 0 bound to OS proc set {0}
OMP: pid 597376 tid 597477 thread 3 bound to OS proc set {3}
OMP: pid 597376 tid 597476 thread 2 bound to OS proc set {2}
OMP: pid 597376 tid 597478 thread 4 bound to OS proc set {4}
OMP: pid 597376 tid 597475 thread 1 bound to OS proc set {1}
OMP: pid 597376 tid 597540 thread 66 bound to OS proc set {80}
OMP: pid 597376 tid 597501 thread 27 bound to OS proc set {32}
OMP: pid 597376 tid 597481 thread 7 bound to OS proc set {8}
OMP: pid 597376 tid 597484 thread 10 bound to OS proc set {12}
OMP: pid 597376 tid 597537 thread 63 bound to OS proc set {76}
OMP: pid 597376 tid 597541 thread 67 bound to OS proc set {81}
OMP: pid 597376 tid 597488 thread 14 bound to OS proc set {16}
OMP: pid 597376 tid 597486 thread 12 bound to OS proc set {14}
OMP: pid 597376 tid 597553 thread 79 bound to OS proc set {95}
OMP: pid 597376 tid 597534 thread 60 bound to OS proc set {72}
OMP: pid 597376 tid 597490 thread 16 bound to OS proc set {19}
OMP: pid 597376 tid 597480 thread 6 bound to OS proc set {7}
OMP: pid 597376 tid 597502 thread 28 bound to OS proc set {33}
OMP: pid 597376 tid 597524 thread 50 bound to OS proc set {60}
OMP: pid 597376 tid 597507 thread 33 bound to OS proc set {40}
OMP: pid 597376 tid 597489 thread 15 bound to OS proc set {18}
OMP: pid 597376 tid 597508 thread 34 bound to OS proc set {41}
OMP: pid 597376 tid 597525 thread 51 bound to OS proc set {61}
OMP: pid 597376 tid 597498 thread 24 bound to OS proc set {29}
OMP: pid 597376 tid 597530 thread 56 bound to OS proc set {67}
OMP: pid 597376 tid 597503 thread 29 bound to OS proc set {35}
OMP: pid 597376 tid 597535 thread 61 bound to OS proc set {73}
OMP: pid 597376 tid 597517 thread 43 bound to OS proc set {52}
OMP: pid 597376 tid 597504 thread 30 bound to OS proc set {36}
OMP: pid 597376 tid 597499 thread 25 bound to OS proc set {30}
OMP: pid 597376 tid 597485 thread 11 bound to OS proc set {13}
OMP: pid 597376 tid 597487 thread 13 bound to OS proc set {15}
OMP: pid 597376 tid 597482 thread 8 bound to OS proc set {9}
OMP: pid 597376 tid 597521 thread 47 bound to OS proc set {56}
OMP: pid 597376 tid 597529 thread 55 bound to OS proc set {66}
OMP: pid 597376 tid 597523 thread 49 bound to OS proc set {59}
OMP: pid 597376 tid 597545 thread 71 bound to OS proc set {86}
OMP: pid 597376 tid 597510 thread 36 bound to OS proc set {43}
OMP: pid 597376 tid 597497 thread 23 bound to OS proc set {27}
OMP: pid 597376 tid 597532 thread 58 bound to OS proc set {70}
OMP: pid 597376 tid 597505 thread 31 bound to OS proc set {37}
OMP: pid 597376 tid 597549 thread 75 bound to OS proc set {90}
OMP: pid 597376 tid 597538 thread 64 bound to OS proc set {77}
OMP: pid 597376 tid 597479 thread 5 bound to OS proc set {6}
OMP: pid 597376 tid 597500 thread 26 bound to OS proc set {31}
OMP: pid 597376 tid 597552 thread 78 bound to OS proc set {94}
OMP: pid 597376 tid 597518 thread 44 bound to OS proc set {53}
OMP: pid 597376 tid 597536 thread 62 bound to OS proc set {75}
OMP: pid 597376 tid 597539 thread 65 bound to OS proc set {78}
OMP: pid 597376 tid 597512 thread 38 bound to OS proc set {46}
OMP: pid 597376 tid 597546 thread 72 bound to OS proc set {87}
OMP: pid 597376 tid 597533 thread 59 bound to OS proc set {71}
OMP: pid 597376 tid 597528 thread 54 bound to OS proc set {65}
OMP: pid 597376 tid 597526 thread 52 bound to OS proc set {63}
OMP: pid 597376 tid 597522 thread 48 bound to OS proc set {58}
OMP: pid 597376 tid 597520 thread 46 bound to OS proc set {55}
OMP: pid 597376 tid 597514 thread 40 bound to OS proc set {48}
OMP: pid 597376 tid 597483 thread 9 bound to OS proc set {10}
OMP: pid 597376 tid 597493 thread 19 bound to OS proc set {23}
OMP: pid 597376 tid 597516 thread 42 bound to OS proc set {50}
OMP: pid 597376 tid 597509 thread 35 bound to OS proc set {42}
OMP: pid 597376 tid 597542 thread 68 bound to OS proc set {82}
OMP: pid 597376 tid 597495 thread 21 bound to OS proc set {25}
OMP: pid 597376 tid 597511 thread 37 bound to OS proc set {44}
OMP: pid 597376 tid 597519 thread 45 bound to OS proc set {54}
OMP: pid 597376 tid 597506 thread 32 bound to OS proc set {38}
OMP: pid 597376 tid 597494 thread 20 bound to OS proc set {24}
OMP: pid 597376 tid 597492 thread 18 bound to OS proc set {21}
OMP: pid 597376 tid 597513 thread 39 bound to OS proc set {47}
OMP: pid 597376 tid 597496 thread 22 bound to OS proc set {26}
OMP: pid 597376 tid 597527 thread 53 bound to OS proc set {64}
OMP: pid 597376 tid 597491 thread 17 bound to OS proc set {20}
OMP: pid 597376 tid 597531 thread 57 bound to OS proc set {69}
OMP: pid 597376 tid 597515 thread 41 bound to OS proc set {49}
OMP: pid 597376 tid 597550 thread 76 bound to OS proc set {92}
OMP: pid 597376 tid 597551 thread 77 bound to OS proc set {93}
OMP: pid 597376 tid 597547 thread 73 bound to OS proc set {88}
OMP: pid 597376 tid 597543 thread 69 bound to OS proc set {83}
OMP: pid 597376 tid 597544 thread 70 bound to OS proc set {84}
OMP: pid 597376 tid 597548 thread 74 bound to OS proc set {89}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.252963, "speed_tg": 157.394958, "t": 3.252963, "speed": 157.394958}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_12  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 597573 tid 597573 thread 0 bound to OS proc set {0}
OMP: pid 597573 tid 597674 thread 3 bound to OS proc set {3}
OMP: pid 597573 tid 597673 thread 2 bound to OS proc set {2}
OMP: pid 597573 tid 597679 thread 8 bound to OS proc set {8}
OMP: pid 597573 tid 597675 thread 4 bound to OS proc set {4}
OMP: pid 597573 tid 597678 thread 7 bound to OS proc set {7}
OMP: pid 597573 tid 597672 thread 1 bound to OS proc set {1}
OMP: pid 597573 tid 597699 thread 28 bound to OS proc set {30}
OMP: pid 597573 tid 597690 thread 19 bound to OS proc set {20}
OMP: pid 597573 tid 597677 thread 6 bound to OS proc set {6}
OMP: pid 597573 tid 597698 thread 27 bound to OS proc set {29}
OMP: pid 597573 tid 597680 thread 9 bound to OS proc set {9}
OMP: pid 597573 tid 597676 thread 5 bound to OS proc set {5}
OMP: pid 597573 tid 597700 thread 29 bound to OS proc set {31}
OMP: pid 597573 tid 597697 thread 26 bound to OS proc set {28}
OMP: pid 597573 tid 597730 thread 59 bound to OS proc set {65}
OMP: pid 597573 tid 597702 thread 31 bound to OS proc set {34}
OMP: pid 597573 tid 597719 thread 48 bound to OS proc set {52}
OMP: pid 597573 tid 597681 thread 10 bound to OS proc set {11}
OMP: pid 597573 tid 597718 thread 47 bound to OS proc set {51}
OMP: pid 597573 tid 597734 thread 63 bound to OS proc set {69}
OMP: pid 597573 tid 597683 thread 12 bound to OS proc set {13}
OMP: pid 597573 tid 597685 thread 14 bound to OS proc set {15}
OMP: pid 597573 tid 597727 thread 56 bound to OS proc set {61}
OMP: pid 597573 tid 597720 thread 49 bound to OS proc set {54}
OMP: pid 597573 tid 597686 thread 15 bound to OS proc set {16}
OMP: pid 597573 tid 597729 thread 58 bound to OS proc set {63}
OMP: pid 597573 tid 597726 thread 55 bound to OS proc set {60}
OMP: pid 597573 tid 597701 thread 30 bound to OS proc set {33}
OMP: pid 597573 tid 597746 thread 75 bound to OS proc set {82}
OMP: pid 597573 tid 597723 thread 52 bound to OS proc set {57}
OMP: pid 597573 tid 597705 thread 34 bound to OS proc set {37}
OMP: pid 597573 tid 597682 thread 11 bound to OS proc set {12}
OMP: pid 597573 tid 597735 thread 64 bound to OS proc set {70}
OMP: pid 597573 tid 597710 thread 39 bound to OS proc set {42}
OMP: pid 597573 tid 597742 thread 71 bound to OS proc set {78}
OMP: pid 597573 tid 597709 thread 38 bound to OS proc set {41}
OMP: pid 597573 tid 597715 thread 44 bound to OS proc set {48}
OMP: pid 597573 tid 597747 thread 76 bound to OS proc set {83}
OMP: pid 597573 tid 597717 thread 46 bound to OS proc set {50}
OMP: pid 597573 tid 597722 thread 51 bound to OS proc set {56}
OMP: pid 597573 tid 597706 thread 35 bound to OS proc set {38}
OMP: pid 597573 tid 597703 thread 32 bound to OS proc set {35}
OMP: pid 597573 tid 597731 thread 60 bound to OS proc set {66}
OMP: pid 597573 tid 597743 thread 72 bound to OS proc set {79}
OMP: pid 597573 tid 597732 thread 61 bound to OS proc set {67}
OMP: pid 597573 tid 597725 thread 54 bound to OS proc set {59}
OMP: pid 597573 tid 597711 thread 40 bound to OS proc set {44}
OMP: pid 597573 tid 597704 thread 33 bound to OS proc set {36}
OMP: pid 597573 tid 597694 thread 23 bound to OS proc set {25}
OMP: pid 597573 tid 597754 thread 83 bound to OS proc set {91}
OMP: pid 597573 tid 597728 thread 57 bound to OS proc set {62}
OMP: pid 597573 tid 597714 thread 43 bound to OS proc set {47}
OMP: pid 597573 tid 597695 thread 24 bound to OS proc set {26}
OMP: pid 597573 tid 597733 thread 62 bound to OS proc set {68}
OMP: pid 597573 tid 597753 thread 82 bound to OS proc set {90}
OMP: pid 597573 tid 597712 thread 41 bound to OS proc set {45}
OMP: pid 597573 tid 597749 thread 78 bound to OS proc set {85}
OMP: pid 597573 tid 597684 thread 13 bound to OS proc set {14}
OMP: pid 597573 tid 597736 thread 65 bound to OS proc set {71}
OMP: pid 597573 tid 597721 thread 50 bound to OS proc set {55}
OMP: pid 597573 tid 597707 thread 36 bound to OS proc set {39}
OMP: pid 597573 tid 597687 thread 16 bound to OS proc set {17}
OMP: pid 597573 tid 597745 thread 74 bound to OS proc set {81}
OMP: pid 597573 tid 597688 thread 17 bound to OS proc set {18}
OMP: pid 597573 tid 597752 thread 81 bound to OS proc set {89}
OMP: pid 597573 tid 597708 thread 37 bound to OS proc set {40}
OMP: pid 597573 tid 597750 thread 79 bound to OS proc set {87}
OMP: pid 597573 tid 597716 thread 45 bound to OS proc set {49}
OMP: pid 597573 tid 597739 thread 68 bound to OS proc set {74}
OMP: pid 597573 tid 597744 thread 73 bound to OS proc set {80}
OMP: pid 597573 tid 597738 thread 67 bound to OS proc set {73}
OMP: pid 597573 tid 597740 thread 69 bound to OS proc set {76}
OMP: pid 597573 tid 597748 thread 77 bound to OS proc set {84}
OMP: pid 597573 tid 597758 thread 87 bound to OS proc set {95}
OMP: pid 597573 tid 597693 thread 22 bound to OS proc set {24}
OMP: pid 597573 tid 597696 thread 25 bound to OS proc set {27}
OMP: pid 597573 tid 597751 thread 80 bound to OS proc set {88}
OMP: pid 597573 tid 597755 thread 84 bound to OS proc set {92}
OMP: pid 597573 tid 597757 thread 86 bound to OS proc set {94}
OMP: pid 597573 tid 597691 thread 20 bound to OS proc set {22}
OMP: pid 597573 tid 597692 thread 21 bound to OS proc set {23}
OMP: pid 597573 tid 597724 thread 53 bound to OS proc set {58}
OMP: pid 597573 tid 597737 thread 66 bound to OS proc set {72}
OMP: pid 597573 tid 597741 thread 70 bound to OS proc set {77}
OMP: pid 597573 tid 597756 thread 85 bound to OS proc set {93}
OMP: pid 597573 tid 597689 thread 18 bound to OS proc set {19}
OMP: pid 597573 tid 597713 thread 42 bound to OS proc set {46}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.373354, "speed_tg": 151.777725, "t": 3.373354, "speed": 151.777725}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_13  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 597778 tid 597778 thread 0 bound to OS proc set {0}
OMP: pid 597778 tid 597891 thread 15 bound to OS proc set {15}
OMP: pid 597778 tid 597879 thread 3 bound to OS proc set {3}
OMP: pid 597778 tid 597939 thread 63 bound to OS proc set {63}
OMP: pid 597778 tid 597888 thread 12 bound to OS proc set {12}
OMP: pid 597778 tid 597927 thread 51 bound to OS proc set {51}
OMP: pid 597778 tid 597878 thread 2 bound to OS proc set {2}
OMP: pid 597778 tid 597887 thread 11 bound to OS proc set {11}
OMP: pid 597778 tid 597890 thread 14 bound to OS proc set {14}
OMP: pid 597778 tid 597936 thread 60 bound to OS proc set {60}
OMP: pid 597778 tid 597907 thread 31 bound to OS proc set {31}
OMP: pid 597778 tid 597884 thread 8 bound to OS proc set {8}
OMP: pid 597778 tid 597883 thread 7 bound to OS proc set {7}
OMP: pid 597778 tid 597935 thread 59 bound to OS proc set {59}
OMP: pid 597778 tid 597938 thread 62 bound to OS proc set {62}
OMP: pid 597778 tid 597904 thread 28 bound to OS proc set {28}
OMP: pid 597778 tid 597877 thread 1 bound to OS proc set {1}
OMP: pid 597778 tid 597924 thread 48 bound to OS proc set {48}
OMP: pid 597778 tid 597880 thread 4 bound to OS proc set {4}
OMP: pid 597778 tid 597889 thread 13 bound to OS proc set {13}
OMP: pid 597778 tid 597886 thread 10 bound to OS proc set {10}
OMP: pid 597778 tid 597906 thread 30 bound to OS proc set {30}
OMP: pid 597778 tid 597911 thread 35 bound to OS proc set {35}
OMP: pid 597778 tid 597903 thread 27 bound to OS proc set {27}
OMP: pid 597778 tid 597908 thread 32 bound to OS proc set {32}
OMP: pid 597778 tid 597900 thread 24 bound to OS proc set {24}
OMP: pid 597778 tid 597923 thread 47 bound to OS proc set {47}
OMP: pid 597778 tid 597926 thread 50 bound to OS proc set {50}
OMP: pid 597778 tid 597895 thread 19 bound to OS proc set {19}
OMP: pid 597778 tid 597905 thread 29 bound to OS proc set {29}
OMP: pid 597778 tid 597955 thread 79 bound to OS proc set {79}
OMP: pid 597778 tid 597931 thread 55 bound to OS proc set {55}
OMP: pid 597778 tid 597902 thread 26 bound to OS proc set {26}
OMP: pid 597778 tid 597892 thread 16 bound to OS proc set {16}
OMP: pid 597778 tid 597910 thread 34 bound to OS proc set {34}
OMP: pid 597778 tid 597937 thread 61 bound to OS proc set {61}
OMP: pid 597778 tid 597932 thread 56 bound to OS proc set {56}
OMP: pid 597778 tid 597901 thread 25 bound to OS proc set {25}
OMP: pid 597778 tid 597922 thread 46 bound to OS proc set {46}
OMP: pid 597778 tid 597943 thread 67 bound to OS proc set {67}
OMP: pid 597778 tid 597920 thread 44 bound to OS proc set {44}
OMP: pid 597778 tid 597882 thread 6 bound to OS proc set {6}
OMP: pid 597778 tid 597919 thread 43 bound to OS proc set {43}
OMP: pid 597778 tid 597951 thread 75 bound to OS proc set {75}
OMP: pid 597778 tid 597948 thread 72 bound to OS proc set {72}
OMP: pid 597778 tid 597925 thread 49 bound to OS proc set {49}
OMP: pid 597778 tid 597885 thread 9 bound to OS proc set {9}
OMP: pid 597778 tid 597940 thread 64 bound to OS proc set {64}
OMP: pid 597778 tid 597952 thread 76 bound to OS proc set {76}
OMP: pid 597778 tid 597881 thread 5 bound to OS proc set {5}
OMP: pid 597778 tid 597942 thread 66 bound to OS proc set {66}
OMP: pid 597778 tid 597930 thread 54 bound to OS proc set {54}
OMP: pid 597778 tid 597916 thread 40 bound to OS proc set {40}
OMP: pid 597778 tid 597894 thread 18 bound to OS proc set {18}
OMP: pid 597778 tid 597950 thread 74 bound to OS proc set {74}
OMP: pid 597778 tid 597954 thread 78 bound to OS proc set {78}
OMP: pid 597778 tid 597899 thread 23 bound to OS proc set {23}
OMP: pid 597778 tid 597934 thread 58 bound to OS proc set {58}
OMP: pid 597778 tid 597947 thread 71 bound to OS proc set {71}
OMP: pid 597778 tid 597893 thread 17 bound to OS proc set {17}
OMP: pid 597778 tid 597898 thread 22 bound to OS proc set {22}
OMP: pid 597778 tid 597915 thread 39 bound to OS proc set {39}
OMP: pid 597778 tid 597921 thread 45 bound to OS proc set {45}
OMP: pid 597778 tid 597933 thread 57 bound to OS proc set {57}
OMP: pid 597778 tid 597941 thread 65 bound to OS proc set {65}
OMP: pid 597778 tid 597944 thread 68 bound to OS proc set {68}
OMP: pid 597778 tid 597949 thread 73 bound to OS proc set {73}
OMP: pid 597778 tid 597953 thread 77 bound to OS proc set {77}
OMP: pid 597778 tid 597896 thread 20 bound to OS proc set {20}
OMP: pid 597778 tid 597912 thread 36 bound to OS proc set {36}
OMP: pid 597778 tid 597909 thread 33 bound to OS proc set {33}
OMP: pid 597778 tid 597928 thread 52 bound to OS proc set {52}
OMP: pid 597778 tid 597929 thread 53 bound to OS proc set {53}
OMP: pid 597778 tid 597946 thread 70 bound to OS proc set {70}
OMP: pid 597778 tid 597956 thread 80 bound to OS proc set {80}
OMP: pid 597778 tid 597958 thread 82 bound to OS proc set {82}
OMP: pid 597778 tid 597897 thread 21 bound to OS proc set {21}
OMP: pid 597778 tid 597917 thread 41 bound to OS proc set {41}
OMP: pid 597778 tid 597945 thread 69 bound to OS proc set {69}
OMP: pid 597778 tid 597918 thread 42 bound to OS proc set {42}
OMP: pid 597778 tid 597913 thread 37 bound to OS proc set {37}
OMP: pid 597778 tid 597914 thread 38 bound to OS proc set {38}
OMP: pid 597778 tid 597957 thread 81 bound to OS proc set {81}
OMP: pid 597778 tid 597968 thread 92 bound to OS proc set {92}
OMP: pid 597778 tid 597967 thread 91 bound to OS proc set {91}
OMP: pid 597778 tid 597971 thread 95 bound to OS proc set {95}
OMP: pid 597778 tid 597964 thread 88 bound to OS proc set {88}
OMP: pid 597778 tid 597966 thread 90 bound to OS proc set {90}
OMP: pid 597778 tid 597965 thread 89 bound to OS proc set {89}
OMP: pid 597778 tid 597963 thread 87 bound to OS proc set {87}
OMP: pid 597778 tid 597960 thread 84 bound to OS proc set {84}
OMP: pid 597778 tid 597962 thread 86 bound to OS proc set {86}
OMP: pid 597778 tid 597961 thread 85 bound to OS proc set {85}
OMP: pid 597778 tid 597969 thread 93 bound to OS proc set {93}
OMP: pid 597778 tid 597970 thread 94 bound to OS proc set {94}
OMP: pid 597778 tid 597959 thread 83 bound to OS proc set {83}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 0, "tg": 128, "pl": 4, "n_kv": 512, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.437525, "speed_tg": 148.944366, "t": 3.437525, "speed": 148.944366}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-4820/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_10-20-44/tools/lprof_npsu_run_14  #
#########################################################################################################################################################################################################################################

×