options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 172.026230, "speed_tg": 11.905161, "t": 172.026230, "speed": 11.905161}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_0  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 622419 tid 622419 thread 0 bound to OS proc set {0}
OMP: pid 622419 tid 622518 thread 1 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 86.400230, "speed_tg": 23.703640, "t": 86.400230, "speed": 23.703640}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_1  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 622590 tid 622590 thread 0 bound to OS proc set {0}
OMP: pid 622590 tid 622689 thread 1 bound to OS proc set {24}
OMP: pid 622590 tid 622690 thread 2 bound to OS proc set {48}
OMP: pid 622590 tid 622691 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 43.401627, "speed_tg": 47.187172, "t": 43.401627, "speed": 47.187172}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_2  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 622759 tid 622759 thread 0 bound to OS proc set {0}
OMP: pid 622759 tid 622862 thread 4 bound to OS proc set {48}
OMP: pid 622759 tid 622860 thread 2 bound to OS proc set {24}
OMP: pid 622759 tid 622859 thread 1 bound to OS proc set {12}
OMP: pid 622759 tid 622864 thread 6 bound to OS proc set {72}
OMP: pid 622759 tid 622861 thread 3 bound to OS proc set {36}
OMP: pid 622759 tid 622863 thread 5 bound to OS proc set {60}
OMP: pid 622759 tid 622865 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 22.067469, "speed_tg": 92.806297, "t": 22.067469, "speed": 92.806297}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_3  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 622885 tid 622885 thread 0 bound to OS proc set {0}
OMP: pid 622885 tid 622985 thread 2 bound to OS proc set {12}
OMP: pid 622885 tid 622986 thread 3 bound to OS proc set {18}
OMP: pid 622885 tid 622995 thread 12 bound to OS proc set {72}
OMP: pid 622885 tid 622984 thread 1 bound to OS proc set {6}
OMP: pid 622885 tid 622997 thread 14 bound to OS proc set {84}
OMP: pid 622885 tid 622994 thread 11 bound to OS proc set {66}
OMP: pid 622885 tid 622987 thread 4 bound to OS proc set {24}
OMP: pid 622885 tid 622991 thread 8 bound to OS proc set {48}
OMP: pid 622885 tid 622988 thread 5 bound to OS proc set {30}
OMP: pid 622885 tid 622990 thread 7 bound to OS proc set {42}
OMP: pid 622885 tid 622996 thread 13 bound to OS proc set {78}
OMP: pid 622885 tid 622989 thread 6 bound to OS proc set {36}
OMP: pid 622885 tid 622993 thread 10 bound to OS proc set {60}
OMP: pid 622885 tid 622992 thread 9 bound to OS proc set {54}
OMP: pid 622885 tid 622998 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 11.695828, "speed_tg": 175.105164, "t": 11.695828, "speed": 175.105164}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_4  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 623018 tid 623018 thread 0 bound to OS proc set {0}
OMP: pid 623018 tid 623119 thread 3 bound to OS proc set {12}
OMP: pid 623018 tid 623120 thread 4 bound to OS proc set {16}
OMP: pid 623018 tid 623118 thread 2 bound to OS proc set {8}
OMP: pid 623018 tid 623117 thread 1 bound to OS proc set {4}
OMP: pid 623018 tid 623124 thread 8 bound to OS proc set {32}
OMP: pid 623018 tid 623131 thread 15 bound to OS proc set {60}
OMP: pid 623018 tid 623128 thread 12 bound to OS proc set {48}
OMP: pid 623018 tid 623121 thread 5 bound to OS proc set {20}
OMP: pid 623018 tid 623127 thread 11 bound to OS proc set {44}
OMP: pid 623018 tid 623132 thread 16 bound to OS proc set {64}
OMP: pid 623018 tid 623130 thread 14 bound to OS proc set {56}
OMP: pid 623018 tid 623134 thread 18 bound to OS proc set {72}
OMP: pid 623018 tid 623135 thread 19 bound to OS proc set {76}
OMP: pid 623018 tid 623123 thread 7 bound to OS proc set {28}
OMP: pid 623018 tid 623126 thread 10 bound to OS proc set {40}
OMP: pid 623018 tid 623129 thread 13 bound to OS proc set {52}
OMP: pid 623018 tid 623122 thread 6 bound to OS proc set {24}
OMP: pid 623018 tid 623125 thread 9 bound to OS proc set {36}
OMP: pid 623018 tid 623133 thread 17 bound to OS proc set {68}
OMP: pid 623018 tid 623137 thread 21 bound to OS proc set {84}
OMP: pid 623018 tid 623136 thread 20 bound to OS proc set {80}
OMP: pid 623018 tid 623138 thread 22 bound to OS proc set {88}
OMP: pid 623018 tid 623139 thread 23 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 9.081829, "speed_tg": 225.505234, "t": 9.081829, "speed": 225.505234}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_5  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 623208 tid 623208 thread 0 bound to OS proc set {0}
OMP: pid 623208 tid 623316 thread 10 bound to OS proc set {30}
OMP: pid 623208 tid 623313 thread 7 bound to OS proc set {21}
OMP: pid 623208 tid 623307 thread 1 bound to OS proc set {3}
OMP: pid 623208 tid 623310 thread 4 bound to OS proc set {12}
OMP: pid 623208 tid 623309 thread 3 bound to OS proc set {9}
OMP: pid 623208 tid 623318 thread 12 bound to OS proc set {36}
OMP: pid 623208 tid 623321 thread 15 bound to OS proc set {45}
OMP: pid 623208 tid 623308 thread 2 bound to OS proc set {6}
OMP: pid 623208 tid 623320 thread 14 bound to OS proc set {42}
OMP: pid 623208 tid 623334 thread 28 bound to OS proc set {84}
OMP: pid 623208 tid 623311 thread 5 bound to OS proc set {15}
OMP: pid 623208 tid 623317 thread 11 bound to OS proc set {33}
OMP: pid 623208 tid 623314 thread 8 bound to OS proc set {24}
OMP: pid 623208 tid 623322 thread 16 bound to OS proc set {48}
OMP: pid 623208 tid 623319 thread 13 bound to OS proc set {39}
OMP: pid 623208 tid 623336 thread 30 bound to OS proc set {90}
OMP: pid 623208 tid 623315 thread 9 bound to OS proc set {27}
OMP: pid 623208 tid 623324 thread 18 bound to OS proc set {54}
OMP: pid 623208 tid 623335 thread 29 bound to OS proc set {87}
OMP: pid 623208 tid 623325 thread 19 bound to OS proc set {57}
OMP: pid 623208 tid 623333 thread 27 bound to OS proc set {81}
OMP: pid 623208 tid 623323 thread 17 bound to OS proc set {51}
OMP: pid 623208 tid 623330 thread 24 bound to OS proc set {72}
OMP: pid 623208 tid 623337 thread 31 bound to OS proc set {93}
OMP: pid 623208 tid 623312 thread 6 bound to OS proc set {18}
OMP: pid 623208 tid 623326 thread 20 bound to OS proc set {60}
OMP: pid 623208 tid 623331 thread 25 bound to OS proc set {75}
OMP: pid 623208 tid 623332 thread 26 bound to OS proc set {78}
OMP: pid 623208 tid 623328 thread 22 bound to OS proc set {66}
OMP: pid 623208 tid 623329 thread 23 bound to OS proc set {69}
OMP: pid 623208 tid 623327 thread 21 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.647170, "speed_tg": 267.811493, "t": 7.647170, "speed": 267.811493}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_6  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 623357 tid 623357 thread 0 bound to OS proc set {0}
OMP: pid 623357 tid 623470 thread 15 bound to OS proc set {36}
OMP: pid 623357 tid 623456 thread 1 bound to OS proc set {2}
OMP: pid 623357 tid 623458 thread 3 bound to OS proc set {7}
OMP: pid 623357 tid 623462 thread 7 bound to OS proc set {16}
OMP: pid 623357 tid 623471 thread 16 bound to OS proc set {38}
OMP: pid 623357 tid 623457 thread 2 bound to OS proc set {4}
OMP: pid 623357 tid 623459 thread 4 bound to OS proc set {9}
OMP: pid 623357 tid 623463 thread 8 bound to OS proc set {19}
OMP: pid 623357 tid 623467 thread 12 bound to OS proc set {29}
OMP: pid 623357 tid 623461 thread 6 bound to OS proc set {14}
OMP: pid 623357 tid 623487 thread 32 bound to OS proc set {77}
OMP: pid 623357 tid 623468 thread 13 bound to OS proc set {31}
OMP: pid 623357 tid 623460 thread 5 bound to OS proc set {12}
OMP: pid 623357 tid 623469 thread 14 bound to OS proc set {33}
OMP: pid 623357 tid 623491 thread 36 bound to OS proc set {87}
OMP: pid 623357 tid 623493 thread 38 bound to OS proc set {92}
OMP: pid 623357 tid 623483 thread 28 bound to OS proc set {67}
OMP: pid 623357 tid 623482 thread 27 bound to OS proc set {65}
OMP: pid 623357 tid 623488 thread 33 bound to OS proc set {80}
OMP: pid 623357 tid 623465 thread 10 bound to OS proc set {24}
OMP: pid 623357 tid 623474 thread 19 bound to OS proc set {46}
OMP: pid 623357 tid 623473 thread 18 bound to OS proc set {43}
OMP: pid 623357 tid 623464 thread 9 bound to OS proc set {21}
OMP: pid 623357 tid 623486 thread 31 bound to OS proc set {75}
OMP: pid 623357 tid 623485 thread 30 bound to OS proc set {72}
OMP: pid 623357 tid 623490 thread 35 bound to OS proc set {84}
OMP: pid 623357 tid 623492 thread 37 bound to OS proc set {89}
OMP: pid 623357 tid 623484 thread 29 bound to OS proc set {70}
OMP: pid 623357 tid 623466 thread 11 bound to OS proc set {26}
OMP: pid 623357 tid 623479 thread 24 bound to OS proc set {58}
OMP: pid 623357 tid 623475 thread 20 bound to OS proc set {48}
OMP: pid 623357 tid 623480 thread 25 bound to OS proc set {60}
OMP: pid 623357 tid 623478 thread 23 bound to OS proc set {55}
OMP: pid 623357 tid 623472 thread 17 bound to OS proc set {41}
OMP: pid 623357 tid 623481 thread 26 bound to OS proc set {63}
OMP: pid 623357 tid 623476 thread 21 bound to OS proc set {50}
OMP: pid 623357 tid 623477 thread 22 bound to OS proc set {53}
OMP: pid 623357 tid 623489 thread 34 bound to OS proc set {82}
OMP: pid 623357 tid 623494 thread 39 bound to OS proc set {94}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.755369, "speed_tg": 303.166260, "t": 6.755369, "speed": 303.166260}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_7  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 623514 tid 623514 thread 0 bound to OS proc set {0}
OMP: pid 623514 tid 623627 thread 15 bound to OS proc set {30}
OMP: pid 623514 tid 623619 thread 7 bound to OS proc set {14}
OMP: pid 623514 tid 623616 thread 4 bound to OS proc set {8}
OMP: pid 623514 tid 623614 thread 2 bound to OS proc set {4}
OMP: pid 623514 tid 623620 thread 8 bound to OS proc set {16}
OMP: pid 623514 tid 623613 thread 1 bound to OS proc set {2}
OMP: pid 623514 tid 623615 thread 3 bound to OS proc set {6}
OMP: pid 623514 tid 623621 thread 9 bound to OS proc set {18}
OMP: pid 623514 tid 623659 thread 47 bound to OS proc set {94}
OMP: pid 623514 tid 623658 thread 46 bound to OS proc set {92}
OMP: pid 623514 tid 623626 thread 14 bound to OS proc set {28}
OMP: pid 623514 tid 623628 thread 16 bound to OS proc set {32}
OMP: pid 623514 tid 623656 thread 44 bound to OS proc set {88}
OMP: pid 623514 tid 623647 thread 35 bound to OS proc set {70}
OMP: pid 623514 tid 623618 thread 6 bound to OS proc set {12}
OMP: pid 623514 tid 623629 thread 17 bound to OS proc set {34}
OMP: pid 623514 tid 623624 thread 12 bound to OS proc set {24}
OMP: pid 623514 tid 623640 thread 28 bound to OS proc set {56}
OMP: pid 623514 tid 623617 thread 5 bound to OS proc set {10}
OMP: pid 623514 tid 623623 thread 11 bound to OS proc set {22}
OMP: pid 623514 tid 623622 thread 10 bound to OS proc set {20}
OMP: pid 623514 tid 623630 thread 18 bound to OS proc set {36}
OMP: pid 623514 tid 623636 thread 24 bound to OS proc set {48}
OMP: pid 623514 tid 623633 thread 21 bound to OS proc set {42}
OMP: pid 623514 tid 623642 thread 30 bound to OS proc set {60}
OMP: pid 623514 tid 623635 thread 23 bound to OS proc set {46}
OMP: pid 623514 tid 623634 thread 22 bound to OS proc set {44}
OMP: pid 623514 tid 623625 thread 13 bound to OS proc set {26}
OMP: pid 623514 tid 623643 thread 31 bound to OS proc set {62}
OMP: pid 623514 tid 623644 thread 32 bound to OS proc set {64}
OMP: pid 623514 tid 623639 thread 27 bound to OS proc set {54}
OMP: pid 623514 tid 623631 thread 19 bound to OS proc set {38}
OMP: pid 623514 tid 623652 thread 40 bound to OS proc set {80}
OMP: pid 623514 tid 623632 thread 20 bound to OS proc set {40}
OMP: pid 623514 tid 623638 thread 26 bound to OS proc set {52}
OMP: pid 623514 tid 623645 thread 33 bound to OS proc set {66}
OMP: pid 623514 tid 623637 thread 25 bound to OS proc set {50}
OMP: pid 623514 tid 623651 thread 39 bound to OS proc set {78}
OMP: pid 623514 tid 623646 thread 34 bound to OS proc set {68}
OMP: pid 623514 tid 623657 thread 45 bound to OS proc set {90}
OMP: pid 623514 tid 623650 thread 38 bound to OS proc set {76}
OMP: pid 623514 tid 623655 thread 43 bound to OS proc set {86}
OMP: pid 623514 tid 623649 thread 37 bound to OS proc set {74}
OMP: pid 623514 tid 623648 thread 36 bound to OS proc set {72}
OMP: pid 623514 tid 623641 thread 29 bound to OS proc set {58}
OMP: pid 623514 tid 623654 thread 42 bound to OS proc set {84}
OMP: pid 623514 tid 623653 thread 41 bound to OS proc set {82}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.227937, "speed_tg": 328.840820, "t": 6.227937, "speed": 328.840820}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_8  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 623679 tid 623679 thread 0 bound to OS proc set {0}
OMP: pid 623679 tid 623778 thread 1 bound to OS proc set {1}
OMP: pid 623679 tid 623792 thread 15 bound to OS proc set {25}
OMP: pid 623679 tid 623808 thread 31 bound to OS proc set {53}
OMP: pid 623679 tid 623785 thread 8 bound to OS proc set {13}
OMP: pid 623679 tid 623788 thread 11 bound to OS proc set {19}
OMP: pid 623679 tid 623784 thread 7 bound to OS proc set {12}
OMP: pid 623679 tid 623806 thread 29 bound to OS proc set {50}
OMP: pid 623679 tid 623828 thread 51 bound to OS proc set {88}
OMP: pid 623679 tid 623829 thread 52 bound to OS proc set {90}
OMP: pid 623679 tid 623796 thread 19 bound to OS proc set {32}
OMP: pid 623679 tid 623809 thread 32 bound to OS proc set {55}
OMP: pid 623679 tid 623781 thread 4 bound to OS proc set {6}
OMP: pid 623679 tid 623786 thread 9 bound to OS proc set {15}
OMP: pid 623679 tid 623831 thread 54 bound to OS proc set {93}
OMP: pid 623679 tid 623795 thread 18 bound to OS proc set {31}
OMP: pid 623679 tid 623801 thread 24 bound to OS proc set {41}
OMP: pid 623679 tid 623832 thread 55 bound to OS proc set {95}
OMP: pid 623679 tid 623803 thread 26 bound to OS proc set {45}
OMP: pid 623679 tid 623779 thread 2 bound to OS proc set {3}
OMP: pid 623679 tid 623804 thread 27 bound to OS proc set {46}
OMP: pid 623679 tid 623791 thread 14 bound to OS proc set {24}
OMP: pid 623679 tid 623824 thread 47 bound to OS proc set {81}
OMP: pid 623679 tid 623787 thread 10 bound to OS proc set {17}
OMP: pid 623679 tid 623807 thread 30 bound to OS proc set {51}
OMP: pid 623679 tid 623805 thread 28 bound to OS proc set {48}
OMP: pid 623679 tid 623789 thread 12 bound to OS proc set {20}
OMP: pid 623679 tid 623783 thread 6 bound to OS proc set {10}
OMP: pid 623679 tid 623827 thread 50 bound to OS proc set {86}
OMP: pid 623679 tid 623794 thread 17 bound to OS proc set {29}
OMP: pid 623679 tid 623812 thread 35 bound to OS proc set {60}
OMP: pid 623679 tid 623810 thread 33 bound to OS proc set {57}
OMP: pid 623679 tid 623790 thread 13 bound to OS proc set {22}
OMP: pid 623679 tid 623799 thread 22 bound to OS proc set {38}
OMP: pid 623679 tid 623780 thread 3 bound to OS proc set {5}
OMP: pid 623679 tid 623797 thread 20 bound to OS proc set {34}
OMP: pid 623679 tid 623830 thread 53 bound to OS proc set {91}
OMP: pid 623679 tid 623825 thread 48 bound to OS proc set {83}
OMP: pid 623679 tid 623820 thread 43 bound to OS proc set {74}
OMP: pid 623679 tid 623782 thread 5 bound to OS proc set {8}
OMP: pid 623679 tid 623802 thread 25 bound to OS proc set {43}
OMP: pid 623679 tid 623800 thread 23 bound to OS proc set {39}
OMP: pid 623679 tid 623816 thread 39 bound to OS proc set {67}
OMP: pid 623679 tid 623815 thread 38 bound to OS proc set {65}
OMP: pid 623679 tid 623798 thread 21 bound to OS proc set {36}
OMP: pid 623679 tid 623811 thread 34 bound to OS proc set {58}
OMP: pid 623679 tid 623817 thread 40 bound to OS proc set {69}
OMP: pid 623679 tid 623822 thread 45 bound to OS proc set {77}
OMP: pid 623679 tid 623819 thread 42 bound to OS proc set {72}
OMP: pid 623679 tid 623814 thread 37 bound to OS proc set {64}
OMP: pid 623679 tid 623813 thread 36 bound to OS proc set {62}
OMP: pid 623679 tid 623821 thread 44 bound to OS proc set {76}
OMP: pid 623679 tid 623826 thread 49 bound to OS proc set {84}
OMP: pid 623679 tid 623793 thread 16 bound to OS proc set {27}
OMP: pid 623679 tid 623823 thread 46 bound to OS proc set {79}
OMP: pid 623679 tid 623818 thread 41 bound to OS proc set {71}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.808696, "speed_tg": 352.574829, "t": 5.808696, "speed": 352.574829}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_9  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 623901 tid 623901 thread 0 bound to OS proc set {0}
OMP: pid 623901 tid 624000 thread 1 bound to OS proc set {1}
OMP: pid 623901 tid 624014 thread 15 bound to OS proc set {22}
OMP: pid 623901 tid 624012 thread 13 bound to OS proc set {19}
OMP: pid 623901 tid 624059 thread 60 bound to OS proc set {90}
OMP: pid 623901 tid 624034 thread 35 bound to OS proc set {53}
OMP: pid 623901 tid 624010 thread 11 bound to OS proc set {16}
OMP: pid 623901 tid 624050 thread 51 bound to OS proc set {77}
OMP: pid 623901 tid 624013 thread 14 bound to OS proc set {21}
OMP: pid 623901 tid 624061 thread 62 bound to OS proc set {93}
OMP: pid 623901 tid 624006 thread 7 bound to OS proc set {10}
OMP: pid 623901 tid 624002 thread 3 bound to OS proc set {4}
OMP: pid 623901 tid 624001 thread 2 bound to OS proc set {3}
OMP: pid 623901 tid 624009 thread 10 bound to OS proc set {15}
OMP: pid 623901 tid 624062 thread 63 bound to OS proc set {95}
OMP: pid 623901 tid 624047 thread 48 bound to OS proc set {72}
OMP: pid 623901 tid 624043 thread 44 bound to OS proc set {66}
OMP: pid 623901 tid 624011 thread 12 bound to OS proc set {18}
OMP: pid 623901 tid 624058 thread 59 bound to OS proc set {89}
OMP: pid 623901 tid 624046 thread 47 bound to OS proc set {71}
OMP: pid 623901 tid 624005 thread 6 bound to OS proc set {9}
OMP: pid 623901 tid 624008 thread 9 bound to OS proc set {13}
OMP: pid 623901 tid 624003 thread 4 bound to OS proc set {6}
OMP: pid 623901 tid 624004 thread 5 bound to OS proc set {7}
OMP: pid 623901 tid 624030 thread 31 bound to OS proc set {46}
OMP: pid 623901 tid 624028 thread 29 bound to OS proc set {43}
OMP: pid 623901 tid 624018 thread 19 bound to OS proc set {28}
OMP: pid 623901 tid 624060 thread 61 bound to OS proc set {92}
OMP: pid 623901 tid 624019 thread 20 bound to OS proc set {30}
OMP: pid 623901 tid 624007 thread 8 bound to OS proc set {12}
OMP: pid 623901 tid 624021 thread 22 bound to OS proc set {33}
OMP: pid 623901 tid 624032 thread 33 bound to OS proc set {50}
OMP: pid 623901 tid 624015 thread 16 bound to OS proc set {24}
OMP: pid 623901 tid 624031 thread 32 bound to OS proc set {48}
OMP: pid 623901 tid 624017 thread 18 bound to OS proc set {27}
OMP: pid 623901 tid 624049 thread 50 bound to OS proc set {75}
OMP: pid 623901 tid 624033 thread 34 bound to OS proc set {51}
OMP: pid 623901 tid 624022 thread 23 bound to OS proc set {34}
OMP: pid 623901 tid 624027 thread 28 bound to OS proc set {42}
OMP: pid 623901 tid 624029 thread 30 bound to OS proc set {45}
OMP: pid 623901 tid 624055 thread 56 bound to OS proc set {84}
OMP: pid 623901 tid 624025 thread 26 bound to OS proc set {39}
OMP: pid 623901 tid 624039 thread 40 bound to OS proc set {60}
OMP: pid 623901 tid 624057 thread 58 bound to OS proc set {87}
OMP: pid 623901 tid 624048 thread 49 bound to OS proc set {74}
OMP: pid 623901 tid 624037 thread 38 bound to OS proc set {57}
OMP: pid 623901 tid 624041 thread 42 bound to OS proc set {63}
OMP: pid 623901 tid 624035 thread 36 bound to OS proc set {54}
OMP: pid 623901 tid 624040 thread 41 bound to OS proc set {62}
OMP: pid 623901 tid 624023 thread 24 bound to OS proc set {36}
OMP: pid 623901 tid 624026 thread 27 bound to OS proc set {40}
OMP: pid 623901 tid 624036 thread 37 bound to OS proc set {56}
OMP: pid 623901 tid 624042 thread 43 bound to OS proc set {65}
OMP: pid 623901 tid 624053 thread 54 bound to OS proc set {81}
OMP: pid 623901 tid 624016 thread 17 bound to OS proc set {25}
OMP: pid 623901 tid 624054 thread 55 bound to OS proc set {83}
OMP: pid 623901 tid 624038 thread 39 bound to OS proc set {59}
OMP: pid 623901 tid 624051 thread 52 bound to OS proc set {78}
OMP: pid 623901 tid 624045 thread 46 bound to OS proc set {69}
OMP: pid 623901 tid 624024 thread 25 bound to OS proc set {37}
OMP: pid 623901 tid 624056 thread 57 bound to OS proc set {86}
OMP: pid 623901 tid 624044 thread 45 bound to OS proc set {68}
OMP: pid 623901 tid 624052 thread 53 bound to OS proc set {80}
OMP: pid 623901 tid 624020 thread 21 bound to OS proc set {31}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.524943, "speed_tg": 370.682556, "t": 5.524943, "speed": 370.682556}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_10  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 624082 tid 624082 thread 0 bound to OS proc set {0}
OMP: pid 624082 tid 624182 thread 2 bound to OS proc set {2}
OMP: pid 624082 tid 624181 thread 1 bound to OS proc set {1}
OMP: pid 624082 tid 624231 thread 51 bound to OS proc set {68}
OMP: pid 624082 tid 624191 thread 11 bound to OS proc set {14}
OMP: pid 624082 tid 624215 thread 35 bound to OS proc set {47}
OMP: pid 624082 tid 624247 thread 67 bound to OS proc set {90}
OMP: pid 624082 tid 624208 thread 28 bound to OS proc set {37}
OMP: pid 624082 tid 624244 thread 64 bound to OS proc set {86}
OMP: pid 624082 tid 624183 thread 3 bound to OS proc set {4}
OMP: pid 624082 tid 624248 thread 68 bound to OS proc set {91}
OMP: pid 624082 tid 624251 thread 71 bound to OS proc set {95}
OMP: pid 624082 tid 624246 thread 66 bound to OS proc set {88}
OMP: pid 624082 tid 624245 thread 65 bound to OS proc set {87}
OMP: pid 624082 tid 624230 thread 50 bound to OS proc set {67}
OMP: pid 624082 tid 624224 thread 44 bound to OS proc set {59}
OMP: pid 624082 tid 624227 thread 47 bound to OS proc set {63}
OMP: pid 624082 tid 624243 thread 63 bound to OS proc set {84}
OMP: pid 624082 tid 624188 thread 8 bound to OS proc set {10}
OMP: pid 624082 tid 624219 thread 39 bound to OS proc set {52}
OMP: pid 624082 tid 624239 thread 59 bound to OS proc set {79}
OMP: pid 624082 tid 624232 thread 52 bound to OS proc set {70}
OMP: pid 624082 tid 624184 thread 4 bound to OS proc set {5}
OMP: pid 624082 tid 624185 thread 5 bound to OS proc set {6}
OMP: pid 624082 tid 624186 thread 6 bound to OS proc set {8}
OMP: pid 624082 tid 624189 thread 9 bound to OS proc set {12}
OMP: pid 624082 tid 624194 thread 14 bound to OS proc set {18}
OMP: pid 624082 tid 624187 thread 7 bound to OS proc set {9}
OMP: pid 624082 tid 624207 thread 27 bound to OS proc set {36}
OMP: pid 624082 tid 624212 thread 32 bound to OS proc set {43}
OMP: pid 624082 tid 624206 thread 26 bound to OS proc set {35}
OMP: pid 624082 tid 624192 thread 12 bound to OS proc set {16}
OMP: pid 624082 tid 624249 thread 69 bound to OS proc set {92}
OMP: pid 624082 tid 624225 thread 45 bound to OS proc set {60}
OMP: pid 624082 tid 624209 thread 29 bound to OS proc set {39}
OMP: pid 624082 tid 624197 thread 17 bound to OS proc set {22}
OMP: pid 624082 tid 624226 thread 46 bound to OS proc set {61}
OMP: pid 624082 tid 624195 thread 15 bound to OS proc set {20}
OMP: pid 624082 tid 624203 thread 23 bound to OS proc set {30}
OMP: pid 624082 tid 624211 thread 31 bound to OS proc set {41}
OMP: pid 624082 tid 624202 thread 22 bound to OS proc set {29}
OMP: pid 624082 tid 624204 thread 24 bound to OS proc set {32}
OMP: pid 624082 tid 624190 thread 10 bound to OS proc set {13}
OMP: pid 624082 tid 624229 thread 49 bound to OS proc set {66}
OMP: pid 624082 tid 624250 thread 70 bound to OS proc set {94}
OMP: pid 624082 tid 624235 thread 55 bound to OS proc set {74}
OMP: pid 624082 tid 624193 thread 13 bound to OS proc set {17}
OMP: pid 624082 tid 624214 thread 34 bound to OS proc set {45}
OMP: pid 624082 tid 624198 thread 18 bound to OS proc set {24}
OMP: pid 624082 tid 624205 thread 25 bound to OS proc set {33}
OMP: pid 624082 tid 624200 thread 20 bound to OS proc set {26}
OMP: pid 624082 tid 624216 thread 36 bound to OS proc set {48}
OMP: pid 624082 tid 624210 thread 30 bound to OS proc set {40}
OMP: pid 624082 tid 624199 thread 19 bound to OS proc set {25}
OMP: pid 624082 tid 624201 thread 21 bound to OS proc set {28}
OMP: pid 624082 tid 624242 thread 62 bound to OS proc set {83}
OMP: pid 624082 tid 624218 thread 38 bound to OS proc set {51}
OMP: pid 624082 tid 624217 thread 37 bound to OS proc set {49}
OMP: pid 624082 tid 624228 thread 48 bound to OS proc set {64}
OMP: pid 624082 tid 624234 thread 54 bound to OS proc set {72}
OMP: pid 624082 tid 624220 thread 40 bound to OS proc set {53}
OMP: pid 624082 tid 624237 thread 57 bound to OS proc set {76}
OMP: pid 624082 tid 624238 thread 58 bound to OS proc set {78}
OMP: pid 624082 tid 624223 thread 43 bound to OS proc set {57}
OMP: pid 624082 tid 624240 thread 60 bound to OS proc set {80}
OMP: pid 624082 tid 624196 thread 16 bound to OS proc set {21}
OMP: pid 624082 tid 624222 thread 42 bound to OS proc set {56}
OMP: pid 624082 tid 624241 thread 61 bound to OS proc set {82}
OMP: pid 624082 tid 624221 thread 41 bound to OS proc set {55}
OMP: pid 624082 tid 624233 thread 53 bound to OS proc set {71}
OMP: pid 624082 tid 624236 thread 56 bound to OS proc set {75}
OMP: pid 624082 tid 624213 thread 33 bound to OS proc set {44}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.356637, "speed_tg": 382.329437, "t": 5.356637, "speed": 382.329437}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_11  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 624271 tid 624271 thread 0 bound to OS proc set {0}
OMP: pid 624271 tid 624372 thread 3 bound to OS proc set {3}
OMP: pid 624271 tid 624371 thread 2 bound to OS proc set {2}
OMP: pid 624271 tid 624373 thread 4 bound to OS proc set {4}
OMP: pid 624271 tid 624370 thread 1 bound to OS proc set {1}
OMP: pid 624271 tid 624376 thread 7 bound to OS proc set {8}
OMP: pid 624271 tid 624381 thread 12 bound to OS proc set {14}
OMP: pid 624271 tid 624432 thread 63 bound to OS proc set {76}
OMP: pid 624271 tid 624380 thread 11 bound to OS proc set {13}
OMP: pid 624271 tid 624448 thread 79 bound to OS proc set {95}
OMP: pid 624271 tid 624436 thread 67 bound to OS proc set {81}
OMP: pid 624271 tid 624382 thread 13 bound to OS proc set {15}
OMP: pid 624271 tid 624385 thread 16 bound to OS proc set {19}
OMP: pid 624271 tid 624383 thread 14 bound to OS proc set {16}
OMP: pid 624271 tid 624396 thread 27 bound to OS proc set {32}
OMP: pid 624271 tid 624397 thread 28 bound to OS proc set {33}
OMP: pid 624271 tid 624379 thread 10 bound to OS proc set {12}
OMP: pid 624271 tid 624374 thread 5 bound to OS proc set {6}
OMP: pid 624271 tid 624398 thread 29 bound to OS proc set {35}
OMP: pid 624271 tid 624384 thread 15 bound to OS proc set {18}
OMP: pid 624271 tid 624418 thread 49 bound to OS proc set {59}
OMP: pid 624271 tid 624434 thread 65 bound to OS proc set {78}
OMP: pid 624271 tid 624415 thread 46 bound to OS proc set {55}
OMP: pid 624271 tid 624421 thread 52 bound to OS proc set {63}
OMP: pid 624271 tid 624377 thread 8 bound to OS proc set {9}
OMP: pid 624271 tid 624386 thread 17 bound to OS proc set {20}
OMP: pid 624271 tid 624393 thread 24 bound to OS proc set {29}
OMP: pid 624271 tid 624375 thread 6 bound to OS proc set {7}
OMP: pid 624271 tid 624417 thread 48 bound to OS proc set {58}
OMP: pid 624271 tid 624389 thread 20 bound to OS proc set {24}
OMP: pid 624271 tid 624425 thread 56 bound to OS proc set {67}
OMP: pid 624271 tid 624412 thread 43 bound to OS proc set {52}
OMP: pid 624271 tid 624447 thread 78 bound to OS proc set {94}
OMP: pid 624271 tid 624440 thread 71 bound to OS proc set {86}
OMP: pid 624271 tid 624420 thread 51 bound to OS proc set {61}
OMP: pid 624271 tid 624428 thread 59 bound to OS proc set {71}
OMP: pid 624271 tid 624427 thread 58 bound to OS proc set {70}
OMP: pid 624271 tid 624445 thread 76 bound to OS proc set {92}
OMP: pid 624271 tid 624378 thread 9 bound to OS proc set {10}
OMP: pid 624271 tid 624423 thread 54 bound to OS proc set {65}
OMP: pid 624271 tid 624419 thread 50 bound to OS proc set {60}
OMP: pid 624271 tid 624435 thread 66 bound to OS proc set {80}
OMP: pid 624271 tid 624437 thread 68 bound to OS proc set {82}
OMP: pid 624271 tid 624394 thread 25 bound to OS proc set {30}
OMP: pid 624271 tid 624430 thread 61 bound to OS proc set {73}
OMP: pid 624271 tid 624424 thread 55 bound to OS proc set {66}
OMP: pid 624271 tid 624404 thread 35 bound to OS proc set {42}
OMP: pid 624271 tid 624400 thread 31 bound to OS proc set {37}
OMP: pid 624271 tid 624388 thread 19 bound to OS proc set {23}
OMP: pid 624271 tid 624414 thread 45 bound to OS proc set {54}
OMP: pid 624271 tid 624399 thread 30 bound to OS proc set {36}
OMP: pid 624271 tid 624391 thread 22 bound to OS proc set {26}
OMP: pid 624271 tid 624444 thread 75 bound to OS proc set {90}
OMP: pid 624271 tid 624413 thread 44 bound to OS proc set {53}
OMP: pid 624271 tid 624426 thread 57 bound to OS proc set {69}
OMP: pid 624271 tid 624409 thread 40 bound to OS proc set {48}
OMP: pid 624271 tid 624429 thread 60 bound to OS proc set {72}
OMP: pid 624271 tid 624387 thread 18 bound to OS proc set {21}
OMP: pid 624271 tid 624416 thread 47 bound to OS proc set {56}
OMP: pid 624271 tid 624408 thread 39 bound to OS proc set {47}
OMP: pid 624271 tid 624405 thread 36 bound to OS proc set {43}
OMP: pid 624271 tid 624403 thread 34 bound to OS proc set {41}
OMP: pid 624271 tid 624407 thread 38 bound to OS proc set {46}
OMP: pid 624271 tid 624431 thread 62 bound to OS proc set {75}
OMP: pid 624271 tid 624433 thread 64 bound to OS proc set {77}
OMP: pid 624271 tid 624406 thread 37 bound to OS proc set {44}
OMP: pid 624271 tid 624411 thread 42 bound to OS proc set {50}
OMP: pid 624271 tid 624390 thread 21 bound to OS proc set {25}
OMP: pid 624271 tid 624395 thread 26 bound to OS proc set {31}
OMP: pid 624271 tid 624410 thread 41 bound to OS proc set {49}
OMP: pid 624271 tid 624439 thread 70 bound to OS proc set {84}
OMP: pid 624271 tid 624392 thread 23 bound to OS proc set {27}
OMP: pid 624271 tid 624422 thread 53 bound to OS proc set {64}
OMP: pid 624271 tid 624438 thread 69 bound to OS proc set {83}
OMP: pid 624271 tid 624441 thread 72 bound to OS proc set {87}
OMP: pid 624271 tid 624443 thread 74 bound to OS proc set {89}
OMP: pid 624271 tid 624446 thread 77 bound to OS proc set {93}
OMP: pid 624271 tid 624442 thread 73 bound to OS proc set {88}
OMP: pid 624271 tid 624401 thread 32 bound to OS proc set {38}
OMP: pid 624271 tid 624402 thread 33 bound to OS proc set {40}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.215218, "speed_tg": 392.696899, "t": 5.215218, "speed": 392.696899}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_12  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 624469 tid 624469 thread 0 bound to OS proc set {0}
OMP: pid 624469 tid 624570 thread 3 bound to OS proc set {3}
OMP: pid 624469 tid 624569 thread 2 bound to OS proc set {2}
OMP: pid 624469 tid 624575 thread 8 bound to OS proc set {8}
OMP: pid 624469 tid 624574 thread 7 bound to OS proc set {7}
OMP: pid 624469 tid 624571 thread 4 bound to OS proc set {4}
OMP: pid 624469 tid 624595 thread 28 bound to OS proc set {30}
OMP: pid 624469 tid 624568 thread 1 bound to OS proc set {1}
OMP: pid 624469 tid 624576 thread 9 bound to OS proc set {9}
OMP: pid 624469 tid 624594 thread 27 bound to OS proc set {29}
OMP: pid 624469 tid 624591 thread 24 bound to OS proc set {26}
OMP: pid 624469 tid 624573 thread 6 bound to OS proc set {6}
OMP: pid 624469 tid 624572 thread 5 bound to OS proc set {5}
OMP: pid 624469 tid 624596 thread 29 bound to OS proc set {31}
OMP: pid 624469 tid 624593 thread 26 bound to OS proc set {28}
OMP: pid 624469 tid 624590 thread 23 bound to OS proc set {25}
OMP: pid 624469 tid 624592 thread 25 bound to OS proc set {27}
OMP: pid 624469 tid 624587 thread 20 bound to OS proc set {22}
OMP: pid 624469 tid 624589 thread 22 bound to OS proc set {24}
OMP: pid 624469 tid 624588 thread 21 bound to OS proc set {23}
OMP: pid 624469 tid 624578 thread 11 bound to OS proc set {12}
OMP: pid 624469 tid 624577 thread 10 bound to OS proc set {11}
OMP: pid 624469 tid 624581 thread 14 bound to OS proc set {15}
OMP: pid 624469 tid 624615 thread 48 bound to OS proc set {52}
OMP: pid 624469 tid 624582 thread 15 bound to OS proc set {16}
OMP: pid 624469 tid 624627 thread 60 bound to OS proc set {66}
OMP: pid 624469 tid 624611 thread 44 bound to OS proc set {48}
OMP: pid 624469 tid 624598 thread 31 bound to OS proc set {34}
OMP: pid 624469 tid 624607 thread 40 bound to OS proc set {44}
OMP: pid 624469 tid 624597 thread 30 bound to OS proc set {33}
OMP: pid 624469 tid 624586 thread 19 bound to OS proc set {20}
OMP: pid 624469 tid 624614 thread 47 bound to OS proc set {51}
OMP: pid 624469 tid 624623 thread 56 bound to OS proc set {61}
OMP: pid 624469 tid 624579 thread 12 bound to OS proc set {13}
OMP: pid 624469 tid 624625 thread 58 bound to OS proc set {63}
OMP: pid 624469 tid 624618 thread 51 bound to OS proc set {56}
OMP: pid 624469 tid 624617 thread 50 bound to OS proc set {55}
OMP: pid 624469 tid 624599 thread 32 bound to OS proc set {35}
OMP: pid 624469 tid 624621 thread 54 bound to OS proc set {59}
OMP: pid 624469 tid 624622 thread 55 bound to OS proc set {60}
OMP: pid 624469 tid 624600 thread 33 bound to OS proc set {36}
OMP: pid 624469 tid 624610 thread 43 bound to OS proc set {47}
OMP: pid 624469 tid 624613 thread 46 bound to OS proc set {50}
OMP: pid 624469 tid 624580 thread 13 bound to OS proc set {14}
OMP: pid 624469 tid 624584 thread 17 bound to OS proc set {18}
OMP: pid 624469 tid 624585 thread 18 bound to OS proc set {19}
OMP: pid 624469 tid 624609 thread 42 bound to OS proc set {46}
OMP: pid 624469 tid 624608 thread 41 bound to OS proc set {45}
OMP: pid 624469 tid 624619 thread 52 bound to OS proc set {57}
OMP: pid 624469 tid 624583 thread 16 bound to OS proc set {17}
OMP: pid 624469 tid 624612 thread 45 bound to OS proc set {49}
OMP: pid 624469 tid 624606 thread 39 bound to OS proc set {42}
OMP: pid 624469 tid 624629 thread 62 bound to OS proc set {68}
OMP: pid 624469 tid 624601 thread 34 bound to OS proc set {37}
OMP: pid 624469 tid 624643 thread 76 bound to OS proc set {83}
OMP: pid 624469 tid 624650 thread 83 bound to OS proc set {91}
OMP: pid 624469 tid 624642 thread 75 bound to OS proc set {82}
OMP: pid 624469 tid 624616 thread 49 bound to OS proc set {54}
OMP: pid 624469 tid 624605 thread 38 bound to OS proc set {41}
OMP: pid 624469 tid 624603 thread 36 bound to OS proc set {39}
OMP: pid 624469 tid 624602 thread 35 bound to OS proc set {38}
OMP: pid 624469 tid 624628 thread 61 bound to OS proc set {67}
OMP: pid 624469 tid 624626 thread 59 bound to OS proc set {65}
OMP: pid 624469 tid 624620 thread 53 bound to OS proc set {58}
OMP: pid 624469 tid 624604 thread 37 bound to OS proc set {40}
OMP: pid 624469 tid 624633 thread 66 bound to OS proc set {72}
OMP: pid 624469 tid 624647 thread 80 bound to OS proc set {88}
OMP: pid 624469 tid 624638 thread 71 bound to OS proc set {78}
OMP: pid 624469 tid 624641 thread 74 bound to OS proc set {81}
OMP: pid 624469 tid 624624 thread 57 bound to OS proc set {62}
OMP: pid 624469 tid 624630 thread 63 bound to OS proc set {69}
OMP: pid 624469 tid 624634 thread 67 bound to OS proc set {73}
OMP: pid 624469 tid 624644 thread 77 bound to OS proc set {84}
OMP: pid 624469 tid 624646 thread 79 bound to OS proc set {87}
OMP: pid 624469 tid 624640 thread 73 bound to OS proc set {80}
OMP: pid 624469 tid 624649 thread 82 bound to OS proc set {90}
OMP: pid 624469 tid 624631 thread 64 bound to OS proc set {70}
OMP: pid 624469 tid 624635 thread 68 bound to OS proc set {74}
OMP: pid 624469 tid 624637 thread 70 bound to OS proc set {77}
OMP: pid 624469 tid 624645 thread 78 bound to OS proc set {85}
OMP: pid 624469 tid 624639 thread 72 bound to OS proc set {79}
OMP: pid 624469 tid 624654 thread 87 bound to OS proc set {95}
OMP: pid 624469 tid 624653 thread 86 bound to OS proc set {94}
OMP: pid 624469 tid 624651 thread 84 bound to OS proc set {92}
OMP: pid 624469 tid 624652 thread 85 bound to OS proc set {93}
OMP: pid 624469 tid 624632 thread 65 bound to OS proc set {71}
OMP: pid 624469 tid 624636 thread 69 bound to OS proc set {76}
OMP: pid 624469 tid 624648 thread 81 bound to OS proc set {89}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.232604, "speed_tg": 391.392120, "t": 5.232604, "speed": 391.392120}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_13  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 624724 tid 624724 thread 0 bound to OS proc set {0}
OMP: pid 624724 tid 624826 thread 3 bound to OS proc set {3}
OMP: pid 624724 tid 624838 thread 15 bound to OS proc set {15}
OMP: pid 624724 tid 624835 thread 12 bound to OS proc set {12}
OMP: pid 624724 tid 624825 thread 2 bound to OS proc set {2}
OMP: pid 624724 tid 624831 thread 8 bound to OS proc set {8}
OMP: pid 624724 tid 624837 thread 14 bound to OS proc set {14}
OMP: pid 624724 tid 624824 thread 1 bound to OS proc set {1}
OMP: pid 624724 tid 624854 thread 31 bound to OS proc set {31}
OMP: pid 624724 tid 624833 thread 10 bound to OS proc set {10}
OMP: pid 624724 tid 624834 thread 11 bound to OS proc set {11}
OMP: pid 624724 tid 624851 thread 28 bound to OS proc set {28}
OMP: pid 624724 tid 624842 thread 19 bound to OS proc set {19}
OMP: pid 624724 tid 624839 thread 16 bound to OS proc set {16}
OMP: pid 624724 tid 624836 thread 13 bound to OS proc set {13}
OMP: pid 624724 tid 624827 thread 4 bound to OS proc set {4}
OMP: pid 624724 tid 624870 thread 47 bound to OS proc set {47}
OMP: pid 624724 tid 624853 thread 30 bound to OS proc set {30}
OMP: pid 624724 tid 624858 thread 35 bound to OS proc set {35}
OMP: pid 624724 tid 624855 thread 32 bound to OS proc set {32}
OMP: pid 624724 tid 624832 thread 9 bound to OS proc set {9}
OMP: pid 624724 tid 624850 thread 27 bound to OS proc set {27}
OMP: pid 624724 tid 624883 thread 60 bound to OS proc set {60}
OMP: pid 624724 tid 624874 thread 51 bound to OS proc set {51}
OMP: pid 624724 tid 624841 thread 18 bound to OS proc set {18}
OMP: pid 624724 tid 624849 thread 26 bound to OS proc set {26}
OMP: pid 624724 tid 624847 thread 24 bound to OS proc set {24}
OMP: pid 624724 tid 624830 thread 7 bound to OS proc set {7}
OMP: pid 624724 tid 624867 thread 44 bound to OS proc set {44}
OMP: pid 624724 tid 624852 thread 29 bound to OS proc set {29}
OMP: pid 624724 tid 624866 thread 43 bound to OS proc set {43}
OMP: pid 624724 tid 624871 thread 48 bound to OS proc set {48}
OMP: pid 624724 tid 624869 thread 46 bound to OS proc set {46}
OMP: pid 624724 tid 624829 thread 6 bound to OS proc set {6}
OMP: pid 624724 tid 624873 thread 50 bound to OS proc set {50}
OMP: pid 624724 tid 624840 thread 17 bound to OS proc set {17}
OMP: pid 624724 tid 624828 thread 5 bound to OS proc set {5}
OMP: pid 624724 tid 624882 thread 59 bound to OS proc set {59}
OMP: pid 624724 tid 624846 thread 23 bound to OS proc set {23}
OMP: pid 624724 tid 624863 thread 40 bound to OS proc set {40}
OMP: pid 624724 tid 624865 thread 42 bound to OS proc set {42}
OMP: pid 624724 tid 624857 thread 34 bound to OS proc set {34}
OMP: pid 624724 tid 624862 thread 39 bound to OS proc set {39}
OMP: pid 624724 tid 624843 thread 20 bound to OS proc set {20}
OMP: pid 624724 tid 624879 thread 56 bound to OS proc set {56}
OMP: pid 624724 tid 624868 thread 45 bound to OS proc set {45}
OMP: pid 624724 tid 624845 thread 22 bound to OS proc set {22}
OMP: pid 624724 tid 624878 thread 55 bound to OS proc set {55}
OMP: pid 624724 tid 624856 thread 33 bound to OS proc set {33}
OMP: pid 624724 tid 624887 thread 64 bound to OS proc set {64}
OMP: pid 624724 tid 624859 thread 36 bound to OS proc set {36}
OMP: pid 624724 tid 624881 thread 58 bound to OS proc set {58}
OMP: pid 624724 tid 624848 thread 25 bound to OS proc set {25}
OMP: pid 624724 tid 624872 thread 49 bound to OS proc set {49}
OMP: pid 624724 tid 624861 thread 38 bound to OS proc set {38}
OMP: pid 624724 tid 624875 thread 52 bound to OS proc set {52}
OMP: pid 624724 tid 624864 thread 41 bound to OS proc set {41}
OMP: pid 624724 tid 624886 thread 63 bound to OS proc set {63}
OMP: pid 624724 tid 624877 thread 54 bound to OS proc set {54}
OMP: pid 624724 tid 624860 thread 37 bound to OS proc set {37}
OMP: pid 624724 tid 624899 thread 76 bound to OS proc set {76}
OMP: pid 624724 tid 624880 thread 57 bound to OS proc set {57}
OMP: pid 624724 tid 624902 thread 79 bound to OS proc set {79}
OMP: pid 624724 tid 624890 thread 67 bound to OS proc set {67}
OMP: pid 624724 tid 624844 thread 21 bound to OS proc set {21}
OMP: pid 624724 tid 624901 thread 78 bound to OS proc set {78}
OMP: pid 624724 tid 624889 thread 66 bound to OS proc set {66}
OMP: pid 624724 tid 624900 thread 77 bound to OS proc set {77}
OMP: pid 624724 tid 624885 thread 62 bound to OS proc set {62}
OMP: pid 624724 tid 624895 thread 72 bound to OS proc set {72}
OMP: pid 624724 tid 624894 thread 71 bound to OS proc set {71}
OMP: pid 624724 tid 624876 thread 53 bound to OS proc set {53}
OMP: pid 624724 tid 624898 thread 75 bound to OS proc set {75}
OMP: pid 624724 tid 624891 thread 68 bound to OS proc set {68}
OMP: pid 624724 tid 624897 thread 74 bound to OS proc set {74}
OMP: pid 624724 tid 624896 thread 73 bound to OS proc set {73}
OMP: pid 624724 tid 624918 thread 95 bound to OS proc set {95}
OMP: pid 624724 tid 624917 thread 94 bound to OS proc set {94}
OMP: pid 624724 tid 624893 thread 70 bound to OS proc set {70}
OMP: pid 624724 tid 624892 thread 69 bound to OS proc set {69}
OMP: pid 624724 tid 624906 thread 83 bound to OS proc set {83}
OMP: pid 624724 tid 624915 thread 92 bound to OS proc set {92}
OMP: pid 624724 tid 624911 thread 88 bound to OS proc set {88}
OMP: pid 624724 tid 624914 thread 91 bound to OS proc set {91}
OMP: pid 624724 tid 624913 thread 90 bound to OS proc set {90}
OMP: pid 624724 tid 624905 thread 82 bound to OS proc set {82}
OMP: pid 624724 tid 624904 thread 81 bound to OS proc set {81}
OMP: pid 624724 tid 624907 thread 84 bound to OS proc set {84}
OMP: pid 624724 tid 624912 thread 89 bound to OS proc set {89}
OMP: pid 624724 tid 624916 thread 93 bound to OS proc set {93}
OMP: pid 624724 tid 624910 thread 87 bound to OS proc set {87}
OMP: pid 624724 tid 624903 thread 80 bound to OS proc set {80}
OMP: pid 624724 tid 624884 thread 61 bound to OS proc set {61}
OMP: pid 624724 tid 624908 thread 85 bound to OS proc set {85}
OMP: pid 624724 tid 624909 thread 86 bound to OS proc set {86}
OMP: pid 624724 tid 624888 thread 65 bound to OS proc set {65}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.247973, "speed_tg": 390.245911, "t": 5.247973, "speed": 390.245911}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-8865/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_11-40-31/tools/lprof_npsu_run_14  #
#########################################################################################################################################################################################################################################

×