* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 881508 tid 881508 thread 0 bound to OS proc set {0}
OMP: pid 881508 tid 881607 thread 1 bound to OS proc set {24}
OMP: pid 881508 tid 881608 thread 2 bound to OS proc set {48}
OMP: pid 881508 tid 881609 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 7.799399, "speed_pp": 65.646088, "t_tg": 0.000000, "speed_tg": nan, "t": 7.799399, "speed": 65.646088}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_2 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 881629 tid 881629 thread 0 bound to OS proc set {0}
OMP: pid 881629 tid 881729 thread 2 bound to OS proc set {24}
OMP: pid 881629 tid 881730 thread 3 bound to OS proc set {36}
OMP: pid 881629 tid 881731 thread 4 bound to OS proc set {48}
OMP: pid 881629 tid 881728 thread 1 bound to OS proc set {12}
OMP: pid 881629 tid 881733 thread 6 bound to OS proc set {72}
OMP: pid 881629 tid 881732 thread 5 bound to OS proc set {60}
OMP: pid 881629 tid 881734 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 3.913739, "speed_pp": 130.821198, "t_tg": 0.000000, "speed_tg": nan, "t": 3.913739, "speed": 130.821198}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_3 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 881803 tid 881803 thread 0 bound to OS proc set {0}
OMP: pid 881803 tid 881903 thread 2 bound to OS proc set {12}
OMP: pid 881803 tid 881913 thread 12 bound to OS proc set {72}
OMP: pid 881803 tid 881904 thread 3 bound to OS proc set {18}
OMP: pid 881803 tid 881915 thread 14 bound to OS proc set {84}
OMP: pid 881803 tid 881914 thread 13 bound to OS proc set {78}
OMP: pid 881803 tid 881905 thread 4 bound to OS proc set {24}
OMP: pid 881803 tid 881902 thread 1 bound to OS proc set {6}
OMP: pid 881803 tid 881909 thread 8 bound to OS proc set {48}
OMP: pid 881803 tid 881908 thread 7 bound to OS proc set {42}
OMP: pid 881803 tid 881912 thread 11 bound to OS proc set {66}
OMP: pid 881803 tid 881911 thread 10 bound to OS proc set {60}
OMP: pid 881803 tid 881906 thread 5 bound to OS proc set {30}
OMP: pid 881803 tid 881910 thread 9 bound to OS proc set {54}
OMP: pid 881803 tid 881907 thread 6 bound to OS proc set {36}
OMP: pid 881803 tid 881916 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 1.979277, "speed_pp": 258.680328, "t_tg": 0.000000, "speed_tg": nan, "t": 1.979277, "speed": 258.680328}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_4 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 881936 tid 881936 thread 0 bound to OS proc set {0}
OMP: pid 881936 tid 882037 thread 3 bound to OS proc set {12}
OMP: pid 881936 tid 882036 thread 2 bound to OS proc set {8}
OMP: pid 881936 tid 882035 thread 1 bound to OS proc set {4}
OMP: pid 881936 tid 882049 thread 15 bound to OS proc set {60}
OMP: pid 881936 tid 882042 thread 8 bound to OS proc set {32}
OMP: pid 881936 tid 882046 thread 12 bound to OS proc set {48}
OMP: pid 881936 tid 882038 thread 4 bound to OS proc set {16}
OMP: pid 881936 tid 882039 thread 5 bound to OS proc set {20}
OMP: pid 881936 tid 882048 thread 14 bound to OS proc set {56}
OMP: pid 881936 tid 882050 thread 16 bound to OS proc set {64}
OMP: pid 881936 tid 882045 thread 11 bound to OS proc set {44}
OMP: pid 881936 tid 882047 thread 13 bound to OS proc set {52}
OMP: pid 881936 tid 882053 thread 19 bound to OS proc set {76}
OMP: pid 881936 tid 882041 thread 7 bound to OS proc set {28}
OMP: pid 881936 tid 882040 thread 6 bound to OS proc set {24}
OMP: pid 881936 tid 882043 thread 9 bound to OS proc set {36}
OMP: pid 881936 tid 882052 thread 18 bound to OS proc set {72}
OMP: pid 881936 tid 882044 thread 10 bound to OS proc set {40}
OMP: pid 881936 tid 882054 thread 20 bound to OS proc set {80}
OMP: pid 881936 tid 882051 thread 17 bound to OS proc set {68}
OMP: pid 881936 tid 882055 thread 21 bound to OS proc set {84}
OMP: pid 881936 tid 882056 thread 22 bound to OS proc set {88}
OMP: pid 881936 tid 882057 thread 23 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 1.453403, "speed_pp": 352.276703, "t_tg": 0.000000, "speed_tg": nan, "t": 1.453403, "speed": 352.276703}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_5 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 882077 tid 882077 thread 0 bound to OS proc set {0}
OMP: pid 882077 tid 882182 thread 7 bound to OS proc set {21}
OMP: pid 882077 tid 882179 thread 4 bound to OS proc set {12}
OMP: pid 882077 tid 882190 thread 15 bound to OS proc set {45}
OMP: pid 882077 tid 882181 thread 6 bound to OS proc set {18}
OMP: pid 882077 tid 882177 thread 2 bound to OS proc set {6}
OMP: pid 882077 tid 882189 thread 14 bound to OS proc set {42}
OMP: pid 882077 tid 882187 thread 12 bound to OS proc set {36}
OMP: pid 882077 tid 882185 thread 10 bound to OS proc set {30}
OMP: pid 882077 tid 882203 thread 28 bound to OS proc set {84}
OMP: pid 882077 tid 882188 thread 13 bound to OS proc set {39}
OMP: pid 882077 tid 882176 thread 1 bound to OS proc set {3}
OMP: pid 882077 tid 882186 thread 11 bound to OS proc set {33}
OMP: pid 882077 tid 882191 thread 16 bound to OS proc set {48}
OMP: pid 882077 tid 882180 thread 5 bound to OS proc set {15}
OMP: pid 882077 tid 882205 thread 30 bound to OS proc set {90}
OMP: pid 882077 tid 882178 thread 3 bound to OS proc set {9}
OMP: pid 882077 tid 882184 thread 9 bound to OS proc set {27}
OMP: pid 882077 tid 882194 thread 19 bound to OS proc set {57}
OMP: pid 882077 tid 882193 thread 18 bound to OS proc set {54}
OMP: pid 882077 tid 882183 thread 8 bound to OS proc set {24}
OMP: pid 882077 tid 882206 thread 31 bound to OS proc set {93}
OMP: pid 882077 tid 882202 thread 27 bound to OS proc set {81}
OMP: pid 882077 tid 882199 thread 24 bound to OS proc set {72}
OMP: pid 882077 tid 882198 thread 23 bound to OS proc set {69}
OMP: pid 882077 tid 882192 thread 17 bound to OS proc set {51}
OMP: pid 882077 tid 882201 thread 26 bound to OS proc set {78}
OMP: pid 882077 tid 882197 thread 22 bound to OS proc set {66}
OMP: pid 882077 tid 882195 thread 20 bound to OS proc set {60}
OMP: pid 882077 tid 882204 thread 29 bound to OS proc set {87}
OMP: pid 882077 tid 882200 thread 25 bound to OS proc set {75}
OMP: pid 882077 tid 882196 thread 21 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 1.159339, "speed_pp": 441.630981, "t_tg": 0.000000, "speed_tg": nan, "t": 1.159339, "speed": 441.630981}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_6 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 882226 tid 882226 thread 0 bound to OS proc set {0}
OMP: pid 882226 tid 882338 thread 14 bound to OS proc set {33}
OMP: pid 882226 tid 882339 thread 15 bound to OS proc set {36}
OMP: pid 882226 tid 882337 thread 13 bound to OS proc set {31}
OMP: pid 882226 tid 882335 thread 11 bound to OS proc set {26}
OMP: pid 882226 tid 882331 thread 7 bound to OS proc set {16}
OMP: pid 882226 tid 882327 thread 3 bound to OS proc set {7}
OMP: pid 882226 tid 882325 thread 1 bound to OS proc set {2}
OMP: pid 882226 tid 882356 thread 32 bound to OS proc set {77}
OMP: pid 882226 tid 882359 thread 35 bound to OS proc set {84}
OMP: pid 882226 tid 882332 thread 8 bound to OS proc set {19}
OMP: pid 882226 tid 882330 thread 6 bound to OS proc set {14}
OMP: pid 882226 tid 882326 thread 2 bound to OS proc set {4}
OMP: pid 882226 tid 882355 thread 31 bound to OS proc set {75}
OMP: pid 882226 tid 882329 thread 5 bound to OS proc set {12}
OMP: pid 882226 tid 882358 thread 34 bound to OS proc set {82}
OMP: pid 882226 tid 882362 thread 38 bound to OS proc set {92}
OMP: pid 882226 tid 882336 thread 12 bound to OS proc set {29}
OMP: pid 882226 tid 882352 thread 28 bound to OS proc set {67}
OMP: pid 882226 tid 882357 thread 33 bound to OS proc set {80}
OMP: pid 882226 tid 882354 thread 30 bound to OS proc set {72}
OMP: pid 882226 tid 882342 thread 18 bound to OS proc set {43}
OMP: pid 882226 tid 882360 thread 36 bound to OS proc set {87}
OMP: pid 882226 tid 882343 thread 19 bound to OS proc set {46}
OMP: pid 882226 tid 882353 thread 29 bound to OS proc set {70}
OMP: pid 882226 tid 882340 thread 16 bound to OS proc set {38}
OMP: pid 882226 tid 882363 thread 39 bound to OS proc set {94}
OMP: pid 882226 tid 882351 thread 27 bound to OS proc set {65}
OMP: pid 882226 tid 882341 thread 17 bound to OS proc set {41}
OMP: pid 882226 tid 882328 thread 4 bound to OS proc set {9}
OMP: pid 882226 tid 882334 thread 10 bound to OS proc set {24}
OMP: pid 882226 tid 882347 thread 23 bound to OS proc set {55}
OMP: pid 882226 tid 882333 thread 9 bound to OS proc set {21}
OMP: pid 882226 tid 882348 thread 24 bound to OS proc set {58}
OMP: pid 882226 tid 882344 thread 20 bound to OS proc set {48}
OMP: pid 882226 tid 882349 thread 25 bound to OS proc set {60}
OMP: pid 882226 tid 882346 thread 22 bound to OS proc set {53}
OMP: pid 882226 tid 882350 thread 26 bound to OS proc set {63}
OMP: pid 882226 tid 882361 thread 37 bound to OS proc set {89}
OMP: pid 882226 tid 882345 thread 21 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.973571, "speed_pp": 525.898987, "t_tg": 0.000000, "speed_tg": nan, "t": 0.973571, "speed": 525.898987}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_7 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 882383 tid 882383 thread 0 bound to OS proc set {0}
OMP: pid 882383 tid 882490 thread 9 bound to OS proc set {18}
OMP: pid 882383 tid 882489 thread 8 bound to OS proc set {16}
OMP: pid 882383 tid 882499 thread 18 bound to OS proc set {36}
OMP: pid 882383 tid 882482 thread 1 bound to OS proc set {2}
OMP: pid 882383 tid 882484 thread 3 bound to OS proc set {6}
OMP: pid 882383 tid 882487 thread 6 bound to OS proc set {12}
OMP: pid 882383 tid 882483 thread 2 bound to OS proc set {4}
OMP: pid 882383 tid 882516 thread 35 bound to OS proc set {70}
OMP: pid 882383 tid 882525 thread 44 bound to OS proc set {88}
OMP: pid 882383 tid 882497 thread 16 bound to OS proc set {32}
OMP: pid 882383 tid 882513 thread 32 bound to OS proc set {64}
OMP: pid 882383 tid 882488 thread 7 bound to OS proc set {14}
OMP: pid 882383 tid 882524 thread 43 bound to OS proc set {86}
OMP: pid 882383 tid 882495 thread 14 bound to OS proc set {28}
OMP: pid 882383 tid 882504 thread 23 bound to OS proc set {46}
OMP: pid 882383 tid 882496 thread 15 bound to OS proc set {30}
OMP: pid 882383 tid 882486 thread 5 bound to OS proc set {10}
OMP: pid 882383 tid 882485 thread 4 bound to OS proc set {8}
OMP: pid 882383 tid 882505 thread 24 bound to OS proc set {48}
OMP: pid 882383 tid 882501 thread 20 bound to OS proc set {40}
OMP: pid 882383 tid 882512 thread 31 bound to OS proc set {62}
OMP: pid 882383 tid 882521 thread 40 bound to OS proc set {80}
OMP: pid 882383 tid 882503 thread 22 bound to OS proc set {44}
OMP: pid 882383 tid 882493 thread 12 bound to OS proc set {24}
OMP: pid 882383 tid 882491 thread 10 bound to OS proc set {20}
OMP: pid 882383 tid 882492 thread 11 bound to OS proc set {22}
OMP: pid 882383 tid 882527 thread 46 bound to OS proc set {92}
OMP: pid 882383 tid 882515 thread 34 bound to OS proc set {68}
OMP: pid 882383 tid 882502 thread 21 bound to OS proc set {42}
OMP: pid 882383 tid 882511 thread 30 bound to OS proc set {60}
OMP: pid 882383 tid 882526 thread 45 bound to OS proc set {90}
OMP: pid 882383 tid 882509 thread 28 bound to OS proc set {56}
OMP: pid 882383 tid 882510 thread 29 bound to OS proc set {58}
OMP: pid 882383 tid 882517 thread 36 bound to OS proc set {72}
OMP: pid 882383 tid 882508 thread 27 bound to OS proc set {54}
OMP: pid 882383 tid 882500 thread 19 bound to OS proc set {38}
OMP: pid 882383 tid 882519 thread 38 bound to OS proc set {76}
OMP: pid 882383 tid 882498 thread 17 bound to OS proc set {34}
OMP: pid 882383 tid 882523 thread 42 bound to OS proc set {84}
OMP: pid 882383 tid 882528 thread 47 bound to OS proc set {94}
OMP: pid 882383 tid 882514 thread 33 bound to OS proc set {66}
OMP: pid 882383 tid 882506 thread 25 bound to OS proc set {50}
OMP: pid 882383 tid 882518 thread 37 bound to OS proc set {74}
OMP: pid 882383 tid 882507 thread 26 bound to OS proc set {52}
OMP: pid 882383 tid 882522 thread 41 bound to OS proc set {82}
OMP: pid 882383 tid 882520 thread 39 bound to OS proc set {78}
OMP: pid 882383 tid 882494 thread 13 bound to OS proc set {26}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.849662, "speed_pp": 602.592529, "t_tg": 0.000000, "speed_tg": nan, "t": 0.849662, "speed": 602.592529}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_8 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 882596 tid 882596 thread 0 bound to OS proc set {0}
OMP: pid 882596 tid 882695 thread 1 bound to OS proc set {1}
OMP: pid 882596 tid 882709 thread 15 bound to OS proc set {25}
OMP: pid 882596 tid 882702 thread 8 bound to OS proc set {13}
OMP: pid 882596 tid 882725 thread 31 bound to OS proc set {53}
OMP: pid 882596 tid 882726 thread 32 bound to OS proc set {55}
OMP: pid 882596 tid 882705 thread 11 bound to OS proc set {19}
OMP: pid 882596 tid 882749 thread 55 bound to OS proc set {95}
OMP: pid 882596 tid 882742 thread 48 bound to OS proc set {83}
OMP: pid 882596 tid 882745 thread 51 bound to OS proc set {88}
OMP: pid 882596 tid 882701 thread 7 bound to OS proc set {12}
OMP: pid 882596 tid 882744 thread 50 bound to OS proc set {86}
OMP: pid 882596 tid 882746 thread 52 bound to OS proc set {90}
OMP: pid 882596 tid 882696 thread 2 bound to OS proc set {3}
OMP: pid 882596 tid 882722 thread 28 bound to OS proc set {48}
OMP: pid 882596 tid 882724 thread 30 bound to OS proc set {51}
OMP: pid 882596 tid 882703 thread 9 bound to OS proc set {15}
OMP: pid 882596 tid 882741 thread 47 bound to OS proc set {81}
OMP: pid 882596 tid 882748 thread 54 bound to OS proc set {93}
OMP: pid 882596 tid 882743 thread 49 bound to OS proc set {84}
OMP: pid 882596 tid 882723 thread 29 bound to OS proc set {50}
OMP: pid 882596 tid 882698 thread 4 bound to OS proc set {6}
OMP: pid 882596 tid 882712 thread 18 bound to OS proc set {31}
OMP: pid 882596 tid 882711 thread 17 bound to OS proc set {29}
OMP: pid 882596 tid 882704 thread 10 bound to OS proc set {17}
OMP: pid 882596 tid 882729 thread 35 bound to OS proc set {60}
OMP: pid 882596 tid 882700 thread 6 bound to OS proc set {10}
OMP: pid 882596 tid 882697 thread 3 bound to OS proc set {5}
OMP: pid 882596 tid 882706 thread 12 bound to OS proc set {20}
OMP: pid 882596 tid 882699 thread 5 bound to OS proc set {8}
OMP: pid 882596 tid 882720 thread 26 bound to OS proc set {45}
OMP: pid 882596 tid 882708 thread 14 bound to OS proc set {24}
OMP: pid 882596 tid 882718 thread 24 bound to OS proc set {41}
OMP: pid 882596 tid 882721 thread 27 bound to OS proc set {46}
OMP: pid 882596 tid 882738 thread 44 bound to OS proc set {76}
OMP: pid 882596 tid 882714 thread 20 bound to OS proc set {34}
OMP: pid 882596 tid 882710 thread 16 bound to OS proc set {27}
OMP: pid 882596 tid 882734 thread 40 bound to OS proc set {69}
OMP: pid 882596 tid 882707 thread 13 bound to OS proc set {22}
OMP: pid 882596 tid 882728 thread 34 bound to OS proc set {58}
OMP: pid 882596 tid 882715 thread 21 bound to OS proc set {36}
OMP: pid 882596 tid 882740 thread 46 bound to OS proc set {79}
OMP: pid 882596 tid 882716 thread 22 bound to OS proc set {38}
OMP: pid 882596 tid 882730 thread 36 bound to OS proc set {62}
OMP: pid 882596 tid 882737 thread 43 bound to OS proc set {74}
OMP: pid 882596 tid 882717 thread 23 bound to OS proc set {39}
OMP: pid 882596 tid 882736 thread 42 bound to OS proc set {72}
OMP: pid 882596 tid 882733 thread 39 bound to OS proc set {67}
OMP: pid 882596 tid 882727 thread 33 bound to OS proc set {57}
OMP: pid 882596 tid 882739 thread 45 bound to OS proc set {77}
OMP: pid 882596 tid 882732 thread 38 bound to OS proc set {65}
OMP: pid 882596 tid 882731 thread 37 bound to OS proc set {64}
OMP: pid 882596 tid 882747 thread 53 bound to OS proc set {91}
OMP: pid 882596 tid 882719 thread 25 bound to OS proc set {43}
OMP: pid 882596 tid 882735 thread 41 bound to OS proc set {71}
OMP: pid 882596 tid 882713 thread 19 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.755635, "speed_pp": 677.575806, "t_tg": 0.000000, "speed_tg": nan, "t": 0.755635, "speed": 677.575806}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_9 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 882769 tid 882769 thread 0 bound to OS proc set {0}
OMP: pid 882769 tid 882868 thread 1 bound to OS proc set {1}
OMP: pid 882769 tid 882869 thread 2 bound to OS proc set {3}
OMP: pid 882769 tid 882882 thread 15 bound to OS proc set {22}
OMP: pid 882769 tid 882875 thread 8 bound to OS proc set {12}
OMP: pid 882769 tid 882881 thread 14 bound to OS proc set {21}
OMP: pid 882769 tid 882930 thread 63 bound to OS proc set {95}
OMP: pid 882769 tid 882870 thread 3 bound to OS proc set {4}
OMP: pid 882769 tid 882927 thread 60 bound to OS proc set {90}
OMP: pid 882769 tid 882878 thread 11 bound to OS proc set {16}
OMP: pid 882769 tid 882929 thread 62 bound to OS proc set {93}
OMP: pid 882769 tid 882918 thread 51 bound to OS proc set {77}
OMP: pid 882769 tid 882880 thread 13 bound to OS proc set {19}
OMP: pid 882769 tid 882871 thread 4 bound to OS proc set {6}
OMP: pid 882769 tid 882879 thread 12 bound to OS proc set {18}
OMP: pid 882769 tid 882899 thread 32 bound to OS proc set {48}
OMP: pid 882769 tid 882915 thread 48 bound to OS proc set {72}
OMP: pid 882769 tid 882885 thread 18 bound to OS proc set {27}
OMP: pid 882769 tid 882874 thread 7 bound to OS proc set {10}
OMP: pid 882769 tid 882898 thread 31 bound to OS proc set {46}
OMP: pid 882769 tid 882886 thread 19 bound to OS proc set {28}
OMP: pid 882769 tid 882877 thread 10 bound to OS proc set {15}
OMP: pid 882769 tid 882901 thread 34 bound to OS proc set {51}
OMP: pid 882769 tid 882897 thread 30 bound to OS proc set {45}
OMP: pid 882769 tid 882883 thread 16 bound to OS proc set {24}
OMP: pid 882769 tid 882902 thread 35 bound to OS proc set {53}
OMP: pid 882769 tid 882873 thread 6 bound to OS proc set {9}
OMP: pid 882769 tid 882895 thread 28 bound to OS proc set {42}
OMP: pid 882769 tid 882876 thread 9 bound to OS proc set {13}
OMP: pid 882769 tid 882914 thread 47 bound to OS proc set {71}
OMP: pid 882769 tid 882891 thread 24 bound to OS proc set {36}
OMP: pid 882769 tid 882910 thread 43 bound to OS proc set {65}
OMP: pid 882769 tid 882887 thread 20 bound to OS proc set {30}
OMP: pid 882769 tid 882917 thread 50 bound to OS proc set {75}
OMP: pid 882769 tid 882926 thread 59 bound to OS proc set {89}
OMP: pid 882769 tid 882909 thread 42 bound to OS proc set {63}
OMP: pid 882769 tid 882907 thread 40 bound to OS proc set {60}
OMP: pid 882769 tid 882896 thread 29 bound to OS proc set {43}
OMP: pid 882769 tid 882903 thread 36 bound to OS proc set {54}
OMP: pid 882769 tid 882894 thread 27 bound to OS proc set {40}
OMP: pid 882769 tid 882893 thread 26 bound to OS proc set {39}
OMP: pid 882769 tid 882928 thread 61 bound to OS proc set {92}
OMP: pid 882769 tid 882872 thread 5 bound to OS proc set {7}
OMP: pid 882769 tid 882905 thread 38 bound to OS proc set {57}
OMP: pid 882769 tid 882904 thread 37 bound to OS proc set {56}
OMP: pid 882769 tid 882906 thread 39 bound to OS proc set {59}
OMP: pid 882769 tid 882892 thread 25 bound to OS proc set {37}
OMP: pid 882769 tid 882925 thread 58 bound to OS proc set {87}
OMP: pid 882769 tid 882923 thread 56 bound to OS proc set {84}
OMP: pid 882769 tid 882913 thread 46 bound to OS proc set {69}
OMP: pid 882769 tid 882916 thread 49 bound to OS proc set {74}
OMP: pid 882769 tid 882922 thread 55 bound to OS proc set {83}
OMP: pid 882769 tid 882884 thread 17 bound to OS proc set {25}
OMP: pid 882769 tid 882911 thread 44 bound to OS proc set {66}
OMP: pid 882769 tid 882888 thread 21 bound to OS proc set {31}
OMP: pid 882769 tid 882912 thread 45 bound to OS proc set {68}
OMP: pid 882769 tid 882921 thread 54 bound to OS proc set {81}
OMP: pid 882769 tid 882919 thread 52 bound to OS proc set {78}
OMP: pid 882769 tid 882890 thread 23 bound to OS proc set {34}
OMP: pid 882769 tid 882924 thread 57 bound to OS proc set {86}
OMP: pid 882769 tid 882920 thread 53 bound to OS proc set {80}
OMP: pid 882769 tid 882900 thread 33 bound to OS proc set {50}
OMP: pid 882769 tid 882889 thread 22 bound to OS proc set {33}
OMP: pid 882769 tid 882908 thread 41 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.670398, "speed_pp": 763.725403, "t_tg": 0.000000, "speed_tg": nan, "t": 0.670398, "speed": 763.725403}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_10 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 882950 tid 882950 thread 0 bound to OS proc set {0}
OMP: pid 882950 tid 883050 thread 2 bound to OS proc set {2}
OMP: pid 882950 tid 883049 thread 1 bound to OS proc set {1}
OMP: pid 882950 tid 883059 thread 11 bound to OS proc set {14}
OMP: pid 882950 tid 883115 thread 67 bound to OS proc set {90}
OMP: pid 882950 tid 883096 thread 48 bound to OS proc set {64}
OMP: pid 882950 tid 883094 thread 46 bound to OS proc set {61}
OMP: pid 882950 tid 883051 thread 3 bound to OS proc set {4}
OMP: pid 882950 tid 883083 thread 35 bound to OS proc set {47}
OMP: pid 882950 tid 883114 thread 66 bound to OS proc set {88}
OMP: pid 882950 tid 883119 thread 71 bound to OS proc set {95}
OMP: pid 882950 tid 883113 thread 65 bound to OS proc set {87}
OMP: pid 882950 tid 883112 thread 64 bound to OS proc set {86}
OMP: pid 882950 tid 883054 thread 6 bound to OS proc set {8}
OMP: pid 882950 tid 883116 thread 68 bound to OS proc set {91}
OMP: pid 882950 tid 883111 thread 63 bound to OS proc set {84}
OMP: pid 882950 tid 883098 thread 50 bound to OS proc set {67}
OMP: pid 882950 tid 883060 thread 12 bound to OS proc set {16}
OMP: pid 882950 tid 883093 thread 45 bound to OS proc set {60}
OMP: pid 882950 tid 883079 thread 31 bound to OS proc set {41}
OMP: pid 882950 tid 883075 thread 27 bound to OS proc set {36}
OMP: pid 882950 tid 883080 thread 32 bound to OS proc set {43}
OMP: pid 882950 tid 883088 thread 40 bound to OS proc set {53}
OMP: pid 882950 tid 883062 thread 14 bound to OS proc set {18}
OMP: pid 882950 tid 883090 thread 42 bound to OS proc set {56}
OMP: pid 882950 tid 883108 thread 60 bound to OS proc set {80}
OMP: pid 882950 tid 883081 thread 33 bound to OS proc set {44}
OMP: pid 882950 tid 883099 thread 51 bound to OS proc set {68}
OMP: pid 882950 tid 883082 thread 34 bound to OS proc set {45}
OMP: pid 882950 tid 883056 thread 8 bound to OS proc set {10}
OMP: pid 882950 tid 883091 thread 43 bound to OS proc set {57}
OMP: pid 882950 tid 883072 thread 24 bound to OS proc set {32}
OMP: pid 882950 tid 883055 thread 7 bound to OS proc set {9}
OMP: pid 882950 tid 883095 thread 47 bound to OS proc set {63}
OMP: pid 882950 tid 883092 thread 44 bound to OS proc set {59}
OMP: pid 882950 tid 883058 thread 10 bound to OS proc set {13}
OMP: pid 882950 tid 883107 thread 59 bound to OS proc set {79}
OMP: pid 882950 tid 883052 thread 4 bound to OS proc set {5}
OMP: pid 882950 tid 883097 thread 49 bound to OS proc set {66}
OMP: pid 882950 tid 883100 thread 52 bound to OS proc set {70}
OMP: pid 882950 tid 883074 thread 26 bound to OS proc set {35}
OMP: pid 882950 tid 883109 thread 61 bound to OS proc set {82}
OMP: pid 882950 tid 883078 thread 30 bound to OS proc set {40}
OMP: pid 882950 tid 883066 thread 18 bound to OS proc set {24}
OMP: pid 882950 tid 883071 thread 23 bound to OS proc set {30}
OMP: pid 882950 tid 883061 thread 13 bound to OS proc set {17}
OMP: pid 882950 tid 883063 thread 15 bound to OS proc set {20}
OMP: pid 882950 tid 883057 thread 9 bound to OS proc set {12}
OMP: pid 882950 tid 883067 thread 19 bound to OS proc set {25}
OMP: pid 882950 tid 883070 thread 22 bound to OS proc set {29}
OMP: pid 882950 tid 883118 thread 70 bound to OS proc set {94}
OMP: pid 882950 tid 883102 thread 54 bound to OS proc set {72}
OMP: pid 882950 tid 883064 thread 16 bound to OS proc set {21}
OMP: pid 882950 tid 883065 thread 17 bound to OS proc set {22}
OMP: pid 882950 tid 883076 thread 28 bound to OS proc set {37}
OMP: pid 882950 tid 883085 thread 37 bound to OS proc set {49}
OMP: pid 882950 tid 883084 thread 36 bound to OS proc set {48}
OMP: pid 882950 tid 883068 thread 20 bound to OS proc set {26}
OMP: pid 882950 tid 883077 thread 29 bound to OS proc set {39}
OMP: pid 882950 tid 883103 thread 55 bound to OS proc set {74}
OMP: pid 882950 tid 883101 thread 53 bound to OS proc set {71}
OMP: pid 882950 tid 883069 thread 21 bound to OS proc set {28}
OMP: pid 882950 tid 883106 thread 58 bound to OS proc set {78}
OMP: pid 882950 tid 883053 thread 5 bound to OS proc set {6}
OMP: pid 882950 tid 883105 thread 57 bound to OS proc set {76}
OMP: pid 882950 tid 883087 thread 39 bound to OS proc set {52}
OMP: pid 882950 tid 883086 thread 38 bound to OS proc set {51}
OMP: pid 882950 tid 883110 thread 62 bound to OS proc set {83}
OMP: pid 882950 tid 883117 thread 69 bound to OS proc set {92}
OMP: pid 882950 tid 883089 thread 41 bound to OS proc set {55}
OMP: pid 882950 tid 883104 thread 56 bound to OS proc set {75}
OMP: pid 882950 tid 883073 thread 25 bound to OS proc set {33}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.628193, "speed_pp": 815.036133, "t_tg": 0.000000, "speed_tg": nan, "t": 0.628193, "speed": 815.036133}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_11 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 883139 tid 883139 thread 0 bound to OS proc set {0}
OMP: pid 883139 tid 883240 thread 3 bound to OS proc set {3}
OMP: pid 883139 tid 883239 thread 2 bound to OS proc set {2}
OMP: pid 883139 tid 883238 thread 1 bound to OS proc set {1}
OMP: pid 883139 tid 883241 thread 4 bound to OS proc set {4}
OMP: pid 883139 tid 883249 thread 12 bound to OS proc set {14}
OMP: pid 883139 tid 883244 thread 7 bound to OS proc set {8}
OMP: pid 883139 tid 883243 thread 6 bound to OS proc set {7}
OMP: pid 883139 tid 883248 thread 11 bound to OS proc set {13}
OMP: pid 883139 tid 883287 thread 50 bound to OS proc set {60}
OMP: pid 883139 tid 883250 thread 13 bound to OS proc set {15}
OMP: pid 883139 tid 883242 thread 5 bound to OS proc set {6}
OMP: pid 883139 tid 883247 thread 10 bound to OS proc set {12}
OMP: pid 883139 tid 883251 thread 14 bound to OS proc set {16}
OMP: pid 883139 tid 883304 thread 67 bound to OS proc set {81}
OMP: pid 883139 tid 883302 thread 65 bound to OS proc set {78}
OMP: pid 883139 tid 883245 thread 8 bound to OS proc set {9}
OMP: pid 883139 tid 883252 thread 15 bound to OS proc set {18}
OMP: pid 883139 tid 883277 thread 40 bound to OS proc set {48}
OMP: pid 883139 tid 883299 thread 62 bound to OS proc set {75}
OMP: pid 883139 tid 883297 thread 60 bound to OS proc set {72}
OMP: pid 883139 tid 883284 thread 47 bound to OS proc set {56}
OMP: pid 883139 tid 883316 thread 79 bound to OS proc set {95}
OMP: pid 883139 tid 883300 thread 63 bound to OS proc set {76}
OMP: pid 883139 tid 883285 thread 48 bound to OS proc set {58}
OMP: pid 883139 tid 883261 thread 24 bound to OS proc set {29}
OMP: pid 883139 tid 883293 thread 56 bound to OS proc set {67}
OMP: pid 883139 tid 883272 thread 35 bound to OS proc set {42}
OMP: pid 883139 tid 883288 thread 51 bound to OS proc set {61}
OMP: pid 883139 tid 883262 thread 25 bound to OS proc set {30}
OMP: pid 883139 tid 883253 thread 16 bound to OS proc set {19}
OMP: pid 883139 tid 883301 thread 64 bound to OS proc set {77}
OMP: pid 883139 tid 883267 thread 30 bound to OS proc set {36}
OMP: pid 883139 tid 883257 thread 20 bound to OS proc set {24}
OMP: pid 883139 tid 883294 thread 57 bound to OS proc set {69}
OMP: pid 883139 tid 883264 thread 27 bound to OS proc set {32}
OMP: pid 883139 tid 883271 thread 34 bound to OS proc set {41}
OMP: pid 883139 tid 883305 thread 68 bound to OS proc set {82}
OMP: pid 883139 tid 883269 thread 32 bound to OS proc set {38}
OMP: pid 883139 tid 883315 thread 78 bound to OS proc set {94}
OMP: pid 883139 tid 883283 thread 46 bound to OS proc set {55}
OMP: pid 883139 tid 883276 thread 39 bound to OS proc set {47}
OMP: pid 883139 tid 883259 thread 22 bound to OS proc set {26}
OMP: pid 883139 tid 883273 thread 36 bound to OS proc set {43}
OMP: pid 883139 tid 883280 thread 43 bound to OS proc set {52}
OMP: pid 883139 tid 883254 thread 17 bound to OS proc set {20}
OMP: pid 883139 tid 883289 thread 52 bound to OS proc set {63}
OMP: pid 883139 tid 883275 thread 38 bound to OS proc set {46}
OMP: pid 883139 tid 883256 thread 19 bound to OS proc set {23}
OMP: pid 883139 tid 883312 thread 75 bound to OS proc set {90}
OMP: pid 883139 tid 883295 thread 58 bound to OS proc set {70}
OMP: pid 883139 tid 883296 thread 59 bound to OS proc set {71}
OMP: pid 883139 tid 883268 thread 31 bound to OS proc set {37}
OMP: pid 883139 tid 883313 thread 76 bound to OS proc set {92}
OMP: pid 883139 tid 883292 thread 55 bound to OS proc set {66}
OMP: pid 883139 tid 883246 thread 9 bound to OS proc set {10}
OMP: pid 883139 tid 883286 thread 49 bound to OS proc set {59}
OMP: pid 883139 tid 883281 thread 44 bound to OS proc set {53}
OMP: pid 883139 tid 883279 thread 42 bound to OS proc set {50}
OMP: pid 883139 tid 883282 thread 45 bound to OS proc set {54}
OMP: pid 883139 tid 883290 thread 53 bound to OS proc set {64}
OMP: pid 883139 tid 883266 thread 29 bound to OS proc set {35}
OMP: pid 883139 tid 883274 thread 37 bound to OS proc set {44}
OMP: pid 883139 tid 883309 thread 72 bound to OS proc set {87}
OMP: pid 883139 tid 883270 thread 33 bound to OS proc set {40}
OMP: pid 883139 tid 883291 thread 54 bound to OS proc set {65}
OMP: pid 883139 tid 883255 thread 18 bound to OS proc set {21}
OMP: pid 883139 tid 883263 thread 26 bound to OS proc set {31}
OMP: pid 883139 tid 883265 thread 28 bound to OS proc set {33}
OMP: pid 883139 tid 883303 thread 66 bound to OS proc set {80}
OMP: pid 883139 tid 883258 thread 21 bound to OS proc set {25}
OMP: pid 883139 tid 883298 thread 61 bound to OS proc set {73}
OMP: pid 883139 tid 883260 thread 23 bound to OS proc set {27}
OMP: pid 883139 tid 883314 thread 77 bound to OS proc set {93}
OMP: pid 883139 tid 883307 thread 70 bound to OS proc set {84}
OMP: pid 883139 tid 883311 thread 74 bound to OS proc set {89}
OMP: pid 883139 tid 883308 thread 71 bound to OS proc set {86}
OMP: pid 883139 tid 883310 thread 73 bound to OS proc set {88}
OMP: pid 883139 tid 883278 thread 41 bound to OS proc set {49}
OMP: pid 883139 tid 883306 thread 69 bound to OS proc set {83}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.577249, "speed_pp": 886.965576, "t_tg": 0.000000, "speed_tg": nan, "t": 0.577249, "speed": 886.965576}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_12 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 883336 tid 883336 thread 0 bound to OS proc set {0}
OMP: pid 883336 tid 883449 thread 15 bound to OS proc set {16}
OMP: pid 883336 tid 883436 thread 2 bound to OS proc set {2}
OMP: pid 883336 tid 883437 thread 3 bound to OS proc set {3}
OMP: pid 883336 tid 883445 thread 11 bound to OS proc set {12}
OMP: pid 883336 tid 883446 thread 12 bound to OS proc set {13}
OMP: pid 883336 tid 883442 thread 8 bound to OS proc set {8}
OMP: pid 883336 tid 883448 thread 14 bound to OS proc set {15}
OMP: pid 883336 tid 883441 thread 7 bound to OS proc set {7}
OMP: pid 883336 tid 883435 thread 1 bound to OS proc set {1}
OMP: pid 883336 tid 883447 thread 13 bound to OS proc set {14}
OMP: pid 883336 tid 883438 thread 4 bound to OS proc set {4}
OMP: pid 883336 tid 883440 thread 6 bound to OS proc set {6}
OMP: pid 883336 tid 883444 thread 10 bound to OS proc set {11}
OMP: pid 883336 tid 883443 thread 9 bound to OS proc set {9}
OMP: pid 883336 tid 883453 thread 19 bound to OS proc set {20}
OMP: pid 883336 tid 883462 thread 28 bound to OS proc set {30}
OMP: pid 883336 tid 883461 thread 27 bound to OS proc set {29}
OMP: pid 883336 tid 883450 thread 16 bound to OS proc set {17}
OMP: pid 883336 tid 883439 thread 5 bound to OS proc set {5}
OMP: pid 883336 tid 883452 thread 18 bound to OS proc set {19}
OMP: pid 883336 tid 883463 thread 29 bound to OS proc set {31}
OMP: pid 883336 tid 883460 thread 26 bound to OS proc set {28}
OMP: pid 883336 tid 883451 thread 17 bound to OS proc set {18}
OMP: pid 883336 tid 883465 thread 31 bound to OS proc set {34}
OMP: pid 883336 tid 883481 thread 47 bound to OS proc set {51}
OMP: pid 883336 tid 883494 thread 60 bound to OS proc set {66}
OMP: pid 883336 tid 883493 thread 59 bound to OS proc set {65}
OMP: pid 883336 tid 883497 thread 63 bound to OS proc set {69}
OMP: pid 883336 tid 883501 thread 67 bound to OS proc set {73}
OMP: pid 883336 tid 883495 thread 61 bound to OS proc set {67}
OMP: pid 883336 tid 883473 thread 39 bound to OS proc set {42}
OMP: pid 883336 tid 883488 thread 54 bound to OS proc set {59}
OMP: pid 883336 tid 883480 thread 46 bound to OS proc set {50}
OMP: pid 883336 tid 883498 thread 64 bound to OS proc set {70}
OMP: pid 883336 tid 883478 thread 44 bound to OS proc set {48}
OMP: pid 883336 tid 883482 thread 48 bound to OS proc set {52}
OMP: pid 883336 tid 883506 thread 72 bound to OS proc set {79}
OMP: pid 883336 tid 883483 thread 49 bound to OS proc set {54}
OMP: pid 883336 tid 883457 thread 23 bound to OS proc set {25}
OMP: pid 883336 tid 883505 thread 71 bound to OS proc set {78}
OMP: pid 883336 tid 883469 thread 35 bound to OS proc set {38}
OMP: pid 883336 tid 883468 thread 34 bound to OS proc set {37}
OMP: pid 883336 tid 883490 thread 56 bound to OS proc set {61}
OMP: pid 883336 tid 883474 thread 40 bound to OS proc set {44}
OMP: pid 883336 tid 883486 thread 52 bound to OS proc set {57}
OMP: pid 883336 tid 883484 thread 50 bound to OS proc set {55}
OMP: pid 883336 tid 883466 thread 32 bound to OS proc set {35}
OMP: pid 883336 tid 883492 thread 58 bound to OS proc set {63}
OMP: pid 883336 tid 883477 thread 43 bound to OS proc set {47}
OMP: pid 883336 tid 883475 thread 41 bound to OS proc set {45}
OMP: pid 883336 tid 883476 thread 42 bound to OS proc set {46}
OMP: pid 883336 tid 883510 thread 76 bound to OS proc set {83}
OMP: pid 883336 tid 883464 thread 30 bound to OS proc set {33}
OMP: pid 883336 tid 883513 thread 79 bound to OS proc set {87}
OMP: pid 883336 tid 883472 thread 38 bound to OS proc set {41}
OMP: pid 883336 tid 883489 thread 55 bound to OS proc set {60}
OMP: pid 883336 tid 883479 thread 45 bound to OS proc set {49}
OMP: pid 883336 tid 883499 thread 65 bound to OS proc set {71}
OMP: pid 883336 tid 883496 thread 62 bound to OS proc set {68}
OMP: pid 883336 tid 883458 thread 24 bound to OS proc set {26}
OMP: pid 883336 tid 883454 thread 20 bound to OS proc set {22}
OMP: pid 883336 tid 883455 thread 21 bound to OS proc set {23}
OMP: pid 883336 tid 883470 thread 36 bound to OS proc set {39}
OMP: pid 883336 tid 883485 thread 51 bound to OS proc set {56}
OMP: pid 883336 tid 883508 thread 74 bound to OS proc set {81}
OMP: pid 883336 tid 883467 thread 33 bound to OS proc set {36}
OMP: pid 883336 tid 883487 thread 53 bound to OS proc set {58}
OMP: pid 883336 tid 883491 thread 57 bound to OS proc set {62}
OMP: pid 883336 tid 883456 thread 22 bound to OS proc set {24}
OMP: pid 883336 tid 883471 thread 37 bound to OS proc set {40}
OMP: pid 883336 tid 883504 thread 70 bound to OS proc set {77}
OMP: pid 883336 tid 883509 thread 75 bound to OS proc set {82}
OMP: pid 883336 tid 883500 thread 66 bound to OS proc set {72}
OMP: pid 883336 tid 883502 thread 68 bound to OS proc set {74}
OMP: pid 883336 tid 883511 thread 77 bound to OS proc set {84}
OMP: pid 883336 tid 883516 thread 82 bound to OS proc set {90}
OMP: pid 883336 tid 883517 thread 83 bound to OS proc set {91}
OMP: pid 883336 tid 883459 thread 25 bound to OS proc set {27}
OMP: pid 883336 tid 883512 thread 78 bound to OS proc set {85}
OMP: pid 883336 tid 883514 thread 80 bound to OS proc set {88}
OMP: pid 883336 tid 883503 thread 69 bound to OS proc set {76}
OMP: pid 883336 tid 883507 thread 73 bound to OS proc set {80}
OMP: pid 883336 tid 883521 thread 87 bound to OS proc set {95}
OMP: pid 883336 tid 883519 thread 85 bound to OS proc set {93}
OMP: pid 883336 tid 883520 thread 86 bound to OS proc set {94}
OMP: pid 883336 tid 883515 thread 81 bound to OS proc set {89}
OMP: pid 883336 tid 883518 thread 84 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.543030, "speed_pp": 942.857605, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 0.543031, "speed": 942.855896}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_13 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 883590 tid 883590 thread 0 bound to OS proc set {0}
OMP: pid 883590 tid 883703 thread 15 bound to OS proc set {15}
OMP: pid 883590 tid 883700 thread 12 bound to OS proc set {12}
OMP: pid 883590 tid 883696 thread 8 bound to OS proc set {8}
OMP: pid 883590 tid 883691 thread 3 bound to OS proc set {3}
OMP: pid 883590 tid 883690 thread 2 bound to OS proc set {2}
OMP: pid 883590 tid 883702 thread 14 bound to OS proc set {14}
OMP: pid 883590 tid 883699 thread 11 bound to OS proc set {11}
OMP: pid 883590 tid 883701 thread 13 bound to OS proc set {13}
OMP: pid 883590 tid 883698 thread 10 bound to OS proc set {10}
OMP: pid 883590 tid 883707 thread 19 bound to OS proc set {19}
OMP: pid 883590 tid 883695 thread 7 bound to OS proc set {7}
OMP: pid 883590 tid 883704 thread 16 bound to OS proc set {16}
OMP: pid 883590 tid 883689 thread 1 bound to OS proc set {1}
OMP: pid 883590 tid 883692 thread 4 bound to OS proc set {4}
OMP: pid 883590 tid 883697 thread 9 bound to OS proc set {9}
OMP: pid 883590 tid 883706 thread 18 bound to OS proc set {18}
OMP: pid 883590 tid 883694 thread 6 bound to OS proc set {6}
OMP: pid 883590 tid 883736 thread 48 bound to OS proc set {48}
OMP: pid 883590 tid 883711 thread 23 bound to OS proc set {23}
OMP: pid 883590 tid 883739 thread 51 bound to OS proc set {51}
OMP: pid 883590 tid 883732 thread 44 bound to OS proc set {44}
OMP: pid 883590 tid 883712 thread 24 bound to OS proc set {24}
OMP: pid 883590 tid 883750 thread 62 bound to OS proc set {62}
OMP: pid 883590 tid 883708 thread 20 bound to OS proc set {20}
OMP: pid 883590 tid 883705 thread 17 bound to OS proc set {17}
OMP: pid 883590 tid 883714 thread 26 bound to OS proc set {26}
OMP: pid 883590 tid 883748 thread 60 bound to OS proc set {60}
OMP: pid 883590 tid 883751 thread 63 bound to OS proc set {63}
OMP: pid 883590 tid 883746 thread 58 bound to OS proc set {58}
OMP: pid 883590 tid 883738 thread 50 bound to OS proc set {50}
OMP: pid 883590 tid 883716 thread 28 bound to OS proc set {28}
OMP: pid 883590 tid 883693 thread 5 bound to OS proc set {5}
OMP: pid 883590 tid 883735 thread 47 bound to OS proc set {47}
OMP: pid 883590 tid 883720 thread 32 bound to OS proc set {32}
OMP: pid 883590 tid 883723 thread 35 bound to OS proc set {35}
OMP: pid 883590 tid 883737 thread 49 bound to OS proc set {49}
OMP: pid 883590 tid 883719 thread 31 bound to OS proc set {31}
OMP: pid 883590 tid 883710 thread 22 bound to OS proc set {22}
OMP: pid 883590 tid 883718 thread 30 bound to OS proc set {30}
OMP: pid 883590 tid 883713 thread 25 bound to OS proc set {25}
OMP: pid 883590 tid 883709 thread 21 bound to OS proc set {21}
OMP: pid 883590 tid 883734 thread 46 bound to OS proc set {46}
OMP: pid 883590 tid 883733 thread 45 bound to OS proc set {45}
OMP: pid 883590 tid 883717 thread 29 bound to OS proc set {29}
OMP: pid 883590 tid 883755 thread 67 bound to OS proc set {67}
OMP: pid 883590 tid 883752 thread 64 bound to OS proc set {64}
OMP: pid 883590 tid 883722 thread 34 bound to OS proc set {34}
OMP: pid 883590 tid 883766 thread 78 bound to OS proc set {78}
OMP: pid 883590 tid 883749 thread 61 bound to OS proc set {61}
OMP: pid 883590 tid 883767 thread 79 bound to OS proc set {79}
OMP: pid 883590 tid 883731 thread 43 bound to OS proc set {43}
OMP: pid 883590 tid 883744 thread 56 bound to OS proc set {56}
OMP: pid 883590 tid 883728 thread 40 bound to OS proc set {40}
OMP: pid 883590 tid 883747 thread 59 bound to OS proc set {59}
OMP: pid 883590 tid 883724 thread 36 bound to OS proc set {36}
OMP: pid 883590 tid 883730 thread 42 bound to OS proc set {42}
OMP: pid 883590 tid 883729 thread 41 bound to OS proc set {41}
OMP: pid 883590 tid 883763 thread 75 bound to OS proc set {75}
OMP: pid 883590 tid 883753 thread 65 bound to OS proc set {65}
OMP: pid 883590 tid 883765 thread 77 bound to OS proc set {77}
OMP: pid 883590 tid 883754 thread 66 bound to OS proc set {66}
OMP: pid 883590 tid 883745 thread 57 bound to OS proc set {57}
OMP: pid 883590 tid 883726 thread 38 bound to OS proc set {38}
OMP: pid 883590 tid 883740 thread 52 bound to OS proc set {52}
OMP: pid 883590 tid 883727 thread 39 bound to OS proc set {39}
OMP: pid 883590 tid 883764 thread 76 bound to OS proc set {76}
OMP: pid 883590 tid 883762 thread 74 bound to OS proc set {74}
OMP: pid 883590 tid 883743 thread 55 bound to OS proc set {55}
OMP: pid 883590 tid 883715 thread 27 bound to OS proc set {27}
OMP: pid 883590 tid 883759 thread 71 bound to OS proc set {71}
OMP: pid 883590 tid 883725 thread 37 bound to OS proc set {37}
OMP: pid 883590 tid 883782 thread 94 bound to OS proc set {94}
OMP: pid 883590 tid 883761 thread 73 bound to OS proc set {73}
OMP: pid 883590 tid 883781 thread 93 bound to OS proc set {93}
OMP: pid 883590 tid 883778 thread 90 bound to OS proc set {90}
OMP: pid 883590 tid 883779 thread 91 bound to OS proc set {91}
OMP: pid 883590 tid 883756 thread 68 bound to OS proc set {68}
OMP: pid 883590 tid 883770 thread 82 bound to OS proc set {82}
OMP: pid 883590 tid 883768 thread 80 bound to OS proc set {80}
OMP: pid 883590 tid 883771 thread 83 bound to OS proc set {83}
OMP: pid 883590 tid 883780 thread 92 bound to OS proc set {92}
OMP: pid 883590 tid 883758 thread 70 bound to OS proc set {70}
OMP: pid 883590 tid 883777 thread 89 bound to OS proc set {89}
OMP: pid 883590 tid 883742 thread 54 bound to OS proc set {54}
OMP: pid 883590 tid 883757 thread 69 bound to OS proc set {69}
OMP: pid 883590 tid 883760 thread 72 bound to OS proc set {72}
OMP: pid 883590 tid 883769 thread 81 bound to OS proc set {81}
OMP: pid 883590 tid 883774 thread 86 bound to OS proc set {86}
OMP: pid 883590 tid 883775 thread 87 bound to OS proc set {87}
OMP: pid 883590 tid 883741 thread 53 bound to OS proc set {53}
OMP: pid 883590 tid 883772 thread 84 bound to OS proc set {84}
OMP: pid 883590 tid 883776 thread 88 bound to OS proc set {88}
OMP: pid 883590 tid 883773 thread 85 bound to OS proc set {85}
OMP: pid 883590 tid 883721 thread 33 bound to OS proc set {33}
OMP: pid 883590 tid 883783 thread 95 bound to OS proc set {95}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 128, "tg": 0, "pl": 4, "n_kv": 512, "t_pp": 0.515455, "speed_pp": 993.297180, "t_tg": 0.000000, "speed_tg": nan, "t": 0.515455, "speed": 993.297180}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-7732/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_15-39-17/tools/lprof_npsu_run_14 #
#########################################################################################################################################################################################################################################