* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3988018 tid 3988018 thread 0 bound to OS proc set {0}
OMP: pid 3988018 tid 3988086 thread 2 bound to OS proc set {32}
OMP: pid 3988018 tid 3988085 thread 1 bound to OS proc set {16}
OMP: pid 3988018 tid 3988087 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 13.603342, "speed_pp": 18.818905, "t_tg": 0.000000, "speed_tg": nan, "t": 13.603342, "speed": 18.818905}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_2 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3988822 tid 3988822 thread 0 bound to OS proc set {0}
OMP: pid 3988822 tid 3988890 thread 2 bound to OS proc set {16}
OMP: pid 3988822 tid 3988891 thread 3 bound to OS proc set {24}
OMP: pid 3988822 tid 3988889 thread 1 bound to OS proc set {8}
OMP: pid 3988822 tid 3988892 thread 4 bound to OS proc set {32}
OMP: pid 3988822 tid 3988894 thread 6 bound to OS proc set {48}
OMP: pid 3988822 tid 3988893 thread 5 bound to OS proc set {40}
OMP: pid 3988822 tid 3988895 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 8.429540, "speed_pp": 30.369392, "t_tg": 0.000000, "speed_tg": nan, "t": 8.429540, "speed": 30.369392}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_3 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3990464 tid 3990464 thread 0 bound to OS proc set {0}
OMP: pid 3990464 tid 3990533 thread 3 bound to OS proc set {12}
OMP: pid 3990464 tid 3990532 thread 2 bound to OS proc set {8}
OMP: pid 3990464 tid 3990531 thread 1 bound to OS proc set {4}
OMP: pid 3990464 tid 3990542 thread 12 bound to OS proc set {48}
OMP: pid 3990464 tid 3990543 thread 13 bound to OS proc set {52}
OMP: pid 3990464 tid 3990541 thread 11 bound to OS proc set {44}
OMP: pid 3990464 tid 3990544 thread 14 bound to OS proc set {56}
OMP: pid 3990464 tid 3990538 thread 8 bound to OS proc set {32}
OMP: pid 3990464 tid 3990534 thread 4 bound to OS proc set {16}
OMP: pid 3990464 tid 3990536 thread 6 bound to OS proc set {24}
OMP: pid 3990464 tid 3990537 thread 7 bound to OS proc set {28}
OMP: pid 3990464 tid 3990540 thread 10 bound to OS proc set {40}
OMP: pid 3990464 tid 3990539 thread 9 bound to OS proc set {36}
OMP: pid 3990464 tid 3990535 thread 5 bound to OS proc set {20}
OMP: pid 3990464 tid 3990545 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 6.404086, "speed_pp": 39.974480, "t_tg": 0.000000, "speed_tg": nan, "t": 6.404086, "speed": 39.974480}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_4 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3993881 tid 3993881 thread 0 bound to OS proc set {0}
OMP: pid 3993881 tid 3993948 thread 1 bound to OS proc set {2}
OMP: pid 3993881 tid 3993955 thread 8 bound to OS proc set {21}
OMP: pid 3993881 tid 3993953 thread 6 bound to OS proc set {16}
OMP: pid 3993881 tid 3993963 thread 16 bound to OS proc set {43}
OMP: pid 3993881 tid 3993950 thread 3 bound to OS proc set {8}
OMP: pid 3993881 tid 3993962 thread 15 bound to OS proc set {40}
OMP: pid 3993881 tid 3993954 thread 7 bound to OS proc set {18}
OMP: pid 3993881 tid 3993966 thread 19 bound to OS proc set {51}
OMP: pid 3993881 tid 3993958 thread 11 bound to OS proc set {29}
OMP: pid 3993881 tid 3993959 thread 12 bound to OS proc set {32}
OMP: pid 3993881 tid 3993956 thread 9 bound to OS proc set {24}
OMP: pid 3993881 tid 3993949 thread 2 bound to OS proc set {5}
OMP: pid 3993881 tid 3993951 thread 4 bound to OS proc set {10}
OMP: pid 3993881 tid 3993961 thread 14 bound to OS proc set {37}
OMP: pid 3993881 tid 3993967 thread 20 bound to OS proc set {54}
OMP: pid 3993881 tid 3993964 thread 17 bound to OS proc set {46}
OMP: pid 3993881 tid 3993965 thread 18 bound to OS proc set {48}
OMP: pid 3993881 tid 3993957 thread 10 bound to OS proc set {27}
OMP: pid 3993881 tid 3993969 thread 22 bound to OS proc set {59}
OMP: pid 3993881 tid 3993960 thread 13 bound to OS proc set {35}
OMP: pid 3993881 tid 3993968 thread 21 bound to OS proc set {56}
OMP: pid 3993881 tid 3993970 thread 23 bound to OS proc set {62}
OMP: pid 3993881 tid 3993952 thread 5 bound to OS proc set {13}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 5.941695, "speed_pp": 43.085346, "t_tg": 0.000000, "speed_tg": nan, "t": 5.941695, "speed": 43.085346}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_5 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 3999073 tid 3999073 thread 0 bound to OS proc set {0}
OMP: pid 3999073 tid 3999151 thread 12 bound to OS proc set {24}
OMP: pid 3999073 tid 3999147 thread 8 bound to OS proc set {16}
OMP: pid 3999073 tid 3999140 thread 1 bound to OS proc set {2}
OMP: pid 3999073 tid 3999153 thread 14 bound to OS proc set {28}
OMP: pid 3999073 tid 3999146 thread 7 bound to OS proc set {14}
OMP: pid 3999073 tid 3999141 thread 2 bound to OS proc set {4}
OMP: pid 3999073 tid 3999142 thread 3 bound to OS proc set {6}
OMP: pid 3999073 tid 3999145 thread 6 bound to OS proc set {12}
OMP: pid 3999073 tid 3999143 thread 4 bound to OS proc set {8}
OMP: pid 3999073 tid 3999148 thread 9 bound to OS proc set {18}
OMP: pid 3999073 tid 3999150 thread 11 bound to OS proc set {22}
OMP: pid 3999073 tid 3999144 thread 5 bound to OS proc set {10}
OMP: pid 3999073 tid 3999149 thread 10 bound to OS proc set {20}
OMP: pid 3999073 tid 3999167 thread 28 bound to OS proc set {56}
OMP: pid 3999073 tid 3999156 thread 17 bound to OS proc set {34}
OMP: pid 3999073 tid 3999157 thread 18 bound to OS proc set {36}
OMP: pid 3999073 tid 3999154 thread 15 bound to OS proc set {30}
OMP: pid 3999073 tid 3999152 thread 13 bound to OS proc set {26}
OMP: pid 3999073 tid 3999158 thread 19 bound to OS proc set {38}
OMP: pid 3999073 tid 3999169 thread 30 bound to OS proc set {60}
OMP: pid 3999073 tid 3999165 thread 26 bound to OS proc set {52}
OMP: pid 3999073 tid 3999163 thread 24 bound to OS proc set {48}
OMP: pid 3999073 tid 3999168 thread 29 bound to OS proc set {58}
OMP: pid 3999073 tid 3999170 thread 31 bound to OS proc set {62}
OMP: pid 3999073 tid 3999166 thread 27 bound to OS proc set {54}
OMP: pid 3999073 tid 3999159 thread 20 bound to OS proc set {40}
OMP: pid 3999073 tid 3999162 thread 23 bound to OS proc set {46}
OMP: pid 3999073 tid 3999161 thread 22 bound to OS proc set {44}
OMP: pid 3999073 tid 3999155 thread 16 bound to OS proc set {32}
OMP: pid 3999073 tid 3999160 thread 21 bound to OS proc set {42}
OMP: pid 3999073 tid 3999164 thread 25 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 5.837714, "speed_pp": 43.852779, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 5.837715, "speed": 43.852772}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_6 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4006090 tid 4006090 thread 0 bound to OS proc set {0}
OMP: pid 4006090 tid 4006157 thread 1 bound to OS proc set {1}
OMP: pid 4006090 tid 4006158 thread 2 bound to OS proc set {3}
OMP: pid 4006090 tid 4006163 thread 7 bound to OS proc set {11}
OMP: pid 4006090 tid 4006166 thread 10 bound to OS proc set {16}
OMP: pid 4006090 tid 4006159 thread 3 bound to OS proc set {4}
OMP: pid 4006090 tid 4006160 thread 4 bound to OS proc set {6}
OMP: pid 4006090 tid 4006170 thread 14 bound to OS proc set {22}
OMP: pid 4006090 tid 4006164 thread 8 bound to OS proc set {13}
OMP: pid 4006090 tid 4006161 thread 5 bound to OS proc set {8}
OMP: pid 4006090 tid 4006168 thread 12 bound to OS proc set {19}
OMP: pid 4006090 tid 4006191 thread 35 bound to OS proc set {56}
OMP: pid 4006090 tid 4006167 thread 11 bound to OS proc set {17}
OMP: pid 4006090 tid 4006187 thread 31 bound to OS proc set {50}
OMP: pid 4006090 tid 4006192 thread 36 bound to OS proc set {58}
OMP: pid 4006090 tid 4006174 thread 18 bound to OS proc set {29}
OMP: pid 4006090 tid 4006195 thread 39 bound to OS proc set {63}
OMP: pid 4006090 tid 4006188 thread 32 bound to OS proc set {52}
OMP: pid 4006090 tid 4006162 thread 6 bound to OS proc set {9}
OMP: pid 4006090 tid 4006171 thread 15 bound to OS proc set {24}
OMP: pid 4006090 tid 4006194 thread 38 bound to OS proc set {61}
OMP: pid 4006090 tid 4006173 thread 17 bound to OS proc set {27}
OMP: pid 4006090 tid 4006190 thread 34 bound to OS proc set {55}
OMP: pid 4006090 tid 4006193 thread 37 bound to OS proc set {60}
OMP: pid 4006090 tid 4006180 thread 24 bound to OS proc set {39}
OMP: pid 4006090 tid 4006183 thread 27 bound to OS proc set {43}
OMP: pid 4006090 tid 4006179 thread 23 bound to OS proc set {37}
OMP: pid 4006090 tid 4006175 thread 19 bound to OS proc set {30}
OMP: pid 4006090 tid 4006186 thread 30 bound to OS proc set {48}
OMP: pid 4006090 tid 4006184 thread 28 bound to OS proc set {45}
OMP: pid 4006090 tid 4006189 thread 33 bound to OS proc set {53}
OMP: pid 4006090 tid 4006178 thread 22 bound to OS proc set {35}
OMP: pid 4006090 tid 4006181 thread 25 bound to OS proc set {40}
OMP: pid 4006090 tid 4006169 thread 13 bound to OS proc set {21}
OMP: pid 4006090 tid 4006172 thread 16 bound to OS proc set {26}
OMP: pid 4006090 tid 4006182 thread 26 bound to OS proc set {42}
OMP: pid 4006090 tid 4006176 thread 20 bound to OS proc set {32}
OMP: pid 4006090 tid 4006185 thread 29 bound to OS proc set {47}
OMP: pid 4006090 tid 4006177 thread 21 bound to OS proc set {34}
OMP: pid 4006090 tid 4006165 thread 9 bound to OS proc set {14}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 6.114342, "speed_pp": 41.868771, "t_tg": 0.000000, "speed_tg": nan, "t": 6.114342, "speed": 41.868771}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_7 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4014834 tid 4014834 thread 0 bound to OS proc set {0}
OMP: pid 4014834 tid 4014902 thread 2 bound to OS proc set {2}
OMP: pid 4014834 tid 4014901 thread 1 bound to OS proc set {1}
OMP: pid 4014834 tid 4014911 thread 11 bound to OS proc set {14}
OMP: pid 4014834 tid 4014903 thread 3 bound to OS proc set {4}
OMP: pid 4014834 tid 4014935 thread 35 bound to OS proc set {47}
OMP: pid 4014834 tid 4014908 thread 8 bound to OS proc set {10}
OMP: pid 4014834 tid 4014906 thread 6 bound to OS proc set {8}
OMP: pid 4014834 tid 4014910 thread 10 bound to OS proc set {13}
OMP: pid 4014834 tid 4014914 thread 14 bound to OS proc set {18}
OMP: pid 4014834 tid 4014912 thread 12 bound to OS proc set {16}
OMP: pid 4014834 tid 4014915 thread 15 bound to OS proc set {20}
OMP: pid 4014834 tid 4014913 thread 13 bound to OS proc set {17}
OMP: pid 4014834 tid 4014909 thread 9 bound to OS proc set {12}
OMP: pid 4014834 tid 4014907 thread 7 bound to OS proc set {9}
OMP: pid 4014834 tid 4014934 thread 34 bound to OS proc set {46}
OMP: pid 4014834 tid 4014904 thread 4 bound to OS proc set {5}
OMP: pid 4014834 tid 4014926 thread 26 bound to OS proc set {35}
OMP: pid 4014834 tid 4014933 thread 33 bound to OS proc set {44}
OMP: pid 4014834 tid 4014918 thread 18 bound to OS proc set {24}
OMP: pid 4014834 tid 4014944 thread 44 bound to OS proc set {59}
OMP: pid 4014834 tid 4014919 thread 19 bound to OS proc set {25}
OMP: pid 4014834 tid 4014947 thread 47 bound to OS proc set {63}
OMP: pid 4014834 tid 4014946 thread 46 bound to OS proc set {62}
OMP: pid 4014834 tid 4014931 thread 31 bound to OS proc set {41}
OMP: pid 4014834 tid 4014924 thread 24 bound to OS proc set {32}
OMP: pid 4014834 tid 4014930 thread 30 bound to OS proc set {40}
OMP: pid 4014834 tid 4014927 thread 27 bound to OS proc set {36}
OMP: pid 4014834 tid 4014923 thread 23 bound to OS proc set {31}
OMP: pid 4014834 tid 4014916 thread 16 bound to OS proc set {21}
OMP: pid 4014834 tid 4014920 thread 20 bound to OS proc set {27}
OMP: pid 4014834 tid 4014928 thread 28 bound to OS proc set {37}
OMP: pid 4014834 tid 4014925 thread 25 bound to OS proc set {33}
OMP: pid 4014834 tid 4014905 thread 5 bound to OS proc set {6}
OMP: pid 4014834 tid 4014932 thread 32 bound to OS proc set {43}
OMP: pid 4014834 tid 4014936 thread 36 bound to OS proc set {48}
OMP: pid 4014834 tid 4014917 thread 17 bound to OS proc set {23}
OMP: pid 4014834 tid 4014929 thread 29 bound to OS proc set {39}
OMP: pid 4014834 tid 4014921 thread 21 bound to OS proc set {28}
OMP: pid 4014834 tid 4014922 thread 22 bound to OS proc set {29}
OMP: pid 4014834 tid 4014945 thread 45 bound to OS proc set {60}
OMP: pid 4014834 tid 4014940 thread 40 bound to OS proc set {54}
OMP: pid 4014834 tid 4014943 thread 43 bound to OS proc set {58}
OMP: pid 4014834 tid 4014942 thread 42 bound to OS proc set {56}
OMP: pid 4014834 tid 4014939 thread 39 bound to OS proc set {52}
OMP: pid 4014834 tid 4014938 thread 38 bound to OS proc set {51}
OMP: pid 4014834 tid 4014941 thread 41 bound to OS proc set {55}
OMP: pid 4014834 tid 4014937 thread 37 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 6.503687, "speed_pp": 39.362289, "t_tg": 0.000000, "speed_tg": nan, "t": 6.503687, "speed": 39.362289}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_8 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4025354 tid 4025354 thread 0 bound to OS proc set {0}
OMP: pid 4025354 tid 4025423 thread 3 bound to OS proc set {3}
OMP: pid 4025354 tid 4025422 thread 2 bound to OS proc set {2}
OMP: pid 4025354 tid 4025421 thread 1 bound to OS proc set {1}
OMP: pid 4025354 tid 4025424 thread 4 bound to OS proc set {4}
OMP: pid 4025354 tid 4025426 thread 6 bound to OS proc set {6}
OMP: pid 4025354 tid 4025425 thread 5 bound to OS proc set {5}
OMP: pid 4025354 tid 4025469 thread 49 bound to OS proc set {56}
OMP: pid 4025354 tid 4025427 thread 7 bound to OS proc set {8}
OMP: pid 4025354 tid 4025431 thread 11 bound to OS proc set {12}
OMP: pid 4025354 tid 4025471 thread 51 bound to OS proc set {59}
OMP: pid 4025354 tid 4025468 thread 48 bound to OS proc set {55}
OMP: pid 4025354 tid 4025475 thread 55 bound to OS proc set {63}
OMP: pid 4025354 tid 4025432 thread 12 bound to OS proc set {13}
OMP: pid 4025354 tid 4025448 thread 28 bound to OS proc set {32}
OMP: pid 4025354 tid 4025466 thread 46 bound to OS proc set {53}
OMP: pid 4025354 tid 4025430 thread 10 bound to OS proc set {11}
OMP: pid 4025354 tid 4025434 thread 14 bound to OS proc set {16}
OMP: pid 4025354 tid 4025470 thread 50 bound to OS proc set {58}
OMP: pid 4025354 tid 4025433 thread 13 bound to OS proc set {15}
OMP: pid 4025354 tid 4025438 thread 18 bound to OS proc set {20}
OMP: pid 4025354 tid 4025447 thread 27 bound to OS proc set {31}
OMP: pid 4025354 tid 4025463 thread 43 bound to OS proc set {49}
OMP: pid 4025354 tid 4025472 thread 52 bound to OS proc set {60}
OMP: pid 4025354 tid 4025428 thread 8 bound to OS proc set {9}
OMP: pid 4025354 tid 4025451 thread 31 bound to OS proc set {35}
OMP: pid 4025354 tid 4025464 thread 44 bound to OS proc set {51}
OMP: pid 4025354 tid 4025465 thread 45 bound to OS proc set {52}
OMP: pid 4025354 tid 4025429 thread 9 bound to OS proc set {10}
OMP: pid 4025354 tid 4025435 thread 15 bound to OS proc set {17}
OMP: pid 4025354 tid 4025455 thread 35 bound to OS proc set {40}
OMP: pid 4025354 tid 4025444 thread 24 bound to OS proc set {27}
OMP: pid 4025354 tid 4025452 thread 32 bound to OS proc set {37}
OMP: pid 4025354 tid 4025460 thread 40 bound to OS proc set {46}
OMP: pid 4025354 tid 4025450 thread 30 bound to OS proc set {34}
OMP: pid 4025354 tid 4025446 thread 26 bound to OS proc set {30}
OMP: pid 4025354 tid 4025456 thread 36 bound to OS proc set {41}
OMP: pid 4025354 tid 4025467 thread 47 bound to OS proc set {54}
OMP: pid 4025354 tid 4025474 thread 54 bound to OS proc set {62}
OMP: pid 4025354 tid 4025459 thread 39 bound to OS proc set {45}
OMP: pid 4025354 tid 4025454 thread 34 bound to OS proc set {39}
OMP: pid 4025354 tid 4025458 thread 38 bound to OS proc set {44}
OMP: pid 4025354 tid 4025453 thread 33 bound to OS proc set {38}
OMP: pid 4025354 tid 4025457 thread 37 bound to OS proc set {42}
OMP: pid 4025354 tid 4025443 thread 23 bound to OS proc set {26}
OMP: pid 4025354 tid 4025445 thread 25 bound to OS proc set {29}
OMP: pid 4025354 tid 4025442 thread 22 bound to OS proc set {25}
OMP: pid 4025354 tid 4025440 thread 20 bound to OS proc set {23}
OMP: pid 4025354 tid 4025462 thread 42 bound to OS proc set {48}
OMP: pid 4025354 tid 4025461 thread 41 bound to OS proc set {47}
OMP: pid 4025354 tid 4025441 thread 21 bound to OS proc set {24}
OMP: pid 4025354 tid 4025439 thread 19 bound to OS proc set {22}
OMP: pid 4025354 tid 4025437 thread 17 bound to OS proc set {19}
OMP: pid 4025354 tid 4025473 thread 53 bound to OS proc set {61}
OMP: pid 4025354 tid 4025436 thread 16 bound to OS proc set {18}
OMP: pid 4025354 tid 4025449 thread 29 bound to OS proc set {33}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 6.900627, "speed_pp": 37.098076, "t_tg": 0.000000, "speed_tg": nan, "t": 6.900627, "speed": 37.098076}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_9 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4037650 tid 4037650 thread 0 bound to OS proc set {0}
OMP: pid 4037650 tid 4037720 thread 3 bound to OS proc set {3}
OMP: pid 4037650 tid 4037719 thread 2 bound to OS proc set {2}
OMP: pid 4037650 tid 4037721 thread 4 bound to OS proc set {4}
OMP: pid 4037650 tid 4037723 thread 6 bound to OS proc set {6}
OMP: pid 4037650 tid 4037780 thread 63 bound to OS proc set {63}
OMP: pid 4037650 tid 4037725 thread 8 bound to OS proc set {8}
OMP: pid 4037650 tid 4037729 thread 12 bound to OS proc set {12}
OMP: pid 4037650 tid 4037752 thread 35 bound to OS proc set {35}
OMP: pid 4037650 tid 4037749 thread 32 bound to OS proc set {32}
OMP: pid 4037650 tid 4037732 thread 15 bound to OS proc set {15}
OMP: pid 4037650 tid 4037718 thread 1 bound to OS proc set {1}
OMP: pid 4037650 tid 4037768 thread 51 bound to OS proc set {51}
OMP: pid 4037650 tid 4037728 thread 11 bound to OS proc set {11}
OMP: pid 4037650 tid 4037764 thread 47 bound to OS proc set {47}
OMP: pid 4037650 tid 4037731 thread 14 bound to OS proc set {14}
OMP: pid 4037650 tid 4037724 thread 7 bound to OS proc set {7}
OMP: pid 4037650 tid 4037776 thread 59 bound to OS proc set {59}
OMP: pid 4037650 tid 4037733 thread 16 bound to OS proc set {16}
OMP: pid 4037650 tid 4037769 thread 52 bound to OS proc set {52}
OMP: pid 4037650 tid 4037751 thread 34 bound to OS proc set {34}
OMP: pid 4037650 tid 4037736 thread 19 bound to OS proc set {19}
OMP: pid 4037650 tid 4037778 thread 61 bound to OS proc set {61}
OMP: pid 4037650 tid 4037773 thread 56 bound to OS proc set {56}
OMP: pid 4037650 tid 4037765 thread 48 bound to OS proc set {48}
OMP: pid 4037650 tid 4037761 thread 44 bound to OS proc set {44}
OMP: pid 4037650 tid 4037730 thread 13 bound to OS proc set {13}
OMP: pid 4037650 tid 4037750 thread 33 bound to OS proc set {33}
OMP: pid 4037650 tid 4037767 thread 50 bound to OS proc set {50}
OMP: pid 4037650 tid 4037777 thread 60 bound to OS proc set {60}
OMP: pid 4037650 tid 4037772 thread 55 bound to OS proc set {55}
OMP: pid 4037650 tid 4037763 thread 46 bound to OS proc set {46}
OMP: pid 4037650 tid 4037779 thread 62 bound to OS proc set {62}
OMP: pid 4037650 tid 4037747 thread 30 bound to OS proc set {30}
OMP: pid 4037650 tid 4037735 thread 18 bound to OS proc set {18}
OMP: pid 4037650 tid 4037744 thread 27 bound to OS proc set {27}
OMP: pid 4037650 tid 4037734 thread 17 bound to OS proc set {17}
OMP: pid 4037650 tid 4037742 thread 25 bound to OS proc set {25}
OMP: pid 4037650 tid 4037775 thread 58 bound to OS proc set {58}
OMP: pid 4037650 tid 4037727 thread 10 bound to OS proc set {10}
OMP: pid 4037650 tid 4037760 thread 43 bound to OS proc set {43}
OMP: pid 4037650 tid 4037757 thread 40 bound to OS proc set {40}
OMP: pid 4037650 tid 4037743 thread 26 bound to OS proc set {26}
OMP: pid 4037650 tid 4037741 thread 24 bound to OS proc set {24}
OMP: pid 4037650 tid 4037726 thread 9 bound to OS proc set {9}
OMP: pid 4037650 tid 4037748 thread 31 bound to OS proc set {31}
OMP: pid 4037650 tid 4037737 thread 20 bound to OS proc set {20}
OMP: pid 4037650 tid 4037770 thread 53 bound to OS proc set {53}
OMP: pid 4037650 tid 4037759 thread 42 bound to OS proc set {42}
OMP: pid 4037650 tid 4037754 thread 37 bound to OS proc set {37}
OMP: pid 4037650 tid 4037745 thread 28 bound to OS proc set {28}
OMP: pid 4037650 tid 4037766 thread 49 bound to OS proc set {49}
OMP: pid 4037650 tid 4037746 thread 29 bound to OS proc set {29}
OMP: pid 4037650 tid 4037755 thread 38 bound to OS proc set {38}
OMP: pid 4037650 tid 4037762 thread 45 bound to OS proc set {45}
OMP: pid 4037650 tid 4037758 thread 41 bound to OS proc set {41}
OMP: pid 4037650 tid 4037739 thread 22 bound to OS proc set {22}
OMP: pid 4037650 tid 4037740 thread 23 bound to OS proc set {23}
OMP: pid 4037650 tid 4037774 thread 57 bound to OS proc set {57}
OMP: pid 4037650 tid 4037738 thread 21 bound to OS proc set {21}
OMP: pid 4037650 tid 4037756 thread 39 bound to OS proc set {39}
OMP: pid 4037650 tid 4037753 thread 36 bound to OS proc set {36}
OMP: pid 4037650 tid 4037771 thread 54 bound to OS proc set {54}
OMP: pid 4037650 tid 4037722 thread 5 bound to OS proc set {5}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 2, "n_kv": 256, "t_pp": 9.715161, "speed_pp": 26.350567, "t_tg": 0.000000, "speed_tg": nan, "t": 9.715161, "speed": 26.350567}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-409-4840/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-20-50/tools/lprof_npsu_run_10 #
########################################################################################################################################################################################################################################