* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4116789 tid 4116789 thread 0 bound to OS proc set {0}
OMP: pid 4116789 tid 4116856 thread 1 bound to OS proc set {16}
OMP: pid 4116789 tid 4116857 thread 2 bound to OS proc set {32}
OMP: pid 4116789 tid 4116858 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 49.116272, "speed_pp": 20.848488, "t_tg": 0.000000, "speed_tg": nan, "t": 49.116272, "speed": 20.848488}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_2 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4118255 tid 4118255 thread 0 bound to OS proc set {0}
OMP: pid 4118255 tid 4118324 thread 3 bound to OS proc set {24}
OMP: pid 4118255 tid 4118322 thread 1 bound to OS proc set {8}
OMP: pid 4118255 tid 4118325 thread 4 bound to OS proc set {32}
OMP: pid 4118255 tid 4118323 thread 2 bound to OS proc set {16}
OMP: pid 4118255 tid 4118327 thread 6 bound to OS proc set {48}
OMP: pid 4118255 tid 4118326 thread 5 bound to OS proc set {40}
OMP: pid 4118255 tid 4118328 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 29.629011, "speed_pp": 34.560722, "t_tg": 0.000000, "speed_tg": nan, "t": 29.629011, "speed": 34.560722}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_3 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4121444 tid 4121444 thread 0 bound to OS proc set {0}
OMP: pid 4121444 tid 4121512 thread 2 bound to OS proc set {8}
OMP: pid 4121444 tid 4121513 thread 3 bound to OS proc set {12}
OMP: pid 4121444 tid 4121511 thread 1 bound to OS proc set {4}
OMP: pid 4121444 tid 4121518 thread 8 bound to OS proc set {32}
OMP: pid 4121444 tid 4121524 thread 14 bound to OS proc set {56}
OMP: pid 4121444 tid 4121522 thread 12 bound to OS proc set {48}
OMP: pid 4121444 tid 4121515 thread 5 bound to OS proc set {20}
OMP: pid 4121444 tid 4121521 thread 11 bound to OS proc set {44}
OMP: pid 4121444 tid 4121520 thread 10 bound to OS proc set {40}
OMP: pid 4121444 tid 4121523 thread 13 bound to OS proc set {52}
OMP: pid 4121444 tid 4121517 thread 7 bound to OS proc set {28}
OMP: pid 4121444 tid 4121519 thread 9 bound to OS proc set {36}
OMP: pid 4121444 tid 4121516 thread 6 bound to OS proc set {24}
OMP: pid 4121444 tid 4121514 thread 4 bound to OS proc set {16}
OMP: pid 4121444 tid 4121525 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 20.875095, "speed_pp": 49.053669, "t_tg": 0.000000, "speed_tg": nan, "t": 20.875095, "speed": 49.053669}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_4 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4128225 tid 4128225 thread 0 bound to OS proc set {0}
OMP: pid 4128225 tid 4128292 thread 1 bound to OS proc set {2}
OMP: pid 4128225 tid 4128294 thread 3 bound to OS proc set {8}
OMP: pid 4128225 tid 4128299 thread 8 bound to OS proc set {21}
OMP: pid 4128225 tid 4128295 thread 4 bound to OS proc set {10}
OMP: pid 4128225 tid 4128300 thread 9 bound to OS proc set {24}
OMP: pid 4128225 tid 4128307 thread 16 bound to OS proc set {43}
OMP: pid 4128225 tid 4128310 thread 19 bound to OS proc set {51}
OMP: pid 4128225 tid 4128306 thread 15 bound to OS proc set {40}
OMP: pid 4128225 tid 4128303 thread 12 bound to OS proc set {32}
OMP: pid 4128225 tid 4128305 thread 14 bound to OS proc set {37}
OMP: pid 4128225 tid 4128309 thread 18 bound to OS proc set {48}
OMP: pid 4128225 tid 4128302 thread 11 bound to OS proc set {29}
OMP: pid 4128225 tid 4128298 thread 7 bound to OS proc set {18}
OMP: pid 4128225 tid 4128311 thread 20 bound to OS proc set {54}
OMP: pid 4128225 tid 4128308 thread 17 bound to OS proc set {46}
OMP: pid 4128225 tid 4128293 thread 2 bound to OS proc set {5}
OMP: pid 4128225 tid 4128304 thread 13 bound to OS proc set {35}
OMP: pid 4128225 tid 4128313 thread 22 bound to OS proc set {59}
OMP: pid 4128225 tid 4128296 thread 5 bound to OS proc set {13}
OMP: pid 4128225 tid 4128297 thread 6 bound to OS proc set {16}
OMP: pid 4128225 tid 4128312 thread 21 bound to OS proc set {56}
OMP: pid 4128225 tid 4128301 thread 10 bound to OS proc set {27}
OMP: pid 4128225 tid 4128314 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 17.765827, "speed_pp": 57.638744, "t_tg": 0.000000, "speed_tg": nan, "t": 17.765827, "speed": 57.638744}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_5 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4138501 tid 4138501 thread 0 bound to OS proc set {0}
OMP: pid 4138501 tid 4138579 thread 12 bound to OS proc set {24}
OMP: pid 4138501 tid 4138581 thread 14 bound to OS proc set {28}
OMP: pid 4138501 tid 4138569 thread 2 bound to OS proc set {4}
OMP: pid 4138501 tid 4138568 thread 1 bound to OS proc set {2}
OMP: pid 4138501 tid 4138578 thread 11 bound to OS proc set {22}
OMP: pid 4138501 tid 4138570 thread 3 bound to OS proc set {6}
OMP: pid 4138501 tid 4138573 thread 6 bound to OS proc set {12}
OMP: pid 4138501 tid 4138577 thread 10 bound to OS proc set {20}
OMP: pid 4138501 tid 4138574 thread 7 bound to OS proc set {14}
OMP: pid 4138501 tid 4138571 thread 4 bound to OS proc set {8}
OMP: pid 4138501 tid 4138583 thread 16 bound to OS proc set {32}
OMP: pid 4138501 tid 4138576 thread 9 bound to OS proc set {18}
OMP: pid 4138501 tid 4138595 thread 28 bound to OS proc set {56}
OMP: pid 4138501 tid 4138572 thread 5 bound to OS proc set {10}
OMP: pid 4138501 tid 4138586 thread 19 bound to OS proc set {38}
OMP: pid 4138501 tid 4138580 thread 13 bound to OS proc set {26}
OMP: pid 4138501 tid 4138597 thread 30 bound to OS proc set {60}
OMP: pid 4138501 tid 4138582 thread 15 bound to OS proc set {30}
OMP: pid 4138501 tid 4138585 thread 18 bound to OS proc set {36}
OMP: pid 4138501 tid 4138591 thread 24 bound to OS proc set {48}
OMP: pid 4138501 tid 4138575 thread 8 bound to OS proc set {16}
OMP: pid 4138501 tid 4138596 thread 29 bound to OS proc set {58}
OMP: pid 4138501 tid 4138594 thread 27 bound to OS proc set {54}
OMP: pid 4138501 tid 4138593 thread 26 bound to OS proc set {52}
OMP: pid 4138501 tid 4138584 thread 17 bound to OS proc set {34}
OMP: pid 4138501 tid 4138587 thread 20 bound to OS proc set {40}
OMP: pid 4138501 tid 4138590 thread 23 bound to OS proc set {46}
OMP: pid 4138501 tid 4138589 thread 22 bound to OS proc set {44}
OMP: pid 4138501 tid 4138592 thread 25 bound to OS proc set {50}
OMP: pid 4138501 tid 4138598 thread 31 bound to OS proc set {62}
OMP: pid 4138501 tid 4138588 thread 21 bound to OS proc set {42}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 16.519091, "speed_pp": 61.988884, "t_tg": 0.000000, "speed_tg": nan, "t": 16.519091, "speed": 61.988884}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_6 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4152320 tid 4152320 thread 0 bound to OS proc set {0}
OMP: pid 4152320 tid 4152387 thread 1 bound to OS proc set {1}
OMP: pid 4152320 tid 4152388 thread 2 bound to OS proc set {3}
OMP: pid 4152320 tid 4152393 thread 7 bound to OS proc set {11}
OMP: pid 4152320 tid 4152400 thread 14 bound to OS proc set {22}
OMP: pid 4152320 tid 4152389 thread 3 bound to OS proc set {4}
OMP: pid 4152320 tid 4152390 thread 4 bound to OS proc set {6}
OMP: pid 4152320 tid 4152394 thread 8 bound to OS proc set {13}
OMP: pid 4152320 tid 4152395 thread 9 bound to OS proc set {14}
OMP: pid 4152320 tid 4152392 thread 6 bound to OS proc set {9}
OMP: pid 4152320 tid 4152418 thread 32 bound to OS proc set {52}
OMP: pid 4152320 tid 4152398 thread 12 bound to OS proc set {19}
OMP: pid 4152320 tid 4152410 thread 24 bound to OS proc set {39}
OMP: pid 4152320 tid 4152404 thread 18 bound to OS proc set {29}
OMP: pid 4152320 tid 4152396 thread 10 bound to OS proc set {16}
OMP: pid 4152320 tid 4152391 thread 5 bound to OS proc set {8}
OMP: pid 4152320 tid 4152420 thread 34 bound to OS proc set {55}
OMP: pid 4152320 tid 4152419 thread 33 bound to OS proc set {53}
OMP: pid 4152320 tid 4152397 thread 11 bound to OS proc set {17}
OMP: pid 4152320 tid 4152414 thread 28 bound to OS proc set {45}
OMP: pid 4152320 tid 4152417 thread 31 bound to OS proc set {50}
OMP: pid 4152320 tid 4152424 thread 38 bound to OS proc set {61}
OMP: pid 4152320 tid 4152403 thread 17 bound to OS proc set {27}
OMP: pid 4152320 tid 4152401 thread 15 bound to OS proc set {24}
OMP: pid 4152320 tid 4152399 thread 13 bound to OS proc set {21}
OMP: pid 4152320 tid 4152416 thread 30 bound to OS proc set {48}
OMP: pid 4152320 tid 4152413 thread 27 bound to OS proc set {43}
OMP: pid 4152320 tid 4152405 thread 19 bound to OS proc set {30}
OMP: pid 4152320 tid 4152406 thread 20 bound to OS proc set {32}
OMP: pid 4152320 tid 4152409 thread 23 bound to OS proc set {37}
OMP: pid 4152320 tid 4152422 thread 36 bound to OS proc set {58}
OMP: pid 4152320 tid 4152408 thread 22 bound to OS proc set {35}
OMP: pid 4152320 tid 4152421 thread 35 bound to OS proc set {56}
OMP: pid 4152320 tid 4152415 thread 29 bound to OS proc set {47}
OMP: pid 4152320 tid 4152423 thread 37 bound to OS proc set {60}
OMP: pid 4152320 tid 4152407 thread 21 bound to OS proc set {34}
OMP: pid 4152320 tid 4152402 thread 16 bound to OS proc set {26}
OMP: pid 4152320 tid 4152411 thread 25 bound to OS proc set {40}
OMP: pid 4152320 tid 4152425 thread 39 bound to OS proc set {63}
OMP: pid 4152320 tid 4152412 thread 26 bound to OS proc set {42}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 16.415743, "speed_pp": 62.379147, "t_tg": 0.000000, "speed_tg": nan, "t": 16.415743, "speed": 62.379147}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_7 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4169732 tid 4169732 thread 0 bound to OS proc set {0}
OMP: pid 4169732 tid 4169800 thread 2 bound to OS proc set {2}
OMP: pid 4169732 tid 4169799 thread 1 bound to OS proc set {1}
OMP: pid 4169732 tid 4169809 thread 11 bound to OS proc set {14}
OMP: pid 4169732 tid 4169801 thread 3 bound to OS proc set {4}
OMP: pid 4169732 tid 4169808 thread 10 bound to OS proc set {13}
OMP: pid 4169732 tid 4169806 thread 8 bound to OS proc set {10}
OMP: pid 4169732 tid 4169813 thread 15 bound to OS proc set {20}
OMP: pid 4169732 tid 4169804 thread 6 bound to OS proc set {8}
OMP: pid 4169732 tid 4169805 thread 7 bound to OS proc set {9}
OMP: pid 4169732 tid 4169812 thread 14 bound to OS proc set {18}
OMP: pid 4169732 tid 4169810 thread 12 bound to OS proc set {16}
OMP: pid 4169732 tid 4169811 thread 13 bound to OS proc set {17}
OMP: pid 4169732 tid 4169807 thread 9 bound to OS proc set {12}
OMP: pid 4169732 tid 4169825 thread 27 bound to OS proc set {36}
OMP: pid 4169732 tid 4169832 thread 34 bound to OS proc set {46}
OMP: pid 4169732 tid 4169817 thread 19 bound to OS proc set {25}
OMP: pid 4169732 tid 4169803 thread 5 bound to OS proc set {6}
OMP: pid 4169732 tid 4169802 thread 4 bound to OS proc set {5}
OMP: pid 4169732 tid 4169816 thread 18 bound to OS proc set {24}
OMP: pid 4169732 tid 4169814 thread 16 bound to OS proc set {21}
OMP: pid 4169732 tid 4169845 thread 47 bound to OS proc set {63}
OMP: pid 4169732 tid 4169818 thread 20 bound to OS proc set {27}
OMP: pid 4169732 tid 4169828 thread 30 bound to OS proc set {40}
OMP: pid 4169732 tid 4169824 thread 26 bound to OS proc set {35}
OMP: pid 4169732 tid 4169829 thread 31 bound to OS proc set {41}
OMP: pid 4169732 tid 4169830 thread 32 bound to OS proc set {43}
OMP: pid 4169732 tid 4169826 thread 28 bound to OS proc set {37}
OMP: pid 4169732 tid 4169822 thread 24 bound to OS proc set {32}
OMP: pid 4169732 tid 4169821 thread 23 bound to OS proc set {31}
OMP: pid 4169732 tid 4169844 thread 46 bound to OS proc set {62}
OMP: pid 4169732 tid 4169834 thread 36 bound to OS proc set {48}
OMP: pid 4169732 tid 4169815 thread 17 bound to OS proc set {23}
OMP: pid 4169732 tid 4169823 thread 25 bound to OS proc set {33}
OMP: pid 4169732 tid 4169819 thread 21 bound to OS proc set {28}
OMP: pid 4169732 tid 4169820 thread 22 bound to OS proc set {29}
OMP: pid 4169732 tid 4169827 thread 29 bound to OS proc set {39}
OMP: pid 4169732 tid 4169833 thread 35 bound to OS proc set {47}
OMP: pid 4169732 tid 4169841 thread 43 bound to OS proc set {58}
OMP: pid 4169732 tid 4169831 thread 33 bound to OS proc set {44}
OMP: pid 4169732 tid 4169842 thread 44 bound to OS proc set {59}
OMP: pid 4169732 tid 4169837 thread 39 bound to OS proc set {52}
OMP: pid 4169732 tid 4169838 thread 40 bound to OS proc set {54}
OMP: pid 4169732 tid 4169835 thread 37 bound to OS proc set {50}
OMP: pid 4169732 tid 4169840 thread 42 bound to OS proc set {56}
OMP: pid 4169732 tid 4169839 thread 41 bound to OS proc set {55}
OMP: pid 4169732 tid 4169843 thread 45 bound to OS proc set {60}
OMP: pid 4169732 tid 4169836 thread 38 bound to OS proc set {51}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 16.481930, "speed_pp": 62.128647, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 16.481932, "speed": 62.128639}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_8 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 4190639 tid 4190639 thread 0 bound to OS proc set {0}
OMP: pid 4190639 tid 4190707 thread 2 bound to OS proc set {2}
OMP: pid 4190639 tid 4190708 thread 3 bound to OS proc set {3}
OMP: pid 4190639 tid 4190706 thread 1 bound to OS proc set {1}
OMP: pid 4190639 tid 4190709 thread 4 bound to OS proc set {4}
OMP: pid 4190639 tid 4190724 thread 19 bound to OS proc set {22}
OMP: pid 4190639 tid 4190711 thread 6 bound to OS proc set {6}
OMP: pid 4190639 tid 4190729 thread 24 bound to OS proc set {27}
OMP: pid 4190639 tid 4190756 thread 51 bound to OS proc set {59}
OMP: pid 4190639 tid 4190753 thread 48 bound to OS proc set {55}
OMP: pid 4190639 tid 4190710 thread 5 bound to OS proc set {5}
OMP: pid 4190639 tid 4190725 thread 20 bound to OS proc set {23}
OMP: pid 4190639 tid 4190712 thread 7 bound to OS proc set {8}
OMP: pid 4190639 tid 4190716 thread 11 bound to OS proc set {12}
OMP: pid 4190639 tid 4190720 thread 15 bound to OS proc set {17}
OMP: pid 4190639 tid 4190754 thread 49 bound to OS proc set {56}
OMP: pid 4190639 tid 4190719 thread 14 bound to OS proc set {16}
OMP: pid 4190639 tid 4190728 thread 23 bound to OS proc set {26}
OMP: pid 4190639 tid 4190760 thread 55 bound to OS proc set {63}
OMP: pid 4190639 tid 4190717 thread 12 bound to OS proc set {13}
OMP: pid 4190639 tid 4190727 thread 22 bound to OS proc set {25}
OMP: pid 4190639 tid 4190718 thread 13 bound to OS proc set {15}
OMP: pid 4190639 tid 4190715 thread 10 bound to OS proc set {11}
OMP: pid 4190639 tid 4190726 thread 21 bound to OS proc set {24}
OMP: pid 4190639 tid 4190732 thread 27 bound to OS proc set {31}
OMP: pid 4190639 tid 4190755 thread 50 bound to OS proc set {58}
OMP: pid 4190639 tid 4190735 thread 30 bound to OS proc set {34}
OMP: pid 4190639 tid 4190714 thread 9 bound to OS proc set {10}
OMP: pid 4190639 tid 4190721 thread 16 bound to OS proc set {18}
OMP: pid 4190639 tid 4190749 thread 44 bound to OS proc set {51}
OMP: pid 4190639 tid 4190731 thread 26 bound to OS proc set {30}
OMP: pid 4190639 tid 4190713 thread 8 bound to OS proc set {9}
OMP: pid 4190639 tid 4190752 thread 47 bound to OS proc set {54}
OMP: pid 4190639 tid 4190730 thread 25 bound to OS proc set {29}
OMP: pid 4190639 tid 4190757 thread 52 bound to OS proc set {60}
OMP: pid 4190639 tid 4190745 thread 40 bound to OS proc set {46}
OMP: pid 4190639 tid 4190723 thread 18 bound to OS proc set {20}
OMP: pid 4190639 tid 4190733 thread 28 bound to OS proc set {32}
OMP: pid 4190639 tid 4190736 thread 31 bound to OS proc set {35}
OMP: pid 4190639 tid 4190737 thread 32 bound to OS proc set {37}
OMP: pid 4190639 tid 4190759 thread 54 bound to OS proc set {62}
OMP: pid 4190639 tid 4190734 thread 29 bound to OS proc set {33}
OMP: pid 4190639 tid 4190748 thread 43 bound to OS proc set {49}
OMP: pid 4190639 tid 4190739 thread 34 bound to OS proc set {39}
OMP: pid 4190639 tid 4190747 thread 42 bound to OS proc set {48}
OMP: pid 4190639 tid 4190746 thread 41 bound to OS proc set {47}
OMP: pid 4190639 tid 4190744 thread 39 bound to OS proc set {45}
OMP: pid 4190639 tid 4190722 thread 17 bound to OS proc set {19}
OMP: pid 4190639 tid 4190750 thread 45 bound to OS proc set {52}
OMP: pid 4190639 tid 4190738 thread 33 bound to OS proc set {38}
OMP: pid 4190639 tid 4190758 thread 53 bound to OS proc set {61}
OMP: pid 4190639 tid 4190742 thread 37 bound to OS proc set {42}
OMP: pid 4190639 tid 4190740 thread 35 bound to OS proc set {40}
OMP: pid 4190639 tid 4190741 thread 36 bound to OS proc set {41}
OMP: pid 4190639 tid 4190743 thread 38 bound to OS proc set {44}
OMP: pid 4190639 tid 4190751 thread 46 bound to OS proc set {53}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 16.565348, "speed_pp": 61.815788, "t_tg": 0.000000, "speed_tg": nan, "t": 16.565348, "speed": 61.815788}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_9 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 21368 tid 21368 thread 0 bound to OS proc set {0}
OMP: pid 21368 tid 21449 thread 15 bound to OS proc set {15}
OMP: pid 21368 tid 21437 thread 3 bound to OS proc set {3}
OMP: pid 21368 tid 21446 thread 12 bound to OS proc set {12}
OMP: pid 21368 tid 21436 thread 2 bound to OS proc set {2}
OMP: pid 21368 tid 21442 thread 8 bound to OS proc set {8}
OMP: pid 21368 tid 21445 thread 11 bound to OS proc set {11}
OMP: pid 21368 tid 21466 thread 32 bound to OS proc set {32}
OMP: pid 21368 tid 21448 thread 14 bound to OS proc set {14}
OMP: pid 21368 tid 21469 thread 35 bound to OS proc set {35}
OMP: pid 21368 tid 21441 thread 7 bound to OS proc set {7}
OMP: pid 21368 tid 21444 thread 10 bound to OS proc set {10}
OMP: pid 21368 tid 21447 thread 13 bound to OS proc set {13}
OMP: pid 21368 tid 21477 thread 43 bound to OS proc set {43}
OMP: pid 21368 tid 21438 thread 4 bound to OS proc set {4}
OMP: pid 21368 tid 21468 thread 34 bound to OS proc set {34}
OMP: pid 21368 tid 21453 thread 19 bound to OS proc set {19}
OMP: pid 21368 tid 21440 thread 6 bound to OS proc set {6}
OMP: pid 21368 tid 21465 thread 31 bound to OS proc set {31}
OMP: pid 21368 tid 21443 thread 9 bound to OS proc set {9}
OMP: pid 21368 tid 21461 thread 27 bound to OS proc set {27}
OMP: pid 21368 tid 21452 thread 18 bound to OS proc set {18}
OMP: pid 21368 tid 21476 thread 42 bound to OS proc set {42}
OMP: pid 21368 tid 21450 thread 16 bound to OS proc set {16}
OMP: pid 21368 tid 21467 thread 33 bound to OS proc set {33}
OMP: pid 21368 tid 21458 thread 24 bound to OS proc set {24}
OMP: pid 21368 tid 21474 thread 40 bound to OS proc set {40}
OMP: pid 21368 tid 21462 thread 28 bound to OS proc set {28}
OMP: pid 21368 tid 21464 thread 30 bound to OS proc set {30}
OMP: pid 21368 tid 21473 thread 39 bound to OS proc set {39}
OMP: pid 21368 tid 21460 thread 26 bound to OS proc set {26}
OMP: pid 21368 tid 21463 thread 29 bound to OS proc set {29}
OMP: pid 21368 tid 21457 thread 23 bound to OS proc set {23}
OMP: pid 21368 tid 21472 thread 38 bound to OS proc set {38}
OMP: pid 21368 tid 21454 thread 20 bound to OS proc set {20}
OMP: pid 21368 tid 21459 thread 25 bound to OS proc set {25}
OMP: pid 21368 tid 21475 thread 41 bound to OS proc set {41}
OMP: pid 21368 tid 21470 thread 36 bound to OS proc set {36}
OMP: pid 21368 tid 21497 thread 63 bound to OS proc set {63}
OMP: pid 21368 tid 21456 thread 22 bound to OS proc set {22}
OMP: pid 21368 tid 21482 thread 48 bound to OS proc set {48}
OMP: pid 21368 tid 21481 thread 47 bound to OS proc set {47}
OMP: pid 21368 tid 21485 thread 51 bound to OS proc set {51}
OMP: pid 21368 tid 21496 thread 62 bound to OS proc set {62}
OMP: pid 21368 tid 21455 thread 21 bound to OS proc set {21}
OMP: pid 21368 tid 21495 thread 61 bound to OS proc set {61}
OMP: pid 21368 tid 21494 thread 60 bound to OS proc set {60}
OMP: pid 21368 tid 21478 thread 44 bound to OS proc set {44}
OMP: pid 21368 tid 21471 thread 37 bound to OS proc set {37}
OMP: pid 21368 tid 21490 thread 56 bound to OS proc set {56}
OMP: pid 21368 tid 21484 thread 50 bound to OS proc set {50}
OMP: pid 21368 tid 21483 thread 49 bound to OS proc set {49}
OMP: pid 21368 tid 21435 thread 1 bound to OS proc set {1}
OMP: pid 21368 tid 21493 thread 59 bound to OS proc set {59}
OMP: pid 21368 tid 21480 thread 46 bound to OS proc set {46}
OMP: pid 21368 tid 21439 thread 5 bound to OS proc set {5}
OMP: pid 21368 tid 21479 thread 45 bound to OS proc set {45}
OMP: pid 21368 tid 21492 thread 58 bound to OS proc set {58}
OMP: pid 21368 tid 21491 thread 57 bound to OS proc set {57}
OMP: pid 21368 tid 21489 thread 55 bound to OS proc set {55}
OMP: pid 21368 tid 21486 thread 52 bound to OS proc set {52}
OMP: pid 21368 tid 21487 thread 53 bound to OS proc set {53}
OMP: pid 21368 tid 21488 thread 54 bound to OS proc set {54}
OMP: pid 21368 tid 21451 thread 17 bound to OS proc set {17}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 8, "n_kv": 1024, "t_pp": 22.616167, "speed_pp": 45.277344, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 22.616169, "speed": 45.277340}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-414-8092/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-26_15-33-01/tools/lprof_npsu_run_10 #
########################################################################################################################################################################################################################################