* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 282511 tid 282511 thread 0 bound to OS proc set {0}
OMP: pid 282511 tid 282579 thread 2 bound to OS proc set {32}
OMP: pid 282511 tid 282578 thread 1 bound to OS proc set {16}
OMP: pid 282511 tid 282580 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 13.731870, "speed_tg": 18.642763, "t": 13.731870, "speed": 18.642763}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_2 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 282600 tid 282600 thread 0 bound to OS proc set {0}
OMP: pid 282600 tid 282669 thread 3 bound to OS proc set {24}
OMP: pid 282600 tid 282668 thread 2 bound to OS proc set {16}
OMP: pid 282600 tid 282670 thread 4 bound to OS proc set {32}
OMP: pid 282600 tid 282672 thread 6 bound to OS proc set {48}
OMP: pid 282600 tid 282671 thread 5 bound to OS proc set {40}
OMP: pid 282600 tid 282667 thread 1 bound to OS proc set {8}
OMP: pid 282600 tid 282673 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.404547, "speed_tg": 34.573349, "t": 7.404547, "speed": 34.573349}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_3 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 282695 tid 282695 thread 0 bound to OS proc set {0}
OMP: pid 282695 tid 282765 thread 4 bound to OS proc set {16}
OMP: pid 282695 tid 282773 thread 12 bound to OS proc set {48}
OMP: pid 282695 tid 282762 thread 1 bound to OS proc set {4}
OMP: pid 282695 tid 282764 thread 3 bound to OS proc set {12}
OMP: pid 282695 tid 282775 thread 14 bound to OS proc set {56}
OMP: pid 282695 tid 282763 thread 2 bound to OS proc set {8}
OMP: pid 282695 tid 282772 thread 11 bound to OS proc set {44}
OMP: pid 282695 tid 282774 thread 13 bound to OS proc set {52}
OMP: pid 282695 tid 282769 thread 8 bound to OS proc set {32}
OMP: pid 282695 tid 282768 thread 7 bound to OS proc set {28}
OMP: pid 282695 tid 282767 thread 6 bound to OS proc set {24}
OMP: pid 282695 tid 282771 thread 10 bound to OS proc set {40}
OMP: pid 282695 tid 282766 thread 5 bound to OS proc set {20}
OMP: pid 282695 tid 282770 thread 9 bound to OS proc set {36}
OMP: pid 282695 tid 282776 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 4.398764, "speed_tg": 58.198166, "t": 4.398764, "speed": 58.198166}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_4 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 282846 tid 282846 thread 0 bound to OS proc set {0}
OMP: pid 282846 tid 282913 thread 1 bound to OS proc set {2}
OMP: pid 282846 tid 282915 thread 3 bound to OS proc set {8}
OMP: pid 282846 tid 282921 thread 9 bound to OS proc set {24}
OMP: pid 282846 tid 282928 thread 16 bound to OS proc set {43}
OMP: pid 282846 tid 282918 thread 6 bound to OS proc set {16}
OMP: pid 282846 tid 282920 thread 8 bound to OS proc set {21}
OMP: pid 282846 tid 282923 thread 11 bound to OS proc set {29}
OMP: pid 282846 tid 282924 thread 12 bound to OS proc set {32}
OMP: pid 282846 tid 282931 thread 19 bound to OS proc set {51}
OMP: pid 282846 tid 282927 thread 15 bound to OS proc set {40}
OMP: pid 282846 tid 282916 thread 4 bound to OS proc set {10}
OMP: pid 282846 tid 282922 thread 10 bound to OS proc set {27}
OMP: pid 282846 tid 282914 thread 2 bound to OS proc set {5}
OMP: pid 282846 tid 282917 thread 5 bound to OS proc set {13}
OMP: pid 282846 tid 282919 thread 7 bound to OS proc set {18}
OMP: pid 282846 tid 282926 thread 14 bound to OS proc set {37}
OMP: pid 282846 tid 282925 thread 13 bound to OS proc set {35}
OMP: pid 282846 tid 282930 thread 18 bound to OS proc set {48}
OMP: pid 282846 tid 282932 thread 20 bound to OS proc set {54}
OMP: pid 282846 tid 282934 thread 22 bound to OS proc set {59}
OMP: pid 282846 tid 282929 thread 17 bound to OS proc set {46}
OMP: pid 282846 tid 282933 thread 21 bound to OS proc set {56}
OMP: pid 282846 tid 282935 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.833351, "speed_tg": 66.782303, "t": 3.833351, "speed": 66.782303}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_5 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 282955 tid 282955 thread 0 bound to OS proc set {0}
OMP: pid 282955 tid 283033 thread 12 bound to OS proc set {24}
OMP: pid 282955 tid 283022 thread 1 bound to OS proc set {2}
OMP: pid 282955 tid 283029 thread 8 bound to OS proc set {16}
OMP: pid 282955 tid 283028 thread 7 bound to OS proc set {14}
OMP: pid 282955 tid 283031 thread 10 bound to OS proc set {20}
OMP: pid 282955 tid 283023 thread 2 bound to OS proc set {4}
OMP: pid 282955 tid 283036 thread 15 bound to OS proc set {30}
OMP: pid 282955 tid 283025 thread 4 bound to OS proc set {8}
OMP: pid 282955 tid 283026 thread 5 bound to OS proc set {10}
OMP: pid 282955 tid 283024 thread 3 bound to OS proc set {6}
OMP: pid 282955 tid 283030 thread 9 bound to OS proc set {18}
OMP: pid 282955 tid 283049 thread 28 bound to OS proc set {56}
OMP: pid 282955 tid 283037 thread 16 bound to OS proc set {32}
OMP: pid 282955 tid 283032 thread 11 bound to OS proc set {22}
OMP: pid 282955 tid 283048 thread 27 bound to OS proc set {54}
OMP: pid 282955 tid 283040 thread 19 bound to OS proc set {38}
OMP: pid 282955 tid 283050 thread 29 bound to OS proc set {58}
OMP: pid 282955 tid 283051 thread 30 bound to OS proc set {60}
OMP: pid 282955 tid 283035 thread 14 bound to OS proc set {28}
OMP: pid 282955 tid 283039 thread 18 bound to OS proc set {36}
OMP: pid 282955 tid 283034 thread 13 bound to OS proc set {26}
OMP: pid 282955 tid 283038 thread 17 bound to OS proc set {34}
OMP: pid 282955 tid 283047 thread 26 bound to OS proc set {52}
OMP: pid 282955 tid 283041 thread 20 bound to OS proc set {40}
OMP: pid 282955 tid 283044 thread 23 bound to OS proc set {46}
OMP: pid 282955 tid 283045 thread 24 bound to OS proc set {48}
OMP: pid 282955 tid 283046 thread 25 bound to OS proc set {50}
OMP: pid 282955 tid 283043 thread 22 bound to OS proc set {44}
OMP: pid 282955 tid 283042 thread 21 bound to OS proc set {42}
OMP: pid 282955 tid 283052 thread 31 bound to OS proc set {62}
OMP: pid 282955 tid 283027 thread 6 bound to OS proc set {12}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.650795, "speed_tg": 70.121712, "t": 3.650795, "speed": 70.121712}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_6 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 283072 tid 283072 thread 0 bound to OS proc set {0}
OMP: pid 283072 tid 283139 thread 1 bound to OS proc set {1}
OMP: pid 283072 tid 283146 thread 8 bound to OS proc set {13}
OMP: pid 283072 tid 283140 thread 2 bound to OS proc set {3}
OMP: pid 283072 tid 283147 thread 9 bound to OS proc set {14}
OMP: pid 283072 tid 283145 thread 7 bound to OS proc set {11}
OMP: pid 283072 tid 283170 thread 32 bound to OS proc set {52}
OMP: pid 283072 tid 283152 thread 14 bound to OS proc set {22}
OMP: pid 283072 tid 283141 thread 3 bound to OS proc set {4}
OMP: pid 283072 tid 283148 thread 10 bound to OS proc set {16}
OMP: pid 283072 tid 283142 thread 4 bound to OS proc set {6}
OMP: pid 283072 tid 283144 thread 6 bound to OS proc set {9}
OMP: pid 283072 tid 283149 thread 11 bound to OS proc set {17}
OMP: pid 283072 tid 283143 thread 5 bound to OS proc set {8}
OMP: pid 283072 tid 283173 thread 35 bound to OS proc set {56}
OMP: pid 283072 tid 283162 thread 24 bound to OS proc set {39}
OMP: pid 283072 tid 283174 thread 36 bound to OS proc set {58}
OMP: pid 283072 tid 283177 thread 39 bound to OS proc set {63}
OMP: pid 283072 tid 283150 thread 12 bound to OS proc set {19}
OMP: pid 283072 tid 283171 thread 33 bound to OS proc set {53}
OMP: pid 283072 tid 283156 thread 18 bound to OS proc set {29}
OMP: pid 283072 tid 283166 thread 28 bound to OS proc set {45}
OMP: pid 283072 tid 283172 thread 34 bound to OS proc set {55}
OMP: pid 283072 tid 283176 thread 38 bound to OS proc set {61}
OMP: pid 283072 tid 283153 thread 15 bound to OS proc set {24}
OMP: pid 283072 tid 283151 thread 13 bound to OS proc set {21}
OMP: pid 283072 tid 283163 thread 25 bound to OS proc set {40}
OMP: pid 283072 tid 283158 thread 20 bound to OS proc set {32}
OMP: pid 283072 tid 283175 thread 37 bound to OS proc set {60}
OMP: pid 283072 tid 283157 thread 19 bound to OS proc set {30}
OMP: pid 283072 tid 283160 thread 22 bound to OS proc set {35}
OMP: pid 283072 tid 283161 thread 23 bound to OS proc set {37}
OMP: pid 283072 tid 283155 thread 17 bound to OS proc set {27}
OMP: pid 283072 tid 283168 thread 30 bound to OS proc set {48}
OMP: pid 283072 tid 283154 thread 16 bound to OS proc set {26}
OMP: pid 283072 tid 283165 thread 27 bound to OS proc set {43}
OMP: pid 283072 tid 283169 thread 31 bound to OS proc set {50}
OMP: pid 283072 tid 283167 thread 29 bound to OS proc set {47}
OMP: pid 283072 tid 283164 thread 26 bound to OS proc set {42}
OMP: pid 283072 tid 283159 thread 21 bound to OS proc set {34}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.596534, "speed_tg": 71.179642, "t": 3.596534, "speed": 71.179642}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_7 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 283198 tid 283198 thread 0 bound to OS proc set {0}
OMP: pid 283198 tid 283266 thread 2 bound to OS proc set {2}
OMP: pid 283198 tid 283265 thread 1 bound to OS proc set {1}
OMP: pid 283198 tid 283276 thread 12 bound to OS proc set {16}
OMP: pid 283198 tid 283274 thread 10 bound to OS proc set {13}
OMP: pid 283198 tid 283273 thread 9 bound to OS proc set {12}
OMP: pid 283198 tid 283299 thread 35 bound to OS proc set {47}
OMP: pid 283198 tid 283267 thread 3 bound to OS proc set {4}
OMP: pid 283198 tid 283291 thread 27 bound to OS proc set {36}
OMP: pid 283198 tid 283272 thread 8 bound to OS proc set {10}
OMP: pid 283198 tid 283279 thread 15 bound to OS proc set {20}
OMP: pid 283198 tid 283275 thread 11 bound to OS proc set {14}
OMP: pid 283198 tid 283283 thread 19 bound to OS proc set {25}
OMP: pid 283198 tid 283297 thread 33 bound to OS proc set {44}
OMP: pid 283198 tid 283270 thread 6 bound to OS proc set {8}
OMP: pid 283198 tid 283278 thread 14 bound to OS proc set {18}
OMP: pid 283198 tid 283308 thread 44 bound to OS proc set {59}
OMP: pid 283198 tid 283290 thread 26 bound to OS proc set {35}
OMP: pid 283198 tid 283294 thread 30 bound to OS proc set {40}
OMP: pid 283198 tid 283310 thread 46 bound to OS proc set {62}
OMP: pid 283198 tid 283271 thread 7 bound to OS proc set {9}
OMP: pid 283198 tid 283288 thread 24 bound to OS proc set {32}
OMP: pid 283198 tid 283311 thread 47 bound to OS proc set {63}
OMP: pid 283198 tid 283277 thread 13 bound to OS proc set {17}
OMP: pid 283198 tid 283298 thread 34 bound to OS proc set {46}
OMP: pid 283198 tid 283280 thread 16 bound to OS proc set {21}
OMP: pid 283198 tid 283296 thread 32 bound to OS proc set {43}
OMP: pid 283198 tid 283289 thread 25 bound to OS proc set {33}
OMP: pid 283198 tid 283282 thread 18 bound to OS proc set {24}
OMP: pid 283198 tid 283284 thread 20 bound to OS proc set {27}
OMP: pid 283198 tid 283268 thread 4 bound to OS proc set {5}
OMP: pid 283198 tid 283292 thread 28 bound to OS proc set {37}
OMP: pid 283198 tid 283300 thread 36 bound to OS proc set {48}
OMP: pid 283198 tid 283295 thread 31 bound to OS proc set {41}
OMP: pid 283198 tid 283269 thread 5 bound to OS proc set {6}
OMP: pid 283198 tid 283293 thread 29 bound to OS proc set {39}
OMP: pid 283198 tid 283285 thread 21 bound to OS proc set {28}
OMP: pid 283198 tid 283281 thread 17 bound to OS proc set {23}
OMP: pid 283198 tid 283307 thread 43 bound to OS proc set {58}
OMP: pid 283198 tid 283287 thread 23 bound to OS proc set {31}
OMP: pid 283198 tid 283286 thread 22 bound to OS proc set {29}
OMP: pid 283198 tid 283309 thread 45 bound to OS proc set {60}
OMP: pid 283198 tid 283303 thread 39 bound to OS proc set {52}
OMP: pid 283198 tid 283302 thread 38 bound to OS proc set {51}
OMP: pid 283198 tid 283301 thread 37 bound to OS proc set {50}
OMP: pid 283198 tid 283306 thread 42 bound to OS proc set {56}
OMP: pid 283198 tid 283304 thread 40 bound to OS proc set {54}
OMP: pid 283198 tid 283305 thread 41 bound to OS proc set {55}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.640936, "speed_tg": 70.311592, "t": 3.640936, "speed": 70.311592}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_8 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 283381 tid 283381 thread 0 bound to OS proc set {0}
OMP: pid 283381 tid 283450 thread 3 bound to OS proc set {3}
OMP: pid 283381 tid 283462 thread 15 bound to OS proc set {17}
OMP: pid 283381 tid 283449 thread 2 bound to OS proc set {2}
OMP: pid 283381 tid 283448 thread 1 bound to OS proc set {1}
OMP: pid 283381 tid 283451 thread 4 bound to OS proc set {4}
OMP: pid 283381 tid 283461 thread 14 bound to OS proc set {16}
OMP: pid 283381 tid 283460 thread 13 bound to OS proc set {15}
OMP: pid 283381 tid 283463 thread 16 bound to OS proc set {18}
OMP: pid 283381 tid 283495 thread 48 bound to OS proc set {55}
OMP: pid 283381 tid 283465 thread 18 bound to OS proc set {20}
OMP: pid 283381 tid 283464 thread 17 bound to OS proc set {19}
OMP: pid 283381 tid 283459 thread 12 bound to OS proc set {13}
OMP: pid 283381 tid 283455 thread 8 bound to OS proc set {9}
OMP: pid 283381 tid 283458 thread 11 bound to OS proc set {12}
OMP: pid 283381 tid 283479 thread 32 bound to OS proc set {37}
OMP: pid 283381 tid 283454 thread 7 bound to OS proc set {8}
OMP: pid 283381 tid 283498 thread 51 bound to OS proc set {59}
OMP: pid 283381 tid 283453 thread 6 bound to OS proc set {6}
OMP: pid 283381 tid 283497 thread 50 bound to OS proc set {58}
OMP: pid 283381 tid 283477 thread 30 bound to OS proc set {34}
OMP: pid 283381 tid 283494 thread 47 bound to OS proc set {54}
OMP: pid 283381 tid 283475 thread 28 bound to OS proc set {32}
OMP: pid 283381 tid 283496 thread 49 bound to OS proc set {56}
OMP: pid 283381 tid 283493 thread 46 bound to OS proc set {53}
OMP: pid 283381 tid 283489 thread 42 bound to OS proc set {48}
OMP: pid 283381 tid 283502 thread 55 bound to OS proc set {63}
OMP: pid 283381 tid 283490 thread 43 bound to OS proc set {49}
OMP: pid 283381 tid 283486 thread 39 bound to OS proc set {45}
OMP: pid 283381 tid 283470 thread 23 bound to OS proc set {26}
OMP: pid 283381 tid 283457 thread 10 bound to OS proc set {11}
OMP: pid 283381 tid 283476 thread 29 bound to OS proc set {33}
OMP: pid 283381 tid 283474 thread 27 bound to OS proc set {31}
OMP: pid 283381 tid 283473 thread 26 bound to OS proc set {30}
OMP: pid 283381 tid 283478 thread 31 bound to OS proc set {35}
OMP: pid 283381 tid 283466 thread 19 bound to OS proc set {22}
OMP: pid 283381 tid 283499 thread 52 bound to OS proc set {60}
OMP: pid 283381 tid 283491 thread 44 bound to OS proc set {51}
OMP: pid 283381 tid 283469 thread 22 bound to OS proc set {25}
OMP: pid 283381 tid 283492 thread 45 bound to OS proc set {52}
OMP: pid 283381 tid 283485 thread 38 bound to OS proc set {44}
OMP: pid 283381 tid 283456 thread 9 bound to OS proc set {10}
OMP: pid 283381 tid 283472 thread 25 bound to OS proc set {29}
OMP: pid 283381 tid 283482 thread 35 bound to OS proc set {40}
OMP: pid 283381 tid 283487 thread 40 bound to OS proc set {46}
OMP: pid 283381 tid 283488 thread 41 bound to OS proc set {47}
OMP: pid 283381 tid 283480 thread 33 bound to OS proc set {38}
OMP: pid 283381 tid 283471 thread 24 bound to OS proc set {27}
OMP: pid 283381 tid 283484 thread 37 bound to OS proc set {42}
OMP: pid 283381 tid 283483 thread 36 bound to OS proc set {41}
OMP: pid 283381 tid 283481 thread 34 bound to OS proc set {39}
OMP: pid 283381 tid 283467 thread 20 bound to OS proc set {23}
OMP: pid 283381 tid 283468 thread 21 bound to OS proc set {24}
OMP: pid 283381 tid 283501 thread 54 bound to OS proc set {62}
OMP: pid 283381 tid 283452 thread 5 bound to OS proc set {5}
OMP: pid 283381 tid 283500 thread 53 bound to OS proc set {61}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.763835, "speed_tg": 68.015732, "t": 3.763835, "speed": 68.015732}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9
To display your profiling results:
#######################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#######################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_9 #
#######################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-46-37.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 283522 tid 283522 thread 0 bound to OS proc set {0}
OMP: pid 283522 tid 283591 thread 3 bound to OS proc set {3}
OMP: pid 283522 tid 283590 thread 2 bound to OS proc set {2}
OMP: pid 283522 tid 283594 thread 6 bound to OS proc set {6}
OMP: pid 283522 tid 283592 thread 4 bound to OS proc set {4}
OMP: pid 283522 tid 283603 thread 15 bound to OS proc set {15}
OMP: pid 283522 tid 283600 thread 12 bound to OS proc set {12}
OMP: pid 283522 tid 283596 thread 8 bound to OS proc set {8}
OMP: pid 283522 tid 283638 thread 50 bound to OS proc set {50}
OMP: pid 283522 tid 283604 thread 16 bound to OS proc set {16}
OMP: pid 283522 tid 283599 thread 11 bound to OS proc set {11}
OMP: pid 283522 tid 283602 thread 14 bound to OS proc set {14}
OMP: pid 283522 tid 283620 thread 32 bound to OS proc set {32}
OMP: pid 283522 tid 283589 thread 1 bound to OS proc set {1}
OMP: pid 283522 tid 283636 thread 48 bound to OS proc set {48}
OMP: pid 283522 tid 283607 thread 19 bound to OS proc set {19}
OMP: pid 283522 tid 283595 thread 7 bound to OS proc set {7}
OMP: pid 283522 tid 283634 thread 46 bound to OS proc set {46}
OMP: pid 283522 tid 283612 thread 24 bound to OS proc set {24}
OMP: pid 283522 tid 283648 thread 60 bound to OS proc set {60}
OMP: pid 283522 tid 283597 thread 9 bound to OS proc set {9}
OMP: pid 283522 tid 283646 thread 58 bound to OS proc set {58}
OMP: pid 283522 tid 283616 thread 28 bound to OS proc set {28}
OMP: pid 283522 tid 283623 thread 35 bound to OS proc set {35}
OMP: pid 283522 tid 283606 thread 18 bound to OS proc set {18}
OMP: pid 283522 tid 283650 thread 62 bound to OS proc set {62}
OMP: pid 283522 tid 283615 thread 27 bound to OS proc set {27}
OMP: pid 283522 tid 283639 thread 51 bound to OS proc set {51}
OMP: pid 283522 tid 283637 thread 49 bound to OS proc set {49}
OMP: pid 283522 tid 283619 thread 31 bound to OS proc set {31}
OMP: pid 283522 tid 283598 thread 10 bound to OS proc set {10}
OMP: pid 283522 tid 283651 thread 63 bound to OS proc set {63}
OMP: pid 283522 tid 283608 thread 20 bound to OS proc set {20}
OMP: pid 283522 tid 283640 thread 52 bound to OS proc set {52}
OMP: pid 283522 tid 283621 thread 33 bound to OS proc set {33}
OMP: pid 283522 tid 283622 thread 34 bound to OS proc set {34}
OMP: pid 283522 tid 283618 thread 30 bound to OS proc set {30}
OMP: pid 283522 tid 283611 thread 23 bound to OS proc set {23}
OMP: pid 283522 tid 283613 thread 25 bound to OS proc set {25}
OMP: pid 283522 tid 283635 thread 47 bound to OS proc set {47}
OMP: pid 283522 tid 283614 thread 26 bound to OS proc set {26}
OMP: pid 283522 tid 283617 thread 29 bound to OS proc set {29}
OMP: pid 283522 tid 283647 thread 59 bound to OS proc set {59}
OMP: pid 283522 tid 283601 thread 13 bound to OS proc set {13}
OMP: pid 283522 tid 283630 thread 42 bound to OS proc set {42}
OMP: pid 283522 tid 283633 thread 45 bound to OS proc set {45}
OMP: pid 283522 tid 283628 thread 40 bound to OS proc set {40}
OMP: pid 283522 tid 283632 thread 44 bound to OS proc set {44}
OMP: pid 283522 tid 283645 thread 57 bound to OS proc set {57}
OMP: pid 283522 tid 283626 thread 38 bound to OS proc set {38}
OMP: pid 283522 tid 283625 thread 37 bound to OS proc set {37}
OMP: pid 283522 tid 283610 thread 22 bound to OS proc set {22}
OMP: pid 283522 tid 283644 thread 56 bound to OS proc set {56}
OMP: pid 283522 tid 283593 thread 5 bound to OS proc set {5}
OMP: pid 283522 tid 283629 thread 41 bound to OS proc set {41}
OMP: pid 283522 tid 283609 thread 21 bound to OS proc set {21}
OMP: pid 283522 tid 283643 thread 55 bound to OS proc set {55}
OMP: pid 283522 tid 283642 thread 54 bound to OS proc set {54}
OMP: pid 283522 tid 283627 thread 39 bound to OS proc set {39}
OMP: pid 283522 tid 283641 thread 53 bound to OS proc set {53}
OMP: pid 283522 tid 283605 thread 17 bound to OS proc set {17}
OMP: pid 283522 tid 283649 thread 61 bound to OS proc set {61}
OMP: pid 283522 tid 283631 thread 43 bound to OS proc set {43}
OMP: pid 283522 tid 283624 thread 36 bound to OS proc set {36}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 2, "n_kv": 256, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 3.919786, "speed_tg": 65.309685, "t": 3.919786, "speed": 65.309685}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-46-37.ec2.internal/176-406-3169/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-25_09-51-35/tools/lprof_npsu_run_10 #
########################################################################################################################################################################################################################################