* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 27220 tid 27220 thread 0 bound to OS proc set {0}
OMP: pid 27220 tid 27320 thread 2 bound to OS proc set {48}
OMP: pid 27220 tid 27319 thread 1 bound to OS proc set {24}
OMP: pid 27220 tid 27321 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 31.145971, "speed_pp": 65.754890, "t_tg": 0.000000, "speed_tg": nan, "t": 31.145971, "speed": 65.754890}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_2 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 27341 tid 27341 thread 0 bound to OS proc set {0}
OMP: pid 27341 tid 27442 thread 3 bound to OS proc set {36}
OMP: pid 27341 tid 27441 thread 2 bound to OS proc set {24}
OMP: pid 27341 tid 27440 thread 1 bound to OS proc set {12}
OMP: pid 27341 tid 27443 thread 4 bound to OS proc set {48}
OMP: pid 27341 tid 27445 thread 6 bound to OS proc set {72}
OMP: pid 27341 tid 27444 thread 5 bound to OS proc set {60}
OMP: pid 27341 tid 27446 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 15.645636, "speed_pp": 130.899124, "t_tg": 0.000000, "speed_tg": nan, "t": 15.645636, "speed": 130.899124}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_3 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 27515 tid 27515 thread 0 bound to OS proc set {0}
OMP: pid 27515 tid 27615 thread 2 bound to OS proc set {12}
OMP: pid 27515 tid 27614 thread 1 bound to OS proc set {6}
OMP: pid 27515 tid 27625 thread 12 bound to OS proc set {72}
OMP: pid 27515 tid 27621 thread 8 bound to OS proc set {48}
OMP: pid 27515 tid 27627 thread 14 bound to OS proc set {84}
OMP: pid 27515 tid 27624 thread 11 bound to OS proc set {66}
OMP: pid 27515 tid 27626 thread 13 bound to OS proc set {78}
OMP: pid 27515 tid 27617 thread 4 bound to OS proc set {24}
OMP: pid 27515 tid 27620 thread 7 bound to OS proc set {42}
OMP: pid 27515 tid 27623 thread 10 bound to OS proc set {60}
OMP: pid 27515 tid 27619 thread 6 bound to OS proc set {36}
OMP: pid 27515 tid 27622 thread 9 bound to OS proc set {54}
OMP: pid 27515 tid 27618 thread 5 bound to OS proc set {30}
OMP: pid 27515 tid 27616 thread 3 bound to OS proc set {18}
OMP: pid 27515 tid 27628 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 7.915881, "speed_pp": 258.720398, "t_tg": 0.000000, "speed_tg": nan, "t": 7.915881, "speed": 258.720398}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_4 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 27648 tid 27648 thread 0 bound to OS proc set {0}
OMP: pid 27648 tid 27749 thread 3 bound to OS proc set {12}
OMP: pid 27648 tid 27748 thread 2 bound to OS proc set {8}
OMP: pid 27648 tid 27761 thread 15 bound to OS proc set {60}
OMP: pid 27648 tid 27762 thread 16 bound to OS proc set {64}
OMP: pid 27648 tid 27747 thread 1 bound to OS proc set {4}
OMP: pid 27648 tid 27758 thread 12 bound to OS proc set {48}
OMP: pid 27648 tid 27753 thread 7 bound to OS proc set {28}
OMP: pid 27648 tid 27764 thread 18 bound to OS proc set {72}
OMP: pid 27648 tid 27750 thread 4 bound to OS proc set {16}
OMP: pid 27648 tid 27765 thread 19 bound to OS proc set {76}
OMP: pid 27648 tid 27751 thread 5 bound to OS proc set {20}
OMP: pid 27648 tid 27766 thread 20 bound to OS proc set {80}
OMP: pid 27648 tid 27759 thread 13 bound to OS proc set {52}
OMP: pid 27648 tid 27754 thread 8 bound to OS proc set {32}
OMP: pid 27648 tid 27757 thread 11 bound to OS proc set {44}
OMP: pid 27648 tid 27763 thread 17 bound to OS proc set {68}
OMP: pid 27648 tid 27760 thread 14 bound to OS proc set {56}
OMP: pid 27648 tid 27756 thread 10 bound to OS proc set {40}
OMP: pid 27648 tid 27755 thread 9 bound to OS proc set {36}
OMP: pid 27648 tid 27767 thread 21 bound to OS proc set {84}
OMP: pid 27648 tid 27752 thread 6 bound to OS proc set {24}
OMP: pid 27648 tid 27768 thread 22 bound to OS proc set {88}
OMP: pid 27648 tid 27769 thread 23 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 5.800039, "speed_pp": 353.101074, "t_tg": 0.000000, "speed_tg": nan, "t": 5.800039, "speed": 353.101074}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_5 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 27791 tid 27791 thread 0 bound to OS proc set {0}
OMP: pid 27791 tid 27896 thread 7 bound to OS proc set {21}
OMP: pid 27791 tid 27893 thread 4 bound to OS proc set {12}
OMP: pid 27791 tid 27904 thread 15 bound to OS proc set {45}
OMP: pid 27791 tid 27894 thread 5 bound to OS proc set {15}
OMP: pid 27791 tid 27903 thread 14 bound to OS proc set {42}
OMP: pid 27791 tid 27901 thread 12 bound to OS proc set {36}
OMP: pid 27791 tid 27892 thread 3 bound to OS proc set {9}
OMP: pid 27791 tid 27891 thread 2 bound to OS proc set {6}
OMP: pid 27791 tid 27890 thread 1 bound to OS proc set {3}
OMP: pid 27791 tid 27900 thread 11 bound to OS proc set {33}
OMP: pid 27791 tid 27902 thread 13 bound to OS proc set {39}
OMP: pid 27791 tid 27897 thread 8 bound to OS proc set {24}
OMP: pid 27791 tid 27899 thread 10 bound to OS proc set {30}
OMP: pid 27791 tid 27917 thread 28 bound to OS proc set {84}
OMP: pid 27791 tid 27913 thread 24 bound to OS proc set {72}
OMP: pid 27791 tid 27895 thread 6 bound to OS proc set {18}
OMP: pid 27791 tid 27915 thread 26 bound to OS proc set {78}
OMP: pid 27791 tid 27920 thread 31 bound to OS proc set {93}
OMP: pid 27791 tid 27905 thread 16 bound to OS proc set {48}
OMP: pid 27791 tid 27919 thread 30 bound to OS proc set {90}
OMP: pid 27791 tid 27916 thread 27 bound to OS proc set {81}
OMP: pid 27791 tid 27908 thread 19 bound to OS proc set {57}
OMP: pid 27791 tid 27918 thread 29 bound to OS proc set {87}
OMP: pid 27791 tid 27914 thread 25 bound to OS proc set {75}
OMP: pid 27791 tid 27912 thread 23 bound to OS proc set {69}
OMP: pid 27791 tid 27907 thread 18 bound to OS proc set {54}
OMP: pid 27791 tid 27906 thread 17 bound to OS proc set {51}
OMP: pid 27791 tid 27909 thread 20 bound to OS proc set {60}
OMP: pid 27791 tid 27911 thread 22 bound to OS proc set {66}
OMP: pid 27791 tid 27898 thread 9 bound to OS proc set {27}
OMP: pid 27791 tid 27910 thread 21 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 4.594366, "speed_pp": 445.763336, "t_tg": 0.000000, "speed_tg": nan, "t": 4.594366, "speed": 445.763336}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_6 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 27942 tid 27942 thread 0 bound to OS proc set {0}
OMP: pid 27942 tid 28041 thread 1 bound to OS proc set {2}
OMP: pid 27942 tid 28043 thread 3 bound to OS proc set {7}
OMP: pid 27942 tid 28042 thread 2 bound to OS proc set {4}
OMP: pid 27942 tid 28047 thread 7 bound to OS proc set {16}
OMP: pid 27942 tid 28046 thread 6 bound to OS proc set {14}
OMP: pid 27942 tid 28055 thread 15 bound to OS proc set {36}
OMP: pid 27942 tid 28075 thread 35 bound to OS proc set {84}
OMP: pid 27942 tid 28054 thread 14 bound to OS proc set {33}
OMP: pid 27942 tid 28045 thread 5 bound to OS proc set {12}
OMP: pid 27942 tid 28072 thread 32 bound to OS proc set {77}
OMP: pid 27942 tid 28044 thread 4 bound to OS proc set {9}
OMP: pid 27942 tid 28056 thread 16 bound to OS proc set {38}
OMP: pid 27942 tid 28057 thread 17 bound to OS proc set {41}
OMP: pid 27942 tid 28070 thread 30 bound to OS proc set {72}
OMP: pid 27942 tid 28048 thread 8 bound to OS proc set {19}
OMP: pid 27942 tid 28079 thread 39 bound to OS proc set {94}
OMP: pid 27942 tid 28050 thread 10 bound to OS proc set {24}
OMP: pid 27942 tid 28052 thread 12 bound to OS proc set {29}
OMP: pid 27942 tid 28071 thread 31 bound to OS proc set {75}
OMP: pid 27942 tid 28051 thread 11 bound to OS proc set {26}
OMP: pid 27942 tid 28068 thread 28 bound to OS proc set {67}
OMP: pid 27942 tid 28053 thread 13 bound to OS proc set {31}
OMP: pid 27942 tid 28066 thread 26 bound to OS proc set {63}
OMP: pid 27942 tid 28076 thread 36 bound to OS proc set {87}
OMP: pid 27942 tid 28074 thread 34 bound to OS proc set {82}
OMP: pid 27942 tid 28073 thread 33 bound to OS proc set {80}
OMP: pid 27942 tid 28078 thread 38 bound to OS proc set {92}
OMP: pid 27942 tid 28049 thread 9 bound to OS proc set {21}
OMP: pid 27942 tid 28064 thread 24 bound to OS proc set {58}
OMP: pid 27942 tid 28067 thread 27 bound to OS proc set {65}
OMP: pid 27942 tid 28069 thread 29 bound to OS proc set {70}
OMP: pid 27942 tid 28058 thread 18 bound to OS proc set {43}
OMP: pid 27942 tid 28065 thread 25 bound to OS proc set {60}
OMP: pid 27942 tid 28077 thread 37 bound to OS proc set {89}
OMP: pid 27942 tid 28060 thread 20 bound to OS proc set {48}
OMP: pid 27942 tid 28063 thread 23 bound to OS proc set {55}
OMP: pid 27942 tid 28062 thread 22 bound to OS proc set {53}
OMP: pid 27942 tid 28059 thread 19 bound to OS proc set {46}
OMP: pid 27942 tid 28061 thread 21 bound to OS proc set {50}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.865285, "speed_pp": 529.844482, "t_tg": 0.000000, "speed_tg": nan, "t": 3.865285, "speed": 529.844482}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_7 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 28152 tid 28152 thread 0 bound to OS proc set {0}
OMP: pid 28152 tid 28251 thread 1 bound to OS proc set {2}
OMP: pid 28152 tid 28268 thread 18 bound to OS proc set {36}
OMP: pid 28152 tid 28258 thread 8 bound to OS proc set {16}
OMP: pid 28152 tid 28266 thread 16 bound to OS proc set {32}
OMP: pid 28152 tid 28259 thread 9 bound to OS proc set {18}
OMP: pid 28152 tid 28252 thread 2 bound to OS proc set {4}
OMP: pid 28152 tid 28282 thread 32 bound to OS proc set {64}
OMP: pid 28152 tid 28297 thread 47 bound to OS proc set {94}
OMP: pid 28152 tid 28294 thread 44 bound to OS proc set {88}
OMP: pid 28152 tid 28262 thread 12 bound to OS proc set {24}
OMP: pid 28152 tid 28257 thread 7 bound to OS proc set {14}
OMP: pid 28152 tid 28273 thread 23 bound to OS proc set {46}
OMP: pid 28152 tid 28285 thread 35 bound to OS proc set {70}
OMP: pid 28152 tid 28253 thread 3 bound to OS proc set {6}
OMP: pid 28152 tid 28264 thread 14 bound to OS proc set {28}
OMP: pid 28152 tid 28281 thread 31 bound to OS proc set {62}
OMP: pid 28152 tid 28256 thread 6 bound to OS proc set {12}
OMP: pid 28152 tid 28296 thread 46 bound to OS proc set {92}
OMP: pid 28152 tid 28254 thread 4 bound to OS proc set {8}
OMP: pid 28152 tid 28265 thread 15 bound to OS proc set {30}
OMP: pid 28152 tid 28263 thread 13 bound to OS proc set {26}
OMP: pid 28152 tid 28274 thread 24 bound to OS proc set {48}
OMP: pid 28152 tid 28261 thread 11 bound to OS proc set {22}
OMP: pid 28152 tid 28255 thread 5 bound to OS proc set {10}
OMP: pid 28152 tid 28277 thread 27 bound to OS proc set {54}
OMP: pid 28152 tid 28260 thread 10 bound to OS proc set {20}
OMP: pid 28152 tid 28278 thread 28 bound to OS proc set {56}
OMP: pid 28152 tid 28279 thread 29 bound to OS proc set {58}
OMP: pid 28152 tid 28290 thread 40 bound to OS proc set {80}
OMP: pid 28152 tid 28270 thread 20 bound to OS proc set {40}
OMP: pid 28152 tid 28289 thread 39 bound to OS proc set {78}
OMP: pid 28152 tid 28271 thread 21 bound to OS proc set {42}
OMP: pid 28152 tid 28293 thread 43 bound to OS proc set {86}
OMP: pid 28152 tid 28267 thread 17 bound to OS proc set {34}
OMP: pid 28152 tid 28272 thread 22 bound to OS proc set {44}
OMP: pid 28152 tid 28295 thread 45 bound to OS proc set {90}
OMP: pid 28152 tid 28269 thread 19 bound to OS proc set {38}
OMP: pid 28152 tid 28280 thread 30 bound to OS proc set {60}
OMP: pid 28152 tid 28276 thread 26 bound to OS proc set {52}
OMP: pid 28152 tid 28286 thread 36 bound to OS proc set {72}
OMP: pid 28152 tid 28283 thread 33 bound to OS proc set {66}
OMP: pid 28152 tid 28284 thread 34 bound to OS proc set {68}
OMP: pid 28152 tid 28275 thread 25 bound to OS proc set {50}
OMP: pid 28152 tid 28288 thread 38 bound to OS proc set {76}
OMP: pid 28152 tid 28287 thread 37 bound to OS proc set {74}
OMP: pid 28152 tid 28292 thread 42 bound to OS proc set {84}
OMP: pid 28152 tid 28291 thread 41 bound to OS proc set {82}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.352615, "speed_pp": 610.866394, "t_tg": 0.000000, "speed_tg": nan, "t": 3.352615, "speed": 610.866394}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_8 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 28317 tid 28317 thread 0 bound to OS proc set {0}
OMP: pid 28317 tid 28416 thread 1 bound to OS proc set {1}
OMP: pid 28317 tid 28418 thread 3 bound to OS proc set {5}
OMP: pid 28317 tid 28417 thread 2 bound to OS proc set {3}
OMP: pid 28317 tid 28470 thread 55 bound to OS proc set {95}
OMP: pid 28317 tid 28423 thread 8 bound to OS proc set {13}
OMP: pid 28317 tid 28466 thread 51 bound to OS proc set {88}
OMP: pid 28317 tid 28465 thread 50 bound to OS proc set {86}
OMP: pid 28317 tid 28447 thread 32 bound to OS proc set {55}
OMP: pid 28317 tid 28463 thread 48 bound to OS proc set {83}
OMP: pid 28317 tid 28430 thread 15 bound to OS proc set {25}
OMP: pid 28317 tid 28434 thread 19 bound to OS proc set {32}
OMP: pid 28317 tid 28426 thread 11 bound to OS proc set {19}
OMP: pid 28317 tid 28443 thread 28 bound to OS proc set {48}
OMP: pid 28317 tid 28467 thread 52 bound to OS proc set {90}
OMP: pid 28317 tid 28422 thread 7 bound to OS proc set {12}
OMP: pid 28317 tid 28464 thread 49 bound to OS proc set {84}
OMP: pid 28317 tid 28429 thread 14 bound to OS proc set {24}
OMP: pid 28317 tid 28469 thread 54 bound to OS proc set {93}
OMP: pid 28317 tid 28439 thread 24 bound to OS proc set {41}
OMP: pid 28317 tid 28433 thread 18 bound to OS proc set {31}
OMP: pid 28317 tid 28425 thread 10 bound to OS proc set {17}
OMP: pid 28317 tid 28431 thread 16 bound to OS proc set {27}
OMP: pid 28317 tid 28462 thread 47 bound to OS proc set {81}
OMP: pid 28317 tid 28441 thread 26 bound to OS proc set {45}
OMP: pid 28317 tid 28419 thread 4 bound to OS proc set {6}
OMP: pid 28317 tid 28427 thread 12 bound to OS proc set {20}
OMP: pid 28317 tid 28444 thread 29 bound to OS proc set {50}
OMP: pid 28317 tid 28432 thread 17 bound to OS proc set {29}
OMP: pid 28317 tid 28450 thread 35 bound to OS proc set {60}
OMP: pid 28317 tid 28428 thread 13 bound to OS proc set {22}
OMP: pid 28317 tid 28459 thread 44 bound to OS proc set {76}
OMP: pid 28317 tid 28421 thread 6 bound to OS proc set {10}
OMP: pid 28317 tid 28442 thread 27 bound to OS proc set {46}
OMP: pid 28317 tid 28468 thread 53 bound to OS proc set {91}
OMP: pid 28317 tid 28445 thread 30 bound to OS proc set {51}
OMP: pid 28317 tid 28435 thread 20 bound to OS proc set {34}
OMP: pid 28317 tid 28461 thread 46 bound to OS proc set {79}
OMP: pid 28317 tid 28458 thread 43 bound to OS proc set {74}
OMP: pid 28317 tid 28437 thread 22 bound to OS proc set {38}
OMP: pid 28317 tid 28449 thread 34 bound to OS proc set {58}
OMP: pid 28317 tid 28438 thread 23 bound to OS proc set {39}
OMP: pid 28317 tid 28448 thread 33 bound to OS proc set {57}
OMP: pid 28317 tid 28455 thread 40 bound to OS proc set {69}
OMP: pid 28317 tid 28420 thread 5 bound to OS proc set {8}
OMP: pid 28317 tid 28424 thread 9 bound to OS proc set {15}
OMP: pid 28317 tid 28457 thread 42 bound to OS proc set {72}
OMP: pid 28317 tid 28453 thread 38 bound to OS proc set {65}
OMP: pid 28317 tid 28451 thread 36 bound to OS proc set {62}
OMP: pid 28317 tid 28454 thread 39 bound to OS proc set {67}
OMP: pid 28317 tid 28460 thread 45 bound to OS proc set {77}
OMP: pid 28317 tid 28452 thread 37 bound to OS proc set {64}
OMP: pid 28317 tid 28440 thread 25 bound to OS proc set {43}
OMP: pid 28317 tid 28456 thread 41 bound to OS proc set {71}
OMP: pid 28317 tid 28446 thread 31 bound to OS proc set {53}
OMP: pid 28317 tid 28436 thread 21 bound to OS proc set {36}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.968510, "speed_pp": 689.908447, "t_tg": 0.000000, "speed_tg": nan, "t": 2.968510, "speed": 689.908447}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9
To display your profiling results:
########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_9 #
########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 28490 tid 28490 thread 0 bound to OS proc set {0}
OMP: pid 28490 tid 28589 thread 1 bound to OS proc set {1}
OMP: pid 28490 tid 28590 thread 2 bound to OS proc set {3}
OMP: pid 28490 tid 28596 thread 8 bound to OS proc set {12}
OMP: pid 28490 tid 28601 thread 13 bound to OS proc set {19}
OMP: pid 28490 tid 28603 thread 15 bound to OS proc set {22}
OMP: pid 28490 tid 28602 thread 14 bound to OS proc set {21}
OMP: pid 28490 tid 28591 thread 3 bound to OS proc set {4}
OMP: pid 28490 tid 28597 thread 9 bound to OS proc set {13}
OMP: pid 28490 tid 28595 thread 7 bound to OS proc set {10}
OMP: pid 28490 tid 28651 thread 63 bound to OS proc set {95}
OMP: pid 28490 tid 28598 thread 10 bound to OS proc set {15}
OMP: pid 28490 tid 28599 thread 11 bound to OS proc set {16}
OMP: pid 28490 tid 28639 thread 51 bound to OS proc set {77}
OMP: pid 28490 tid 28648 thread 60 bound to OS proc set {90}
OMP: pid 28490 tid 28592 thread 4 bound to OS proc set {6}
OMP: pid 28490 tid 28594 thread 6 bound to OS proc set {9}
OMP: pid 28490 tid 28620 thread 32 bound to OS proc set {48}
OMP: pid 28490 tid 28600 thread 12 bound to OS proc set {18}
OMP: pid 28490 tid 28650 thread 62 bound to OS proc set {93}
OMP: pid 28490 tid 28606 thread 18 bound to OS proc set {27}
OMP: pid 28490 tid 28604 thread 16 bound to OS proc set {24}
OMP: pid 28490 tid 28623 thread 35 bound to OS proc set {53}
OMP: pid 28490 tid 28628 thread 40 bound to OS proc set {60}
OMP: pid 28490 tid 28607 thread 19 bound to OS proc set {28}
OMP: pid 28490 tid 28593 thread 5 bound to OS proc set {7}
OMP: pid 28490 tid 28615 thread 27 bound to OS proc set {40}
OMP: pid 28490 tid 28617 thread 29 bound to OS proc set {43}
OMP: pid 28490 tid 28627 thread 39 bound to OS proc set {59}
OMP: pid 28490 tid 28636 thread 48 bound to OS proc set {72}
OMP: pid 28490 tid 28635 thread 47 bound to OS proc set {71}
OMP: pid 28490 tid 28619 thread 31 bound to OS proc set {46}
OMP: pid 28490 tid 28612 thread 24 bound to OS proc set {36}
OMP: pid 28490 tid 28608 thread 20 bound to OS proc set {30}
OMP: pid 28490 tid 28643 thread 55 bound to OS proc set {83}
OMP: pid 28490 tid 28611 thread 23 bound to OS proc set {34}
OMP: pid 28490 tid 28618 thread 30 bound to OS proc set {45}
OMP: pid 28490 tid 28638 thread 50 bound to OS proc set {75}
OMP: pid 28490 tid 28605 thread 17 bound to OS proc set {25}
OMP: pid 28490 tid 28630 thread 42 bound to OS proc set {63}
OMP: pid 28490 tid 28616 thread 28 bound to OS proc set {42}
OMP: pid 28490 tid 28631 thread 43 bound to OS proc set {65}
OMP: pid 28490 tid 28621 thread 33 bound to OS proc set {50}
OMP: pid 28490 tid 28647 thread 59 bound to OS proc set {89}
OMP: pid 28490 tid 28649 thread 61 bound to OS proc set {92}
OMP: pid 28490 tid 28613 thread 25 bound to OS proc set {37}
OMP: pid 28490 tid 28625 thread 37 bound to OS proc set {56}
OMP: pid 28490 tid 28614 thread 26 bound to OS proc set {39}
OMP: pid 28490 tid 28634 thread 46 bound to OS proc set {69}
OMP: pid 28490 tid 28624 thread 36 bound to OS proc set {54}
OMP: pid 28490 tid 28629 thread 41 bound to OS proc set {62}
OMP: pid 28490 tid 28637 thread 49 bound to OS proc set {74}
OMP: pid 28490 tid 28626 thread 38 bound to OS proc set {57}
OMP: pid 28490 tid 28622 thread 34 bound to OS proc set {51}
OMP: pid 28490 tid 28644 thread 56 bound to OS proc set {84}
OMP: pid 28490 tid 28642 thread 54 bound to OS proc set {81}
OMP: pid 28490 tid 28640 thread 52 bound to OS proc set {78}
OMP: pid 28490 tid 28646 thread 58 bound to OS proc set {87}
OMP: pid 28490 tid 28633 thread 45 bound to OS proc set {68}
OMP: pid 28490 tid 28645 thread 57 bound to OS proc set {86}
OMP: pid 28490 tid 28632 thread 44 bound to OS proc set {66}
OMP: pid 28490 tid 28610 thread 22 bound to OS proc set {33}
OMP: pid 28490 tid 28641 thread 53 bound to OS proc set {80}
OMP: pid 28490 tid 28609 thread 21 bound to OS proc set {31}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.634831, "speed_pp": 777.279480, "t_tg": 0.000000, "speed_tg": nan, "t": 2.634831, "speed": 777.279480}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_10 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 28671 tid 28671 thread 0 bound to OS proc set {0}
OMP: pid 28671 tid 28771 thread 2 bound to OS proc set {2}
OMP: pid 28671 tid 28770 thread 1 bound to OS proc set {1}
OMP: pid 28671 tid 28840 thread 71 bound to OS proc set {95}
OMP: pid 28671 tid 28836 thread 67 bound to OS proc set {90}
OMP: pid 28671 tid 28817 thread 48 bound to OS proc set {64}
OMP: pid 28671 tid 28772 thread 3 bound to OS proc set {4}
OMP: pid 28671 tid 28837 thread 68 bound to OS proc set {91}
OMP: pid 28671 tid 28835 thread 66 bound to OS proc set {88}
OMP: pid 28671 tid 28781 thread 12 bound to OS proc set {16}
OMP: pid 28671 tid 28834 thread 65 bound to OS proc set {87}
OMP: pid 28671 tid 28775 thread 6 bound to OS proc set {8}
OMP: pid 28671 tid 28833 thread 64 bound to OS proc set {86}
OMP: pid 28671 tid 28832 thread 63 bound to OS proc set {84}
OMP: pid 28671 tid 28815 thread 46 bound to OS proc set {61}
OMP: pid 28671 tid 28820 thread 51 bound to OS proc set {68}
OMP: pid 28671 tid 28780 thread 11 bound to OS proc set {14}
OMP: pid 28671 tid 28828 thread 59 bound to OS proc set {79}
OMP: pid 28671 tid 28818 thread 49 bound to OS proc set {66}
OMP: pid 28671 tid 28783 thread 14 bound to OS proc set {18}
OMP: pid 28671 tid 28816 thread 47 bound to OS proc set {63}
OMP: pid 28671 tid 28773 thread 4 bound to OS proc set {5}
OMP: pid 28671 tid 28821 thread 52 bound to OS proc set {70}
OMP: pid 28671 tid 28829 thread 60 bound to OS proc set {80}
OMP: pid 28671 tid 28803 thread 34 bound to OS proc set {45}
OMP: pid 28671 tid 28777 thread 8 bound to OS proc set {10}
OMP: pid 28671 tid 28804 thread 35 bound to OS proc set {47}
OMP: pid 28671 tid 28779 thread 10 bound to OS proc set {13}
OMP: pid 28671 tid 28801 thread 32 bound to OS proc set {43}
OMP: pid 28671 tid 28784 thread 15 bound to OS proc set {20}
OMP: pid 28671 tid 28786 thread 17 bound to OS proc set {22}
OMP: pid 28671 tid 28776 thread 7 bound to OS proc set {9}
OMP: pid 28671 tid 28782 thread 13 bound to OS proc set {17}
OMP: pid 28671 tid 28789 thread 20 bound to OS proc set {26}
OMP: pid 28671 tid 28799 thread 30 bound to OS proc set {40}
OMP: pid 28671 tid 28792 thread 23 bound to OS proc set {30}
OMP: pid 28671 tid 28838 thread 69 bound to OS proc set {92}
OMP: pid 28671 tid 28839 thread 70 bound to OS proc set {94}
OMP: pid 28671 tid 28791 thread 22 bound to OS proc set {29}
OMP: pid 28671 tid 28798 thread 29 bound to OS proc set {39}
OMP: pid 28671 tid 28813 thread 44 bound to OS proc set {59}
OMP: pid 28671 tid 28831 thread 62 bound to OS proc set {83}
OMP: pid 28671 tid 28774 thread 5 bound to OS proc set {6}
OMP: pid 28671 tid 28787 thread 18 bound to OS proc set {24}
OMP: pid 28671 tid 28778 thread 9 bound to OS proc set {12}
OMP: pid 28671 tid 28819 thread 50 bound to OS proc set {67}
OMP: pid 28671 tid 28809 thread 40 bound to OS proc set {53}
OMP: pid 28671 tid 28814 thread 45 bound to OS proc set {60}
OMP: pid 28671 tid 28802 thread 33 bound to OS proc set {44}
OMP: pid 28671 tid 28807 thread 38 bound to OS proc set {51}
OMP: pid 28671 tid 28823 thread 54 bound to OS proc set {72}
OMP: pid 28671 tid 28788 thread 19 bound to OS proc set {25}
OMP: pid 28671 tid 28793 thread 24 bound to OS proc set {32}
OMP: pid 28671 tid 28800 thread 31 bound to OS proc set {41}
OMP: pid 28671 tid 28797 thread 28 bound to OS proc set {37}
OMP: pid 28671 tid 28785 thread 16 bound to OS proc set {21}
OMP: pid 28671 tid 28790 thread 21 bound to OS proc set {28}
OMP: pid 28671 tid 28794 thread 25 bound to OS proc set {33}
OMP: pid 28671 tid 28808 thread 39 bound to OS proc set {52}
OMP: pid 28671 tid 28805 thread 36 bound to OS proc set {48}
OMP: pid 28671 tid 28796 thread 27 bound to OS proc set {36}
OMP: pid 28671 tid 28795 thread 26 bound to OS proc set {35}
OMP: pid 28671 tid 28825 thread 56 bound to OS proc set {75}
OMP: pid 28671 tid 28824 thread 55 bound to OS proc set {74}
OMP: pid 28671 tid 28830 thread 61 bound to OS proc set {82}
OMP: pid 28671 tid 28812 thread 43 bound to OS proc set {57}
OMP: pid 28671 tid 28826 thread 57 bound to OS proc set {76}
OMP: pid 28671 tid 28811 thread 42 bound to OS proc set {56}
OMP: pid 28671 tid 28822 thread 53 bound to OS proc set {71}
OMP: pid 28671 tid 28806 thread 37 bound to OS proc set {49}
OMP: pid 28671 tid 28810 thread 41 bound to OS proc set {55}
OMP: pid 28671 tid 28827 thread 58 bound to OS proc set {78}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.456781, "speed_pp": 833.611145, "t_tg": 0.000000, "speed_tg": nan, "t": 2.456781, "speed": 833.611145}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_11 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 28860 tid 28860 thread 0 bound to OS proc set {0}
OMP: pid 28860 tid 28961 thread 3 bound to OS proc set {3}
OMP: pid 28860 tid 28960 thread 2 bound to OS proc set {2}
OMP: pid 28860 tid 28962 thread 4 bound to OS proc set {4}
OMP: pid 28860 tid 28959 thread 1 bound to OS proc set {1}
OMP: pid 28860 tid 28970 thread 12 bound to OS proc set {14}
OMP: pid 28860 tid 28965 thread 7 bound to OS proc set {8}
OMP: pid 28860 tid 28968 thread 10 bound to OS proc set {12}
OMP: pid 28860 tid 29008 thread 50 bound to OS proc set {60}
OMP: pid 28860 tid 28969 thread 11 bound to OS proc set {13}
OMP: pid 28860 tid 29037 thread 79 bound to OS proc set {95}
OMP: pid 28860 tid 28985 thread 27 bound to OS proc set {32}
OMP: pid 28860 tid 29007 thread 49 bound to OS proc set {59}
OMP: pid 28860 tid 28972 thread 14 bound to OS proc set {16}
OMP: pid 28860 tid 28973 thread 15 bound to OS proc set {18}
OMP: pid 28860 tid 28971 thread 13 bound to OS proc set {15}
OMP: pid 28860 tid 28963 thread 5 bound to OS proc set {6}
OMP: pid 28860 tid 28966 thread 8 bound to OS proc set {9}
OMP: pid 28860 tid 28964 thread 6 bound to OS proc set {7}
OMP: pid 28860 tid 29004 thread 46 bound to OS proc set {55}
OMP: pid 28860 tid 29009 thread 51 bound to OS proc set {61}
OMP: pid 28860 tid 29003 thread 45 bound to OS proc set {54}
OMP: pid 28860 tid 29036 thread 78 bound to OS proc set {94}
OMP: pid 28860 tid 28978 thread 20 bound to OS proc set {24}
OMP: pid 28860 tid 28998 thread 40 bound to OS proc set {48}
OMP: pid 28860 tid 28975 thread 17 bound to OS proc set {20}
OMP: pid 28860 tid 29006 thread 48 bound to OS proc set {58}
OMP: pid 28860 tid 28993 thread 35 bound to OS proc set {42}
OMP: pid 28860 tid 28967 thread 9 bound to OS proc set {10}
OMP: pid 28860 tid 28980 thread 22 bound to OS proc set {26}
OMP: pid 28860 tid 29005 thread 47 bound to OS proc set {56}
OMP: pid 28860 tid 28982 thread 24 bound to OS proc set {29}
OMP: pid 28860 tid 28974 thread 16 bound to OS proc set {19}
OMP: pid 28860 tid 29001 thread 43 bound to OS proc set {52}
OMP: pid 28860 tid 28997 thread 39 bound to OS proc set {47}
OMP: pid 28860 tid 28987 thread 29 bound to OS proc set {35}
OMP: pid 28860 tid 29002 thread 44 bound to OS proc set {53}
OMP: pid 28860 tid 28984 thread 26 bound to OS proc set {31}
OMP: pid 28860 tid 29019 thread 61 bound to OS proc set {73}
OMP: pid 28860 tid 29011 thread 53 bound to OS proc set {64}
OMP: pid 28860 tid 28977 thread 19 bound to OS proc set {23}
OMP: pid 28860 tid 29020 thread 62 bound to OS proc set {75}
OMP: pid 28860 tid 29000 thread 42 bound to OS proc set {50}
OMP: pid 28860 tid 28976 thread 18 bound to OS proc set {21}
OMP: pid 28860 tid 28988 thread 30 bound to OS proc set {36}
OMP: pid 28860 tid 29010 thread 52 bound to OS proc set {63}
OMP: pid 28860 tid 28992 thread 34 bound to OS proc set {41}
OMP: pid 28860 tid 29014 thread 56 bound to OS proc set {67}
OMP: pid 28860 tid 29021 thread 63 bound to OS proc set {76}
OMP: pid 28860 tid 28979 thread 21 bound to OS proc set {25}
OMP: pid 28860 tid 28994 thread 36 bound to OS proc set {43}
OMP: pid 28860 tid 28990 thread 32 bound to OS proc set {38}
OMP: pid 28860 tid 28986 thread 28 bound to OS proc set {33}
OMP: pid 28860 tid 28999 thread 41 bound to OS proc set {49}
OMP: pid 28860 tid 28983 thread 25 bound to OS proc set {30}
OMP: pid 28860 tid 29017 thread 59 bound to OS proc set {71}
OMP: pid 28860 tid 28991 thread 33 bound to OS proc set {40}
OMP: pid 28860 tid 29022 thread 64 bound to OS proc set {77}
OMP: pid 28860 tid 28995 thread 37 bound to OS proc set {44}
OMP: pid 28860 tid 28989 thread 31 bound to OS proc set {37}
OMP: pid 28860 tid 29034 thread 76 bound to OS proc set {92}
OMP: pid 28860 tid 29016 thread 58 bound to OS proc set {70}
OMP: pid 28860 tid 29035 thread 77 bound to OS proc set {93}
OMP: pid 28860 tid 28996 thread 38 bound to OS proc set {46}
OMP: pid 28860 tid 28981 thread 23 bound to OS proc set {27}
OMP: pid 28860 tid 29023 thread 65 bound to OS proc set {78}
OMP: pid 28860 tid 29018 thread 60 bound to OS proc set {72}
OMP: pid 28860 tid 29012 thread 54 bound to OS proc set {65}
OMP: pid 28860 tid 29015 thread 57 bound to OS proc set {69}
OMP: pid 28860 tid 29025 thread 67 bound to OS proc set {81}
OMP: pid 28860 tid 29013 thread 55 bound to OS proc set {66}
OMP: pid 28860 tid 29033 thread 75 bound to OS proc set {90}
OMP: pid 28860 tid 29028 thread 70 bound to OS proc set {84}
OMP: pid 28860 tid 29026 thread 68 bound to OS proc set {82}
OMP: pid 28860 tid 29029 thread 71 bound to OS proc set {86}
OMP: pid 28860 tid 29027 thread 69 bound to OS proc set {83}
OMP: pid 28860 tid 29030 thread 72 bound to OS proc set {87}
OMP: pid 28860 tid 29024 thread 66 bound to OS proc set {80}
OMP: pid 28860 tid 29031 thread 73 bound to OS proc set {88}
OMP: pid 28860 tid 29032 thread 74 bound to OS proc set {89}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.247592, "speed_pp": 911.197388, "t_tg": 0.000000, "speed_tg": nan, "t": 2.247592, "speed": 911.197388}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_12 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 29106 tid 29106 thread 0 bound to OS proc set {0}
OMP: pid 29106 tid 29207 thread 3 bound to OS proc set {3}
OMP: pid 29106 tid 29206 thread 2 bound to OS proc set {2}
OMP: pid 29106 tid 29212 thread 8 bound to OS proc set {8}
OMP: pid 29106 tid 29205 thread 1 bound to OS proc set {1}
OMP: pid 29106 tid 29208 thread 4 bound to OS proc set {4}
OMP: pid 29106 tid 29211 thread 7 bound to OS proc set {7}
OMP: pid 29106 tid 29213 thread 9 bound to OS proc set {9}
OMP: pid 29106 tid 29210 thread 6 bound to OS proc set {6}
OMP: pid 29106 tid 29218 thread 14 bound to OS proc set {15}
OMP: pid 29106 tid 29209 thread 5 bound to OS proc set {5}
OMP: pid 29106 tid 29215 thread 11 bound to OS proc set {12}
OMP: pid 29106 tid 29251 thread 47 bound to OS proc set {51}
OMP: pid 29106 tid 29267 thread 63 bound to OS proc set {69}
OMP: pid 29106 tid 29250 thread 46 bound to OS proc set {50}
OMP: pid 29106 tid 29248 thread 44 bound to OS proc set {48}
OMP: pid 29106 tid 29236 thread 32 bound to OS proc set {35}
OMP: pid 29106 tid 29219 thread 15 bound to OS proc set {16}
OMP: pid 29106 tid 29216 thread 12 bound to OS proc set {13}
OMP: pid 29106 tid 29217 thread 13 bound to OS proc set {14}
OMP: pid 29106 tid 29255 thread 51 bound to OS proc set {56}
OMP: pid 29106 tid 29280 thread 76 bound to OS proc set {83}
OMP: pid 29106 tid 29222 thread 18 bound to OS proc set {19}
OMP: pid 29106 tid 29247 thread 43 bound to OS proc set {47}
OMP: pid 29106 tid 29279 thread 75 bound to OS proc set {82}
OMP: pid 29106 tid 29235 thread 31 bound to OS proc set {34}
OMP: pid 29106 tid 29287 thread 83 bound to OS proc set {91}
OMP: pid 29106 tid 29243 thread 39 bound to OS proc set {42}
OMP: pid 29106 tid 29249 thread 45 bound to OS proc set {49}
OMP: pid 29106 tid 29232 thread 28 bound to OS proc set {30}
OMP: pid 29106 tid 29264 thread 60 bound to OS proc set {66}
OMP: pid 29106 tid 29260 thread 56 bound to OS proc set {61}
OMP: pid 29106 tid 29252 thread 48 bound to OS proc set {52}
OMP: pid 29106 tid 29240 thread 36 bound to OS proc set {39}
OMP: pid 29106 tid 29254 thread 50 bound to OS proc set {55}
OMP: pid 29106 tid 29242 thread 38 bound to OS proc set {41}
OMP: pid 29106 tid 29239 thread 35 bound to OS proc set {38}
OMP: pid 29106 tid 29265 thread 61 bound to OS proc set {67}
OMP: pid 29106 tid 29246 thread 42 bound to OS proc set {46}
OMP: pid 29106 tid 29214 thread 10 bound to OS proc set {11}
OMP: pid 29106 tid 29244 thread 40 bound to OS proc set {44}
OMP: pid 29106 tid 29234 thread 30 bound to OS proc set {33}
OMP: pid 29106 tid 29262 thread 58 bound to OS proc set {63}
OMP: pid 29106 tid 29237 thread 33 bound to OS proc set {36}
OMP: pid 29106 tid 29245 thread 41 bound to OS proc set {45}
OMP: pid 29106 tid 29261 thread 57 bound to OS proc set {62}
OMP: pid 29106 tid 29256 thread 52 bound to OS proc set {57}
OMP: pid 29106 tid 29220 thread 16 bound to OS proc set {17}
OMP: pid 29106 tid 29224 thread 20 bound to OS proc set {22}
OMP: pid 29106 tid 29231 thread 27 bound to OS proc set {29}
OMP: pid 29106 tid 29275 thread 71 bound to OS proc set {78}
OMP: pid 29106 tid 29268 thread 64 bound to OS proc set {70}
OMP: pid 29106 tid 29221 thread 17 bound to OS proc set {18}
OMP: pid 29106 tid 29269 thread 65 bound to OS proc set {71}
OMP: pid 29106 tid 29270 thread 66 bound to OS proc set {72}
OMP: pid 29106 tid 29228 thread 24 bound to OS proc set {26}
OMP: pid 29106 tid 29226 thread 22 bound to OS proc set {24}
OMP: pid 29106 tid 29223 thread 19 bound to OS proc set {20}
OMP: pid 29106 tid 29283 thread 79 bound to OS proc set {87}
OMP: pid 29106 tid 29259 thread 55 bound to OS proc set {60}
OMP: pid 29106 tid 29230 thread 26 bound to OS proc set {28}
OMP: pid 29106 tid 29263 thread 59 bound to OS proc set {65}
OMP: pid 29106 tid 29286 thread 82 bound to OS proc set {90}
OMP: pid 29106 tid 29258 thread 54 bound to OS proc set {59}
OMP: pid 29106 tid 29233 thread 29 bound to OS proc set {31}
OMP: pid 29106 tid 29241 thread 37 bound to OS proc set {40}
OMP: pid 29106 tid 29274 thread 70 bound to OS proc set {77}
OMP: pid 29106 tid 29266 thread 62 bound to OS proc set {68}
OMP: pid 29106 tid 29281 thread 77 bound to OS proc set {84}
OMP: pid 29106 tid 29238 thread 34 bound to OS proc set {37}
OMP: pid 29106 tid 29272 thread 68 bound to OS proc set {74}
OMP: pid 29106 tid 29229 thread 25 bound to OS proc set {27}
OMP: pid 29106 tid 29278 thread 74 bound to OS proc set {81}
OMP: pid 29106 tid 29253 thread 49 bound to OS proc set {54}
OMP: pid 29106 tid 29257 thread 53 bound to OS proc set {58}
OMP: pid 29106 tid 29227 thread 23 bound to OS proc set {25}
OMP: pid 29106 tid 29282 thread 78 bound to OS proc set {85}
OMP: pid 29106 tid 29288 thread 84 bound to OS proc set {92}
OMP: pid 29106 tid 29276 thread 72 bound to OS proc set {79}
OMP: pid 29106 tid 29273 thread 69 bound to OS proc set {76}
OMP: pid 29106 tid 29277 thread 73 bound to OS proc set {80}
OMP: pid 29106 tid 29225 thread 21 bound to OS proc set {23}
OMP: pid 29106 tid 29271 thread 67 bound to OS proc set {73}
OMP: pid 29106 tid 29290 thread 86 bound to OS proc set {94}
OMP: pid 29106 tid 29291 thread 87 bound to OS proc set {95}
OMP: pid 29106 tid 29284 thread 80 bound to OS proc set {88}
OMP: pid 29106 tid 29285 thread 81 bound to OS proc set {89}
OMP: pid 29106 tid 29289 thread 85 bound to OS proc set {93}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.098129, "speed_pp": 976.107727, "t_tg": 0.000000, "speed_tg": nan, "t": 2.098129, "speed": 976.107727}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_13 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 29311 tid 29311 thread 0 bound to OS proc set {0}
OMP: pid 29311 tid 29424 thread 15 bound to OS proc set {15}
OMP: pid 29311 tid 29412 thread 3 bound to OS proc set {3}
OMP: pid 29311 tid 29421 thread 12 bound to OS proc set {12}
OMP: pid 29311 tid 29423 thread 14 bound to OS proc set {14}
OMP: pid 29311 tid 29411 thread 2 bound to OS proc set {2}
OMP: pid 29311 tid 29420 thread 11 bound to OS proc set {11}
OMP: pid 29311 tid 29417 thread 8 bound to OS proc set {8}
OMP: pid 29311 tid 29422 thread 13 bound to OS proc set {13}
OMP: pid 29311 tid 29410 thread 1 bound to OS proc set {1}
OMP: pid 29311 tid 29416 thread 7 bound to OS proc set {7}
OMP: pid 29311 tid 29419 thread 10 bound to OS proc set {10}
OMP: pid 29311 tid 29428 thread 19 bound to OS proc set {19}
OMP: pid 29311 tid 29425 thread 16 bound to OS proc set {16}
OMP: pid 29311 tid 29415 thread 6 bound to OS proc set {6}
OMP: pid 29311 tid 29413 thread 4 bound to OS proc set {4}
OMP: pid 29311 tid 29427 thread 18 bound to OS proc set {18}
OMP: pid 29311 tid 29418 thread 9 bound to OS proc set {9}
OMP: pid 29311 tid 29433 thread 24 bound to OS proc set {24}
OMP: pid 29311 tid 29426 thread 17 bound to OS proc set {17}
OMP: pid 29311 tid 29429 thread 20 bound to OS proc set {20}
OMP: pid 29311 tid 29432 thread 23 bound to OS proc set {23}
OMP: pid 29311 tid 29459 thread 50 bound to OS proc set {50}
OMP: pid 29311 tid 29431 thread 22 bound to OS proc set {22}
OMP: pid 29311 tid 29437 thread 28 bound to OS proc set {28}
OMP: pid 29311 tid 29430 thread 21 bound to OS proc set {21}
OMP: pid 29311 tid 29472 thread 63 bound to OS proc set {63}
OMP: pid 29311 tid 29441 thread 32 bound to OS proc set {32}
OMP: pid 29311 tid 29476 thread 67 bound to OS proc set {67}
OMP: pid 29311 tid 29487 thread 78 bound to OS proc set {78}
OMP: pid 29311 tid 29443 thread 34 bound to OS proc set {34}
OMP: pid 29311 tid 29469 thread 60 bound to OS proc set {60}
OMP: pid 29311 tid 29439 thread 30 bound to OS proc set {30}
OMP: pid 29311 tid 29470 thread 61 bound to OS proc set {61}
OMP: pid 29311 tid 29460 thread 51 bound to OS proc set {51}
OMP: pid 29311 tid 29471 thread 62 bound to OS proc set {62}
OMP: pid 29311 tid 29457 thread 48 bound to OS proc set {48}
OMP: pid 29311 tid 29475 thread 66 bound to OS proc set {66}
OMP: pid 29311 tid 29461 thread 52 bound to OS proc set {52}
OMP: pid 29311 tid 29467 thread 58 bound to OS proc set {58}
OMP: pid 29311 tid 29453 thread 44 bound to OS proc set {44}
OMP: pid 29311 tid 29488 thread 79 bound to OS proc set {79}
OMP: pid 29311 tid 29449 thread 40 bound to OS proc set {40}
OMP: pid 29311 tid 29440 thread 31 bound to OS proc set {31}
OMP: pid 29311 tid 29468 thread 59 bound to OS proc set {59}
OMP: pid 29311 tid 29447 thread 38 bound to OS proc set {38}
OMP: pid 29311 tid 29458 thread 49 bound to OS proc set {49}
OMP: pid 29311 tid 29435 thread 26 bound to OS proc set {26}
OMP: pid 29311 tid 29465 thread 56 bound to OS proc set {56}
OMP: pid 29311 tid 29444 thread 35 bound to OS proc set {35}
OMP: pid 29311 tid 29456 thread 47 bound to OS proc set {47}
OMP: pid 29311 tid 29455 thread 46 bound to OS proc set {46}
OMP: pid 29311 tid 29486 thread 77 bound to OS proc set {77}
OMP: pid 29311 tid 29436 thread 27 bound to OS proc set {27}
OMP: pid 29311 tid 29483 thread 74 bound to OS proc set {74}
OMP: pid 29311 tid 29451 thread 42 bound to OS proc set {42}
OMP: pid 29311 tid 29484 thread 75 bound to OS proc set {75}
OMP: pid 29311 tid 29504 thread 95 bound to OS proc set {95}
OMP: pid 29311 tid 29445 thread 36 bound to OS proc set {36}
OMP: pid 29311 tid 29434 thread 25 bound to OS proc set {25}
OMP: pid 29311 tid 29491 thread 82 bound to OS proc set {82}
OMP: pid 29311 tid 29448 thread 39 bound to OS proc set {39}
OMP: pid 29311 tid 29452 thread 43 bound to OS proc set {43}
OMP: pid 29311 tid 29446 thread 37 bound to OS proc set {37}
OMP: pid 29311 tid 29450 thread 41 bound to OS proc set {41}
OMP: pid 29311 tid 29466 thread 57 bound to OS proc set {57}
OMP: pid 29311 tid 29485 thread 76 bound to OS proc set {76}
OMP: pid 29311 tid 29503 thread 94 bound to OS proc set {94}
OMP: pid 29311 tid 29464 thread 55 bound to OS proc set {55}
OMP: pid 29311 tid 29473 thread 64 bound to OS proc set {64}
OMP: pid 29311 tid 29463 thread 54 bound to OS proc set {54}
OMP: pid 29311 tid 29482 thread 73 bound to OS proc set {73}
OMP: pid 29311 tid 29477 thread 68 bound to OS proc set {68}
OMP: pid 29311 tid 29479 thread 70 bound to OS proc set {70}
OMP: pid 29311 tid 29480 thread 71 bound to OS proc set {71}
OMP: pid 29311 tid 29481 thread 72 bound to OS proc set {72}
OMP: pid 29311 tid 29490 thread 81 bound to OS proc set {81}
OMP: pid 29311 tid 29492 thread 83 bound to OS proc set {83}
OMP: pid 29311 tid 29499 thread 90 bound to OS proc set {90}
OMP: pid 29311 tid 29500 thread 91 bound to OS proc set {91}
OMP: pid 29311 tid 29414 thread 5 bound to OS proc set {5}
OMP: pid 29311 tid 29489 thread 80 bound to OS proc set {80}
OMP: pid 29311 tid 29496 thread 87 bound to OS proc set {87}
OMP: pid 29311 tid 29501 thread 92 bound to OS proc set {92}
OMP: pid 29311 tid 29502 thread 93 bound to OS proc set {93}
OMP: pid 29311 tid 29493 thread 84 bound to OS proc set {84}
OMP: pid 29311 tid 29498 thread 89 bound to OS proc set {89}
OMP: pid 29311 tid 29495 thread 86 bound to OS proc set {86}
OMP: pid 29311 tid 29474 thread 65 bound to OS proc set {65}
OMP: pid 29311 tid 29497 thread 88 bound to OS proc set {88}
OMP: pid 29311 tid 29442 thread 33 bound to OS proc set {33}
OMP: pid 29311 tid 29478 thread 69 bound to OS proc set {69}
OMP: pid 29311 tid 29494 thread 85 bound to OS proc set {85}
OMP: pid 29311 tid 29462 thread 53 bound to OS proc set {53}
OMP: pid 29311 tid 29438 thread 29 bound to OS proc set {29}
OMP: pid 29311 tid 29454 thread 45 bound to OS proc set {45}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.002769, "speed_pp": 1022.584229, "t_tg": 0.000000, "speed_tg": nan, "t": 2.002769, "speed": 1022.584229}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-398-1667/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_12-04-19/tools/lprof_npsu_run_14 #
#########################################################################################################################################################################################################################################