options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 7.840710, "speed_pp": 16.325052, "t_tg": 0.000000, "speed_tg": nan, "t": 7.840710, "speed": 16.325052}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_0  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 855055 tid 855055 thread 0 bound to OS proc set {0}
OMP: pid 855055 tid 855154 thread 1 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 3.908022, "speed_pp": 32.753143, "t_tg": 0.000000, "speed_tg": nan, "t": 3.908022, "speed": 32.753143}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_1  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 855174 tid 855174 thread 0 bound to OS proc set {0}
OMP: pid 855174 tid 855274 thread 2 bound to OS proc set {48}
OMP: pid 855174 tid 855273 thread 1 bound to OS proc set {24}
OMP: pid 855174 tid 855275 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 1.978064, "speed_pp": 64.709740, "t_tg": 0.000000, "speed_tg": nan, "t": 1.978064, "speed": 64.709740}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_2  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 855298 tid 855298 thread 0 bound to OS proc set {0}
OMP: pid 855298 tid 855399 thread 3 bound to OS proc set {36}
OMP: pid 855298 tid 855398 thread 2 bound to OS proc set {24}
OMP: pid 855298 tid 855400 thread 4 bound to OS proc set {48}
OMP: pid 855298 tid 855397 thread 1 bound to OS proc set {12}
OMP: pid 855298 tid 855402 thread 6 bound to OS proc set {72}
OMP: pid 855298 tid 855401 thread 5 bound to OS proc set {60}
OMP: pid 855298 tid 855403 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.991181, "speed_pp": 129.138870, "t_tg": 0.000000, "speed_tg": nan, "t": 0.991181, "speed": 129.138870}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_3  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 855472 tid 855472 thread 0 bound to OS proc set {0}
OMP: pid 855472 tid 855572 thread 2 bound to OS proc set {12}
OMP: pid 855472 tid 855573 thread 3 bound to OS proc set {18}
OMP: pid 855472 tid 855574 thread 4 bound to OS proc set {24}
OMP: pid 855472 tid 855571 thread 1 bound to OS proc set {6}
OMP: pid 855472 tid 855578 thread 8 bound to OS proc set {48}
OMP: pid 855472 tid 855584 thread 14 bound to OS proc set {84}
OMP: pid 855472 tid 855582 thread 12 bound to OS proc set {72}
OMP: pid 855472 tid 855581 thread 11 bound to OS proc set {66}
OMP: pid 855472 tid 855580 thread 10 bound to OS proc set {60}
OMP: pid 855472 tid 855577 thread 7 bound to OS proc set {42}
OMP: pid 855472 tid 855575 thread 5 bound to OS proc set {30}
OMP: pid 855472 tid 855583 thread 13 bound to OS proc set {78}
OMP: pid 855472 tid 855576 thread 6 bound to OS proc set {36}
OMP: pid 855472 tid 855579 thread 9 bound to OS proc set {54}
OMP: pid 855472 tid 855585 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.509760, "speed_pp": 251.098541, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 0.509761, "speed": 251.098053}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_4  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 855605 tid 855605 thread 0 bound to OS proc set {0}
OMP: pid 855605 tid 855708 thread 3 bound to OS proc set {12}
OMP: pid 855605 tid 855707 thread 2 bound to OS proc set {8}
OMP: pid 855605 tid 855720 thread 15 bound to OS proc set {60}
OMP: pid 855605 tid 855716 thread 11 bound to OS proc set {44}
OMP: pid 855605 tid 855719 thread 14 bound to OS proc set {56}
OMP: pid 855605 tid 855706 thread 1 bound to OS proc set {4}
OMP: pid 855605 tid 855717 thread 12 bound to OS proc set {48}
OMP: pid 855605 tid 855710 thread 5 bound to OS proc set {20}
OMP: pid 855605 tid 855709 thread 4 bound to OS proc set {16}
OMP: pid 855605 tid 855714 thread 9 bound to OS proc set {36}
OMP: pid 855605 tid 855724 thread 19 bound to OS proc set {76}
OMP: pid 855605 tid 855713 thread 8 bound to OS proc set {32}
OMP: pid 855605 tid 855721 thread 16 bound to OS proc set {64}
OMP: pid 855605 tid 855725 thread 20 bound to OS proc set {80}
OMP: pid 855605 tid 855723 thread 18 bound to OS proc set {72}
OMP: pid 855605 tid 855711 thread 6 bound to OS proc set {24}
OMP: pid 855605 tid 855718 thread 13 bound to OS proc set {52}
OMP: pid 855605 tid 855715 thread 10 bound to OS proc set {40}
OMP: pid 855605 tid 855712 thread 7 bound to OS proc set {28}
OMP: pid 855605 tid 855722 thread 17 bound to OS proc set {68}
OMP: pid 855605 tid 855726 thread 21 bound to OS proc set {84}
OMP: pid 855605 tid 855727 thread 22 bound to OS proc set {88}
OMP: pid 855605 tid 855728 thread 23 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.377279, "speed_pp": 339.271454, "t_tg": 0.000000, "speed_tg": nan, "t": 0.377279, "speed": 339.271454}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_5  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 855748 tid 855748 thread 0 bound to OS proc set {0}
OMP: pid 855748 tid 855850 thread 4 bound to OS proc set {12}
OMP: pid 855748 tid 855858 thread 12 bound to OS proc set {36}
OMP: pid 855748 tid 855849 thread 3 bound to OS proc set {9}
OMP: pid 855748 tid 855853 thread 7 bound to OS proc set {21}
OMP: pid 855748 tid 855851 thread 5 bound to OS proc set {15}
OMP: pid 855748 tid 855860 thread 14 bound to OS proc set {42}
OMP: pid 855748 tid 855852 thread 6 bound to OS proc set {18}
OMP: pid 855748 tid 855848 thread 2 bound to OS proc set {6}
OMP: pid 855748 tid 855861 thread 15 bound to OS proc set {45}
OMP: pid 855748 tid 855854 thread 8 bound to OS proc set {24}
OMP: pid 855748 tid 855857 thread 11 bound to OS proc set {33}
OMP: pid 855748 tid 855847 thread 1 bound to OS proc set {3}
OMP: pid 855748 tid 855859 thread 13 bound to OS proc set {39}
OMP: pid 855748 tid 855874 thread 28 bound to OS proc set {84}
OMP: pid 855748 tid 855856 thread 10 bound to OS proc set {30}
OMP: pid 855748 tid 855864 thread 18 bound to OS proc set {54}
OMP: pid 855748 tid 855862 thread 16 bound to OS proc set {48}
OMP: pid 855748 tid 855876 thread 30 bound to OS proc set {90}
OMP: pid 855748 tid 855870 thread 24 bound to OS proc set {72}
OMP: pid 855748 tid 855872 thread 26 bound to OS proc set {78}
OMP: pid 855748 tid 855875 thread 29 bound to OS proc set {87}
OMP: pid 855748 tid 855873 thread 27 bound to OS proc set {81}
OMP: pid 855748 tid 855877 thread 31 bound to OS proc set {93}
OMP: pid 855748 tid 855865 thread 19 bound to OS proc set {57}
OMP: pid 855748 tid 855863 thread 17 bound to OS proc set {51}
OMP: pid 855748 tid 855869 thread 23 bound to OS proc set {69}
OMP: pid 855748 tid 855871 thread 25 bound to OS proc set {75}
OMP: pid 855748 tid 855866 thread 20 bound to OS proc set {60}
OMP: pid 855748 tid 855868 thread 22 bound to OS proc set {66}
OMP: pid 855748 tid 855855 thread 9 bound to OS proc set {27}
OMP: pid 855748 tid 855867 thread 21 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.303041, "speed_pp": 422.385071, "t_tg": 0.000000, "speed_tg": nan, "t": 0.303041, "speed": 422.385071}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_6  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 855897 tid 855897 thread 0 bound to OS proc set {0}
OMP: pid 855897 tid 856009 thread 14 bound to OS proc set {33}
OMP: pid 855897 tid 855998 thread 3 bound to OS proc set {7}
OMP: pid 855897 tid 856027 thread 32 bound to OS proc set {77}
OMP: pid 855897 tid 855997 thread 2 bound to OS proc set {4}
OMP: pid 855897 tid 856002 thread 7 bound to OS proc set {16}
OMP: pid 855897 tid 856000 thread 5 bound to OS proc set {12}
OMP: pid 855897 tid 856029 thread 34 bound to OS proc set {82}
OMP: pid 855897 tid 855996 thread 1 bound to OS proc set {2}
OMP: pid 855897 tid 856028 thread 33 bound to OS proc set {80}
OMP: pid 855897 tid 856031 thread 36 bound to OS proc set {87}
OMP: pid 855897 tid 856033 thread 38 bound to OS proc set {92}
OMP: pid 855897 tid 856001 thread 6 bound to OS proc set {14}
OMP: pid 855897 tid 856026 thread 31 bound to OS proc set {75}
OMP: pid 855897 tid 856010 thread 15 bound to OS proc set {36}
OMP: pid 855897 tid 855999 thread 4 bound to OS proc set {9}
OMP: pid 855897 tid 856003 thread 8 bound to OS proc set {19}
OMP: pid 855897 tid 856008 thread 13 bound to OS proc set {31}
OMP: pid 855897 tid 856005 thread 10 bound to OS proc set {24}
OMP: pid 855897 tid 856007 thread 12 bound to OS proc set {29}
OMP: pid 855897 tid 856011 thread 16 bound to OS proc set {38}
OMP: pid 855897 tid 856030 thread 35 bound to OS proc set {84}
OMP: pid 855897 tid 856013 thread 18 bound to OS proc set {43}
OMP: pid 855897 tid 856006 thread 11 bound to OS proc set {26}
OMP: pid 855897 tid 856022 thread 27 bound to OS proc set {65}
OMP: pid 855897 tid 856025 thread 30 bound to OS proc set {72}
OMP: pid 855897 tid 856019 thread 24 bound to OS proc set {58}
OMP: pid 855897 tid 856012 thread 17 bound to OS proc set {41}
OMP: pid 855897 tid 856024 thread 29 bound to OS proc set {70}
OMP: pid 855897 tid 856023 thread 28 bound to OS proc set {67}
OMP: pid 855897 tid 856021 thread 26 bound to OS proc set {63}
OMP: pid 855897 tid 856004 thread 9 bound to OS proc set {21}
OMP: pid 855897 tid 856032 thread 37 bound to OS proc set {89}
OMP: pid 855897 tid 856020 thread 25 bound to OS proc set {60}
OMP: pid 855897 tid 856014 thread 19 bound to OS proc set {46}
OMP: pid 855897 tid 856017 thread 22 bound to OS proc set {53}
OMP: pid 855897 tid 856018 thread 23 bound to OS proc set {55}
OMP: pid 855897 tid 856015 thread 20 bound to OS proc set {48}
OMP: pid 855897 tid 856016 thread 21 bound to OS proc set {50}
OMP: pid 855897 tid 856034 thread 39 bound to OS proc set {94}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.258599, "speed_pp": 494.974823, "t_tg": 0.000000, "speed_tg": nan, "t": 0.258599, "speed": 494.974823}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_7  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 856054 tid 856054 thread 0 bound to OS proc set {0}
OMP: pid 856054 tid 856167 thread 15 bound to OS proc set {30}
OMP: pid 856054 tid 856163 thread 11 bound to OS proc set {22}
OMP: pid 856054 tid 856159 thread 7 bound to OS proc set {14}
OMP: pid 856054 tid 856154 thread 2 bound to OS proc set {4}
OMP: pid 856054 tid 856155 thread 3 bound to OS proc set {6}
OMP: pid 856054 tid 856153 thread 1 bound to OS proc set {2}
OMP: pid 856054 tid 856156 thread 4 bound to OS proc set {8}
OMP: pid 856054 tid 856168 thread 16 bound to OS proc set {32}
OMP: pid 856054 tid 856161 thread 9 bound to OS proc set {18}
OMP: pid 856054 tid 856160 thread 8 bound to OS proc set {16}
OMP: pid 856054 tid 856158 thread 6 bound to OS proc set {12}
OMP: pid 856054 tid 856196 thread 44 bound to OS proc set {88}
OMP: pid 856054 tid 856170 thread 18 bound to OS proc set {36}
OMP: pid 856054 tid 856164 thread 12 bound to OS proc set {24}
OMP: pid 856054 tid 856187 thread 35 bound to OS proc set {70}
OMP: pid 856054 tid 856176 thread 24 bound to OS proc set {48}
OMP: pid 856054 tid 856157 thread 5 bound to OS proc set {10}
OMP: pid 856054 tid 856184 thread 32 bound to OS proc set {64}
OMP: pid 856054 tid 856199 thread 47 bound to OS proc set {94}
OMP: pid 856054 tid 856175 thread 23 bound to OS proc set {46}
OMP: pid 856054 tid 856186 thread 34 bound to OS proc set {68}
OMP: pid 856054 tid 856183 thread 31 bound to OS proc set {62}
OMP: pid 856054 tid 856162 thread 10 bound to OS proc set {20}
OMP: pid 856054 tid 856192 thread 40 bound to OS proc set {80}
OMP: pid 856054 tid 856195 thread 43 bound to OS proc set {86}
OMP: pid 856054 tid 856166 thread 14 bound to OS proc set {28}
OMP: pid 856054 tid 856169 thread 17 bound to OS proc set {34}
OMP: pid 856054 tid 856182 thread 30 bound to OS proc set {60}
OMP: pid 856054 tid 856198 thread 46 bound to OS proc set {92}
OMP: pid 856054 tid 856172 thread 20 bound to OS proc set {40}
OMP: pid 856054 tid 856197 thread 45 bound to OS proc set {90}
OMP: pid 856054 tid 856171 thread 19 bound to OS proc set {38}
OMP: pid 856054 tid 856165 thread 13 bound to OS proc set {26}
OMP: pid 856054 tid 856179 thread 27 bound to OS proc set {54}
OMP: pid 856054 tid 856191 thread 39 bound to OS proc set {78}
OMP: pid 856054 tid 856194 thread 42 bound to OS proc set {84}
OMP: pid 856054 tid 856180 thread 28 bound to OS proc set {56}
OMP: pid 856054 tid 856177 thread 25 bound to OS proc set {50}
OMP: pid 856054 tid 856178 thread 26 bound to OS proc set {52}
OMP: pid 856054 tid 856185 thread 33 bound to OS proc set {66}
OMP: pid 856054 tid 856190 thread 38 bound to OS proc set {76}
OMP: pid 856054 tid 856174 thread 22 bound to OS proc set {44}
OMP: pid 856054 tid 856188 thread 36 bound to OS proc set {72}
OMP: pid 856054 tid 856181 thread 29 bound to OS proc set {58}
OMP: pid 856054 tid 856189 thread 37 bound to OS proc set {74}
OMP: pid 856054 tid 856193 thread 41 bound to OS proc set {82}
OMP: pid 856054 tid 856173 thread 21 bound to OS proc set {42}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.229819, "speed_pp": 556.960022, "t_tg": 0.000000, "speed_tg": nan, "t": 0.229819, "speed": 556.960022}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_8  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 856219 tid 856219 thread 0 bound to OS proc set {0}
OMP: pid 856219 tid 856318 thread 1 bound to OS proc set {1}
OMP: pid 856219 tid 856319 thread 2 bound to OS proc set {3}
OMP: pid 856219 tid 856325 thread 8 bound to OS proc set {13}
OMP: pid 856219 tid 856324 thread 7 bound to OS proc set {12}
OMP: pid 856219 tid 856326 thread 9 bound to OS proc set {15}
OMP: pid 856219 tid 856328 thread 11 bound to OS proc set {19}
OMP: pid 856219 tid 856336 thread 19 bound to OS proc set {32}
OMP: pid 856219 tid 856329 thread 12 bound to OS proc set {20}
OMP: pid 856219 tid 856321 thread 4 bound to OS proc set {6}
OMP: pid 856219 tid 856344 thread 27 bound to OS proc set {46}
OMP: pid 856219 tid 856372 thread 55 bound to OS proc set {95}
OMP: pid 856219 tid 856368 thread 51 bound to OS proc set {88}
OMP: pid 856219 tid 856352 thread 35 bound to OS proc set {60}
OMP: pid 856219 tid 856371 thread 54 bound to OS proc set {93}
OMP: pid 856219 tid 856349 thread 32 bound to OS proc set {55}
OMP: pid 856219 tid 856369 thread 52 bound to OS proc set {90}
OMP: pid 856219 tid 856331 thread 14 bound to OS proc set {24}
OMP: pid 856219 tid 856323 thread 6 bound to OS proc set {10}
OMP: pid 856219 tid 856327 thread 10 bound to OS proc set {17}
OMP: pid 856219 tid 856347 thread 30 bound to OS proc set {51}
OMP: pid 856219 tid 856320 thread 3 bound to OS proc set {5}
OMP: pid 856219 tid 856343 thread 26 bound to OS proc set {45}
OMP: pid 856219 tid 856341 thread 24 bound to OS proc set {41}
OMP: pid 856219 tid 856348 thread 31 bound to OS proc set {53}
OMP: pid 856219 tid 856337 thread 20 bound to OS proc set {34}
OMP: pid 856219 tid 856351 thread 34 bound to OS proc set {58}
OMP: pid 856219 tid 856350 thread 33 bound to OS proc set {57}
OMP: pid 856219 tid 856335 thread 18 bound to OS proc set {31}
OMP: pid 856219 tid 856364 thread 47 bound to OS proc set {81}
OMP: pid 856219 tid 856342 thread 25 bound to OS proc set {43}
OMP: pid 856219 tid 856340 thread 23 bound to OS proc set {39}
OMP: pid 856219 tid 856333 thread 16 bound to OS proc set {27}
OMP: pid 856219 tid 856357 thread 40 bound to OS proc set {69}
OMP: pid 856219 tid 856332 thread 15 bound to OS proc set {25}
OMP: pid 856219 tid 856345 thread 28 bound to OS proc set {48}
OMP: pid 856219 tid 856322 thread 5 bound to OS proc set {8}
OMP: pid 856219 tid 856346 thread 29 bound to OS proc set {50}
OMP: pid 856219 tid 856360 thread 43 bound to OS proc set {74}
OMP: pid 856219 tid 856361 thread 44 bound to OS proc set {76}
OMP: pid 856219 tid 856334 thread 17 bound to OS proc set {29}
OMP: pid 856219 tid 856355 thread 38 bound to OS proc set {65}
OMP: pid 856219 tid 856370 thread 53 bound to OS proc set {91}
OMP: pid 856219 tid 856330 thread 13 bound to OS proc set {22}
OMP: pid 856219 tid 856339 thread 22 bound to OS proc set {38}
OMP: pid 856219 tid 856353 thread 36 bound to OS proc set {62}
OMP: pid 856219 tid 856359 thread 42 bound to OS proc set {72}
OMP: pid 856219 tid 856354 thread 37 bound to OS proc set {64}
OMP: pid 856219 tid 856367 thread 50 bound to OS proc set {86}
OMP: pid 856219 tid 856356 thread 39 bound to OS proc set {67}
OMP: pid 856219 tid 856363 thread 46 bound to OS proc set {79}
OMP: pid 856219 tid 856366 thread 49 bound to OS proc set {84}
OMP: pid 856219 tid 856365 thread 48 bound to OS proc set {83}
OMP: pid 856219 tid 856362 thread 45 bound to OS proc set {77}
OMP: pid 856219 tid 856358 thread 41 bound to OS proc set {71}
OMP: pid 856219 tid 856338 thread 21 bound to OS proc set {36}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.207445, "speed_pp": 617.031006, "t_tg": 0.000000, "speed_tg": nan, "t": 0.207445, "speed": 617.031006}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9

To display your profiling results:
########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                #
########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_9  #
########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 856441 tid 856441 thread 0 bound to OS proc set {0}
OMP: pid 856441 tid 856540 thread 1 bound to OS proc set {1}
OMP: pid 856441 tid 856541 thread 2 bound to OS proc set {3}
OMP: pid 856441 tid 856547 thread 8 bound to OS proc set {12}
OMP: pid 856441 tid 856554 thread 15 bound to OS proc set {22}
OMP: pid 856441 tid 856542 thread 3 bound to OS proc set {4}
OMP: pid 856441 tid 856590 thread 51 bound to OS proc set {77}
OMP: pid 856441 tid 856553 thread 14 bound to OS proc set {21}
OMP: pid 856441 tid 856602 thread 63 bound to OS proc set {95}
OMP: pid 856441 tid 856546 thread 7 bound to OS proc set {10}
OMP: pid 856441 tid 856599 thread 60 bound to OS proc set {90}
OMP: pid 856441 tid 856548 thread 9 bound to OS proc set {13}
OMP: pid 856441 tid 856589 thread 50 bound to OS proc set {75}
OMP: pid 856441 tid 856552 thread 13 bound to OS proc set {19}
OMP: pid 856441 tid 856558 thread 19 bound to OS proc set {28}
OMP: pid 856441 tid 856551 thread 12 bound to OS proc set {18}
OMP: pid 856441 tid 856543 thread 4 bound to OS proc set {6}
OMP: pid 856441 tid 856587 thread 48 bound to OS proc set {72}
OMP: pid 856441 tid 856569 thread 30 bound to OS proc set {45}
OMP: pid 856441 tid 856571 thread 32 bound to OS proc set {48}
OMP: pid 856441 tid 856557 thread 18 bound to OS proc set {27}
OMP: pid 856441 tid 856550 thread 11 bound to OS proc set {16}
OMP: pid 856441 tid 856544 thread 5 bound to OS proc set {7}
OMP: pid 856441 tid 856601 thread 62 bound to OS proc set {93}
OMP: pid 856441 tid 856570 thread 31 bound to OS proc set {46}
OMP: pid 856441 tid 856574 thread 35 bound to OS proc set {53}
OMP: pid 856441 tid 856573 thread 34 bound to OS proc set {51}
OMP: pid 856441 tid 856549 thread 10 bound to OS proc set {15}
OMP: pid 856441 tid 856555 thread 16 bound to OS proc set {24}
OMP: pid 856441 tid 856562 thread 23 bound to OS proc set {34}
OMP: pid 856441 tid 856572 thread 33 bound to OS proc set {50}
OMP: pid 856441 tid 856567 thread 28 bound to OS proc set {42}
OMP: pid 856441 tid 856598 thread 59 bound to OS proc set {89}
OMP: pid 856441 tid 856600 thread 61 bound to OS proc set {92}
OMP: pid 856441 tid 856586 thread 47 bound to OS proc set {71}
OMP: pid 856441 tid 856545 thread 6 bound to OS proc set {9}
OMP: pid 856441 tid 856588 thread 49 bound to OS proc set {74}
OMP: pid 856441 tid 856568 thread 29 bound to OS proc set {43}
OMP: pid 856441 tid 856595 thread 56 bound to OS proc set {84}
OMP: pid 856441 tid 856578 thread 39 bound to OS proc set {59}
OMP: pid 856441 tid 856563 thread 24 bound to OS proc set {36}
OMP: pid 856441 tid 856556 thread 17 bound to OS proc set {25}
OMP: pid 856441 tid 856579 thread 40 bound to OS proc set {60}
OMP: pid 856441 tid 856566 thread 27 bound to OS proc set {40}
OMP: pid 856441 tid 856594 thread 55 bound to OS proc set {83}
OMP: pid 856441 tid 856575 thread 36 bound to OS proc set {54}
OMP: pid 856441 tid 856577 thread 38 bound to OS proc set {57}
OMP: pid 856441 tid 856583 thread 44 bound to OS proc set {66}
OMP: pid 856441 tid 856591 thread 52 bound to OS proc set {78}
OMP: pid 856441 tid 856580 thread 41 bound to OS proc set {62}
OMP: pid 856441 tid 856585 thread 46 bound to OS proc set {69}
OMP: pid 856441 tid 856582 thread 43 bound to OS proc set {65}
OMP: pid 856441 tid 856581 thread 42 bound to OS proc set {63}
OMP: pid 856441 tid 856565 thread 26 bound to OS proc set {39}
OMP: pid 856441 tid 856559 thread 20 bound to OS proc set {30}
OMP: pid 856441 tid 856561 thread 22 bound to OS proc set {33}
OMP: pid 856441 tid 856593 thread 54 bound to OS proc set {81}
OMP: pid 856441 tid 856560 thread 21 bound to OS proc set {31}
OMP: pid 856441 tid 856576 thread 37 bound to OS proc set {56}
OMP: pid 856441 tid 856584 thread 45 bound to OS proc set {68}
OMP: pid 856441 tid 856597 thread 58 bound to OS proc set {87}
OMP: pid 856441 tid 856596 thread 57 bound to OS proc set {86}
OMP: pid 856441 tid 856592 thread 53 bound to OS proc set {80}
OMP: pid 856441 tid 856564 thread 25 bound to OS proc set {37}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.194744, "speed_pp": 657.273132, "t_tg": 0.000000, "speed_tg": nan, "t": 0.194744, "speed": 657.273132}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_10  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 856622 tid 856622 thread 0 bound to OS proc set {0}
OMP: pid 856622 tid 856722 thread 2 bound to OS proc set {2}
OMP: pid 856622 tid 856721 thread 1 bound to OS proc set {1}
OMP: pid 856622 tid 856732 thread 12 bound to OS proc set {16}
OMP: pid 856622 tid 856786 thread 66 bound to OS proc set {88}
OMP: pid 856622 tid 856784 thread 64 bound to OS proc set {86}
OMP: pid 856622 tid 856771 thread 51 bound to OS proc set {68}
OMP: pid 856622 tid 856791 thread 71 bound to OS proc set {95}
OMP: pid 856622 tid 856733 thread 13 bound to OS proc set {17}
OMP: pid 856622 tid 856787 thread 67 bound to OS proc set {90}
OMP: pid 856622 tid 856770 thread 50 bound to OS proc set {67}
OMP: pid 856622 tid 856735 thread 15 bound to OS proc set {20}
OMP: pid 856622 tid 856731 thread 11 bound to OS proc set {14}
OMP: pid 856622 tid 856755 thread 35 bound to OS proc set {47}
OMP: pid 856622 tid 856754 thread 34 bound to OS proc set {45}
OMP: pid 856622 tid 856727 thread 7 bound to OS proc set {9}
OMP: pid 856622 tid 856788 thread 68 bound to OS proc set {91}
OMP: pid 856622 tid 856747 thread 27 bound to OS proc set {36}
OMP: pid 856622 tid 856723 thread 3 bound to OS proc set {4}
OMP: pid 856622 tid 856783 thread 63 bound to OS proc set {84}
OMP: pid 856622 tid 856767 thread 47 bound to OS proc set {63}
OMP: pid 856622 tid 856728 thread 8 bound to OS proc set {10}
OMP: pid 856622 tid 856730 thread 10 bound to OS proc set {13}
OMP: pid 856622 tid 856790 thread 70 bound to OS proc set {94}
OMP: pid 856622 tid 856726 thread 6 bound to OS proc set {8}
OMP: pid 856622 tid 856725 thread 5 bound to OS proc set {6}
OMP: pid 856622 tid 856775 thread 55 bound to OS proc set {74}
OMP: pid 856622 tid 856729 thread 9 bound to OS proc set {12}
OMP: pid 856622 tid 856751 thread 31 bound to OS proc set {41}
OMP: pid 856622 tid 856724 thread 4 bound to OS proc set {5}
OMP: pid 856622 tid 856738 thread 18 bound to OS proc set {24}
OMP: pid 856622 tid 856785 thread 65 bound to OS proc set {87}
OMP: pid 856622 tid 856745 thread 25 bound to OS proc set {33}
OMP: pid 856622 tid 856739 thread 19 bound to OS proc set {25}
OMP: pid 856622 tid 856736 thread 16 bound to OS proc set {21}
OMP: pid 856622 tid 856746 thread 26 bound to OS proc set {35}
OMP: pid 856622 tid 856753 thread 33 bound to OS proc set {44}
OMP: pid 856622 tid 856734 thread 14 bound to OS proc set {18}
OMP: pid 856622 tid 856764 thread 44 bound to OS proc set {59}
OMP: pid 856622 tid 856742 thread 22 bound to OS proc set {29}
OMP: pid 856622 tid 856743 thread 23 bound to OS proc set {30}
OMP: pid 856622 tid 856752 thread 32 bound to OS proc set {43}
OMP: pid 856622 tid 856772 thread 52 bound to OS proc set {70}
OMP: pid 856622 tid 856769 thread 49 bound to OS proc set {66}
OMP: pid 856622 tid 856744 thread 24 bound to OS proc set {32}
OMP: pid 856622 tid 856779 thread 59 bound to OS proc set {79}
OMP: pid 856622 tid 856750 thread 30 bound to OS proc set {40}
OMP: pid 856622 tid 856760 thread 40 bound to OS proc set {53}
OMP: pid 856622 tid 856789 thread 69 bound to OS proc set {92}
OMP: pid 856622 tid 856749 thread 29 bound to OS proc set {39}
OMP: pid 856622 tid 856740 thread 20 bound to OS proc set {26}
OMP: pid 856622 tid 856766 thread 46 bound to OS proc set {61}
OMP: pid 856622 tid 856768 thread 48 bound to OS proc set {64}
OMP: pid 856622 tid 856774 thread 54 bound to OS proc set {72}
OMP: pid 856622 tid 856782 thread 62 bound to OS proc set {83}
OMP: pid 856622 tid 856756 thread 36 bound to OS proc set {48}
OMP: pid 856622 tid 856780 thread 60 bound to OS proc set {80}
OMP: pid 856622 tid 856757 thread 37 bound to OS proc set {49}
OMP: pid 856622 tid 856737 thread 17 bound to OS proc set {22}
OMP: pid 856622 tid 856748 thread 28 bound to OS proc set {37}
OMP: pid 856622 tid 856741 thread 21 bound to OS proc set {28}
OMP: pid 856622 tid 856773 thread 53 bound to OS proc set {71}
OMP: pid 856622 tid 856763 thread 43 bound to OS proc set {57}
OMP: pid 856622 tid 856761 thread 41 bound to OS proc set {55}
OMP: pid 856622 tid 856758 thread 38 bound to OS proc set {51}
OMP: pid 856622 tid 856759 thread 39 bound to OS proc set {52}
OMP: pid 856622 tid 856765 thread 45 bound to OS proc set {60}
OMP: pid 856622 tid 856778 thread 58 bound to OS proc set {78}
OMP: pid 856622 tid 856762 thread 42 bound to OS proc set {56}
OMP: pid 856622 tid 856776 thread 56 bound to OS proc set {75}
OMP: pid 856622 tid 856781 thread 61 bound to OS proc set {82}
OMP: pid 856622 tid 856777 thread 57 bound to OS proc set {76}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.180909, "speed_pp": 707.538086, "t_tg": 0.000000, "speed_tg": nan, "t": 0.180909, "speed": 707.538086}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_11  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 856811 tid 856811 thread 0 bound to OS proc set {0}
OMP: pid 856811 tid 856912 thread 3 bound to OS proc set {3}
OMP: pid 856811 tid 856911 thread 2 bound to OS proc set {2}
OMP: pid 856811 tid 856913 thread 4 bound to OS proc set {4}
OMP: pid 856811 tid 856910 thread 1 bound to OS proc set {1}
OMP: pid 856811 tid 856976 thread 67 bound to OS proc set {81}
OMP: pid 856811 tid 856988 thread 79 bound to OS proc set {95}
OMP: pid 856811 tid 856924 thread 15 bound to OS proc set {18}
OMP: pid 856811 tid 856921 thread 12 bound to OS proc set {14}
OMP: pid 856811 tid 856916 thread 7 bound to OS proc set {8}
OMP: pid 856811 tid 856975 thread 66 bound to OS proc set {80}
OMP: pid 856811 tid 856937 thread 28 bound to OS proc set {33}
OMP: pid 856811 tid 856920 thread 11 bound to OS proc set {13}
OMP: pid 856811 tid 856956 thread 47 bound to OS proc set {56}
OMP: pid 856811 tid 856987 thread 78 bound to OS proc set {94}
OMP: pid 856811 tid 856923 thread 14 bound to OS proc set {16}
OMP: pid 856811 tid 856985 thread 76 bound to OS proc set {92}
OMP: pid 856811 tid 856922 thread 13 bound to OS proc set {15}
OMP: pid 856811 tid 856936 thread 27 bound to OS proc set {32}
OMP: pid 856811 tid 856984 thread 75 bound to OS proc set {90}
OMP: pid 856811 tid 856986 thread 77 bound to OS proc set {93}
OMP: pid 856811 tid 856944 thread 35 bound to OS proc set {42}
OMP: pid 856811 tid 856914 thread 5 bound to OS proc set {6}
OMP: pid 856811 tid 856939 thread 30 bound to OS proc set {36}
OMP: pid 856811 tid 856933 thread 24 bound to OS proc set {29}
OMP: pid 856811 tid 856952 thread 43 bound to OS proc set {52}
OMP: pid 856811 tid 856917 thread 8 bound to OS proc set {9}
OMP: pid 856811 tid 856925 thread 16 bound to OS proc set {19}
OMP: pid 856811 tid 856974 thread 65 bound to OS proc set {78}
OMP: pid 856811 tid 856935 thread 26 bound to OS proc set {31}
OMP: pid 856811 tid 856971 thread 62 bound to OS proc set {75}
OMP: pid 856811 tid 856958 thread 49 bound to OS proc set {59}
OMP: pid 856811 tid 856942 thread 33 bound to OS proc set {40}
OMP: pid 856811 tid 856915 thread 6 bound to OS proc set {7}
OMP: pid 856811 tid 856932 thread 23 bound to OS proc set {27}
OMP: pid 856811 tid 856938 thread 29 bound to OS proc set {35}
OMP: pid 856811 tid 856953 thread 44 bound to OS proc set {53}
OMP: pid 856811 tid 856943 thread 34 bound to OS proc set {41}
OMP: pid 856811 tid 856934 thread 25 bound to OS proc set {30}
OMP: pid 856811 tid 856949 thread 40 bound to OS proc set {48}
OMP: pid 856811 tid 856919 thread 10 bound to OS proc set {12}
OMP: pid 856811 tid 856927 thread 18 bound to OS proc set {21}
OMP: pid 856811 tid 856967 thread 58 bound to OS proc set {70}
OMP: pid 856811 tid 856960 thread 51 bound to OS proc set {61}
OMP: pid 856811 tid 856940 thread 31 bound to OS proc set {37}
OMP: pid 856811 tid 856957 thread 48 bound to OS proc set {58}
OMP: pid 856811 tid 856973 thread 64 bound to OS proc set {77}
OMP: pid 856811 tid 856977 thread 68 bound to OS proc set {82}
OMP: pid 856811 tid 856929 thread 20 bound to OS proc set {24}
OMP: pid 856811 tid 856950 thread 41 bound to OS proc set {49}
OMP: pid 856811 tid 856955 thread 46 bound to OS proc set {55}
OMP: pid 856811 tid 856945 thread 36 bound to OS proc set {43}
OMP: pid 856811 tid 856970 thread 61 bound to OS proc set {73}
OMP: pid 856811 tid 856968 thread 59 bound to OS proc set {71}
OMP: pid 856811 tid 856980 thread 71 bound to OS proc set {86}
OMP: pid 856811 tid 856961 thread 52 bound to OS proc set {63}
OMP: pid 856811 tid 856951 thread 42 bound to OS proc set {50}
OMP: pid 856811 tid 856918 thread 9 bound to OS proc set {10}
OMP: pid 856811 tid 856969 thread 60 bound to OS proc set {72}
OMP: pid 856811 tid 856972 thread 63 bound to OS proc set {76}
OMP: pid 856811 tid 856963 thread 54 bound to OS proc set {65}
OMP: pid 856811 tid 856959 thread 50 bound to OS proc set {60}
OMP: pid 856811 tid 856946 thread 37 bound to OS proc set {44}
OMP: pid 856811 tid 856941 thread 32 bound to OS proc set {38}
OMP: pid 856811 tid 856966 thread 57 bound to OS proc set {69}
OMP: pid 856811 tid 856981 thread 72 bound to OS proc set {87}
OMP: pid 856811 tid 856926 thread 17 bound to OS proc set {20}
OMP: pid 856811 tid 856948 thread 39 bound to OS proc set {47}
OMP: pid 856811 tid 856965 thread 56 bound to OS proc set {67}
OMP: pid 856811 tid 856979 thread 70 bound to OS proc set {84}
OMP: pid 856811 tid 856983 thread 74 bound to OS proc set {89}
OMP: pid 856811 tid 856964 thread 55 bound to OS proc set {66}
OMP: pid 856811 tid 856931 thread 22 bound to OS proc set {26}
OMP: pid 856811 tid 856962 thread 53 bound to OS proc set {64}
OMP: pid 856811 tid 856928 thread 19 bound to OS proc set {23}
OMP: pid 856811 tid 856947 thread 38 bound to OS proc set {46}
OMP: pid 856811 tid 856982 thread 73 bound to OS proc set {88}
OMP: pid 856811 tid 856978 thread 69 bound to OS proc set {83}
OMP: pid 856811 tid 856930 thread 21 bound to OS proc set {25}
OMP: pid 856811 tid 856954 thread 45 bound to OS proc set {54}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.174281, "speed_pp": 734.446106, "t_tg": 0.000000, "speed_tg": nan, "t": 0.174281, "speed": 734.446106}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_12  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 857008 tid 857008 thread 0 bound to OS proc set {0}
OMP: pid 857008 tid 857109 thread 3 bound to OS proc set {3}
OMP: pid 857008 tid 857108 thread 2 bound to OS proc set {2}
OMP: pid 857008 tid 857114 thread 8 bound to OS proc set {8}
OMP: pid 857008 tid 857125 thread 19 bound to OS proc set {20}
OMP: pid 857008 tid 857110 thread 4 bound to OS proc set {4}
OMP: pid 857008 tid 857113 thread 7 bound to OS proc set {7}
OMP: pid 857008 tid 857112 thread 6 bound to OS proc set {6}
OMP: pid 857008 tid 857107 thread 1 bound to OS proc set {1}
OMP: pid 857008 tid 857115 thread 9 bound to OS proc set {9}
OMP: pid 857008 tid 857111 thread 5 bound to OS proc set {5}
OMP: pid 857008 tid 857117 thread 11 bound to OS proc set {12}
OMP: pid 857008 tid 857121 thread 15 bound to OS proc set {16}
OMP: pid 857008 tid 857116 thread 10 bound to OS proc set {11}
OMP: pid 857008 tid 857153 thread 47 bound to OS proc set {51}
OMP: pid 857008 tid 857162 thread 56 bound to OS proc set {61}
OMP: pid 857008 tid 857136 thread 30 bound to OS proc set {33}
OMP: pid 857008 tid 857150 thread 44 bound to OS proc set {48}
OMP: pid 857008 tid 857189 thread 83 bound to OS proc set {91}
OMP: pid 857008 tid 857124 thread 18 bound to OS proc set {19}
OMP: pid 857008 tid 857167 thread 61 bound to OS proc set {67}
OMP: pid 857008 tid 857120 thread 14 bound to OS proc set {15}
OMP: pid 857008 tid 857158 thread 52 bound to OS proc set {57}
OMP: pid 857008 tid 857118 thread 12 bound to OS proc set {13}
OMP: pid 857008 tid 857165 thread 59 bound to OS proc set {65}
OMP: pid 857008 tid 857154 thread 48 bound to OS proc set {52}
OMP: pid 857008 tid 857138 thread 32 bound to OS proc set {35}
OMP: pid 857008 tid 857182 thread 76 bound to OS proc set {83}
OMP: pid 857008 tid 857133 thread 27 bound to OS proc set {29}
OMP: pid 857008 tid 857132 thread 26 bound to OS proc set {28}
OMP: pid 857008 tid 857137 thread 31 bound to OS proc set {34}
OMP: pid 857008 tid 857166 thread 60 bound to OS proc set {66}
OMP: pid 857008 tid 857161 thread 55 bound to OS proc set {60}
OMP: pid 857008 tid 857139 thread 33 bound to OS proc set {36}
OMP: pid 857008 tid 857146 thread 40 bound to OS proc set {44}
OMP: pid 857008 tid 857152 thread 46 bound to OS proc set {50}
OMP: pid 857008 tid 857123 thread 17 bound to OS proc set {18}
OMP: pid 857008 tid 857149 thread 43 bound to OS proc set {47}
OMP: pid 857008 tid 857157 thread 51 bound to OS proc set {56}
OMP: pid 857008 tid 857134 thread 28 bound to OS proc set {30}
OMP: pid 857008 tid 857122 thread 16 bound to OS proc set {17}
OMP: pid 857008 tid 857172 thread 66 bound to OS proc set {72}
OMP: pid 857008 tid 857173 thread 67 bound to OS proc set {73}
OMP: pid 857008 tid 857177 thread 71 bound to OS proc set {78}
OMP: pid 857008 tid 857144 thread 38 bound to OS proc set {41}
OMP: pid 857008 tid 857164 thread 58 bound to OS proc set {63}
OMP: pid 857008 tid 857169 thread 63 bound to OS proc set {69}
OMP: pid 857008 tid 857129 thread 23 bound to OS proc set {25}
OMP: pid 857008 tid 857130 thread 24 bound to OS proc set {26}
OMP: pid 857008 tid 857155 thread 49 bound to OS proc set {54}
OMP: pid 857008 tid 857159 thread 53 bound to OS proc set {58}
OMP: pid 857008 tid 857128 thread 22 bound to OS proc set {24}
OMP: pid 857008 tid 857168 thread 62 bound to OS proc set {68}
OMP: pid 857008 tid 857126 thread 20 bound to OS proc set {22}
OMP: pid 857008 tid 857156 thread 50 bound to OS proc set {55}
OMP: pid 857008 tid 857174 thread 68 bound to OS proc set {74}
OMP: pid 857008 tid 857179 thread 73 bound to OS proc set {80}
OMP: pid 857008 tid 857127 thread 21 bound to OS proc set {23}
OMP: pid 857008 tid 857131 thread 25 bound to OS proc set {27}
OMP: pid 857008 tid 857163 thread 57 bound to OS proc set {62}
OMP: pid 857008 tid 857145 thread 39 bound to OS proc set {42}
OMP: pid 857008 tid 857181 thread 75 bound to OS proc set {82}
OMP: pid 857008 tid 857140 thread 34 bound to OS proc set {37}
OMP: pid 857008 tid 857180 thread 74 bound to OS proc set {81}
OMP: pid 857008 tid 857148 thread 42 bound to OS proc set {46}
OMP: pid 857008 tid 857160 thread 54 bound to OS proc set {59}
OMP: pid 857008 tid 857171 thread 65 bound to OS proc set {71}
OMP: pid 857008 tid 857188 thread 82 bound to OS proc set {90}
OMP: pid 857008 tid 857142 thread 36 bound to OS proc set {39}
OMP: pid 857008 tid 857176 thread 70 bound to OS proc set {77}
OMP: pid 857008 tid 857141 thread 35 bound to OS proc set {38}
OMP: pid 857008 tid 857143 thread 37 bound to OS proc set {40}
OMP: pid 857008 tid 857147 thread 41 bound to OS proc set {45}
OMP: pid 857008 tid 857151 thread 45 bound to OS proc set {49}
OMP: pid 857008 tid 857178 thread 72 bound to OS proc set {79}
OMP: pid 857008 tid 857183 thread 77 bound to OS proc set {84}
OMP: pid 857008 tid 857184 thread 78 bound to OS proc set {85}
OMP: pid 857008 tid 857185 thread 79 bound to OS proc set {87}
OMP: pid 857008 tid 857135 thread 29 bound to OS proc set {31}
OMP: pid 857008 tid 857170 thread 64 bound to OS proc set {70}
OMP: pid 857008 tid 857186 thread 80 bound to OS proc set {88}
OMP: pid 857008 tid 857175 thread 69 bound to OS proc set {76}
OMP: pid 857008 tid 857187 thread 81 bound to OS proc set {89}
OMP: pid 857008 tid 857190 thread 84 bound to OS proc set {92}
OMP: pid 857008 tid 857191 thread 85 bound to OS proc set {93}
OMP: pid 857008 tid 857192 thread 86 bound to OS proc set {94}
OMP: pid 857008 tid 857193 thread 87 bound to OS proc set {95}
OMP: pid 857008 tid 857119 thread 13 bound to OS proc set {14}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.163418, "speed_pp": 783.267456, "t_tg": 0.000000, "speed_tg": nan, "t": 0.163418, "speed": 783.267456}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_13  #
#########################################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-35-140.ec2.internal. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 857213 tid 857213 thread 0 bound to OS proc set {0}
OMP: pid 857213 tid 857314 thread 3 bound to OS proc set {3}
OMP: pid 857213 tid 857323 thread 12 bound to OS proc set {12}
OMP: pid 857213 tid 857313 thread 2 bound to OS proc set {2}
OMP: pid 857213 tid 857326 thread 15 bound to OS proc set {15}
OMP: pid 857213 tid 857362 thread 51 bound to OS proc set {51}
OMP: pid 857213 tid 857325 thread 14 bound to OS proc set {14}
OMP: pid 857213 tid 857319 thread 8 bound to OS proc set {8}
OMP: pid 857213 tid 857342 thread 31 bound to OS proc set {31}
OMP: pid 857213 tid 857312 thread 1 bound to OS proc set {1}
OMP: pid 857213 tid 857324 thread 13 bound to OS proc set {13}
OMP: pid 857213 tid 857371 thread 60 bound to OS proc set {60}
OMP: pid 857213 tid 857318 thread 7 bound to OS proc set {7}
OMP: pid 857213 tid 857339 thread 28 bound to OS proc set {28}
OMP: pid 857213 tid 857346 thread 35 bound to OS proc set {35}
OMP: pid 857213 tid 857343 thread 32 bound to OS proc set {32}
OMP: pid 857213 tid 857358 thread 47 bound to OS proc set {47}
OMP: pid 857213 tid 857338 thread 27 bound to OS proc set {27}
OMP: pid 857213 tid 857341 thread 30 bound to OS proc set {30}
OMP: pid 857213 tid 857370 thread 59 bound to OS proc set {59}
OMP: pid 857213 tid 857322 thread 11 bound to OS proc set {11}
OMP: pid 857213 tid 857315 thread 4 bound to OS proc set {4}
OMP: pid 857213 tid 857330 thread 19 bound to OS proc set {19}
OMP: pid 857213 tid 857359 thread 48 bound to OS proc set {48}
OMP: pid 857213 tid 857335 thread 24 bound to OS proc set {24}
OMP: pid 857213 tid 857327 thread 16 bound to OS proc set {16}
OMP: pid 857213 tid 857354 thread 43 bound to OS proc set {43}
OMP: pid 857213 tid 857361 thread 50 bound to OS proc set {50}
OMP: pid 857213 tid 857337 thread 26 bound to OS proc set {26}
OMP: pid 857213 tid 857345 thread 34 bound to OS proc set {34}
OMP: pid 857213 tid 857340 thread 29 bound to OS proc set {29}
OMP: pid 857213 tid 857321 thread 10 bound to OS proc set {10}
OMP: pid 857213 tid 857366 thread 55 bound to OS proc set {55}
OMP: pid 857213 tid 857367 thread 56 bound to OS proc set {56}
OMP: pid 857213 tid 857320 thread 9 bound to OS proc set {9}
OMP: pid 857213 tid 857317 thread 6 bound to OS proc set {6}
OMP: pid 857213 tid 857357 thread 46 bound to OS proc set {46}
OMP: pid 857213 tid 857363 thread 52 bound to OS proc set {52}
OMP: pid 857213 tid 857316 thread 5 bound to OS proc set {5}
OMP: pid 857213 tid 857329 thread 18 bound to OS proc set {18}
OMP: pid 857213 tid 857336 thread 25 bound to OS proc set {25}
OMP: pid 857213 tid 857369 thread 58 bound to OS proc set {58}
OMP: pid 857213 tid 857355 thread 44 bound to OS proc set {44}
OMP: pid 857213 tid 857360 thread 49 bound to OS proc set {49}
OMP: pid 857213 tid 857351 thread 40 bound to OS proc set {40}
OMP: pid 857213 tid 857350 thread 39 bound to OS proc set {39}
OMP: pid 857213 tid 857334 thread 23 bound to OS proc set {23}
OMP: pid 857213 tid 857353 thread 42 bound to OS proc set {42}
OMP: pid 857213 tid 857365 thread 54 bound to OS proc set {54}
OMP: pid 857213 tid 857328 thread 17 bound to OS proc set {17}
OMP: pid 857213 tid 857331 thread 20 bound to OS proc set {20}
OMP: pid 857213 tid 857368 thread 57 bound to OS proc set {57}
OMP: pid 857213 tid 857347 thread 36 bound to OS proc set {36}
OMP: pid 857213 tid 857356 thread 45 bound to OS proc set {45}
OMP: pid 857213 tid 857364 thread 53 bound to OS proc set {53}
OMP: pid 857213 tid 857344 thread 33 bound to OS proc set {33}
OMP: pid 857213 tid 857352 thread 41 bound to OS proc set {41}
OMP: pid 857213 tid 857333 thread 22 bound to OS proc set {22}
OMP: pid 857213 tid 857349 thread 38 bound to OS proc set {38}
OMP: pid 857213 tid 857374 thread 63 bound to OS proc set {63}
OMP: pid 857213 tid 857348 thread 37 bound to OS proc set {37}
OMP: pid 857213 tid 857373 thread 62 bound to OS proc set {62}
OMP: pid 857213 tid 857332 thread 21 bound to OS proc set {21}
OMP: pid 857213 tid 857390 thread 79 bound to OS proc set {79}
OMP: pid 857213 tid 857387 thread 76 bound to OS proc set {76}
OMP: pid 857213 tid 857403 thread 92 bound to OS proc set {92}
OMP: pid 857213 tid 857372 thread 61 bound to OS proc set {61}
OMP: pid 857213 tid 857386 thread 75 bound to OS proc set {75}
OMP: pid 857213 tid 857388 thread 77 bound to OS proc set {77}
OMP: pid 857213 tid 857389 thread 78 bound to OS proc set {78}
OMP: pid 857213 tid 857378 thread 67 bound to OS proc set {67}
OMP: pid 857213 tid 857394 thread 83 bound to OS proc set {83}
OMP: pid 857213 tid 857376 thread 65 bound to OS proc set {65}
OMP: pid 857213 tid 857377 thread 66 bound to OS proc set {66}
OMP: pid 857213 tid 857383 thread 72 bound to OS proc set {72}
OMP: pid 857213 tid 857399 thread 88 bound to OS proc set {88}
OMP: pid 857213 tid 857402 thread 91 bound to OS proc set {91}
OMP: pid 857213 tid 857406 thread 95 bound to OS proc set {95}
OMP: pid 857213 tid 857392 thread 81 bound to OS proc set {81}
OMP: pid 857213 tid 857395 thread 84 bound to OS proc set {84}
OMP: pid 857213 tid 857398 thread 87 bound to OS proc set {87}
OMP: pid 857213 tid 857393 thread 82 bound to OS proc set {82}
OMP: pid 857213 tid 857382 thread 71 bound to OS proc set {71}
OMP: pid 857213 tid 857381 thread 70 bound to OS proc set {70}
OMP: pid 857213 tid 857384 thread 73 bound to OS proc set {73}
OMP: pid 857213 tid 857385 thread 74 bound to OS proc set {74}
OMP: pid 857213 tid 857405 thread 94 bound to OS proc set {94}
OMP: pid 857213 tid 857375 thread 64 bound to OS proc set {64}
OMP: pid 857213 tid 857401 thread 90 bound to OS proc set {90}
OMP: pid 857213 tid 857404 thread 93 bound to OS proc set {93}
OMP: pid 857213 tid 857379 thread 68 bound to OS proc set {68}
OMP: pid 857213 tid 857397 thread 86 bound to OS proc set {86}
OMP: pid 857213 tid 857396 thread 85 bound to OS proc set {85}
OMP: pid 857213 tid 857400 thread 89 bound to OS proc set {89}
OMP: pid 857213 tid 857380 thread 69 bound to OS proc set {69}
OMP: pid 857213 tid 857391 thread 80 bound to OS proc set {80}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 128, "tg": 0, "pl": 1, "n_kv": 128, "t_pp": 0.162197, "speed_pp": 789.163818, "t_tg": 0.000000, "speed_tg": nan, "t": 0.162197, "speed": 789.163818}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14

To display your profiling results:
#########################################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                                COMMAND                                                                                                 #
#########################################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-399-4096/llama.cpp/run/oneview_runs/multicore/armclang/maqao_2025-11-24_14-32-03/tools/lprof_npsu_run_14  #
#########################################################################################################################################################################################################################################

×