| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/build/_deps/kleidiai_download-src/kai/ukernels/matmul/matmul_clamp_f32_qsi8d32p_qsi4c32p/kai_matmul_clamp_f32_qsi8d32p1x8_qsi4c32p4x8_1x4x32_neon_dotprod.c: 128-128
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc/_deps/kleidiai_download-src/kai/ukernels/matmul/matmul_clamp_f32_qsi8d32p_qsi4c32p/kai_matmul_clamp_f32_qsi8d32p1x8_qsi4c32p4x8_1x4x32_neon_dotprod.c: 128-128
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc_2/_deps/kleidiai_download-src/kai/ukernels/matmul/matmul_clamp_f32_qsi8d32p_qsi4c32p/kai_matmul_clamp_f32_qsi8d32p1x8_qsi4c32p4x8_1x4x32_neon_dotprod.c: 128-128
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2567 | 1.17 | 1.30 | 39.33 | 57.14 | 79.29 | 2060 | 1.18 | 1.51 | 41.64 | 57.14 | 79.29 | 2064 | 1.21 | 1.50 | 43.16 | 57.14 | 79.29 |
| | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2567) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2060) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2064) |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Data Access Issues | | Data Access Issues | | Data Access Issues | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h: 51-51
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2683-2684
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2693-2699
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2709-2758
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h: 51-51
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2683-2683
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2693-2702
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2709-2758
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2769-2814
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h: 51-51
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2683-2683
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2693-2702
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2709-2758
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/arm/quants.c: 2769-2812
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2462 | 0.14 | 0.14 | 4.15 | 75.14 | 82.66 | 1953 | 0.15 | 0.17 | 4.72 | 59.53 | 80.22 | 1954 | 0.15 | 0.16 | 4.66 | 69.54 | 84.89 |
| | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2462) | Sum on 1 analyzed binary loop (libggml-cpu.so - 1953) | Sum on 1 analyzed binary loop (libggml-cpu.so - 1954) |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Control Flow Issues | | Control Flow Issues | | Control Flow Issues | |
| Presence of 2 to 4 paths | | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 |
| Data Access Issues | | Data Access Issues | | Data Access Issues | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
| Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/build/_deps/kleidiai_download-src/kai/ukernels/matmul/matmul_clamp_f32_qsi8d32p_qsi4c32p/kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c: 131-131
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc/_deps/kleidiai_download-src/kai/ukernels/matmul/matmul_clamp_f32_qsi8d32p_qsi4c32p/kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c: 131-131
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc_2/_deps/kleidiai_download-src/kai/ukernels/matmul/matmul_clamp_f32_qsi8d32p_qsi4c32p/kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c: 131-131
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2582 | 0.03 | 0.02 | 0.47 | 28.14 | 55.37 | 2075 | 0.03 | 0.02 | 0.58 | 28.14 | 55.37 | 2079 | 0.03 | 0.02 | 0.58 | 28.14 | 55.37 |
| | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2582) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2075) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2079) |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Data Access Issues | | Data Access Issues | | Data Access Issues | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2473 | 0.00 | 0.00 | 0.00 | 0 | 0 | 4180 | 0.01 | 0.00 | 0.00 | 0 | 0 | 3846 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2634 | 0.00 | 0.00 | 0.00 | 0 | 0 | 4472 | 0.01 | 0.00 | 0.00 | 0 | 0 | 4198 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2790 | 0.00 | 0.00 | 0.00 | 0 | 0 | 3728 | 0.01 | 0.00 | 0.00 | 0 | 0 | 4209 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2886 | 0.00 | 0.00 | 0.00 | 0 | 0 | 3732 | 0.01 | 0.00 | 0.00 | 0 | 0 | 4195 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2633 | 0.00 | 0.00 | 0.00 | 0 | 0 | 3907 | 0.01 | 0.00 | 0.00 | 0 | 0 | 4245 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 2650 | 0.00 | 0.00 | 0.00 | 0 | 0 | 4140 | 0.01 | 0.00 | 0.00 | 0 | 0 | 3971 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2767 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2043 | 0.00 | 0.00 | 0.00 | 0 | 0 | 3969 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2764 | 0.00 | 0.00 | 0.00 | 0 | 0 | 800 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2046 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2649 | 0.01 | 0.00 | 0.00 | 0 | 0 | 1447 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2062 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 1904 | 0.00 | 0.00 | 0.00 | 0 | 0 | 55 | 0.01 | 0.00 | 0.01 | 0 | 0 | 839 | 0.00 | 0.00 | 0.01 | 0 | 0 |
| 2545 | 0.01 | 0.00 | 0.00 | 0 | 0 | 2057 | 0.01 | 0.00 | 0.00 | 0 | 0 | 2011 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 77 | 0.00 | 0.00 | 0.01 | 0 | 0 | 495 | 0.02 | 0.00 | 0.01 | 0 | 0 | 1159 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 434 | 0.02 | 0.00 | 0.01 | 0 | 0 | 411 | 0.01 | 0.00 | 0.00 | 0 | 0 | 98 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 1570 | 0.01 | 0.00 | 0.01 | 0 | 0 | 854 | 0.00 | 0.00 | 0.00 | 0 | 0 | 413 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 1401 | 0.01 | 0.00 | 0.00 | 0 | 0 | 1450 | 0.01 | 0.00 | 0.00 | 0 | 0 | 1496 | 0.01 | 0.00 | 0.01 | 0 | 0 |
| 400 | 0.01 | 0.00 | 0.01 | 0 | 0 | 310 | 0.01 | 0.00 | 0.00 | 0 | 0 | 35 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 551 | 0.02 | 0.00 | 0.01 | 0 | 0 | 74 | 0.01 | 0.00 | 0.00 | 0 | 0 | |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 1008-1034
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.cpp: 385-387
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 1009-1023
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 1031-1034
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.cpp: 385-387
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 1009-1023
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 1031-1034
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.cpp: 385-387
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 1015 | 0.20 | 0.00 | 0.10 | 68.18 | 82.24 | 767 | 0.07 | 0.00 | 0.04 | 80 | 97.68 | 790 | 0.08 | 0.00 | 0.04 | 90 | 98.67 |
| | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 1015) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Loop Computation Issues | | | | | |
| Presence of expensive FP instructions | 1 | | | | |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/build/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c: 127-128
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/build/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c: 137-139
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c: 127-134
| Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2565 | 0.29 | 0.00 | 0.12 | 0 | 50 | 2056 | 0.09 | 0.00 | 0.04 | 0 | 43.75 | |
| | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2565) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Data Access Issues | | | | | |
| Presence of indirect access | 1 | | | | |
| Vectorization Roadblocks | | | | | |
| Presence of indirect access | 1 | | | | |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/build/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c: 115-118
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c: 115-118
| Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2566 | 0.14 | 0.00 | 0.06 | 0 | 31.25 | 2058 | 0.20 | 0.00 | 0.09 | 0 | 31.25 | |
| | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2566) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2058) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Data Access Issues | | Data Access Issues | | | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 0 | | |
| Presence of indirect access | 1 | Presence of indirect access | 1 | | |
| Vectorization Roadblocks | | Vectorization Roadblocks | | | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 0 | | |
| Presence of indirect access | 1 | Presence of indirect access | 1 | | |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6210-6211
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6220-6230
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6238-6245
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6413-6413
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6211-6211
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6223-6231
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 790 | 0.01 | 0.00 | 0.10 | 2.33 | 20.2 | 805 | 0.01 | 0.00 | 0.04 | 0 | 25.52 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 790) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| | Loop Computation Issues | | | |
| | Presence of expensive FP instructions | 1 | | |
| | Presence of a large number of scalar integer instructions | 1 | | |
| | Control Flow Issues | | | |
| | Presence of calls | 1 | | |
| | Presence of more than 4 paths | 1 | | |
| | Vectorization Roadblocks | | | |
| | Presence of calls | 1 | | |
| | Presence of more than 4 paths | 1 | | |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.cpp: 231-262
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.cpp: 231-262
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.cpp: 231-262
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 1007 | 0.01 | 0.00 | 0.04 | 100 | 100 | 764 | 0.02 | 0.00 | 0.07 | 100 | 100 | 788 | 0.01 | 0.00 | 0.03 | 100 | 100 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 764) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| | Data Access Issues | | | |
| | Presence of constant non-unit stride data access | 1 | | |
| | Vectorization Roadblocks | | | |
| | Presence of constant non-unit stride data access | 1 | | |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 92-92
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 341-342
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 348-382
- /usr/include/c++/11/variant: 1594-1594
- /usr/include/c++/11/variant: 1726-1727
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc/_deps/kleidiai_download-src/kai/kai_common.h: 143-143
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc_2/_deps/kleidiai_download-src/kai/kai_common.h: 143-143
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 92-92
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 341-342
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 348-382
- /usr/include/c++/11/variant: 1594-1594
- /usr/include/c++/11/variant: 1726-1727
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2017 | 0.01 | 0.00 | 0.04 | 0 | 45.71 | 2018 | 0.01 | 0.00 | 0.07 | 0 | 45.95 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 2018) |
| Analysis | Count | Analysis | Count | Analysis | Count |
| | | | Loop Computation Issues | |
| | | | Presence of a large number of scalar integer instructions | 1 |
| | | | Control Flow Issues | |
| | | | Presence of calls | 1 |
| | | | Vectorization Roadblocks | |
| | | | Presence of calls | 1 |
| | | | Presence of more than 4 paths | 1 |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 411-458
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 411-458
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 411-458
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 1900 | 0.01 | 0.00 | 0.04 | 100 | 100 | 1454 | 0.02 | 0.00 | 0.04 | 100 | 100 | 1501 | 0.01 | 0.00 | 0.02 | 100 | 100 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/build/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_lhs_quant_pack_qsi8d32p_f32.c: 87-89
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_lhs_quant_pack_qsi8d32p_f32.c: 87-89
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc_2/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_lhs_quant_pack_qsi8d32p_f32.c: 87-89
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2532 | 0.02 | 0.00 | 0.01 | 0 | 19.64 | 2034 | 0.05 | 0.00 | 0.02 | 0 | 18.75 | 2035 | 0.15 | 0.00 | 0.06 | 0 | 18.75 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 2035) |
| Analysis | Count | Analysis | Count | Analysis | Count |
| | | | Loop Computation Issues | |
| | | | Presence of a large number of scalar integer instructions | 1 |
| | | | Control Flow Issues | |
| | | | Presence of more than 4 paths | 1 |
| | | | Vectorization Roadblocks | |
| | | | Presence of more than 4 paths | 1 |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6238-6245
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| | 829 | 0.01 | 0.00 | 0.06 | 0 | 28.37 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 829) |
| Analysis | Count | Analysis | Count | Analysis | Count |
| | | | Loop Computation Issues | |
| | | | Presence of expensive FP instructions | 1 |
| | | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| | | | Control Flow Issues | |
| | | | Presence of calls | 1 |
| | | | Data Access Issues | |
| | | | Presence of constant non-unit stride data access | 1 |
| | | | Presence of indirect access | 1 |
| | | | Vectorization Roadblocks | |
| | | | Presence of calls | 1 |
| | | | Presence of constant non-unit stride data access | 1 |
| | | | Presence of indirect access | 1 |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6220-6220
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6229-6230
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 6238-6245
| Loop Source Regions | | Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 1565 | 0.02 | 0.00 | 0.06 | 0 | 26.34 | | |
| | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 1565) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Loop Computation Issues | | | | | |
| Presence of expensive FP instructions | 1 | | | | |
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | | | | |
| Control Flow Issues | | | | | |
| Presence of calls | 1 | | | | |
| Data Access Issues | | | | | |
| Presence of constant non-unit stride data access | 1 | | | | |
| Vectorization Roadblocks | | | | | |
| Presence of calls | 1 | | | | |
| Presence of constant non-unit stride data access | 1 | | | | |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 3228-3229
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/./ggml-impl.h: 389-404
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 3228-3229
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/./ggml-impl.h: 354-354
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/./ggml-impl.h: 389-404
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 3228-3229
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/./ggml-impl.h: 389-404
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 0 | 0.00 | 0.00 | 0.01 | 92.5 | 98.75 | 6 | 0.01 | 0.00 | 0.03 | 0 | 18.47 | 1 | 0.01 | 0.00 | 0.01 | 72.6 | 83.56 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/build/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_lhs_quant_pack_qsi8d32p_f32.c: 103-112
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/gcc/_deps/kleidiai_download-src/kai/ukernels/matmul/pack/kai_lhs_quant_pack_qsi8d32p_f32.c: 103-112
| Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2531 | 0.05 | 0.00 | 0.02 | 0 | 25.14 | 2037 | 0.04 | 0.00 | 0.02 | 0 | 22.66 | |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 535-535
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 546-547
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 535-535
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/kleidiai/kleidiai.cpp: 546-547
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2021 | 0.01 | 0.00 | 0.02 | 0 | 50 | 2022 | 0.01 | 0.00 | 0.02 | 0 | 50 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1126-1130
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1142-1142
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1371-1379
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1386-1395
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1126-1130
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1142-1142
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1371-1379
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1386-1395
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 51 | 0.01 | 0.00 | 0.02 | 0 | 48.72 | 55 | 0.01 | 0.00 | 0.02 | 0 | 47.98 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 4325-4326
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ops.cpp: 4325-4326
| Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 1398 | 0.03 | 0.00 | 0.01 | 96.97 | 98.48 | 1127 | 0.04 | 0.00 | 0.02 | 0 | 26.56 | |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 2879-2898
| Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 2879-2898
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 63 | 0.01 | 0.00 | 0.02 | 0 | 32.95 | 67 | 0.01 | 0.00 | 0.01 | 0 | 31.91 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/binary-ops.cpp: 18-18
- /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/binary-ops.cpp: 31-32
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| | 501 | 0.05 | 0.00 | 0.02 | 25 | 100 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c: 1193-1194
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| | 59 | 0.01 | 0.00 | 0.01 | 0 | 46.88 |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run gcc_default | Run gcc_2 |
| Loop Source Regions | | Loop Source Regions | - /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-35-140.ec2.internal/176-406-0470/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/traits.cpp: 13-17
| Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 369 | 0.01 | 0.00 | 0.01 | 0 | 50 | |
| | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count |