Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | | Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 74-87
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| | | 2 | 18.97 | 13.74 | 61.10 | 0 | 48.61 | 100.05 |
| | | |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 2) |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | | | | | Loop Computation Issues | |
| | | | | | Presence of a large number of scalar integer instructions | 1 |
| | | | | | Control Flow Issues | |
| | | | | | Presence of 2 to 4 paths | 1 |
| | | | | | Vectorization Roadblocks | |
| | | | | | Presence of 2 to 4 paths | 1 |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | | Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 71-75
| Loop Source Regions | | Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| 1 | 8.80 | 8.48 | 59.73 | 0 | 47.22 | 105.92 | | |
| | | |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 1) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 118-131
| Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| | 2 | 17.60 | 13.27 | 58.19 | 12.5 | 54.69 | 101.73 | |
| | | |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 2) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | | | Loop Computation Issues | | | |
| | | | Presence of a large number of scalar integer instructions | 1 | | |
| | | | Control Flow Issues | | | |
| | | | Presence of 2 to 4 paths | 1 | | |
| | | | Vectorization Roadblocks | | | |
| | | | Presence of 2 to 4 paths | 1 | | |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 115-120
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
2 | 6.00 | 5.09 | 41.49 | 18.18 | 52.27 | 114.87 | | | |
| | | |
Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 2) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
Loop Computation Issues | | | | | | | |
Presence of a large number of scalar integer instructions | 1 | | | | | | |
Control Flow Issues | | | | | | | |
Vectorization Roadblocks | | | | | | | |
Presence of more than 4 paths | 1 | | | | | | |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 140-143
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
11 | 0.82 | 0.75 | 6.14 | 10 | 47.5 | 46.61 | | | |
| | | |
Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 11) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
Loop Computation Issues | | | | | | | |
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | | | | | | |
Presence of a large number of scalar integer instructions | 1 | | | | | | |
Data Access Issues | | | | | | | |
Presence of indirect access | 1 | | | | | | |
Vectorization Roadblocks | | | | | | | |
Presence of indirect access | 1 | | | | | | |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | | Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 94-97
| Loop Source Regions | | Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| 12 | 0.60 | 0.29 | 2.02 | 0 | 43.18 | 46.44 | | |
| | | |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 12) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | Loop Computation Issues | | | | | |
| | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | | | | |
| | Presence of a large number of scalar integer instructions | 1 | | | | |
| | Data Access Issues | | | | | |
| | Presence of indirect access | 1 | | | | |
| | Vectorization Roadblocks | | | | | |
| | Presence of indirect access | 1 | | | | |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 157-160
| Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| | 16 | 0.67 | 0.23 | 1.01 | 10 | 47.5 | 45.78 | |
| | | |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 16) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | | | Loop Computation Issues | | | |
| | | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | | |
| | | | Presence of a large number of scalar integer instructions | 1 | | |
| | | | Data Access Issues | | | |
| | | | Presence of indirect access | 1 | | |
| | | | Vectorization Roadblocks | | | |
| | | | Presence of indirect access | 1 | | |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | | Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 113-116
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| | | 16 | 0.24 | 0.12 | 0.53 | 0 | 43.18 | 45.76 |
| | | |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 16) |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | | | | | Loop Computation Issues | |
| | | | | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| | | | | | Presence of a large number of scalar integer instructions | 1 |
| | | | | | Data Access Issues | |
| | | | | | Presence of indirect access | 1 |
| | | | | | Vectorization Roadblocks | |
| | | | | | Presence of indirect access | 1 |
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads) | Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads) |
Loop Source Regions | | Loop Source Regions | | Loop Source Regions | | Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
14 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 11 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 13 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | |
10 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | | | |
12 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | | | |
| | | |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |