Run Cascade Lake ICPX Ofast STATIC SCHEDULING | Run Cascade Lake ICPX Ofast DYNAMIC SCHEDULING | Run Cascade Lake ICPX Ofast GUIDED SELF SCHEDULING |
Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 116-122
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
26 | 23.06 | 24.82 | 85.53 | 57.89 | 18.86 | 35.12 | 26 | 1.08 | 0.97 | 2.43 | 58.57 | 19.38 | 41.39 | 15 | 28.70 | 31.11 | 91.72 | 57.89 | 18.86 | 34.92 |
| 27 | 19.62 | 20.35 | 50.93 | 57.14 | 18.75 | 33.78 | |
| | |
Sum on 1 analyzed binary loop (kmeans-icpx-Ofast - 26) | Sum on 2 analyzed binary loops (kmeans-icpx-Ofast - 26, kmeans-icpx-Ofast - 27) | Sum on 1 analyzed binary loop (kmeans-icpx-Ofast - 15) |
Analysis | Count | Analysis | Count | Analysis | Count |
Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | |
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
Presence of a large number of scalar integer instructions | 0 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 0 |
Control Flow Issues | | Control Flow Issues | | Control Flow Issues | |
Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 |
Data Access Issues | | Data Access Issues | | Data Access Issues | |
Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |
Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 |
Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | |
Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |