- r_1 - gcc_o3_ov1_o52 - 4 analyzed loop(s)
- Loop 23 - spmxv.exe
- Loop 21 - spmxv.exe
- Loop 20 - spmxv.exe
- Loop 22 - spmxv.exe
- r_2 - gcc_o3-ffastmath_ov1_o52 - 4 analyzed loop(s)
- Loop 23 - spmxv.exe
- Loop 21 - spmxv.exe
- Loop 20 - spmxv.exe
- Loop 22 - spmxv.exe
- r_3 - gcc_ofast_ov1_o52 - 4 analyzed loop(s)
- Loop 23 - spmxv.exe
- Loop 21 - spmxv.exe
- Loop 20 - spmxv.exe
- Loop 22 - spmxv.exe
- r_4 - icx_o3_ov1_o52 - 5 analyzed loop(s)
- Loop 16 - spmxv.exe
- Loop 14 - spmxv.exe
- Loop 15 - spmxv.exe
- Loop 13 - spmxv.exe
- Loop 12 - spmxv.exe
- r_5 - icx_o3-ffastmath_ov1_o52 - 5 analyzed loop(s)
- Loop 16 - spmxv.exe
- Loop 14 - spmxv.exe
- Loop 15 - spmxv.exe
- Loop 13 - spmxv.exe
- Loop 12 - spmxv.exe
- r_6 - icx_fast_ov1_o52 - 5 analyzed loop(s)
- Loop 16 - spmxv.exe
- Loop 14 - spmxv.exe
- Loop 15 - spmxv.exe
- Loop 13 - spmxv.exe
- Loop 12 - spmxv.exe
Analysis | Count | Percentage | Weighted Count |
▼Loop Computation Issues– | 18 | | |
○Presence of a large number of scalar integer instructions | 12 | 44.44 | 1.36 |
○Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 3 | 11.11 | 0.64 |
○Low iteration count | 3 | 11.11 | 0.48 |
▼Control Flow Issues– | 45 | | |
○Non-innermost loop | 18 | 66.67 | 1.36 |
○Presence of more than 4 paths | 9 | 33.33 | 0.69 |
○Presence of 2 to 4 paths | 9 | 33.33 | 0.68 |
○Presence of calls | 6 | 22.22 | 0.00 |
○Low iteration count | 3 | 11.11 | 0.48 |
▼Data Access Issues– | 37 | | |
○More than 20% of the loads are accessing the stack | 14 | 51.85 | 0.65 |
○Presence of indirect access | 11 | 40.74 | 3.97 |
○Presence of special instructions executing on a single port | 7 | 25.93 | 2.41 |
○Presence of expensive instructions: scatter/gather | 3 | 11.11 | 1.50 |
○Presence of constant non-unit stride data access | 2 | 7.41 | 0.03 |
▼Vectorization Roadblocks– | 55 | | |
○Non-innermost loop | 18 | 66.67 | 1.36 |
○Presence of indirect access | 11 | 40.74 | 3.97 |
○Presence of 2 to 4 paths | 9 | 33.33 | 0.68 |
○Presence of more than 4 paths | 9 | 33.33 | 0.69 |
○Presence of calls | 6 | 22.22 | 0.00 |
○Presence of constant non-unit stride data access | 2 | 7.41 | 0.03 |
▼Inefficient Vectorization– | 10 | | |
○Presence of special instructions executing on a single port | 7 | 25.93 | 2.41 |
○Presence of expensive instructions: scatter/gather | 3 | 11.11 | 1.50 |
Analysis | r_1 | r_2 | r_3 | r_4 | r_5 | r_6 |
Loop Computation Issues | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | 0 | 0 | 1 | 1 | 1 |
---|
Presence of a large number of scalar integer instructions | 2 | 2 | 2 | 2 | 2 | 2 |
Low iteration count | 0 | 0 | 0 | 1 | 1 | 1 |
Control Flow Issues | Presence of calls | 1 | 1 | 1 | 1 | 1 | 1 |
---|
Presence of 2 to 4 paths | 3 | 0 | 0 | 2 | 2 | 2 |
Presence of more than 4 paths | 0 | 3 | 3 | 1 | 1 | 1 |
Non-innermost loop | 3 | 3 | 3 | 3 | 3 | 3 |
Low iteration count | 0 | 0 | 0 | 1 | 1 | 1 |
Data Access Issues | Presence of constant non-unit stride data access | 2 | 0 | 0 | 0 | 0 | 0 |
---|
Presence of indirect access | 3 | 1 | 1 | 2 | 2 | 2 |
Presence of expensive instructions: scatter/gather | 0 | 0 | 0 | 1 | 1 | 1 |
Presence of special instructions executing on a single port | 0 | 2 | 2 | 1 | 1 | 1 |
More than 20% of the loads are accessing the stack | 1 | 2 | 2 | 3 | 3 | 3 |
Vectorization Roadblocks | Presence of calls | 1 | 1 | 1 | 1 | 1 | 1 |
---|
Presence of 2 to 4 paths | 3 | 0 | 0 | 2 | 2 | 2 |
Presence of more than 4 paths | 0 | 3 | 3 | 1 | 1 | 1 |
Non-innermost loop | 3 | 3 | 3 | 3 | 3 | 3 |
Presence of constant non-unit stride data access | 2 | 0 | 0 | 0 | 0 | 0 |
Presence of indirect access | 3 | 1 | 1 | 2 | 2 | 2 |
Inefficient Vectorization | Presence of expensive instructions: scatter/gather | 0 | 0 | 0 | 1 | 1 | 1 |
---|
Presence of special instructions executing on a single port | 0 | 2 | 2 | 1 | 1 | 1 |