options

Statistics

AnalysisCountPercentageWeighted Count
Loop Computation Issues19
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA1050.000.35
Presence of a large number of scalar integer instructions525.000.14
Presence of expensive FP instructions210.000.03
Large loop body over microp cache size15.000.01
Bottleneck in the front-end15.000.01
Control Flow Issues9
Presence of 2 to 4 paths420.000.07
Presence of calls315.000.17
Presence of more than 4 paths15.000.02
Non-innermost loop15.000.01
Data Access Issues23
More than 20% of the loads are accessing the stack840.000.35
Presence of indirect access420.000.12
Presence of constant non-unit stride data access420.000.13
Presence of special instructions executing on a single port315.000.06
More than 10% of the vector loads instructions are unaligned315.000.06
Presence of expensive instructions: scatter/gather15.000.03
Vectorization Roadblocks20
Presence of more than 4 paths420.000.19
Presence of 2 to 4 paths420.000.07
Presence of constant non-unit stride data access420.000.13
Presence of indirect access420.000.12
Presence of calls315.000.17
Non-innermost loop15.000.01
Inefficient Vectorization5
Presence of special instructions executing on a single port315.000.06
Use of masked instructions15.000.01
Presence of expensive instructions: scatter/gather15.000.03

Details

Analysisr_1r_2
Loop Computation IssuesPresence of expensive FP instructions11
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA55
Large loop body over microp cache size10
Presence of a large number of scalar integer instructions23
Bottleneck in the front-end10
Control Flow IssuesPresence of calls21
Presence of 2 to 4 paths22
Presence of more than 4 paths01
Non-innermost loop10
Data Access IssuesPresence of constant non-unit stride data access22
Presence of indirect access22
More than 10% of the vector loads instructions are unaligned30
Presence of expensive instructions: scatter/gather01
Presence of special instructions executing on a single port30
More than 20% of the loads are accessing the stack35
Vectorization RoadblocksPresence of calls21
Presence of 2 to 4 paths22
Presence of more than 4 paths22
Non-innermost loop10
Presence of constant non-unit stride data access22
Presence of indirect access22
Inefficient VectorizationPresence of expensive instructions: scatter/gather01
Presence of special instructions executing on a single port30
Use of masked instructions10
×