OV - Compare Loops

MAQAO

options

Loops

▶main.cpp: 74 - 61.10 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 74-87
																					2	18.97	13.74	61.10	0	48.61	100.05

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 2)
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
																					Loop Computation Issues
																					Presence of a large number of scalar integer instructions						1
																					Control Flow Issues
																					Presence of 2 to 4 paths						1
																					Vectorization Roadblocks
																					Presence of 2 to 4 paths						1

▶main.cpp: 71 - 59.73 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 71-75						Loop Source Regions							Loop Source Regions
							1	8.80	8.48	59.73	0	47.22	105.92

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 1)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count

▶main.cpp: 118 - 58.19 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 118-131						Loop Source Regions
														2	17.60	13.27	58.19	12.5	54.69	101.73

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 2)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
														Loop Computation Issues
														Presence of a large number of scalar integer instructions						1
														Control Flow Issues
														Presence of 2 to 4 paths						1
														Vectorization Roadblocks
														Presence of 2 to 4 paths						1

▶main.cpp: 115 - 41.49 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 115-120						Loop Source Regions							Loop Source Regions							Loop Source Regions
2	6.00	5.09	41.49	18.18	52.27	114.87

Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 2)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
Loop Computation Issues
Presence of a large number of scalar integer instructions						1
Control Flow Issues
Vectorization Roadblocks
Presence of more than 4 paths						1

▶main.cpp: 140 - 6.14 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 140-143						Loop Source Regions							Loop Source Regions							Loop Source Regions
11	0.82	0.75	6.14	10	47.5	46.61

Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 11)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
Presence of a large number of scalar integer instructions						1
Data Access Issues
Presence of indirect access						1
Vectorization Roadblocks
Presence of indirect access						1

▶main.cpp: 94 - 2.02 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 94-97						Loop Source Regions							Loop Source Regions
							12	0.60	0.29	2.02	0	43.18	46.44

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 12)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
							Loop Computation Issues
							Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
							Presence of a large number of scalar integer instructions						1
							Data Access Issues
							Presence of indirect access						1
							Vectorization Roadblocks
							Presence of indirect access						1

▶main.cpp: 157 - 1.01 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 157-160						Loop Source Regions
														16	0.67	0.23	1.01	10	47.5	45.78

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 16)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
														Loop Computation Issues
														Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
														Presence of a large number of scalar integer instructions						1
														Data Access Issues
														Presence of indirect access						1
														Vectorization Roadblocks
														Presence of indirect access						1

▶main.cpp: 113 - 0.53 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 113-116
																					16	0.24	0.12	0.53	0	43.18	45.76

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 16)
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
																					Loop Computation Issues
																					Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
																					Presence of a large number of scalar integer instructions						1
																					Data Access Issues
																					Presence of indirect access						1
																					Vectorization Roadblocks
																					Presence of indirect access						1

▶<unknown>: 0 - 0.00 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V2 GCC O3 Base (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 SoA (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll (250 iterations, 96 threads)							Run Neoverse V2 GCC O3 Manual Unroll + SoA (250 iterations, 96 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions							Loop Source Regions
14	0.00	0.00	0.00	0	0	0	11	0.00	0.00	0.00	0	0	0	13	0.00	0.00	0.00	0	0	0
10	0.00	0.00	0.00	0	0	0
12	0.00	0.00	0.00	0	0	0

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count

×