Run Skylake ICPX Ofast AoS (base) | Run Skylake ICPX Ofast SoA | Run Skylake ICPX Ofast Manual Unroll | Run Skylake ICPX Ofast Manual Unroll + SoA |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 113-125
| | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 113-142
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 113-142
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
13 | 79.84 | 83.57 | 103.39 | 26 | 8.12 | 8.41 | 78.52 | | 13 | 88.88 | 25.01 | 24.15 | 26 | 6.06 | 1.48 | 385.03 | 13 | 88.33 | 24.82 | 24.36 | 26 | 5.04 | 1.26 | 385.48 |
Run Skylake ICPX Ofast AoS (base) | Run Skylake ICPX Ofast SoA | Run Skylake ICPX Ofast Manual Unroll | Run Skylake ICPX Ofast Manual Unroll + SoA |
| | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 67-79
| | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 13 | 96.86 | 95.70 | 95.27 | 26 | 1.54 | 1.52 | 85.49 | | |
Run Skylake ICPX Ofast AoS (base) | Run Skylake ICPX Ofast SoA | Run Skylake ICPX Ofast Manual Unroll | Run Skylake ICPX Ofast Manual Unroll + SoA |
| | | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 155-161
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 155-161
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| | 12 | 10.51 | 2.96 | 4.70 | 26 | 6.07 | 1.52 | 9.62 | 12 | 11.11 | 3.12 | 4.59 | 26 | 5.01 | 1.26 | 9.63 |
Run Skylake ICPX Ofast AoS (base) | Run Skylake ICPX Ofast SoA | Run Skylake ICPX Ofast Manual Unroll | Run Skylake ICPX Ofast Manual Unroll + SoA |
| | | | | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
1164 | 0.32 | 0.34 | 0.67 | 26 | 0.15 | 0.16 | 0.00 | 1164 | 0.00 | 0.00 | 0.01 | 9 | 0.00 | 0.00 | 0.00 | 1164 | 0.00 | 0.00 | 0.01 | 5 | 0.00 | 0.00 | 0.00 | 1164 | 0.01 | 0.00 | 0.01 | 5 | 0.01 | 0.00 | 0.00 |
1122 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1106 | 0.00 | 0.00 | 0.01 | 11 | 0.00 | 0.00 | 0.00 | 1106 | 0.01 | 0.00 | 0.01 | 9 | 0.01 | 0.00 | 0.00 | 1106 | 0.01 | 0.00 | 0.01 | 8 | 0.01 | 0.00 | 0.00 |
1106 | 0.44 | 0.46 | 0.85 | 26 | 0.20 | 0.20 | 0.00 | 1087 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1087 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1087 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 |
2 | 0.00 | 0.00 | 0.01 | 4 | 0.00 | 0.00 | 0.00 | 1993 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1089 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1083 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 |
2813 | 0.01 | 0.01 | 0.02 | 19 | 0.00 | 0.00 | 0.00 | 1102 | 0.16 | 0.16 | 0.19 | 26 | 0.03 | 0.03 | 0.00 | 2806 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1102 | 0.54 | 0.15 | 0.18 | 26 | 0.11 | 0.03 | 0.00 |
1102 | 17.35 | 18.16 | 33.15 | 26 | 7.72 | 8.03 | 0.00 | -1 | 0.00 | 0.00 | 0.00 | 5 | 0.00 | 0.00 | NA | 1937 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 18 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 |
18 | 0.00 | 0.00 | 0.01 | 2 | 0.00 | 0.00 | 0.00 | | 1102 | 0.58 | 0.16 | 0.22 | 26 | 0.13 | 0.03 | 0.00 | -1 | 0.00 | 0.00 | 0.00 | 3 | 0.00 | 0.00 | NA |
-1 | 0.00 | 0.00 | 0.00 | 7 | 0.00 | 0.00 | NA | | -1 | 0.00 | 0.00 | 0.00 | 3 | 0.00 | 0.00 | NA | |
Run Skylake ICPX Ofast AoS (base) | Run Skylake ICPX Ofast SoA | Run Skylake ICPX Ofast Manual Unroll | Run Skylake ICPX Ofast Manual Unroll + SoA |
| | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 92-98
| | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 12 | 2.97 | 2.93 | 5.03 | 26 | 1.54 | 1.47 | 9.80 | | |
Run Skylake ICPX Ofast AoS (base) | Run Skylake ICPX Ofast SoA | Run Skylake ICPX Ofast Manual Unroll | Run Skylake ICPX Ofast Manual Unroll + SoA |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 139-145
| | | | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
12 | 2.03 | 2.13 | 3.94 | 26 | 1.15 | 1.20 | 10.17 | | | |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
Skylake ICPX Ofast AoS (base) | Skylake ICPX Ofast SoA | Skylake ICPX Ofast Manual Unroll | Skylake ICPX Ofast Manual Unroll + SoA | Skylake ICPX Ofast AoS (base) | Skylake ICPX Ofast SoA | Skylake ICPX Ofast Manual Unroll | Skylake ICPX Ofast Manual Unroll + SoA | Skylake ICPX Ofast AoS (base) | Skylake ICPX Ofast SoA | Skylake ICPX Ofast Manual Unroll | Skylake ICPX Ofast Manual Unroll + SoA | Skylake ICPX Ofast AoS (base) | Skylake ICPX Ofast SoA | Skylake ICPX Ofast Manual Unroll | Skylake ICPX Ofast Manual Unroll + SoA | Skylake ICPX Ofast AoS (base) | Skylake ICPX Ofast SoA | Skylake ICPX Ofast Manual Unroll | Skylake ICPX Ofast Manual Unroll + SoA | Skylake ICPX Ofast AoS (base) | Skylake ICPX Ofast SoA | Skylake ICPX Ofast Manual Unroll | Skylake ICPX Ofast Manual Unroll + SoA | Skylake ICPX Ofast AoS (base) | Skylake ICPX Ofast SoA | Skylake ICPX Ofast Manual Unroll | Skylake ICPX Ofast Manual Unroll + SoA |
k_means(int, point_t*, point_t*, int*, int, int) [clone .extracted.18] | binary | 79.84 | NA | 88.88 | 88.33 | 83.57 | NA | 25.01 | 24.82 | 103.39 | NA | 24.15 | 24.36 | 26 | NA | 26 | 26 | 78.52 | NA | 385.03 | 385.48 | 8.12 | NA | 6.06 | 5.04 | 8.41 | NA | 1.48 | 1.26 |
k_means(int, point_t&, point_t&, int*, int, int) [clone .extracted.18] | binary | NA | 96.86 | NA | NA | NA | 95.70 | NA | NA | NA | 95.27 | NA | NA | NA | 26 | NA | NA | NA | 85.49 | NA | NA | NA | 1.54 | NA | NA | NA | 1.52 | NA | NA |
k_means(int, point_t*, point_t*, int*, int, int) [clone .extracted] | binary | 2.03 | NA | 10.51 | 11.11 | 2.13 | NA | 2.96 | 3.12 | 3.94 | NA | 4.70 | 4.59 | 26 | NA | 26 | 26 | 10.17 | NA | 9.62 | 9.63 | 1.15 | NA | 6.07 | 5.01 | 1.20 | NA | 1.52 | 1.26 |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libiomp5.so | 17.35 | 0.16 | 0.58 | 0.54 | 18.16 | 0.16 | 0.16 | 0.15 | 33.15 | 0.19 | 0.22 | 0.18 | 26 | 26 | 26 | 26 | 0.00 | 0.00 | 0.00 | 0.00 | 7.72 | 0.03 | 0.13 | 0.11 | 8.03 | 0.03 | 0.03 | 0.03 |
k_means(int, point_t&, point_t&, int*, int, int) [clone .extracted] | binary | NA | 2.97 | NA | NA | NA | 2.93 | NA | NA | NA | 5.03 | NA | NA | NA | 26 | NA | NA | NA | 9.80 | NA | NA | NA | 1.54 | NA | NA | NA | 1.47 | NA | NA |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libiomp5.so | 0.44 | 0.00 | 0.01 | 0.01 | 0.46 | 0.00 | 0.00 | 0.00 | 0.85 | 0.01 | 0.01 | 0.01 | 26 | 11 | 9 | 8 | 0.00 | 0.00 | 0.00 | 0.00 | 0.20 | 0.00 | 0.01 | 0.01 | 0.20 | 0.00 | 0.00 | 0.00 |
__sched_yield | libc.so.6 | 0.32 | 0.00 | 0.00 | 0.01 | 0.34 | 0.00 | 0.00 | 0.00 | 0.67 | 0.01 | 0.01 | 0.01 | 26 | 9 | 5 | 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.15 | 0.00 | 0.00 | 0.01 | 0.16 | 0.00 | 0.00 | 0.00 |
__kmp_yield | libiomp5.so | 0.01 | NA | NA | NA | 0.01 | NA | NA | NA | 0.02 | NA | NA | NA | 19 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
__kmp_join_barrier(int) | libiomp5.so | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.01 | 0.01 | 0.01 | NA | 1 | 1 | 1 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 |
__intel_avx_rep_memset | binary | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.01 | NA | NA | 0.01 | 2 | NA | NA | 1 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 |
__kmp_fork_call | libiomp5.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
__kmp_hyper_barrier_gather(barrier_type, kmp_info*, int, int, void (*)(void*, void*), void*) | libiomp5.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
__kmp_invoke_microtask | libiomp5.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
__kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) | libiomp5.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
.plt.sec@start | libiomp5.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 4 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
__kmp_invoke_task_func | libiomp5.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
getppid | libc.so.6 | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 7 | 5 | 3 | 3 | NA | NA | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |