Run Skylake GCC Ofast Manual Unroll + SoA | Run Skylake Clang O3 + ffast-math Manual Unroll + SoA | Run Skylake ICPX Ofast Manual Unroll + SoA |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 69-97
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 69-97
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 69-98
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
6 | 96.62 | 12.21 | 12.43 | 26 | 1.65 | 0.69 | 86.61 | 9 | 96.19 | 11.67 | 12.08 | 26 | 1.80 | 0.72 | 93.00 | 13 | 81.09 | 4.99 | 5.16 | 26 | 7.67 | 0.61 | 414.52 |
Run Skylake GCC Ofast Manual Unroll + SoA | Run Skylake Clang O3 + ffast-math Manual Unroll + SoA | Run Skylake ICPX Ofast Manual Unroll + SoA |
| | | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 111-117
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| | 12 | 13.66 | 0.84 | 0.93 | 26 | 7.54 | 0.24 | 9.64 |
Run Skylake GCC Ofast Manual Unroll + SoA | Run Skylake Clang O3 + ffast-math Manual Unroll + SoA | Run Skylake ICPX Ofast Manual Unroll + SoA |
| | | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
7 | 2.98 | 0.38 | 0.51 | 26 | 1.68 | 0.16 | 9.18 | 10 | 3.17 | 0.38 | 0.52 | 26 | 1.81 | 0.16 | 10.18 | 1164 | 0.07 | 0.00 | 0.01 | 6 | 0.11 | 0.00 | 0.00 |
-1 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 8 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1106 | 0.19 | 0.01 | 0.02 | 15 | 0.15 | 0.00 | 0.00 |
332 | 0.06 | 0.01 | 0.03 | 11 | 0.09 | 0.01 | 0.00 | 1164 | 0.01 | 0.00 | 0.01 | 4 | 0.03 | 0.00 | 0.00 | 1102 | 4.98 | 0.31 | 0.21 | 26 | 2.34 | 0.06 | 0.00 |
336 | 0.33 | 0.04 | 0.06 | 26 | 0.13 | 0.01 | 0.00 | 651 | 0.63 | 0.08 | 0.08 | 26 | 0.17 | 0.01 | 0.00 | |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
Skylake GCC Ofast Manual Unroll + SoA | Skylake Clang O3 + ffast-math Manual Unroll + SoA | Skylake ICPX Ofast Manual Unroll + SoA | Skylake GCC Ofast Manual Unroll + SoA | Skylake Clang O3 + ffast-math Manual Unroll + SoA | Skylake ICPX Ofast Manual Unroll + SoA | Skylake GCC Ofast Manual Unroll + SoA | Skylake Clang O3 + ffast-math Manual Unroll + SoA | Skylake ICPX Ofast Manual Unroll + SoA | Skylake GCC Ofast Manual Unroll + SoA | Skylake Clang O3 + ffast-math Manual Unroll + SoA | Skylake ICPX Ofast Manual Unroll + SoA | Skylake GCC Ofast Manual Unroll + SoA | Skylake Clang O3 + ffast-math Manual Unroll + SoA | Skylake ICPX Ofast Manual Unroll + SoA | Skylake GCC Ofast Manual Unroll + SoA | Skylake Clang O3 + ffast-math Manual Unroll + SoA | Skylake ICPX Ofast Manual Unroll + SoA | Skylake GCC Ofast Manual Unroll + SoA | Skylake Clang O3 + ffast-math Manual Unroll + SoA | Skylake ICPX Ofast Manual Unroll + SoA |
k_means(int, point_t&, point_t&, int*, int, int) [clone ._omp_fn.0] | binary | 96.62 | NA | NA | 12.21 | NA | NA | 12.43 | NA | NA | 26 | NA | NA | 86.61 | NA | NA | 1.65 | NA | NA | 0.69 | NA | NA |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined] | binary | NA | 96.19 | NA | NA | 11.67 | NA | NA | 12.08 | NA | NA | 26 | NA | NA | 93.00 | NA | NA | 1.80 | NA | NA | 0.72 | NA |
k_means(int, point_t&, point_t&, int*, int, int) [clone .extracted.18] | binary | NA | NA | 81.09 | NA | NA | 4.99 | NA | NA | 5.16 | NA | NA | 26 | NA | NA | 414.52 | NA | NA | 7.67 | NA | NA | 0.61 |
k_means(int, point_t&, point_t&, int*, int, int) [clone .extracted] | binary | NA | NA | 13.66 | NA | NA | 0.84 | NA | NA | 0.93 | NA | NA | 26 | NA | NA | 9.64 | NA | NA | 7.54 | NA | NA | 0.24 |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libiomp5.so | NA | NA | 4.98 | NA | NA | 0.31 | NA | NA | 0.21 | NA | NA | 26 | NA | NA | 0.00 | NA | NA | 2.34 | NA | NA | 0.06 |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined.2] | binary | NA | 3.17 | NA | NA | 0.38 | NA | NA | 0.52 | NA | NA | 26 | NA | NA | 10.18 | NA | NA | 1.81 | NA | NA | 0.16 | NA |
k_means(int, point_t&, point_t&, int*, int, int) [clone ._omp_fn.1] | binary | 2.98 | NA | NA | 0.38 | NA | NA | 0.51 | NA | NA | 26 | NA | NA | 9.18 | NA | NA | 1.68 | NA | NA | 0.16 | NA | NA |
__kmpc_threadprivate_register_vec | libomp.so | NA | 0.63 | NA | NA | 0.08 | NA | NA | 0.08 | NA | NA | 26 | NA | NA | 0.00 | NA | NA | 0.17 | NA | NA | 0.01 | NA |
gomp_team_barrier_wait_end | libgomp.so.1.0.0 | 0.33 | NA | NA | 0.04 | NA | NA | 0.06 | NA | NA | 26 | NA | NA | 0.00 | NA | NA | 0.13 | NA | NA | 0.01 | NA | NA |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libiomp5.so | NA | NA | 0.19 | NA | NA | 0.01 | NA | NA | 0.02 | NA | NA | 15 | NA | NA | 0.00 | NA | NA | 0.15 | NA | NA | 0.00 |
__sched_yield | libc.so.6 | NA | 0.01 | 0.07 | NA | 0.00 | 0.00 | NA | 0.01 | 0.01 | NA | 4 | 6 | NA | 0.00 | 0.00 | NA | 0.03 | 0.11 | NA | 0.00 | 0.00 |
gomp_barrier_wait_end | libgomp.so.1.0.0 | 0.06 | NA | NA | 0.01 | NA | NA | 0.03 | NA | NA | 11 | NA | NA | 0.00 | NA | NA | 0.09 | NA | NA | 0.01 | NA | NA |
_dl_rtld_di_serinfo | ld-linux-x86-64.so.2 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
unknown_function | binary | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |