Run Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Run Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 68-86
- /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 93-101
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 68-81
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
13 | 84.63 | 89.10 | 103.50 | 26 | 7.13 | 7.35 | 73.93 | 13 | 87.35 | 24.28 | 24.18 | 26 | 5.59 | 1.46 | 384.29 |
Run Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Run Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) |
| | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
1168 | 0.23 | 0.25 | 0.51 | 25 | 0.12 | 0.12 | 0.00 | 1168 | 0.01 | 0.00 | 0.01 | 8 | 0.01 | 0.00 | 0.00 |
2 | 0.00 | 0.00 | 0.01 | 2 | 0.00 | 0.00 | 0.00 | 1106 | 0.02 | 0.00 | 0.01 | 13 | 0.01 | 0.00 | 0.00 |
1106 | 0.32 | 0.34 | 0.69 | 26 | 0.17 | 0.18 | 0.00 | 1976 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 |
2813 | 0.00 | 0.00 | 0.02 | 7 | 0.00 | 0.00 | 0.00 | 1083 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 |
1083 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 | 1102 | 0.70 | 0.19 | 0.25 | 26 | 0.15 | 0.04 | 0.00 |
1102 | 13.04 | 13.73 | 26.27 | 26 | 6.57 | 6.85 | 0.00 | -1 | 0.00 | 0.00 | 0.00 | 2 | 0.00 | 0.00 | NA |
-1 | 0.00 | 0.00 | 0.00 | 7 | 0.00 | 0.00 | NA | |
Run Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Run Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) |
| | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 95-101
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 12 | 11.92 | 3.31 | 5.02 | 26 | 5.58 | 1.39 | 10.02 |
Run Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Run Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 114-120
| | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
12 | 1.78 | 1.87 | 3.91 | 26 | 1.10 | 1.15 | 10.12 | |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) | Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) | Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) | Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) | Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) | Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) | Skylake ICPX Ofast Manual Unroll ONLY (no Hoisting) | Skylake ICPX Ofast Hoisting ONLY (no Manual Unroll) |
k_means(int, point_t*, point_t*, int*, int, int) [clone .extracted.18] | binary | 84.63 | 87.35 | 89.10 | 24.28 | 103.50 | 24.18 | 26 | 26 | 73.93 | 384.29 | 7.13 | 5.59 | 7.35 | 1.46 |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libiomp5.so | 13.04 | 0.70 | 13.73 | 0.19 | 26.27 | 0.25 | 26 | 26 | 0.00 | 0.00 | 6.57 | 0.15 | 6.85 | 0.04 |
k_means(int, point_t*, point_t*, int*, int, int) [clone .extracted] | binary | 1.78 | 11.92 | 1.87 | 3.31 | 3.91 | 5.02 | 26 | 26 | 10.12 | 10.02 | 1.10 | 5.58 | 1.15 | 1.39 |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libiomp5.so | 0.32 | 0.02 | 0.34 | 0.00 | 0.69 | 0.01 | 26 | 13 | 0.00 | 0.00 | 0.17 | 0.01 | 0.18 | 0.00 |
__sched_yield | libc.so.6 | 0.23 | 0.01 | 0.25 | 0.00 | 0.51 | 0.01 | 25 | 8 | 0.00 | 0.00 | 0.12 | 0.01 | 0.12 | 0.00 |
__kmp_yield | libiomp5.so | 0.00 | NA | 0.00 | NA | 0.02 | NA | 7 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
__kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) | libiomp5.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
__kmp_launch_thread | libiomp5.so | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
.plt.sec@start | libiomp5.so | 0.00 | NA | 0.00 | NA | 0.01 | NA | 2 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 7 | 2 | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 |