Run run_0 | Number processes: 1Number nodes: 1Run Command: <executable> ../qmckl_bench/data/Alz_large.h5MPI Command: Dataset: Run Directory: .OMP_NUM_THREADS: 1 |
---|---|
Run run_1 | OMP_NUM_THREADS: 2 |
Run run_2 | OMP_NUM_THREADS: 4 |
Run run_3 | OMP_NUM_THREADS: 8 |
Run run_4 | OMP_NUM_THREADS: 16 |
Run run_5 | OMP_NUM_THREADS: 26 |
Run run_6 | OMP_NUM_THREADS: 52 |
Loop id | Source Location | Source Function | Level | Coverage run_0 (%) | Coverage run_1 (%) | Coverage run_2 (%) | Coverage run_3 (%) | Coverage run_4 (%) | Coverage run_5 (%) | Coverage run_6 (%) | Max Time Over Threads run_0 (s) | Max Time Over Threads run_1 (s) | Max Time Over Threads run_2 (s) | Max Time Over Threads run_3 (s) | Max Time Over Threads run_4 (s) | Max Time Over Threads run_5 (s) | Max Time Over Threads run_6 (s) | Time w.r.t. Wall Time run_0 (s) | Time w.r.t. Wall Time run_1 (s) | Time w.r.t. Wall Time run_2 (s) | Time w.r.t. Wall Time run_3 (s) | Time w.r.t. Wall Time run_4 (s) | Time w.r.t. Wall Time run_5 (s) | Time w.r.t. Wall Time run_6 (s) | Nb Threads run_0 | Nb Threads run_1 | Nb Threads run_2 | Nb Threads run_3 | Nb Threads run_4 | Nb Threads run_5 | Nb Threads run_6 | Vectorization Ratio (%) | Vector Length Use (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing run_0 | Speedup If Perfect Load Balancing run_1 | Speedup If Perfect Load Balancing run_2 | Speedup If Perfect Load Balancing run_3 | Speedup If Perfect Load Balancing run_4 | Speedup If Perfect Load Balancing run_5 | Speedup If Perfect Load Balancing run_6 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | (run_0) Efficiency | (run_0) Potential Speed-Up (%) | (run_1) Efficiency | (run_1) Potential Speed-Up (%) | (run_2) Efficiency | (run_2) Potential Speed-Up (%) | (run_3) Efficiency | (run_3) Potential Speed-Up (%) | (run_4) Efficiency | (run_4) Potential Speed-Up (%) | (run_5) Efficiency | (run_5) Potential Speed-Up (%) | (run_6) Efficiency | (run_6) Potential Speed-Up (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
36 | libqmckl.so.0.0.0 - qmckl_ao.c:2433-3595 [...] | qmckl_compute_ao_vgl_hpc_gaussian | InBetween | 18.04 | 17.87 | 14.57 | 16.62 | 15.24 | 13.6 | 22.65 | 20.02 | 10.07 | 5.22 | 2.57 | 1.41 | 0.92 | 2.31 | 20.02 | 10.01 | 5 | 2.47 | 1.33 | 0.85 | 1.66 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 29.02 | 21.22 | 1.5 | 1.02 | 5.1 | 1 | 1.01 | 1.05 | 1.05 | 1.09 | 1.14 | 1.45 | NA | NA | NA | NA | NA | 1 | 0 | 1 | 0 | 1 | 0 | 1.01 | 0 | 0.94 | 0.9 | 0.91 | 1.28 | 0.23 | 17.4 |
106 | libqmckl.so.0.0.0 - qmckl_ao.c:2433-2986 [...] | qmckl_compute_ao_value_hpc_gaussian | InBetween | 12.88 | 12.61 | 10.13 | 12.61 | 10.66 | 9.43 | 4.07 | 14.29 | 7.06 | 3.59 | 2 | 0.98 | 0.68 | 0.34 | 14.29 | 7.06 | 3.48 | 1.88 | 0.93 | 0.59 | 0.3 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 28.75 | 20.91 | 1.41 | 1.02 | 5.17 | 1 | 1 | 1.04 | 1.08 | 1.09 | 1.21 | 1.17 | NA | NA | NA | NA | NA | 1 | 0 | 1.01 | 0 | 1.03 | 0 | 0.95 | 0.63 | 0.96 | 0.42 | 0.93 | 0.65 | 0.92 | 0.34 |
37 | libqmckl.so.0.0.0 - qmckl_ao.c:3480-3595 [...] | qmckl_compute_ao_vgl_hpc_gaussian | InBetween | 8.71 | 8.6 | 7.55 | 8.49 | 7.17 | 6.27 | 5.02 | 9.67 | 4.84 | 2.61 | 1.34 | 0.68 | 0.48 | 0.57 | 9.67 | 4.81 | 2.59 | 1.26 | 0.63 | 0.39 | 0.37 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 37.96 | 25.69 | 1.92 | 1.84 | 6.03 | 1 | 1.01 | 1.02 | 1.07 | 1.11 | 1.26 | 1.63 | NA | NA | NA | NA | NA | 1 | 0 | 1.01 | 0 | 0.93 | 0.5 | 0.96 | 0.35 | 0.96 | 0.29 | 0.95 | 0.29 | 0.5 | 2.5 |
40 | libqmckl.so.0.0.0 - qmckl_ao.c:3454-3470 | qmckl_compute_ao_vgl_hpc_gaussian | InBetween | 3.2 | 3.28 | 2.59 | 3.05 | 2.68 | 2.47 | 2.48 | 3.55 | 1.92 | 0.97 | 0.52 | 0.31 | 0.22 | 0.31 | 3.55 | 1.84 | 0.89 | 0.45 | 0.23 | 0.15 | 0.18 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 28.15 | 21.71 | 1.58 | 1 | 3.93 | 1 | 1.05 | 1.1 | 1.16 | 1.35 | 1.47 | 1.82 | 1 | 1.67 | 0 | 1.67 | 0.67 | 1 | 0 | 0.96 | 0.12 | 1 | 0.01 | 0.99 | 0.04 | 0.96 | 0.09 | 0.91 | 0.22 | 0.38 | 1.54 |
107 | libqmckl.so.0.0.0 - qmckl_ao.c:2940-2986 [...] | qmckl_compute_ao_value_hpc_gaussian | InBetween | 3.08 | 3.16 | 2.58 | 2.68 | 2.47 | 2.08 | 0.94 | 3.42 | 1.83 | 0.92 | 0.5 | 0.26 | 0.18 | 0.13 | 3.42 | 1.77 | 0.89 | 0.4 | 0.22 | 0.13 | 0.07 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 10 | 14.13 | 3.3 | 1.17 | 8.99 | 1 | 1.03 | 1.05 | 1.28 | 1.24 | 1.5 | 1.86 | NA | NA | NA | NA | NA | 1 | 0 | 0.97 | 0.11 | 0.96 | 0.1 | 1.07 | 0 | 0.97 | 0.07 | 1.01 | 0 | 0.94 | 0.06 |
112 | libqmckl.so.0.0.0 - qmckl_ao.c:2925-2985 [...] | qmckl_compute_ao_value_hpc_gaussian | InBetween | 3.02 | 3 | 2.57 | 2.78 | 2.58 | 2.28 | 0.93 | 3.35 | 1.69 | 0.96 | 0.47 | 0.28 | 0.19 | 0.11 | 3.35 | 1.68 | 0.88 | 0.41 | 0.23 | 0.14 | 0.07 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 0 | 9.82 | 3.75 | 1 | 10.91 | 1 | 1.01 | 1.09 | 1.15 | 1.27 | 1.36 | 1.57 | NA | NA | NA | NA | NA | 1 | 0 | 1 | 0.01 | 0.95 | 0.12 | 1.02 | 0 | 0.91 | 0.23 | 0.92 | 0.18 | 0.92 | 0.07 |
34 | bench_aos - | __intel_avx_rep_memset | Single | 2.77 | 2.8 | 2.62 | 3.24 | 2.82 | 2.77 | 1.8 | 3.07 | 1.65 | 1.05 | 0.6 | 0.3 | 0.22 | 0.19 | 3.07 | 1.57 | 0.9 | 0.48 | 0.25 | 0.17 | 0.13 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 100 | 50 | 1 | 1 | 2 | 1 | 1.06 | 1.18 | 1.25 | 1.25 | 1.29 | 1.46 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0.98 | 0.06 | 0.85 | 0.39 | 0.8 | 0.65 | 0.77 | 0.66 | 0.69 | 0.85 | 0.45 | 0.98 |
41 | libqmckl.so.0.0.0 - qmckl_ao.c:3463-3470 | qmckl_compute_ao_vgl_hpc_gaussian | Innermost | 2.42 | 2.41 | 1.95 | 2.26 | 1.83 | 1.81 | 2.49 | 2.69 | 1.36 | 0.7 | 0.38 | 0.22 | 0.16 | 0.34 | 2.69 | 1.35 | 0.67 | 0.34 | 0.16 | 0.11 | 0.18 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 75 | 40.63 | 1.15 | 1.33 | 3.33 | 1 | 1.01 | 1.06 | 1.15 | 1.38 | 1.45 | 2 | 1 | 2 | 0 | 0 | 1 | 1 | 0 | 1 | 0.01 | 1 | 0 | 0.99 | 0.02 | 1.05 | 0 | 0.94 | 0.11 | 0.29 | 1.77 |
113 | libqmckl.so.0.0.0 - qmckl_ao.c:2929-2932 | qmckl_compute_ao_value_hpc_gaussian | Innermost | 1.71 | 1.74 | 1.44 | 1.71 | 1.4 | 1.18 | 0.54 | 1.9 | 1 | 0.57 | 0.29 | 0.15 | 0.1 | 0.07 | 1.9 | 0.97 | 0.49 | 0.25 | 0.12 | 0.07 | 0.04 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 0 | 11.25 | 1.44 | 2.14 | 11.87 | 1 | 1.03 | 1.16 | 1.16 | 1.25 | 1.43 | 1.75 | 2 | 0 | 1 | 0 | 3 | 1 | 0 | 0.98 | 0.04 | 0.97 | 0.04 | 0.95 | 0.09 | 0.99 | 0.01 | 1.04 | 0 | 0.91 | 0.05 |
49 | libqmckl.so.0.0.0 - qmckl_ao.c:3404-3408 | qmckl_compute_ao_vgl_hpc_gaussian | Innermost | 1.71 | 1.56 | 1.4 | 1.73 | 1.35 | 1.22 | 0.81 | 1.9 | 0.91 | 0.5 | 0.31 | 0.16 | 0.11 | 0.11 | 1.9 | 0.87 | 0.48 | 0.26 | 0.12 | 0.08 | 0.06 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 0 | 12.5 | 1.13 | 2.25 | 8 | 1 | 1.05 | 1.04 | 1.24 | 1.45 | 1.57 | 1.83 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 1.09 | 0 | 0.99 | 0.01 | 0.91 | 0.15 | 0.99 | 0.01 | 0.91 | 0.11 | 0.61 | 0.32 |
120 | libqmckl.so.0.0.0 - qmckl_ao.c:2886-2890 | qmckl_compute_ao_value_hpc_gaussian | Innermost | 1.52 | 1.69 | 1.37 | 1.51 | 1.24 | 1.21 | 0.49 | 1.69 | 0.99 | 0.5 | 0.24 | 0.14 | 0.12 | 0.06 | 1.69 | 0.95 | 0.47 | 0.22 | 0.11 | 0.08 | 0.04 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 0 | 12.5 | 1.13 | 2.25 | 8 | 1 | 1.05 | 1.06 | 1.09 | 1.27 | 1.71 | 2 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0.89 | 0.19 | 0.9 | 0.14 | 0.96 | 0.06 | 0.96 | 0.05 | 0.81 | 0.23 | 0.81 | 0.09 |
35 | libqmckl.so.0.0.0 - qmckl_ao.c:2433-3595 [...] | qmckl_compute_ao_vgl_hpc_gaussian | Outermost | 1.4 | 1.41 | 1.42 | 1.42 | 1.36 | 1.29 | 2.84 | 1.55 | 0.82 | 0.77 | 0.26 | 0.15 | 0.13 | 0.27 | 1.55 | 0.79 | 0.49 | 0.21 | 0.12 | 0.08 | 0.21 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 9.09 | 13.64 | 3.5 | 1 | 7.8 | 1 | 1.04 | 1.6 | 1.24 | 1.25 | 1.63 | 1.35 | NA | NA | NA | NA | NA | 1 | 0 | 0.98 | 0.03 | 0.79 | 0.3 | 0.92 | 0.11 | 0.81 | 0.26 | 0.75 | 0.33 | 0.14 | 2.44 |
42 | libqmckl.so.0.0.0 - qmckl_ao.c:3441-3447 | qmckl_compute_ao_vgl_hpc_gaussian | Innermost | 1.23 | 1.35 | 1.05 | 1.05 | 1.1 | 0.96 | 0.37 | 1.36 | 0.77 | 0.41 | 0.19 | 0.13 | 0.08 | 0.05 | 1.36 | 0.76 | 0.36 | 0.16 | 0.1 | 0.06 | 0.03 | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 7.14 | 13.39 | 1.06 | 1.13 | 7.58 | 1 | 1.03 | 1.14 | 1.27 | 1.44 | 1.33 | 1.67 | 1 | 2 | 0 | 1 | 0 | 1 | 0 | 0.89 | 0.14 | 0.94 | 0.06 | 1.06 | 0 | 0.85 | 0.17 | 0.87 | 0.12 | 0.87 | 0.05 |
32 | bench_aos - | __intel_avx_rep_memcpy | Single | 0.43 | 0.43 | 0.35 | 0.41 | 0.34 | 0.29 | 0.13 | 0.48 | 0.48 | 0.48 | 0.48 | 0.46 | 0.45 | 0.47 | 0.48 | 0.24 | 0.12 | 0.06 | 0.03 | 0.02 | 0.01 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 100 | 50 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0.92 | 0.02 | 0.92 | 0.01 |
50 | libqmckl.so.0.0.0 - qmckl_ao.c:2669-2695 | qmckl_compute_ao_vgl_hpc_gaussian | Innermost | 0.32 | 0.31 | 0.24 | 0.29 | 0.36 | 0.26 | 0.25 | 0.36 | 0.19 | 0.12 | 0.07 | 0.05 | 0.03 | 0.04 | 0.36 | 0.17 | 0.08 | 0.04 | 0.03 | 0.02 | 0.02 | 1 | 2 | 4 | 8 | 16 | 25 | 47 | 12.9 | 13.51 | 1.18 | 1.25 | 9.41 | 1 | 1.12 | 1.5 | 1.75 | 1.67 | 1.5 | 2 | 1 | 8 | 0 | 0 | 1 | 1 | 0 | 1.06 | 0 | 1.13 | 0 | 1.13 | 0 | 0.75 | 0.09 | 0.69 | 0.08 | 0.35 | 0.16 |
121 | libqmckl.so.0.0.0 - qmckl_ao.c:2669-2695 | qmckl_compute_ao_value_hpc_gaussian | Innermost | 0.31 | 0.31 | 0.23 | 0.26 | 0.28 | 0.23 | 0.1 | 0.34 | 0.17 | 0.09 | 0.07 | 0.05 | 0.02 | 0.02 | 0.34 | 0.18 | 0.08 | 0.04 | 0.02 | 0.01 | 0.01 | 1 | 2 | 4 | 8 | 16 | 26 | 39 | 12.9 | 13.51 | 1.18 | 1.25 | 9.41 | 1 | 1 | 1.13 | 1.75 | 2.5 | 2 | 2 | 1 | 8 | 0 | 0 | 1 | 1 | 0 | 0.94 | 0.02 | 1.06 | 0 | 1.06 | 0 | 1.06 | 0 | 1.31 | 0 | 0.65 | 0.03 |
104 | libqmckl.so.0.0.0 - qmckl_ao.c:2433-2986 [...] | qmckl_compute_ao_value_hpc_gaussian | InBetween | 0.25 | 0.24 | 0.21 | 0.23 | 0.29 | 0.23 | 0.16 | 0.28 | 0.16 | 0.09 | 0.05 | 0.05 | 0.03 | 0.04 | 0.28 | 0.14 | 0.07 | 0.03 | 0.03 | 0.01 | 0.01 | 1 | 2 | 4 | 8 | 16 | 24 | 47 | 16.67 | 14.58 | 2.93 | 1 | 7.79 | 1 | 1.23 | 1.29 | 1.67 | 1.67 | 1.5 | 4 | NA | NA | NA | NA | NA | 1 | 0 | 1 | 0 | 1 | 0 | 1.17 | 0 | 0.58 | 0.12 | 1.08 | 0 | 0.54 | 0.07 |
119 | libqmckl.so.0.0.0 - qmckl_ao.c:2897-2898 | qmckl_compute_ao_value_hpc_gaussian | Innermost | 0.24 | 0.24 | 0.19 | 0.19 | 0.21 | 0.16 | 0.07 | 0.27 | 0.14 | 0.08 | 0.04 | 0.04 | 0.02 | 0.02 | 0.27 | 0.13 | 0.07 | 0.03 | 0.02 | 0.01 | 0.01 | 1 | 2 | 4 | 8 | 15 | 22 | 34 | 100 | 50 | 1.6 | 1 | 2 | 1 | 1.08 | 1.14 | 1.33 | 2 | 2 | 2 | 1 | 2 | 0 | 0 | 0 | 1 | 0 | 1.04 | 0 | 0.96 | 0.01 | 1.13 | 0 | 0.84 | 0.03 | 1.04 | 0 | 0.52 | 0.03 |
122 | libqmckl.so.0.0.0 - qmckl_ao.c:2666-2698 | qmckl_compute_ao_value_hpc_gaussian | InBetween | 0.19 | 0.11 | 0.13 | 0.13 | 0.14 | 0.14 | 0.04 | 0.21 | 0.08 | 0.05 | 0.04 | 0.03 | 0.02 | 0.01 | 0.21 | 0.06 | 0.05 | 0.02 | 0.01 | 0.01 | 0 | 1 | 2 | 4 | 8 | 14 | 21 | 22 | 22.22 | 13.54 | 2.88 | 2.88 | 13.43 | 1 | 1.33 | 1.25 | 2 | 3 | 2 | 1 | 2 | 5 | 3 | 2 | 0 | 1 | 0 | 1.75 | 0 | 1.05 | 0 | 1.31 | 0 | 1.31 | 0 | 0.81 | 0.03 | 1 | 0 |
51 | libqmckl.so.0.0.0 - qmckl_ao.c:2666-2698 | qmckl_compute_ao_vgl_hpc_gaussian | InBetween | 0.16 | 0.14 | 0.08 | 0.17 | 0.14 | 0.14 | 0.09 | 0.18 | 0.09 | 0.04 | 0.04 | 0.03 | 0.02 | 0.03 | 0.18 | 0.08 | 0.03 | 0.02 | 0.01 | 0.01 | 0.01 | 1 | 2 | 4 | 8 | 15 | 23 | 34 | 22.22 | 13.54 | 2.88 | 2.88 | 13.43 | 1 | 1.13 | 1.33 | 2 | 3 | 2 | 3 | 2 | 1 | 2 | 6 | 0 | 1 | 0 | 1.13 | 0 | 1.5 | 0 | 1.13 | 0 | 1.13 | 0 | 0.69 | 0.04 | 0.35 | 0.06 |
33 | bench_aos - | __intel_avx_rep_memcpy | Single | 0.07 | 0.07 | 0.06 | 0.05 | 0.06 | 0.04 | 0.02 | 0.07 | 0.07 | 0.08 | 0.06 | 0.08 | 0.06 | 0.08 | 0.08 | 0.04 | 0.02 | 0.01 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 100 | 50 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
43 | libqmckl.so.0.0.0 - qmckl_ao.c:3441-3447 | qmckl_compute_ao_vgl_hpc_gaussian | Innermost | 0.04 | 0.07 | 0.04 | 0.08 | 0.06 | 0.05 | 0.04 | 0.04 | 0.05 | 0.02 | 0.04 | 0.01 | 0 | 0.01 | 0.04 | 0.04 | 0.01 | 0.01 | 0.01 | 0 | 0 | 1 | 2 | 3 | 6 | 12 | 16 | 19 | 81.4 | 37.79 | 1.02 | 1.24 | 2.88 | 1 | 1.25 | 1 | 4 | 1 | 0 | 1 | 1 | 1 | 4 | 0 | 0 | 1 | 0 | 0.5 | 0.04 | 1 | 0 | 0.5 | 0.04 | 0.25 | 0.04 | 1 | 0 | 1 | 0 |
56 | libqmckl.so.0.0.0 - qmckl_ao.c:2625-2628 | qmckl_compute_ao_vgl_hpc_gaussian | Innermost | 0.02 | 0.01 | 0.02 | 0.01 | 0.03 | 0.02 | 0.01 | 0.02 | 0.01 | 0.01 | 0.01 | 0.01 | 0 | 0.01 | 0.02 | 0.01 | 0.01 | 0 | 0 | 0 | 0 | 1 | 2 | 4 | 3 | 7 | 6 | 6 | 0 | 12.5 | 1 | 1.33 | 8 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0.5 | 0.01 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
63 | libqmckl.so.0.0.0 - qmckl_ao.c:3300-3304 | qmckl_compute_ao_vgl_hpc_gaussian | Innermost | 0.01 | 0.01 | 0.04 | 0.12 | 0.2 | 0.39 | 0.56 | 0.02 | 0.01 | 0.03 | 0.03 | 0.03 | 0.04 | 0.08 | 0.01 | 0.01 | 0.01 | 0.02 | 0.02 | 0.02 | 0.04 | 1 | 1 | 4 | 8 | 16 | 26 | 52 | 91.37 | 46.76 | 1.36 | 1 | 2.36 | 1 | 1 | 3 | 1.5 | 1.5 | 2 | 2 | 2 | 0 | 0 | 3.25 | 0 | 1 | 0 | 0.5 | 0.01 | 0.25 | 0.03 | 0.06 | 0.11 | 0.03 | 0.19 | 0.02 | 0.38 | 0 | 0.56 |