Run 1x1 | Number processes: 1Number nodes: 1Number processes per node: 1Run Command: <executable> MPI Command: mpirun -np <number_processes> -ppn <number_processes_per_node>Dataset: Run Directory: /home/eoseret/tst_HACCmkI_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spreadOMP_NUM_THREADS: 1 |
---|---|
Run 1x2 | OMP_NUM_THREADS: 2I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x4 | OMP_NUM_THREADS: 4I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x8 | OMP_NUM_THREADS: 8I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x16 | OMP_NUM_THREADS: 16I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x32 | OMP_NUM_THREADS: 32I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x64 | OMP_NUM_THREADS: 64I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x128 | OMP_NUM_THREADS: 128I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 1x192 | OMP_NUM_THREADS: 192I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Name | Module | Coverage 1x1 (%) | Coverage 1x2 (%) | Coverage 1x4 (%) | Coverage 1x8 (%) | Coverage 1x16 (%) | Coverage 1x32 (%) | Coverage 1x64 (%) | Coverage 1x128 (%) | Coverage 1x192 (%) | Max Time Over Threads 1x1 (s) | Max Time Over Threads 1x2 (s) | Max Time Over Threads 1x4 (s) | Max Time Over Threads 1x8 (s) | Max Time Over Threads 1x16 (s) | Max Time Over Threads 1x32 (s) | Max Time Over Threads 1x64 (s) | Max Time Over Threads 1x128 (s) | Max Time Over Threads 1x192 (s) | Time w.r.t. Wall Time 1x1 (s) | Time w.r.t. Wall Time 1x2 (s) | Time w.r.t. Wall Time 1x4 (s) | Time w.r.t. Wall Time 1x8 (s) | Time w.r.t. Wall Time 1x16 (s) | Time w.r.t. Wall Time 1x32 (s) | Time w.r.t. Wall Time 1x64 (s) | Time w.r.t. Wall Time 1x128 (s) | Time w.r.t. Wall Time 1x192 (s) | Nb Threads 1x1 | Nb Threads 1x2 | Nb Threads 1x4 | Nb Threads 1x8 | Nb Threads 1x16 | Nb Threads 1x32 | Nb Threads 1x64 | Nb Threads 1x128 | Nb Threads 1x192 | Deviation (coverage) 1x1 | Deviation (coverage) 1x2 | Deviation (coverage) 1x4 | Deviation (coverage) 1x8 | Deviation (coverage) 1x16 | Deviation (coverage) 1x32 | Deviation (coverage) 1x64 | Deviation (coverage) 1x128 | Deviation (coverage) 1x192 | Deviation (walltime) 1x1 | Deviation (walltime) 1x2 | Deviation (walltime) 1x4 | Deviation (walltime) 1x8 | Deviation (walltime) 1x16 | Deviation (walltime) 1x32 | Deviation (walltime) 1x64 | Deviation (walltime) 1x128 | Deviation (walltime) 1x192 | Categories 1x1 | Categories 1x2 | Categories 1x4 | Categories 1x8 | Categories 1x16 | Categories 1x32 | Categories 1x64 | Categories 1x128 | Categories 1x192 | GFLOPS 1x1 | GFLOPS 1x2 | GFLOPS 1x4 | GFLOPS 1x8 | GFLOPS 1x16 | GFLOPS 1x32 | GFLOPS 1x64 | GFLOPS 1x128 | GFLOPS 1x192 | Compilation Options | (1x1) Efficiency | (1x1) Potential Speed-Up (%) | (1x2) Efficiency | (1x2) Potential Speed-Up (%) | (1x4) Efficiency | (1x4) Potential Speed-Up (%) | (1x8) Efficiency | (1x8) Potential Speed-Up (%) | (1x16) Efficiency | (1x16) Potential Speed-Up (%) | (1x32) Efficiency | (1x32) Potential Speed-Up (%) | (1x64) Efficiency | (1x64) Potential Speed-Up (%) | (1x128) Efficiency | (1x128) Potential Speed-Up (%) | (1x192) Efficiency | (1x192) Potential Speed-Up (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
►Step10_orig | exec | 99.93 | 98.69 | 97.25 | 94.66 | 89.87 | 79.49 | 67.85 | 51.59 | 43.13 | 661.48 | 331.24 | 165.81 | 83.29 | 42.29 | 22.14 | 11.96 | 6.67 | 5.16 | 661.48 | 331.21 | 165.6 | 82.98 | 41.94 | 21.35 | 11.3 | 6.04 | 4.51 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 0.00 | 0.01 | 0.12 | 0.29 | 0.50 | 1.72 | 2.59 | 3.62 | 4.67 | 0.00 | 0.05 | 0.20 | 0.25 | 0.23 | 0.46 | 0.43 | 0.42 | 0.49 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | 3.54 | 7.08 | 14.16 | 28.26 | 55.90 | 109.81 | 207.46 | 388.08 | 519.68 | GNU GIMPLE 13.2.0 -march=znver4 -g -g -O3 -O3 -O3 -O3 -fno-openacc -fno-pie -fcf-protection=none -fno-omit-frame-pointer -fcf-protection=none -fopenmp -funroll-loops -ffast-math -fltrans | 1 | 0 | 1 | 0.14 | 1 | 0.14 | 1 | 0.34 | 0.99 | 1.28 | 0.97 | 2.53 | 0.91 | 5.79 | 0.86 | 7.45 | 0.76 | 10.18 |
○Loop 3 - Step10_orig.c:19-31 - exec | 99.87 | 98.63 | 97.2 | 94.61 | 89.83 | 79.44 | 67.81 | 51.57 | 43.11 | 661.1 | 331.1 | 165.68 | 83.25 | 42.28 | 22.12 | 11.96 | 6.67 | 5.16 | 661.1 | 331.04 | 165.5 | 82.94 | 41.92 | 21.33 | 11.29 | 6.04 | 4.51 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 0.00 | 0.03 | 0.11 | 0.29 | 0.50 | 1.72 | 2.59 | 3.61 | 4.67 | 0.00 | 0.08 | 0.19 | 0.25 | 0.23 | 0.45 | 0.43 | 0.42 | 0.49 | 3.54 | 7.08 | 14.16 | 28.25 | 55.89 | 109.85 | 207.54 | 387.83 | 519.28 | 1 | 0 | 1 | 0.15 | 1 | 0.13 | 1 | 0.35 | 0.99 | 1.29 | 0.97 | 2.5 | 0.91 | 5.77 | 0.86 | 7.47 | 0.76 | 10.2 | |||||||||||
○Loop 2 - Step10_orig.c:19-35 - exec | 0.02 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.01 | 0.01 | 0.01 | 0.11 | 0.05 | 0.04 | 0.01 | 0.02 | 0.01 | 0.01 | 0 | 0.01 | 0.11 | 0.04 | 0.02 | 0.01 | 0.01 | 0 | 0 | 0 | 0 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 124 | 176 | 0.00 | 0.00 | 0.01 | 0.00 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.00 | 0.01 | 0.01 | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 4.99 | 13.59 | 26.63 | 55.25 | 50.88 | 0.00 | 0.00 | 0.00 | 0.00 | 1 | 0 | 1.38 | -0 | 1.38 | -0 | 1.38 | -0 | 0.69 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | |||||||||||
►main | exec | 0.04 | 0.61 | 0.61 | 0.61 | 0.56 | 0.51 | 0.41 | 0.32 | 0.24 | 0.25 | 4.08 | 4.14 | 4.3 | 4.17 | 4.39 | 4.41 | 4.74 | 4.74 | 0.25 | 2.04 | 1.04 | 0.54 | 0.26 | 0.14 | 0.07 | 0.04 | 0.02 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | 10.01 | 1.22 | 2.40 | 4.62 | 9.62 | 17.95 | 35.61 | 62.44 | 124.69 | GNU GIMPLE 13.2.0 -march=znver4 -g -g -O3 -O3 -O3 -O3 -fno-openacc -fno-pie -fcf-protection=none -fno-omit-frame-pointer -fcf-protection=none -fopenmp -funroll-loops -ffast-math -fltrans | 1 | 0 | 0.06 | 0.57 | 0.06 | 0.57 | 0.06 | 0.57 | 0.06 | 0.53 | 0.06 | 0.48 | 0.06 | 0.39 | 0.05 | 0.3 | 0.07 | 0.22 |
►Loop 1 - main.c:57-169 - exec [...] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||||||||||||||||||||||||||||
○Loop 0 - main.c:111-116 - exec | 0.04 | 0.61 | 0.61 | 0.61 | 0.56 | 0.51 | 0.41 | 0.32 | 0.24 | 0.25 | 4.08 | 4.14 | 4.3 | 4.17 | 4.39 | 4.41 | 4.74 | 4.74 | 0.25 | 2.04 | 1.04 | 0.54 | 0.26 | 0.14 | 0.07 | 0.04 | 0.02 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 10.01 | 1.22 | 2.40 | 4.62 | 9.62 | 17.95 | 35.61 | 62.44 | 124.69 | 1 | 0 | 0.06 | 0.57 | 0.06 | 0.57 | 0.06 | 0.57 | 0.06 | 0.53 | 0.06 | 0.48 | 0.06 | 0.39 | 0.05 | 0.3 | 0.07 | 0.22 | |||||||||||
►main._omp_fn.1 | exec | 0.02 | 0.02 | 0.02 | 0.02 | 0.04 | 0.06 | 0.07 | 0.19 | 0.19 | 0.13 | 0.07 | 0.05 | 0.03 | 0.03 | 0.03 | 0.03 | 0.07 | 0.06 | 0.13 | 0.06 | 0.03 | 0.02 | 0.02 | 0.02 | 0.01 | 0.02 | 0.02 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 0.00 | 0.00 | 0.01 | 0.01 | 0.02 | 0.03 | 0.05 | 0.14 | 0.16 | 0.00 | 0.01 | 0.02 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.02 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | Exe (%): 100.00 | 2.78 | 6.25 | 13.00 | 18.50 | 26.75 | 28.63 | 56.13 | 31.06 | 37.38 | GNU GIMPLE 13.2.0 -march=znver4 -g -g -O3 -O3 -O3 -O3 -fno-openacc -fno-pie -fcf-protection=none -fno-omit-frame-pointer -fcf-protection=none -fopenmp -funroll-loops -ffast-math -fltrans | 1 | 0 | 1.08 | -0 | 1.08 | -0 | 0.81 | 0 | 0.41 | 0.02 | 0.2 | 0.05 | 0.2 | 0.06 | 0.05 | 0.18 | 0.03 | 0.18 |
○Loop 4 - main.c:142-146 - exec | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.04 | 0.03 | 0.03 | 0.02 | 0.12 | 0.07 | 0.05 | 0.02 | 0.03 | 0.03 | 0.01 | 0.02 | 0.04 | 0.12 | 0.06 | 0.03 | 0.02 | 0.01 | 0.01 | 0 | 0 | 0 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 124 | 181 | 0.00 | 0.00 | 0.01 | 0.01 | 0.02 | 0.03 | 0.03 | 0.04 | 0.05 | 0.00 | 0.01 | 0.02 | 0.01 | 0.01 | 0.01 | 0.00 | 0.00 | 0.00 | 3.01 | 6.23 | 12.96 | 18.38 | 52.75 | 54.00 | 0.00 | 0.00 | 0.00 | 1 | 0 | 1 | 0 | 1 | 0 | 0.75 | 0.01 | 0.75 | 0.01 | 0.38 | 0.03 | 1 | 0 | 1 | 0 | 1 | 0 | |||||||||||
○__memset_avx512_unaligned_erms | libc.so.6 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.01 | 0.01 | 0.01 | 0.01 | 0.07 | 0.08 | 0.04 | 0.08 | 0.11 | 0.12 | 0.13 | 0.12 | 0.12 | 0.07 | 0.04 | 0.01 | 0.01 | 0.01 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | Memory (%): 100.00 | Memory (%): 100.00 | Memory (%): 100.00 | Memory (%): 100.00 | Memory (%): 100.00 | Memory (%): 100.00 | Memory (%): 100.00 | Memory (%): 100.00 | Memory (%): 100.00 | 1.46 | 2.31 | 10.00 | 10.13 | 10.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1 | 0 | 0.88 | 0 | 1.75 | 0 | 0.88 | 0 | 0.44 | 0.01 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | |
○gomp_barrier_wait_end | libgomp.so.1.0.0 | 0 | 0.61 | 1.91 | 4.41 | 8.91 | 17.7 | 27.75 | 43.48 | 49.82 | 0 | 4.11 | 4.36 | 4.51 | 4.73 | 5.11 | 5.17 | 5.24 | 5.39 | 0 | 2.05 | 3.26 | 3.86 | 4.16 | 4.75 | 4.62 | 5.09 | 5.21 | 0 | 1 | 3 | 8 | 15 | 31 | 63 | 127 | 191 | 0.00 | 0.00 | 0.01 | 1.78 | 0.26 | 0.46 | 0.92 | 0.60 | 0.44 | 0.00 | 0.00 | 0.02 | 1.56 | 0.12 | 0.12 | 0.15 | 0.07 | 0.05 | NA | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | |
○gomp_team_barrier_wait_end | libgomp.so.1.0.0 | 0 | 0.06 | 0.19 | 0.29 | 0.58 | 2.16 | 3.79 | 4.17 | 6.32 | 0 | 0.2 | 0.54 | 0.45 | 0.65 | 1.42 | 1.51 | 1.39 | 1.74 | 0 | 0.19 | 0.32 | 0.25 | 0.27 | 0.58 | 0.63 | 0.49 | 0.66 | 0 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 0.00 | 0.00 | 0.13 | 0.19 | 0.47 | 1.62 | 2.95 | 3.61 | 4.77 | 0.00 | 0.01 | 0.22 | 0.16 | 0.22 | 0.43 | 0.49 | 0.42 | 0.50 | NA | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | OMP (%): 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.01 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |