Theme: MAQAO_theme darkgrey cyan
Help is available by moving the cursor above any symbol or by checking MAQAO website .
There is no filter information to display
Total Time (s) 18.23
Profiled Time (s) 16.60
Time in analyzed loops (%) 48.9
Time in analyzed innermost loops (%) 41.1
Time in user code (%) 52.0
Compilation Options Score (%) 100
Array Access Efficiency (%) 52.1
Potential Speedups
Perfect Flow Complexity 1.02
Perfect OpenMP + MPI + Pthread 1.23
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution 2.14
No Scalar Integer Potential Speedup 1.03
Nb Loops to get 80% 14
FP Vectorised Potential Speedup 1.02
Nb Loops to get 80% 8
Fully Vectorised Potential Speedup 1.10
Nb Loops to get 80% 33
FP Arithmetic Only Potential Speedup 1.10
Nb Loops to get 80% 32
Source Object Issue
▼ libgromacs_mpi.so.9.0.0–
○ fft5d.cpp
○ threaded_force_buffer.cpp
○ pme_pp.cpp
○ pme_gather.cpp
○ listed_forces.cpp
○ simd_prune_kernel.cpp
○ partition.cpp
○ settle.cpp
○ pairlist.cpp
○ update.cpp
○ md_support.cpp
○ kernel_common.cpp
○ mdatoms.cpp
○ lincs.cpp
○ pbc.cpp
○ calc_verletbuf.h
○ pme_redistribute.cpp
○ domdec_specatomcomm.cpp
○ pme_grid.cpp
○ localtopology.cpp
○ pme_solve.cpp
○ pme_spread.cpp
○ calc_verletbuf.cpp
○ simd_kernel.h
○ fft_mkl.cpp
○ bonded.cpp
○ sim_util.cpp
○ grid.cpp
○ vec.h
○ domdec_constraints.cpp
○ pairs.cpp
○ domdec.cpp
○ atomdata.cpp
Application ../../install_gcc/bin/gmx_mpi
Timestamp 2024-08-05 19:41:40
Universal Timestamp 1722879700
Number of processes observed 48
Number of threads observed 192
Experiment Type MPI; OpenMP;
Machine ins01.benchmarkcenter.megware.com
Model Name AMD EPYC 9654 96-Core Processor
Architecture x86_64
Micro Architecture ZEN_V4
Cache Size 1024 KB
Number of Cores 96
OS Version Linux 5.14.0-427.18.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 28 06:27:02 EDT 2024
Architecture used during static analysis x86_64
Micro Architecture used during static analysis ZEN_V4
Frequency Driver acpi-cpufreq
Frequency Governor performance
Huge Pages always
Hyperthreading on
Number of sockets 2
Number of cores per socket 96
Compilation Options libgromacs_mpi.so.9.0.0 : GNU C++17 13.2.0 -march=skylake-avx512 -g -O3 -std=c++17 -fno-omit-frame-pointer -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp
Comments GROMACS 2024.2 compiled with gcc 13.2 running on two 96 cores AMD Zen 4 processors, using 48 MPI ranks and 4 OMP threads per MPI rank. Pinning is controlled by GROMACS.
Dataset
Run Command <executable> mdrun -s ion_channel.tpr -nsteps 10000 -pin on -deffnm gcc
MPI Command mpirun -genv I_MPI_FABRICS=shm -n <number_processes>
Number Processes 48
Number Nodes 1
Number Processes per Nodes 48
Filter Not Used
Profile Start Not Used