* Info: Selecting the 'perf-low-ppn' engine for node inti6212
* Info: "ref-cycles" not supported on inti6212: fallback to "cpu-clock"
* Info: Process launched (host inti6212, process 1663741)miniqmc git branch: OMP_offload
miniqmc git commit: de45b04eb021c4b57ba6f4bee8f563c614d11135-dirty
number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 1536
Tile size = 1536
Number of tiles = 1
Number of electrons = 3072
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
OpenMP threads = 16
Number of walkers per rank = 16
SPO coefficients size = 786432000 bytes (750 MB)
delayed update rank = 32
Using SoA distance table, Jastrow + einspline,
and determinant update.
==================================
Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer Inclusive_time Exclusive_time Calls Time_per_call
Setup 0.0418 0.0418 1 0.041798230
ParticleSet:::update 0.0000 0.0000 1 0.000004210
Total 46.4226 12.0436 1 46.422591094
Diffusion 9.5428 0.0190 5 1.908557894
Complete Updates 0.0805 0.0000 5 0.016093480
Determinant::update 0.0805 0.0805 10 0.008045307
Current Gradient 0.2733 0.0112 15360 0.000017793
Determinant::ratio 0.2576 0.2576 15360 0.000016773
OneBodyJastrow 0.0031 0.0031 15360 0.000000203
TwoBodyJastrow 0.0014 0.0014 15360 0.000000089
Kinetic Energy 0.1355 0.1353 5 0.027108807
OneBodyJastrow 0.0001 0.0001 5 0.000025414
TwoBodyJastrow 0.0001 0.0001 5 0.000014982
New Gradient 6.0263 0.0151 15360 0.000392337
Determinant::ratio 0.0244 0.0244 15360 0.000001585
Determinant::spovgl 5.8265 0.0807 15360 0.000379329
Single-Particle Orbitals 5.7458 5.7458 15360 0.000374072
OneBodyJastrow 0.0192 0.0192 15360 0.000001248
TwoBodyJastrow 0.1412 0.1412 15360 0.000009190
ParticleSet:::acceptMove 0.1777 0.0033 7611 0.000023350
DTAAOMPTarget::update_e_e 0.1703 0.1703 7611 0.000022374
DTABOMPTarget::update_ion_e 0.0042 0.0042 7611 0.000000549
ParticleSet:::computeNewPosDT 0.2670 0.0057 15360 0.000017381
DTAAOMPTarget::move_e_e 0.2350 0.2350 15360 0.000015298
DTABOMPTarget::move_ion_e 0.0263 0.0263 15360 0.000001710
ParticleSet:::donePbyP 0.0000 0.0000 5 0.000001472
Update 2.5635 0.0062 7611 0.000336811
Determinant::update 2.3872 2.3872 7611 0.000313653
OneBodyJastrow 0.0019 0.0019 7611 0.000000253
TwoBodyJastrow 0.1681 0.1681 7611 0.000022088
Initialization 1.6766 0.2618 1 1.676575295
Determinant::inverse 0.1859 0.1859 2 0.092954203
Determinant::spovgl 1.1257 0.0281 2 0.562825434
Single-Particle Orbitals 1.0976 1.0976 3072 0.000357288
OneBodyJastrow 0.0030 0.0030 1 0.002987749
ParticleSet:::update 0.0669 0.0180 2 0.033447637
DTAAOMPTarget::evaluate_e_e 0.0391 0.0391 1 0.039131590
DTABOMPTarget::evaluate_ion_e 0.0097 0.0002 1 0.009737378
DTABOMPTarget::offload_ion_e 0.0095 0.0095 1 0.009537898
TwoBodyJastrow 0.0334 0.0334 1 0.033375522
Pseudopotential 23.1596 0.0175 5 4.631927550
Determinant::spoval 22.2742 0.0088 5359 0.004156403
Single-Particle Orbitals 22.2654 22.2654 5359 0.004154763
OneBodyJastrow 0.0097 0.0097 5359 0.000001818
ParticleSet:::update 0.6083 0.0036 5359 0.000113504
DTABOMPTarget::evaluate_e_virtual 0.5557 0.0015 5359 0.000103701
DTABOMPTarget::offload_e_virtual 0.5542 0.5542 5359 0.000103419
DTABOMPTarget::evaluate_ion_virtual 0.0489 0.0014 5359 0.000009129
DTABOMPTarget::offload_ion_virtual 0.0476 0.0476 5359 0.000008875
TwoBodyJastrow 0.2500 0.2500 5359 0.000046650
========== Throughput ============
Total throughput ( N_walkers * N_elec^3 / Total time ) = 9.99204e+09
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 4.86081e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 6.51975e+06
* Info: Process finished (host inti6212, process 1663741)
* Warning: Collected empty callchains for 67.8% of 1st-event samples
* Info: Callchains info will be incomplete
* Info: Try to recompile your application with -fno-omit-frame-pointer or to rerun with btm=stack
* Info: Dumping samples (host inti6212, process 1663741)
* Info: Dumping source info for callchain nodes (host inti6212, process 1663741)
* Info: Building/writing metadata (host inti6212)
* Info: Finished collect step (host inti6212, process 1663741)
Your experiment path is /ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0
To display your profiling results:
#############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/ccc/work/cont001/ocre/oserete/miniqmc/build_icx/maqao_2023-06-18_12-31-42/tools/lprof_npsu_run_0 #
#############################################################################################################################################################