options

Executable Output

* Info: Selecting the 'perf-high-ppn' engine for node gmz16.benchmarkcenter.megware.com

* Info: Process launched (host gmz16.benchmarkcenter.megware.com, process 9469)
* Info: "ref-cycles" not supported on gmz16.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz16.benchmarkcenter.megware.com, process 9474)
   _  __       _         _
  | |/ /      (_)       | |
  | ' /  _ __  _  _ __  | | __ ___
  |  <  | '__|| || '_ \ | |/ // _ \ 
  | . \ | |   | || |_) ||   <|  __/
  |_|\_\|_|   |_|| .__/ |_|\_\\___|
                 | |
                 |_|        Version 1.2.4

LLNL-CODE-775068

Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC

Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license

This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.

Author: Adam J. Kunen 

Compilation Options:
  Architecture:           OpenMP
  Compiler:               /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
  Compiler Flags:         "-O3 -march=native -O3 -march=znver4 -mprefer-vector-width=512 -flto -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=clang++     -Wall -Wextra  "
  Linker Flags:           " "
  CHAI Enabled:           No
  CUDA Enabled:           No
  MPI Enabled:            Yes
  OpenMP Enabled:         Yes
  Caliper Enabled:        No

OpenMP Thread->Core mapping for 1 threads on rank 0
    0->  0

Input Parameters
================

  Problem Size:
    Zones:                 16 x 16 x 16  (4096 total)
    Groups:                1024
    Legendre Order:        4
    Quadrature Set:        Dummy S2 with 96 points

  Physical Properties:
    Total X-Sec:           sigt=[0.100000, 0.000100, 0.100000]
    Scattering X-Sec:      sigs=[0.050000, 0.000050, 0.050000]

  Solver Options:
    Number iterations:     10

  MPI Decomposition Options:
    Total MPI tasks:       2
    Spatial decomp:        2 x 1 x 1 MPI tasks
    Block solve method:    Sweep

  Per-Task Options:
    DirSets/Directions:    8 sets, 12 directions/set
    GroupSet/Groups:       2 sets, 512 groups/set
    Zone Sets:             1 x 1 x 1
    Architecture:          OpenMP
    Data Layout:           DGZ

Generating Problem
==================

  Decomposition Space:   Procs:      Subdomains (local/global):
  ---------------------  ----------  --------------------------
  (P) Energy:            1           2 / 2
  (Q) Direction:         1           8 / 8
  (R) Space:             2           1 / 2
  (Rx,Ry,Rz) R in XYZ:   2x1x1       1x1x1 / 2x1x1
  (PQR) TOTAL:           2           16 / 32

  Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]

  Memory breakdown of Field variables:
  Field Variable            Num Elements    Megabytes
  --------------            ------------    ---------
  data/sigs                     15728640      120.000
  dx                                  16        0.000
  dy                                  16        0.000
  dz                                  16        0.000
  ell                               2400        0.018
  ell_plus                          2400        0.018
  i_plane                       25165824      192.000
  j_plane                       25165824      192.000
  k_plane                       25165824      192.000
  mixelem_to_fraction               4352        0.033
  phi                          104857600      800.000
  phi_out                      104857600      800.000
  psi                          402653184     3072.000
  quadrature/w                        96        0.001
  quadrature/xcos                     96        0.001
  quadrature/ycos                     96        0.001
  quadrature/zcos                     96        0.001
  rhs                          402653184     3072.000
  sigt_zonal                     4194304       32.000
  volume                            4096        0.031
  --------                  ------------    ---------
  TOTAL                       1110455664     8472.104

  Generation Complete!

Steady State Solve
==================

  iter 0: particle count=1.197998e+09, change=1.000000e+00
  iter 1: particle count=1.801368e+09, change=3.349511e-01
  iter 2: particle count=2.102278e+09, change=1.431351e-01
  iter 3: particle count=2.251810e+09, change=6.640521e-02
  iter 4: particle count=2.325888e+09, change=3.184924e-02
  iter 5: particle count=2.362467e+09, change=1.548355e-02
  iter 6: particle count=2.380471e+09, change=7.563193e-03
  iter 7: particle count=2.389305e+09, change=3.697158e-03
  iter 8: particle count=2.393627e+09, change=1.805479e-03
  iter 9: particle count=2.395735e+09, change=8.801810e-04
  Solver terminated

Timers
======

  Timer                    Count       Seconds
  ----------------  ------------  ------------
  Generate                     1       0.02985
  LPlusTimes                  10      17.01816
  LTimes                      10      25.68544
  Population                  10       1.67824
  Scattering                  10    1025.60752
  Solve                        1    1107.14039
  Source                      10       0.04474
  SweepSolver                 10      36.24844
  SweepSubdomain             160      18.97942

TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.029850,17.018161,25.685436,1.678242,1025.607521,1107.140387,0.044739,36.248443,18.979417

Figures of Merit
================

  Throughput:         3.636876e+06 [unknowns/(second/iteration)]
  Grind time :        2.749613e-07 [(seconds/iteration)/unknowns]
  Sweep efficiency :  52.35926 [100.0 * SweepSubdomain time / SweepSolver time]
  Number of unknowns: 402653184

END

* Info: Process finished (host gmz16.benchmarkcenter.megware.com, process 9469)
* Warning: (host gmz16.benchmarkcenter.megware.com, process 9469) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.

* Info: Process finished (host gmz16.benchmarkcenter.megware.com, process 9474)
* Warning: (host gmz16.benchmarkcenter.megware.com, process 9474) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.


Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0

To display your profiling results:
########################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                COMMAND                                                                                #
########################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/kripke/intel/Kripke/run/oneview_runs/compilers/aocc_10/oneview_results_scal/tools/lprof_npsu_run_0  #
########################################################################################################################################################################################################

×