Frontier (OLCF)

The Frontier cluster (see: Crusher) is located at OLCF. Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

Introduction

If you are new to this system, please see the following resources:

  • Crusher user guide

  • Batch system: Slurm

  • Production directories:

    • $PROJWORK/$proj/: shared with all members of a project, purged every 90 days (recommended)

    • $MEMBERWORK/$proj/: single user, purged every 90 days (usually smaller quota)

    • $WORLDWORK/$proj/: shared with all users, purged every 90 days

    • Note that the $HOME directory is mounted as read-only on compute nodes. That means you cannot run in your $HOME.

Installation

Use the following commands to download the WarpX source code and switch to the correct branch. You have to do this on Summit/OLCF Home/etc. since Frontier cannot connect directly to the internet:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
git clone https://github.com/AMReX-Codes/amrex.git $HOME/src/amrex
git clone https://github.com/ECP-WarpX/picsar.git $HOME/src/picsar
git clone -b 0.14.5 https://github.com/openPMD/openPMD-api.git $HOME/src/openPMD-api

To enable HDF5, work-around the broken HDF5_VERSION variable (empty) in the Cray PE by commenting out the following lines in $HOME/src/openPMD-api/CMakeLists.txt: https://github.com/openPMD/openPMD-api/blob/0.14.5/CMakeLists.txt#L216-L220

We use the following modules and environments on the system ($HOME/frontier_warpx.profile).

Listing 16 You can copy this file from Tools/machines/frontier-olcf/frontier_warpx.profile.example.
# please set your project account
#export proj=APH114-frontier

# required dependencies
module load cmake/3.22.2
module load craype-accel-amd-gfx90a
module load rocm/5.1.0
module load cray-mpich
module load cce/14.0.1  # must be loaded after rocm

# optional: faster builds
module load ccache
module load ninja

# optional: just an additional text editor
module load nano

# optional: for PSATD in RZ geometry support (not yet available)
#module load cray-libsci_acc/22.06.1.2
#module load blaspp
#module load lapackpp

# optional: for QED lookup table generation support
module load boost/1.79.0-cxx17

# optional: for openPMD support
#module load adios2/2.7.1
module load cray-hdf5-parallel/1.12.1.1

# optional: for Python bindings or libEnsemble
module load cray-python/3.9.12.1

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# make output group-readable by default
umask 0027

# an alias to request an interactive batch node for one hour
#   for paralle execution, start on the batch node: srun <command>
alias getNode="salloc -A $proj -J warpx -t 01:00:00 -p batch -N 1 --ntasks-per-node=8 --gpus-per-task=1 --gpu-bind=closest"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -A $proj -J warpx -t 00:30:00 -p batch -N 1 --ntasks-per-node=8 --gpus-per-task=1 --gpu-bind=closest"

# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1

# optimize CUDA compilation for MI250X
export AMREX_AMD_ARCH=gfx90a

# compiler environment hints
export CC=$(which cc)
export CXX=$(which CC)
export FC=$(which ftn)
export CFLAGS="-I${ROCM_PATH}/include"
export CXXFLAGS="-I${ROCM_PATH}/include -Wno-pass-failed"
export LDFLAGS="-L${ROCM_PATH}/lib -lamdhip64"

We recommend to store the above lines in a file, such as $HOME/frontier_warpx.profile, and load it into your shell after a login:

source $HOME/frontier_warpx.profile

Then, cd into the directory $HOME/src/warpx and use the following commands to compile:

cd $HOME/src/warpx
rm -rf build

cmake -S . -B build   \
  -DWarpX_COMPUTE=HIP \
  -DWarpX_amrex_src=$HOME/src/amrex \
  -DWarpX_picsar_src=$HOME/src/picsar \
  -DWarpX_openpmd_src=$HOME/src/openPMD-api
cmake --build build -j 32

The general cmake compile-time options apply as usual.

Running

MI250X GPUs (2x64 GB)

After requesting an interactive node with the getNode alias above, run a simulation like this, here using 8 MPI ranks and a single node:

runNode ./warpx inputs

Or in non-interactive runs:

Listing 17 You can copy this file from Tools/machines/frontier-olcf/submit.sh.
#!/usr/bin/env bash

#SBATCH -A <project id>
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p batch
# Currently not configured on Frontier:
#S BATCH --ntasks-per-node=8
#S BATCH --cpus-per-task=8
#S BATCH --gpus-per-task=1
#S BATCH --gpu-bind=closest
#SBATCH -N 20

# load cray libs and ROCm libs
#export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}

# From the documentation:
# Each Frontier compute node consists of [1x] 64-core AMD EPYC 7A53
# "Optimized 3rd Gen EPYC" CPU (with 2 hardware threads per physical core) with
# access to 512 GB of DDR4 memory.
# Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies
# (GCDs) for a total of 8 GCDs per node. The programmer can think of the 8 GCDs
# as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

# note (5-16-22 and 7-12-22)
# this environment setting is currently needed on Frontier to work-around a
# known issue with Libfabric (both in the May and June PE)
#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null

export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
    ./warpx inputs > output.txt

Post-Processing

For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).

Please follow the same guidance as for OLCF Summit post-processing.

Known System Issues

Warning

May 16th, 2022 (OLCFHELP-6888): There is a caching bug in Libfrabric that causes WarpX simulations to occasionally hang on Frontier on more than 1 node.

As a work-around, please export the following environment variable in your job scripts until the issue is fixed:

#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

Warning

Sep 2nd, 2022 (OLCFDEV-1079): rocFFT in ROCm 5.1+ tries to write to a cache in the home area by default. This does not scale, disable it via:

export ROCFFT_RTC_CACHE_PATH=/dev/null