Frontier (OLCF)¶
The Frontier cluster (see: Crusher) is located at OLCF. Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
Introduction¶
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$PROJWORK/$proj/
: shared with all members of a project, purged every 90 days (recommended)$MEMBERWORK/$proj/
: single user, purged every 90 days (usually smaller quota)$WORLDWORK/$proj/
: shared with all users, purged every 90 daysNote that the
$HOME
directory is mounted as read-only on compute nodes. That means you cannot run in your$HOME
.
Installation¶
Use the following commands to download the WarpX source code and switch to the correct branch. You have to do this on Summit/OLCF Home/etc. since Frontier cannot connect directly to the internet:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
git clone https://github.com/AMReX-Codes/amrex.git $HOME/src/amrex
git clone https://github.com/ECP-WarpX/picsar.git $HOME/src/picsar
git clone -b 0.14.5 https://github.com/openPMD/openPMD-api.git $HOME/src/openPMD-api
To enable HDF5, work-around the broken HDF5_VERSION
variable (empty) in the Cray PE by commenting out the following lines in $HOME/src/openPMD-api/CMakeLists.txt
:
https://github.com/openPMD/openPMD-api/blob/0.14.5/CMakeLists.txt#L216-L220
We use the following modules and environments on the system ($HOME/frontier_warpx.profile
).
# please set your project account
#export proj=APH114-frontier
# required dependencies
module load cmake/3.22.2
module load craype-accel-amd-gfx90a
module load rocm/5.1.0
module load cray-mpich
module load cce/14.0.1 # must be loaded after rocm
# optional: faster builds
module load ccache
module load ninja
# optional: just an additional text editor
module load nano
# optional: for PSATD in RZ geometry support (not yet available)
#module load cray-libsci_acc/22.06.1.2
#module load blaspp
#module load lapackpp
# optional: for QED lookup table generation support
module load boost/1.79.0-cxx17
# optional: for openPMD support
#module load adios2/2.7.1
module load cray-hdf5-parallel/1.12.1.1
# optional: for Python bindings or libEnsemble
module load cray-python/3.9.12.1
# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand
# make output group-readable by default
umask 0027
# an alias to request an interactive batch node for one hour
# for paralle execution, start on the batch node: srun <command>
alias getNode="salloc -A $proj -J warpx -t 01:00:00 -p batch -N 1 --ntasks-per-node=8 --gpus-per-task=1 --gpu-bind=closest"
# an alias to run a command on a batch node for up to 30min
# usage: runNode <command>
alias runNode="srun -A $proj -J warpx -t 00:30:00 -p batch -N 1 --ntasks-per-node=8 --gpus-per-task=1 --gpu-bind=closest"
# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1
# optimize CUDA compilation for MI250X
export AMREX_AMD_ARCH=gfx90a
# compiler environment hints
export CC=$(which cc)
export CXX=$(which CC)
export FC=$(which ftn)
export CFLAGS="-I${ROCM_PATH}/include"
export CXXFLAGS="-I${ROCM_PATH}/include -Wno-pass-failed"
export LDFLAGS="-L${ROCM_PATH}/lib -lamdhip64"
We recommend to store the above lines in a file, such as $HOME/frontier_warpx.profile
, and load it into your shell after a login:
source $HOME/frontier_warpx.profile
Then, cd
into the directory $HOME/src/warpx
and use the following commands to compile:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build \
-DWarpX_COMPUTE=HIP \
-DWarpX_amrex_src=$HOME/src/amrex \
-DWarpX_picsar_src=$HOME/src/picsar \
-DWarpX_openpmd_src=$HOME/src/openPMD-api
cmake --build build -j 32
The general cmake compile-time options apply as usual.
Running¶
MI250X GPUs (2x64 GB)¶
After requesting an interactive node with the getNode
alias above, run a simulation like this, here using 8 MPI ranks and a single node:
runNode ./warpx inputs
Or in non-interactive runs:
#!/usr/bin/env bash
#SBATCH -A <project id>
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p batch
# Currently not configured on Frontier:
#S BATCH --ntasks-per-node=8
#S BATCH --cpus-per-task=8
#S BATCH --gpus-per-task=1
#S BATCH --gpu-bind=closest
#SBATCH -N 20
# load cray libs and ROCm libs
#export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}
# From the documentation:
# Each Frontier compute node consists of [1x] 64-core AMD EPYC 7A53
# "Optimized 3rd Gen EPYC" CPU (with 2 hardware threads per physical core) with
# access to 512 GB of DDR4 memory.
# Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies
# (GCDs) for a total of 8 GCDs per node. The programmer can think of the 8 GCDs
# as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
# note (5-16-22 and 7-12-22)
# this environment setting is currently needed on Frontier to work-around a
# known issue with Libfabric (both in the May and June PE)
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null
export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
./warpx inputs > output.txt
Post-Processing¶
For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).
Please follow the same guidance as for OLCF Summit post-processing.
Known System Issues¶
Warning
May 16th, 2022 (OLCFHELP-6888): There is a caching bug in Libfrabric that causes WarpX simulations to occasionally hang on Frontier on more than 1 node.
As a work-around, please export the following environment variable in your job scripts until the issue is fixed:
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
Warning
Sep 2nd, 2022 (OLCFDEV-1079): rocFFT in ROCm 5.1+ tries to write to a cache in the home area by default. This does not scale, disable it via:
export ROCFFT_RTC_CACHE_PATH=/dev/null