Perlmutter (NERSC)
The Perlmutter cluster is located at NERSC.
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up (40GB)${CFS}/m3239/
: community file system for users in the projectm3239
(or equivalent); moderate performance (20TB default)$PSCRATCH
: per-user production directory; very fast for parallel jobs; purged every 8 weeks (20TB default)
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/BLAST-WarpX/warpx.git $HOME/src/warpx
On Perlmutter, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes.
We use system software modules, add environment hints and further dependencies via the file $HOME/perlmutter_gpu_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_gpu_warpx.profile.example $HOME/perlmutter_gpu_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
Perlmutter GPU projects must end in ..._g
.
For example, if you are member of the project m3239
, then run nano $HOME/perlmutter_gpu_warpx.profile
and edit line 2 to read:
export proj="m3239_g"
Exit the nano
editor with Ctrl
+ O
(save) and then Ctrl
+ X
(exit).
Important
Now, and as the first step on future logins to Perlmutter, activate these environment settings:
source $HOME/perlmutter_gpu_warpx.profile
Finally, since Perlmutter does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/perlmutter-nersc/install_gpu_dependencies.sh
source ${PSCRATCH}/storage/sw/warpx/perlmutter/gpu/venvs/warpx-gpu/bin/activate
We use system software modules, add environment hints and further dependencies via the file $HOME/perlmutter_cpu_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_cpu_warpx.profile.example $HOME/perlmutter_cpu_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project m3239
, then run nano $HOME/perlmutter_cpu_warpx.profile
and edit line 2 to read:
export proj="m3239"
Exit the nano
editor with Ctrl
+ O
(save) and then Ctrl
+ X
(exit).
Important
Now, and as the first step on future logins to Perlmutter, activate these environment settings:
source $HOME/perlmutter_cpu_warpx.profile
Finally, since Perlmutter does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/perlmutter-nersc/install_cpu_dependencies.sh
source ${PSCRATCH}/storage/sw/warpx/perlmutter/cpu/venvs/warpx-cpu/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_pm_gpu
cmake -S . -B build_pm_gpu -DWarpX_COMPUTE=CUDA -DWarpX_FFT=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu -j 16
The WarpX application executables are now in $HOME/src/warpx/build_pm_gpu/bin/
.
Additionally, the following commands will install WarpX as a Python module:
cd $HOME/src/warpx
rm -rf build_pm_gpu_py
cmake -S . -B build_pm_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_FFT=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu_py -j 16 --target pip_install
cd $HOME/src/warpx
rm -rf build_pm_cpu
cmake -S . -B build_pm_cpu -DWarpX_COMPUTE=OMP -DWarpX_FFT=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_cpu -j 16
The WarpX application executables are now in $HOME/src/warpx/build_pm_cpu/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_pm_cpu_py
cmake -S . -B build_pm_cpu_py -DWarpX_COMPUTE=OMP -DWarpX_FFT=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_cpu_py -j 16 --target pip_install
Now, you can submit Perlmutter compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Perlmutter jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PSCRATCH
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
update the perlmutter_gpu_warpx.profile or perlmutter_cpu_warpx files,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_pm_*
and rebuild WarpX.
Running
The batch script below can be used to run a WarpX simulation on multiple nodes (change -N
accordingly) on the supercomputer Perlmutter at NERSC.
This partition as up to 1536 nodes.
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Note that we run one MPI rank per GPU.
$HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_gpu.sbatch
.#!/bin/bash -l
# Copyright 2021-2023 Axel Huebl, Kevin Gott
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH -J WarpX
# note: <proj> must end on _g
#SBATCH -A <proj>
#SBATCH -q regular
# A100 40GB (most nodes)
#SBATCH -C gpu
# A100 80GB (256 nodes)
#S BATCH -C gpu&hbm80g
#SBATCH --exclusive
#SBATCH --cpus-per-task=32
# ideally single:1, but NERSC cgroups issue
#SBATCH --gpu-bind=none
#SBATCH --ntasks-per-node=4
#SBATCH --gpus-per-node=4
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs
# pin to closest NIC to GPU
export MPICH_OFI_NIC_POLICY=GPU
# threads for OpenMP and threaded compressors per MPI rank
# note: 16 avoids hyperthreading (32 virtual cores, 16 physical)
export OMP_NUM_THREADS=16
# GPU-aware MPI optimizations
GPU_AWARE_MPI="amrex.use_gpu_aware_mpi=1"
# CUDA visible devices are ordered inverse to local task IDs
# Reference: nvidia-smi topo -m
srun --cpu-bind=cores bash -c "
export CUDA_VISIBLE_DEVICES=\$((3-SLURM_LOCALID));
${EXE} ${INPUTS} ${GPU_AWARE_MPI}" \
> output.txt
To run a simulation, copy the lines above to a file perlmutter_gpu.sbatch
and run
sbatch perlmutter_gpu.sbatch
to submit the job.
Perlmutter has 256 nodes that provide 80 GB HBM per A100 GPU.
In the A100 (40GB) batch script, replace -C gpu
with -C gpu&hbm80g
to use these large-memory GPUs.
The Perlmutter CPU partition as up to 3072 nodes, each with 2x AMD EPYC 7763 CPUs.
$HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_cpu.sbatch
.#!/bin/bash -l
# Copyright 2021-2023 WarpX
#
# This file is part of WarpX.
#
# Authors: Axel Huebl
# License: BSD-3-Clause-LBNL
#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH -J WarpX
#SBATCH -A <proj>
#SBATCH -q regular
#SBATCH -C cpu
# 8 cores per chiplet, 2x SMP
#SBATCH --cpus-per-task=16
#SBATCH --ntasks-per-node=16
#SBATCH --exclusive
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs_small
# each CPU node on Perlmutter (NERSC) has 64 hardware cores with
# 2x Hyperthreading/SMP
# https://en.wikichip.org/wiki/amd/epyc/7763
# https://www.amd.com/en/products/cpu/amd-epyc-7763
# Each CPU is made up of 8 chiplets, each sharing 32MB L3 cache.
# This will be our MPI rank assignment (2x8 is 16 ranks/node).
# threads for OpenMP and threaded compressors per MPI rank
export OMP_PLACES=threads
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
srun --cpu-bind=cores \
${EXE} ${INPUTS} \
> output.txt
Post-Processing
For post-processing, most users use Python via NERSC’s Jupyter service (documentation).
As a one-time preparatory setup, log into Perlmutter via SSH and do not source the WarpX profile script above. Create your own Conda environment and Jupyter kernel for post-processing:
module load python
conda config --set auto_activate_base false
# create conda environment
rm -rf $HOME/.conda/envs/warpx-pm-postproc
conda create --yes -n warpx-pm-postproc -c conda-forge mamba conda-libmamba-solver
conda activate warpx-pm-postproc
conda config --set solver libmamba
mamba install --yes -c conda-forge python ipykernel ipympl matplotlib numpy pandas yt openpmd-viewer openpmd-api h5py fast-histogram dask dask-jobqueue pyarrow
# create Jupyter kernel
rm -rf $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/
python -m ipykernel install --user --name warpx-pm-postproc --display-name WarpX-PM-PostProcessing
echo -e '#!/bin/bash\nmodule load python\nsource activate warpx-pm-postproc\nexec "$@"' > $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel-helper.sh
chmod a+rx $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel-helper.sh
KERNEL_STR=$(jq '.argv |= ["{resource_dir}/kernel-helper.sh"] + .' $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel.json | jq '.argv[1] = "python"')
echo ${KERNEL_STR} | jq > $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel.json
exit
When opening a Jupyter notebook on https://jupyter.nersc.gov, just select WarpX-PM-PostProcessing
from the list of available kernels on the top right of the notebook.
Additional software can be installed later on, e.g., in a Jupyter cell using !mamba install -y -c conda-forge ...
.
Software that is not available via conda can be installed via !python -m pip install ...
.