WarpX

WarpX is an advanced, time-based, electromagnetic & electrostatic Particle-In-Cell code.

It supports many features including:

  • Perfectly-Matched Layers (PML)

  • Boosted-frame simulations

  • Mesh refinement

For details on the algorithms that WarpX implements, see the theory section.

WarpX is a highly-parallel and highly-optimized code, which can run on GPUs and multi-core CPUs, and includes load balancing capabilities. WarpX scales to the world’s largest supercomputers and was awarded the 2022 ACM Gordon Bell Prize. In addition, WarpX is also a multi-platform code and runs on Linux, macOS and Windows.

Contact us

If you are starting using WarpX, or if you have a user question, please pop in our discussions page and get in touch with the community.

The WarpX GitHub repo is the main communication platform. Have a look at the action icons on the top right of the web page: feel free to watch the repo if you want to receive updates, or to star the repo to support the project. For bug reports or to request new features, you can also open a new issue.

We also have a discussion page on which you can find already answered questions, add new questions, get help with installation procedures, discuss ideas or share comments.

Code of Conduct

Our Pledge

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

Our Standards

Examples of behavior that contributes to creating a positive environment include:

  • Using welcoming and inclusive language

  • Being respectful of differing viewpoints and experiences

  • Gracefully accepting constructive criticism

  • Focusing on what is best for the community

  • Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

  • The use of sexualized language or imagery and unwelcome sexual attention or advances

  • Trolling, insulting/derogatory comments, and personal or political attacks

  • Public or private harassment

  • Publishing others’ private information, such as a physical or electronic address, without explicit permission

  • Other conduct which could reasonably be considered inappropriate in a professional setting

Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

Scope

This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.

Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at warpx-coc@lbl.gov. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project’s leadership.

Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

For answers to common questions about this code of conduct, see https://www.contributor-covenant.org/faq

Acknowledge WarpX

Please acknowledge the role that WarpX played in your research.

In presentations

For your presentations, you can find WarpX slides here. Several flavors are available:

  • full slide

  • half-slide (portrait or landscape format)

  • small inset.

Feel free to use the one that fits into your presentation and adequately acknowledges the part that WarpX played in your research.

In publications

Please add the following sentence to your publications, it helps contributors keep in touch with the community and promote the project.

Plain text:

This research used the open-source particle-in-cell code WarpX https://github.com/ECP-WarpX/WarpX, primarily funded by the US DOE Exascale Computing Project. Primary WarpX contributors are with LBNL, LLNL, CEA-LIDYL, SLAC, DESY, CERN, and TAE Technologies. We acknowledge all WarpX contributors.

LaTeX:

\usepackage{hyperref}
This research used the open-source particle-in-cell code WarpX \url{https://github.com/ECP-WarpX/WarpX}, primarily funded by the US DOE Exascale Computing Project.
Primary WarpX contributors are with LBNL, LLNL, CEA-LIDYL, SLAC, DESY, CERN, and TAE Technologies.
We acknowledge all WarpX contributors.

Latest WarpX reference

If your project leads to a scientific publication, please consider citing the paper below.

  • Fedeli L, Huebl A, Boillod-Cerneux F, Clark T, Gott K, Hillairet C, Jaure S, Leblanc A, Lehe R, Myers A, Piechurski C, Sato M, Zaim N, Zhang W, Vay J-L, Vincenti H. Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ISSN:2167-4337, pp. 25-36, Dallas, TX, US, 2022. DOI:10.1109/SC41404.2022.00008 (preprint here)

Prior WarpX references

If your project uses a specific algorithm or component, please consider citing the respective publications in addition.

  • Sandberg R T, Lehe R, Mitchell C E, Garten M, Myers A, Qiang J, Vay J-L and Huebl A. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. Proc. of Platform for Advanced Scientific Computing (PASC’24), submitted, 2024. preprint <http://arxiv.org/abs/2402.17248>__

  • Sandberg R T, Lehe R, Mitchell C E, Garten M, Qiang J, Vay J-L and Huebl A. Hybrid Beamline Element ML-Training for Surrogates in the ImpactX Beam-Dynamics Code. 14th International Particle Accelerator Conference (IPAC’23), WEPA101, 2023. DOI:10.18429/JACoW-IPAC2023-WEPA101

  • Huebl A, Lehe R, Zoni E, Shapoval O, Sandberg R T, Garten M, Formenti A, Jambunathan R, Kumar P, Gott K, Myers A, Zhang W, Almgren A, Mitchell C E, Qiang J, Sinn A, Diederichs S, Thevenet M, Grote D, Fedeli L, Clark T, Zaim N, Vincenti H, Vay JL. From Compact Plasma Particle Sources to Advanced Accelerators with Modeling at Exascale. Proceedings of the 20th Advanced Accelerator Concepts Workshop (AAC’22), in print, 2023. arXiv:2303.12873

  • Huebl A, Lehe R, Mitchell C E, Qiang J, Ryne R D, Sandberg R T, Vay JL. Next Generation Computational Tools for the Modeling and Design of Particle Accelerators at Exascale. Proceedings of the 2022 North American Particle Accelerator Conference (NAPAC’22), TUYE2, pp. 302-306, 2022. arXiv:2208.02382, DOI:10.18429/JACoW-NAPAC2022-TUYE2

  • Fedeli L, Zaim N, Sainte-Marie A, Thevenet M, Huebl A, Myers A, Vay JL, Vincenti H. PICSAR-QED: a Monte Carlo module to simulate Strong-Field Quantum Electrodynamics in Particle-In-Cell codes for exascale architectures. New Journal of Physics 24 025009, 2022. DOI:10.1088/1367-2630/ac4ef1

  • Lehe R, Blelly A, Giacomel L, Jambunathan R, Vay JL. Absorption of charged particles in perfectly matched layers by optimal damping of the deposited current. Physical Review E 106 045306, 2022. DOI:10.1103/PhysRevE.106.045306

  • Zoni E, Lehe R, Shapoval O, Belkin D, Zaim N, Fedeli L, Vincenti H, Vay JL. A hybrid nodal-staggered pseudo-spectral electromagnetic particle-in-cell method with finite-order centering. Computer Physics Communications 279, 2022. DOI:10.1016/j.cpc.2022.108457

  • Myers A, Almgren A, Amorim LD, Bell J, Fedeli L, Ge L, Gott K, Grote DP, Hogan M, Huebl A, Jambunathan R, Lehe R, Ng C, Rowan M, Shapoval O, Thevenet M, Vay JL, Vincenti H, Yang E, Zaim N, Zhang W, Zhao Y, Zoni E. Porting WarpX to GPU-accelerated platforms. Parallel Computing. 2021 Sep, 108:102833. DOI:10.1016/j.parco.2021.102833

  • Shapoval O, Lehe R, Thevenet M, Zoni E, Zhao Y, Vay JL. Overcoming timestep limitations in boosted-frame Particle-In-Cell simulations of plasma-based acceleration. Phys. Rev. E Nov 2021, 104:055311. arXiv:2104.13995, DOI:10.1103/PhysRevE.104.055311

  • Vay JL, Huebl A, Almgren A, Amorim LD, Bell J, Fedeli L, Ge L, Gott K, Grote DP, Hogan M, Jambunathan R, Lehe R, Myers A, Ng C, Rowan M, Shapoval O, Thevenet M, Vincenti H, Yang E, Zaim N, Zhang W, Zhao Y, Zoni E. Modeling of a chain of three plasma accelerator stages with the WarpX electromagnetic PIC code on GPUs. Physics of Plasmas. 2021 Feb 9, 28(2):023105. DOI:10.1063/5.0028512

  • Rowan ME, Gott KN, Deslippe J, Huebl A, Thevenet M, Lehe R, Vay JL. In-situ assessment of device-side compute work for dynamic load balancing in a GPU-accelerated PIC code. PASC ‘21: Proceedings of the Platform for Advanced Scientific Computing Conference. 2021 July, 10, pages 1-11. DOI:10.1145/3468267.3470614

  • Vay JL, Almgren A, Bell J, Ge L, Grote DP, Hogan M, Kononenko O, Lehe R, Myers A, Ng C, Park J, Ryne R, Shapoval O, Thevenet M, Zhang W. Warp-X: A new exascale computing platform for beam–plasma simulations. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 2018 Nov, 909(12) Pages 476-479. DOI: 10.1016/j.nima.2018.01.035

  • Kirchen M, Lehe R, Jalas S, Shapoval O, Vay JL, Maier AR. Scalable spectral solver in Galilean coordinates for eliminating the numerical Cherenkov instability in particle-in-cell simulations of streaming plasmas. Physical Review E. 2020 July, 102(1-1):013202. DOI: 10.1103/PhysRevE.102.013202

  • Shapoval O, Vay JL, Vincenti H. Two-step perfectly matched layer for arbitrary-order pseudo-spectral analytical time-domain methods. Computer Physics Communications. 2019 Feb, 235, pages 102-110. DOI: 10.1016/j.cpc.2018.09.015

  • Lehe R, Kirchen M, Godfrey BB, Maier AR, Vay JL. Elimination of numerical Cherenkov instability in flowing-plasma particle-in-cell simulations by using galilean coordinates. Physical Review E. 2016 Nov, 94:053305. DOI: 10.1103/PhysRevE.94.053305

Science Highlights

WarpX can be used in many domains of laser-plasma science, plasma physics, accelerator physics and beyond. Below, we collect a series of scientific publications that used WarpX. Please acknowledge WarpX in your works, so we can find your works.

Is your publication missing? Contact us or edit this page via a pull request.

Plasma-Based Acceleration

Scientific works in laser-plasma and beam-plasma acceleration.

  1. Peng, H. and Huang, T. W. and Jiang, K. and Li, R. and Wu, C. N. and Yu, M. Y. and Riconda, C. and Weber, S. and Zhou, C. T. and Ruan, S. C. Coherent Subcycle Optical Shock from a Superluminal Plasma Wake. Phys. Rev. Lett. 131, 145003, 2023 DOI:10.1103/PhysRevLett.131.145003

  2. Mewes SM, Boyle GJ, Ferran Pousa A, Shalloo RJ, Osterhoff J, Arran C, Corner L, Walczak R, Hooker SM, Thévenet M. Demonstration of tunability of HOFI waveguides via start-to-end simulations. Phys. Rev. Research 5, 033112, 2023 DOI:10.1103/PhysRevResearch.5.033112

  3. Sandberg R T, Lehe R, Mitchell C E, Garten M, Myers A, Qiang J, Vay J-L and Huebl A. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. Proc. of Platform for Advanced Scientific Computing (PASC’24), submitted, 2024. preprint

  4. Sandberg R T, Lehe R, Mitchell C E, Garten M, Qiang J, Vay J-L and Huebl A. Hybrid Beamline Element ML-Training for Surrogates in the ImpactX Beam-Dynamics Code. 14th International Particle Accelerator Conference (IPAC’23), WEPA101, 2023. DOI:10.18429/JACoW-IPAC2023-WEPA101

  5. Wang J, Zeng M, Li D, Wang X, Gao J. High quality beam produced by tightly focused laser driven wakefield accelerators. Phys. Rev. Accel. Beams, 26, 091303, 2023. DOI:10.1103/PhysRevAccelBeams.26.091303

  6. Fedeli L, Huebl A, Boillod-Cerneux F, Clark T, Gott K, Hillairet C, Jaure S, Leblanc A, Lehe R, Myers A, Piechurski C, Sato M, Zaim N, Zhang W, Vay J-L, Vincenti H. Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ISSN:2167-4337, pp. 25-36, Dallas, TX, US, 2022. DOI:10.1109/SC41404.2022.00008 (preprint here)

  7. Zhao Y, Lehe R, Myers A, Thevenet M, Huebl A, Schroeder CB, Vay J-L. Plasma electron contribution to beam emittance growth from Coulomb collisions in plasma-based accelerators. Physics of Plasmas 29, 103109, 2022. DOI:10.1063/5.0102919

  8. Wang J, Zeng M, Li D, Wang X, Lu W, Gao J. Injection induced by coaxial laser interference in laser wakefield accelerators. Matter and Radiation at Extremes 7, 054001, 2022. DOI:10.1063/5.0101098

  9. Miao B, Shrock JE, Feder L, Hollinger RC, Morrison J, Nedbailo R, Picksley A, Song H, Wang S, Rocca JJ, Milchberg HM. Multi-GeV electron bunches from an all-optical laser wakefield accelerator. Physical Review X 12, 031038, 2022. DOI:10.1103/PhysRevX.12.031038

  10. Mirani F, Calzolari D, Formenti A, Passoni M. Superintense laser-driven photon activation analysis. Nature Communications Physics volume 4.185, 2021. DOI:10.1038/s42005-021-00685-2

  11. Zhao Y, Lehe R, Myers A, Thevenet M, Huebl A, Schroeder CB, Vay J-L. Modeling of emittance growth due to Coulomb collisions in plasma-based accelerators. Physics of Plasmas 27, 113105, 2020. DOI:10.1063/5.0023776

Laser-Plasma Interaction

Scientific works in laser-ion acceleration and laser-matter interaction.

  1. Knight B, Gautam C, Stoner C, Egner B, Smith J, Orban C, Manfredi J, Frische K, Dexter M, Chowdhury E, Patnaik A (2023). Detailed Characterization of a kHz-rate Laser-Driven Fusion at a Thin Liquid Sheet with a Neutron Detection Suite. High Power Laser Science and Engineering, 1-13, 2023. DOI:10.1017/hpl.2023.84

  2. Fedeli L, Huebl A, Boillod-Cerneux F, Clark T, Gott K, Hillairet C, Jaure S, Leblanc A, Lehe R, Myers A, Piechurski C, Sato M, Zaim N, Zhang W, Vay J-L, Vincenti H. Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ISSN:2167-4337, pp. 25-36, Dallas, TX, US, 2022. DOI:10.1109/SC41404.2022.00008 (preprint here)

  3. Hakimi S, Obst-Huebl L, Huebl A, Nakamura K, Bulanov SS, Steinke S, Leemans WP, Kober Z, Ostermayr TM, Schenkel T, Gonsalves AJ, Vay J-L, Tilborg Jv, Toth C, Schroeder CB, Esarey E, Geddes CGR. Laser-solid interaction studies enabled by the new capabilities of the iP2 BELLA PW beamline. Physics of Plasmas 29, 083102, 2022. DOI:10.1063/5.0089331

  4. Levy D, Andriyash IA, Haessler S, Kaur J, Ouille M, Flacco A, Kroupp E, Malka V, Lopez-Martens R. Low-divergence MeV-class proton beams from kHz-driven laser-solid interactions. Phys. Rev. Accel. Beams 25, 093402, 2022. DOI:10.1103/PhysRevAccelBeams.25.093402

Particle Accelerator & Beam Physics

Scientific works in particle and beam modeling.

  1. Sandberg R T, Lehe R, Mitchell C E, Garten M, Myers A, Qiang J, Vay J-L and Huebl A. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. Proc. of Platform for Advanced Scientific Computing (PASC’24), submitted, 2024. preprint

  2. Sandberg R T, Lehe R, Mitchell C E, Garten M, Qiang J, Vay J-L, Huebl A. Hybrid Beamline Element ML-Training for Surrogates in the ImpactX Beam-Dynamics Code. 14th International Particle Accelerator Conference (IPAC’23), WEPA101, in print, 2023. preprint, DOI:10.18429/JACoW-IPAC-23-WEPA101

  3. Tan W H, Piot P, Myers A, Zhang W, Rheaume T, Jambunathan R, Huebl A, Lehe R, Vay J-L. Simulation studies of drive-beam instability in a dielectric wakefield accelerator. 13th International Particle Accelerator Conference (IPAC’22), MOPOMS012, 2022. DOI:10.18429/JACoW-IPAC2022-MOPOMS012

High Energy Astrophysical Plasma Physics

Scientific works in astrophysical plasma modeling.

  1. Klion H, Jambunathan R, Rowan ME, Yang E, Willcox D, Vay J-L, Lehe R, Myers A, Huebl A, Zhang W. Particle-in-Cell simulations of relativistic magnetic reconnection with advanced Maxwell solver algorithms. arXiv pre-print, 2023. DOI:10.48550/arXiv.2304.10566

Microelectronics

ARTEMIS (Adaptive mesh Refinement Time-domain ElectrodynaMIcs Solver) is based on WarpX and couples the Maxwell’s equations implementation in WarpX with classical equations that describe quantum material behavior (such as, LLG equation for micromagnetics and London equation for superconducting materials) for quantifying the performance of next-generation microelectronics.

  1. Sawant S S, Yao Z, Jambunathan R, Nonaka A. Characterization of Transmission Lines in Microelectronic Circuits Using the ARTEMIS Solver. IEEE Journal on Multiscale and Multiphysics Computational Techniques, vol. 8, pp. 31-39, 2023. DOI:10.1109/JMMCT.2022.3228281

  2. Kumar P, Nonaka A, Jambunathan R, Pahwa G and Salahuddin S, Yao Z. FerroX: A GPU-accelerated, 3D Phase-Field Simulation Framework for Modeling Ferroelectric Devices. arXiv preprint, 2022. arXiv:2210.15668

  3. Yao Z, Jambunathan R, Zeng Y, Nonaka A. A Massively Parallel Time-Domain Coupled Electrodynamics–Micromagnetics Solver. The International Journal of High Performance Computing Applications, 36(2):167-181, 2022. DOI:10.1177/10943420211057906

High-Performance Computing and Numerics

Scientific works in High-Performance Computing, applied mathematics and numerics.

Please see this section.

Nuclear Fusion - Magnetically Confined Plasmas

  1. Nicks B. S., Putvinski S. and Tajima T. Stabilization of the Alfvén-ion cyclotron instability through short plasmas: Fully kinetic simulations in a high-beta regime. Physics of Plasmas 30, 102108, 2023. DOI:10.1063/5.0163889

  2. Groenewald R. E., Veksler A., Ceccherini F., Necas A., Nicks B. S., Barnes D. C., Tajima T. and Dettrick S. A. Accelerated kinetic model for global macro stability studies of high-beta fusion reactors. Physics of Plasmas 30, 122508, 2023. DOI:10.1063/5.0178288

Installation

Users

Our community is here to help. Please report installation problems in case you should get stuck.

Choose one of the installation methods below to get started:

_images/hpc.svg

HPC Systems

If want to use WarpX on a specific high-performance computing (HPC) systems, jump directly to our HPC system-specific documentation.

_images/conda.svg

Using the Conda Package

A package for WarpX is available via the Conda package manager.

Tip

We recommend to configure your conda to use the faster libmamba dependency solver.

conda update -y -n base conda
conda install -y -n base conda-libmamba-solver
conda config --set solver libmamba

We recommend to deactivate that conda self-activates its base environment. This avoids interference with the system and other package managers.

conda config --set auto_activate_base false
conda create -n warpx -c conda-forge warpx
conda activate warpx

Note

The warpx conda package does not yet provide GPU support.

_images/spack.svg

Using the Spack Package

Packages for WarpX are available via the Spack package manager. The package warpx installs executables and the package py-warpx includes Python bindings, i.e. PICMI.

# optional: activate Spack binary caches
spack mirror add rolling https://binaries.spack.io/develop
spack buildcache keys --install --trust

# see `spack info py-warpx` for build options.
# optional arguments:  -mpi ^warpx dims=2 compute=cuda
spack install py-warpx
spack load py-warpx

See spack info warpx or spack info py-warpx and the official Spack tutorial for more information.

_images/pypi.svg

Using the PyPI Package

Given that you have the WarpX dependencies installed, you can use pip to install WarpX with PICMI from source:

python3 -m pip install -U pip
python3 -m pip install -U build packaging setuptools wheel
python3 -m pip install -U cmake

python3 -m pip wheel -v git+https://github.com/ECP-WarpX/WarpX.git
python3 -m pip install *whl

In the future, will publish pre-compiled binary packages on PyPI for faster installs. (Consider using conda in the meantime.)

_images/brew.svg

Using the Brew Package

Note

Coming soon.

_images/cmake.svg

From Source with CMake

After installing the WarpX dependencies, you can also install WarpX from source with CMake:

# get the source code
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
cd $HOME/src/warpx

# configure
cmake -S . -B build

# optional: change configuration
ccmake build

# compile
#   on Windows:          --config RelWithDebInfo
cmake --build build -j 4

# executables for WarpX are now in build/bin/

We document the details in the developer installation.

Tips for macOS Users

Tip

Before getting started with package managers, please check what you manually installed in /usr/local. If you find entries in bin/, lib/ et al. that look like you manually installed MPI, HDF5 or other software in the past, then remove those files first.

If you find software such as MPI in the same directories that are shown as symbolic links then it is likely you brew installed software before. If you are trying annother package manager than brew, run brew unlink … on such packages first to avoid software incompatibilities.

See also: A. Huebl, Working With Multiple Package Managers, Collegeville Workshop (CW20), 2020

Developers

CMake is our primary build system. If you are new to CMake, this short tutorial from the HEP Software foundation is the perfect place to get started. If you just want to use CMake to build the project, jump into sections 1. Introduction, 2. Building with CMake and 9. Finding Packages.

Dependencies

Before you start, you will need a copy of the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
cd $HOME/src/warpx

WarpX depends on popular third party software.

Dependencies

WarpX depends on the following popular third party software. Please see installation instructions below.

  • a mature C++17 compiler, e.g., GCC 8.4+, Clang 7, NVCC 11.0, MSVC 19.15 or newer

  • CMake 3.20.0+

  • Git 2.18+

  • AMReX: we automatically download and compile a copy of AMReX

  • PICSAR: we automatically download and compile a copy of PICSAR

and for Python bindings:

  • pyAMReX: we automatically download and compile a copy of pyAMReX

  • pybind11: we automatically download and compile a copy of pybind11

Optional dependencies include:

If you are on a high-performance computing (HPC) system, then please see our separate HPC documentation.

For all other systems, we recommend to use a package dependency manager: Pick one of the installation methods below to install all dependencies for WarpX development in a consistent manner.

Conda (Linux/macOS/Windows)

Conda/Mamba are cross-compatible, user-level package managers.

Tip

We recommend to configure your conda to use the faster libmamba dependency solver.

conda update -y -n base conda
conda install -y -n base conda-libmamba-solver
conda config --set solver libmamba

We recommend to deactivate that conda self-activates its base environment. This avoids interference with the system and other package managers.

conda config --set auto_activate_base false
conda create -n warpx-cpu-mpich-dev -c conda-forge blaspp boost ccache cmake compilers git lapackpp "openpmd-api=*=mpi_mpich*" openpmd-viewer python make numpy pandas scipy yt "fftw=*=mpi_mpich*" pkg-config matplotlib mamba mpich mpi4py ninja pip virtualenv
conda activate warpx-cpu-mpich-dev

# compile WarpX with -DWarpX_MPI=ON
# for pip, use: export WARPX_MPI=ON
conda create -n warpx-cpu-dev -c conda-forge blaspp boost ccache cmake compilers git lapackpp openpmd-api openpmd-viewer python make numpy pandas scipy yt fftw pkg-config matplotlib mamba ninja pip virtualenv
conda activate warpx-cpu-dev

# compile WarpX with -DWarpX_MPI=OFF
# for pip, use: export WARPX_MPI=OFF

For OpenMP support, you will further need:

conda install -c conda-forge libgomp
conda install -c conda-forge llvm-openmp

For Nvidia CUDA GPU support, you will need to have a recent CUDA driver installed or you can lower the CUDA version of the Nvidia cuda package and conda-forge to match your drivers and then add these packages:

conda install -c nvidia -c conda-forge cuda cupy

More info for CUDA-enabled ML packages.

Spack (Linux/macOS)

Spack is a user-level package manager. It is primarily written for Linux, with slightly less support for macOS, and future support for Windows.

First, download a WarpX Spack desktop development environment of your choice. For most desktop developments, pick the OpenMP environment for CPUs unless you have a supported GPU.

  • Debian/Ubuntu Linux:

    • OpenMP: system=ubuntu; compute=openmp (CPUs)

    • CUDA: system=ubuntu; compute=cuda (Nvidia GPUs)

    • ROCm: system=ubuntu; compute=rocm (AMD GPUs)

    • SYCL: todo (Intel GPUs)

  • macOS: first, prepare with brew install gpg2; brew install gcc

    • OpenMP: system=macos; compute=openmp

If you already installed Spack, we recommend to activate its binary caches for faster builds:

spack mirror add rolling https://binaries.spack.io/develop
spack buildcache keys --install --trust

Now install the WarpX dependencies in a new WarpX development environment:

# download environment file
curl -sLO https://raw.githubusercontent.com/ECP-WarpX/WarpX/development/Tools/machines/desktop/spack-${system}-${compute}.yaml

# create new development environment
spack env create warpx-${compute}-dev spack-${system}-${compute}.yaml
spack env activate warpx-${compute}-dev

# installation
spack install
python3 -m pip install jupyter matplotlib numpy openpmd-api openpmd-viewer pandas scipy virtualenv yt

In new terminal sessions, re-activate the environment with

spack env activate warpx-openmp-dev

again. Replace openmp with the equivalent you chose.

Compile WarpX with -DWarpX_MPI=ON. For pip, use export WARPX_MPI=ON.

Brew (macOS/Linux)

Homebrew (Brew) is a user-level package manager primarily for Apple macOS, but also supports Linux.

brew update
brew tap openpmd/openpmd
brew install adios2      # for openPMD
brew install ccache
brew install cmake
brew install fftw        # for PSATD
brew install git
brew install hdf5-mpi    # for openPMD
brew install libomp
brew unlink gcc
brew link --force libomp
brew install pkg-config  # for fftw
brew install open-mpi
brew install openblas    # for PSATD in RZ
brew install openpmd-api # for openPMD

If you also want to compile with PSATD in RZ, you need to manually install BLAS++ and LAPACK++:

sudo mkdir -p /usr/local/bin/
sudo curl -L -o /usr/local/bin/cmake-easyinstall https://raw.githubusercontent.com/ax3l/cmake-easyinstall/main/cmake-easyinstall
sudo chmod a+x /usr/local/bin/cmake-easyinstall

cmake-easyinstall --prefix=/usr/local git+https://github.com/icl-utk-edu/blaspp.git \
    -Duse_openmp=OFF -Dbuild_tests=OFF -DCMAKE_VERBOSE_MAKEFILE=ON
cmake-easyinstall --prefix=/usr/local git+https://github.com/icl-utk-edu/lapackpp.git \
    -Duse_cmake_find_lapack=ON -Dbuild_tests=OFF -DCMAKE_VERBOSE_MAKEFILE=ON

Compile WarpX with -DWarpX_MPI=ON. For pip, use export WARPX_MPI=ON.

APT (Debian/Ubuntu Linux)

The Advanced Package Tool (APT) is a system-level package manager on Debian-based Linux distributions, including Ubuntu.

sudo apt update
sudo apt install build-essential ccache cmake g++ git libfftw3-mpi-dev libfftw3-dev libhdf5-openmpi-dev libopenmpi-dev pkg-config python3 python3-matplotlib python3-mpi4py python3-numpy python3-pandas python3-pip python3-scipy python3-venv

# optional:
# for CUDA, either install
#   https://developer.nvidia.com/cuda-downloads (preferred)
# or, if your Debian/Ubuntu is new enough, use the packages
#   sudo apt install nvidia-cuda-dev libcub-dev

# compile WarpX with -DWarpX_MPI=ON
# for pip, use: export WARPX_MPI=ON
sudo apt update
sudo apt install build-essential ccache cmake g++ git libfftw3-dev libfftw3-dev libhdf5-dev pkg-config python3 python3-matplotlib python3-numpy python3-pandas python3-pip python3-scipy python3-venv

# optional:
# for CUDA, either install
#   https://developer.nvidia.com/cuda-downloads (preferred)
# or, if your Debian/Ubuntu is new enough, use the packages
#   sudo apt install nvidia-cuda-dev libcub-dev

# compile WarpX with -DWarpX_MPI=OFF
# for pip, use: export WARPX_MPI=OFF

Compile

From the base of the WarpX source directory, execute:

# find dependencies & configure
#   see additional options below, e.g.
#                   -DWarpX_PYTHON=ON
#                   -DCMAKE_INSTALL_PREFIX=$HOME/sw/warpx
cmake -S . -B build

# compile, here we use four threads
cmake --build build -j 4

That’s it! A 3D WarpX binary is now in build/bin/ and can be run with a 3D example inputs file. Most people execute the binary directly or copy it out.

If you want to install the executables in a programmatic way, run this:

# for default install paths, you will need administrator rights, e.g. with sudo:
cmake --build build --target install

You can inspect and modify build options after running cmake -S . -B build with either

ccmake build

or by adding arguments with -D<OPTION>=<VALUE> to the first CMake call. For example, this builds WarpX in all geometries, enables Python bindings and Nvidia GPU (CUDA) support:

cmake -S . -B build -DWarpX_DIMS="1;2;RZ;3" -DWarpX_COMPUTE=CUDA

Build Options

CMake Option

Default & Values

Description

CMAKE_BUILD_TYPE

RelWithDebInfo/Release/Debug

Type of build, symbols & optimizations

CMAKE_INSTALL_PREFIX

system-dependent path

Install path prefix

CMAKE_VERBOSE_MAKEFILE

ON/OFF

Print all compiler commands to the terminal during build

PYINSTALLOPTIONS

Additional options for pip install, e.g., -v --user

WarpX_APP

ON/OFF

Build the WarpX executable application

WarpX_ASCENT

ON/OFF

Ascent in situ visualization

WarpX_COMPUTE

NOACC/OMP/CUDA/SYCL/HIP

On-node, accelerated computing backend

WarpX_DIMS

3/2/1/RZ

Simulation dimensionality. Use "1;2;RZ;3" for all.

WarpX_EB

ON/OFF

Embedded boundary support (not supported in RZ yet)

WarpX_IPO

ON/OFF

Compile WarpX with interprocedural optimization (aka LTO)

WarpX_LIB

ON/OFF

Build WarpX as a library, e.g., for PICMI Python

WarpX_MPI

ON/OFF

Multi-node support (message-passing)

WarpX_MPI_THREAD_MULTIPLE

ON/OFF

MPI thread-multiple support, i.e. for async_io

WarpX_OPENPMD

ON/OFF

openPMD I/O (HDF5, ADIOS)

WarpX_PRECISION

SINGLE/DOUBLE

Floating point precision (single/double)

WarpX_PARTICLE_PRECISION

SINGLE/DOUBLE

Particle floating point precision (single/double), defaults to WarpX_PRECISION value if not set

WarpX_PSATD

ON/OFF

Spectral solver

WarpX_PYTHON

ON/OFF

Python bindings

WarpX_QED

ON/OFF

QED support (requires PICSAR)

WarpX_QED_TABLE_GEN

ON/OFF

QED table generation support (requires PICSAR and Boost)

WarpX_QED_TOOLS

ON/OFF

Build external tool to generate QED lookup tables (requires PICSAR and Boost)

WarpX_QED_TABLES_GEN_OMP

AUTO/ON/OFF

Enables OpenMP support for QED lookup tables generation

WarpX_SENSEI

ON/OFF

SENSEI in situ visualization

WarpX can be configured in further detail with options from AMReX, which are documented in the AMReX manual:

Developers might be interested in additional options that control dependencies of WarpX. By default, the most important dependencies of WarpX are automatically downloaded for convenience:

CMake Option

Default & Values

Description

BUILD_SHARED_LIBS

ON/OFF

Build shared libraries for dependencies

WarpX_CCACHE

ON/OFF

Search and use CCache to speed up rebuilds.

AMReX_CUDA_PTX_VERBOSE

ON/OFF

Print CUDA code generation statistics from ptxas.

WarpX_amrex_src

None

Path to AMReX source directory (preferred if set)

WarpX_amrex_repo

https://github.com/AMReX-Codes/amrex.git

Repository URI to pull and build AMReX from

WarpX_amrex_branch

we set and maintain a compatible commit

Repository branch for WarpX_amrex_repo

WarpX_amrex_internal

ON/OFF

Needs a pre-installed AMReX library if set to OFF

WarpX_openpmd_src

None

Path to openPMD-api source directory (preferred if set)

WarpX_openpmd_repo

https://github.com/openPMD/openPMD-api.git

Repository URI to pull and build openPMD-api from

WarpX_openpmd_branch

0.15.2

Repository branch for WarpX_openpmd_repo

WarpX_openpmd_internal

ON/OFF

Needs a pre-installed openPMD-api library if set to OFF

WarpX_picsar_src

None

Path to PICSAR source directory (preferred if set)

WarpX_picsar_repo

https://github.com/ECP-WarpX/picsar.git

Repository URI to pull and build PICSAR from

WarpX_picsar_branch

we set and maintain a compatible commit

Repository branch for WarpX_picsar_repo

WarpX_picsar_internal

ON/OFF

Needs a pre-installed PICSAR library if set to OFF

WarpX_pyamrex_src

None

Path to PICSAR source directory (preferred if set)

WarpX_pyamrex_repo

https://github.com/AMReX-Codes/pyamrex.git

Repository URI to pull and build pyAMReX from

WarpX_pyamrex_branch

we set and maintain a compatible commit

Repository branch for WarpX_pyamrex_repo

WarpX_pyamrex_internal

ON/OFF

Needs a pre-installed pyAMReX library if set to OFF

WarpX_PYTHON_IPO

ON/OFF

Build Python w/ interprocedural/link optimization (IPO/LTO)

WarpX_pybind11_src

None

Path to pybind11 source directory (preferred if set)

WarpX_pybind11_repo

https://github.com/pybind/pybind11.git

Repository URI to pull and build pybind11 from

WarpX_pybind11_branch

we set and maintain a compatible commit

Repository branch for WarpX_pybind11_repo

WarpX_pybind11_internal

ON/OFF

Needs a pre-installed pybind11 library if set to OFF

For example, one can also build against a local AMReX copy. Assuming AMReX’ source is located in $HOME/src/amrex, add the cmake argument -DWarpX_amrex_src=$HOME/src/amrex. Relative paths are also supported, e.g. -DWarpX_amrex_src=../amrex.

Or build against an AMReX feature branch of a colleague. Assuming your colleague pushed AMReX to https://github.com/WeiqunZhang/amrex/ in a branch new-feature then pass to cmake the arguments: -DWarpX_amrex_repo=https://github.com/WeiqunZhang/amrex.git -DWarpX_amrex_branch=new-feature. More details on this workflow are described here.

You can speed up the install further if you pre-install these dependencies, e.g. with a package manager. Set -DWarpX_<dependency-name>_internal=OFF and add installation prefix of the dependency to the environment variable CMAKE_PREFIX_PATH. Please see the introduction to CMake if this sounds new to you. More details on this workflow are described here.

If you re-compile often, consider installing the Ninja build system. Pass -G Ninja to the CMake configuration call to speed up parallel compiles.

Configure your compiler

If you don’t want to use your default compiler, you can set the following environment variables. For example, using a Clang/LLVM:

export CC=$(which clang)
export CXX=$(which clang++)

If you also want to select a CUDA compiler:

export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=$(which clang++)

We also support adding additional compiler flags via environment variables such as CXXFLAGS/LDFLAGS:

# example: treat all compiler warnings as errors
export CXXFLAGS="-Werror"

Note

Please clean your build directory with rm -rf build/ after changing the compiler. Now call cmake -S . -B build (+ further options) again to re-initialize the build configuration.

Run

An executable WarpX binary with the current compile-time options encoded in its file name will be created in build/bin/. Note that you need separate binaries to run 1D, 2D, 3D, and RZ geometry inputs scripts. Additionally, a symbolic link named warpx can be found in that directory, which points to the last built WarpX executable.

More details on running simulations are in the section Run WarpX. Alternatively, read on and also build our PICMI Python interface.

PICMI Python Bindings

Note

Preparation: make sure you work with up-to-date Python tooling.

python3 -m pip install -U pip
python3 -m pip install -U build packaging setuptools wheel
python3 -m pip install -U cmake
python3 -m pip install -r requirements.txt

For PICMI Python bindings, configure WarpX to produce a library and call our pip_install CMake target:

# find dependencies & configure for all WarpX dimensionalities
cmake -S . -B build_py -DWarpX_DIMS="1;2;RZ;3" -DWarpX_PYTHON=ON


# build and then call "python3 -m pip install ..."
cmake --build build_py --target pip_install -j 4

That’s it! You can now run a first 3D PICMI script from our examples.

Developers could now change the WarpX source code and then call the build line again to refresh the Python installation.

Tip

If you do not develop with a user-level package manager, e.g., because you rely on a HPC system’s environment modules, then consider to set up a virtual environment via Python venv. Otherwise, without a virtual environment, you likely need to add the CMake option -DPYINSTALLOPTIONS="--user".

Python Bindings (Package Management)

This section is relevant for Python package management, mainly for maintainers or people that rather like to interact only with pip.

One can build and install pywarpx from the root of the WarpX source tree:

python3 -m pip wheel -v .
python3 -m pip install pywarpx*whl

This will call the CMake logic above implicitly. Using this workflow has the advantage that it can build and package up multiple libraries with varying WarpX_DIMS into one pywarpx package.

Environment variables can be used to control the build step:

Environment Variable

Default & Values

Description

WARPX_COMPUTE

NOACC/OMP/CUDA/SYCL/HIP

On-node, accelerated computing backend

WARPX_DIMS

"1;2;3;RZ"

Simulation dimensionalities (semicolon-separated list)

WARPX_EB

ON/OFF

Embedded boundary support (not supported in RZ yet)

WARPX_MPI

ON/OFF

Multi-node support (message-passing)

WARPX_OPENPMD

ON/OFF

openPMD I/O (HDF5, ADIOS)

WARPX_PRECISION

SINGLE/DOUBLE

Floating point precision (single/double)

WARPX_PARTICLE_PRECISION

SINGLE/DOUBLE

Particle floating point precision (single/double), defaults to WarpX_PRECISION value if not set

WARPX_PSATD

ON/OFF

Spectral solver

WARPX_QED

ON/OFF

PICSAR QED (requires PICSAR)

WARPX_QED_TABLE_GEN

ON/OFF

QED table generation (requires PICSAR and Boost)

BUILD_PARALLEL

2

Number of threads to use for parallel builds

BUILD_SHARED_LIBS

ON/OFF

Build shared libraries for dependencies

HDF5_USE_STATIC_LIBRARIES

ON/OFF

Prefer static libraries for HDF5 dependency (openPMD)

ADIOS_USE_STATIC_LIBS

ON/OFF

Prefer static libraries for ADIOS1 dependency (openPMD)

WARPX_AMREX_SRC

None

Absolute path to AMReX source directory (preferred if set)

WARPX_AMREX_REPO

None (uses cmake default)

Repository URI to pull and build AMReX from

WARPX_AMREX_BRANCH

None (uses cmake default)

Repository branch for WARPX_AMREX_REPO

WARPX_AMREX_INTERNAL

ON/OFF

Needs a pre-installed AMReX library if set to OFF

WARPX_OPENPMD_SRC

None

Absolute path to openPMD-api source directory (preferred if set)

WARPX_OPENPMD_INTERNAL

ON/OFF

Needs a pre-installed openPMD-api library if set to OFF

WARPX_PICSAR_SRC

None

Absolute path to PICSAR source directory (preferred if set)

WARPX_PICSAR_INTERNAL

ON/OFF

Needs a pre-installed PICSAR library if set to OFF

WARPX_PYAMREX_SRC

None

Absolute path to pyAMReX source directory (preferred if set)

WARPX_PYAMREX_INTERNAL

ON/OFF

Needs a pre-installed pyAMReX library if set to OFF

WARPX_PYTHON_IPO

ON/OFF

Build Python w/ interprocedural/link optimization (IPO/LTO)

WARPX_PYBIND11_SRC

None

Absolute path to pybind11 source directory (preferred if set)

WARPX_PYBIND11_INTERNAL

ON/OFF

Needs a pre-installed pybind11 library if set to OFF

WARPX_CCACHE_PROGRAM

First found ccache executable.

Set to NO to disable CCache.

PYWARPX_LIB_DIR

None

If set, search for pre-built WarpX C++ libraries (see below)

Note that we currently change the WARPX_MPI default intentionally to OFF, to simplify a first install from source.

Some hints and workflows follow. Developers, that want to test a change of the source code but did not change the pywarpx version number, can force a reinstall via:

python3 -m pip install --force-reinstall --no-deps -v .

Some Developers like to code directly against a local copy of AMReX, changing both code-bases at a time:

WARPX_AMREX_SRC=$PWD/../amrex python3 -m pip install --force-reinstall --no-deps -v .

Additional environment control as common for CMake (see above) can be set as well, e.g. CC, CXX`, and CMAKE_PREFIX_PATH hints. So another sophisticated example might be: use Clang as the compiler, build with local source copies of PICSAR and AMReX, support the PSATD solver, MPI and openPMD, hint a parallel HDF5 installation in $HOME/sw/hdf5-parallel-1.10.4, and only build 2D and 3D geometry:

CC=$(which clang) CXX=$(which clang++) WARPX_AMREX_SRC=$PWD/../amrex WARPX_PICSAR_SRC=$PWD/../picsar WARPX_PSATD=ON WARPX_MPI=ON WARPX_DIMS="2;3" CMAKE_PREFIX_PATH=$HOME/sw/hdf5-parallel-1.10.4:$CMAKE_PREFIX_PATH python3 -m pip install --force-reinstall --no-deps -v .

Here we wrote this all in one line, but one can also set all environment variables in a development environment and keep the pip call nice and short as in the beginning. Note that you need to use absolute paths for external source trees, because pip builds in a temporary directory, e.g. export WARPX_AMREX_SRC=$HOME/src/amrex.

All of this can also be run from CMake. This is the workflow most developers will prefer as it allows rapid re-compiles:

# build WarpX executables and libraries
cmake -S . -B build_py -DWarpX_DIMS="1;2;RZ;3" -DWarpX_PYTHON=ON

# build & install Python only
cmake --build build_py -j 4 --target pip_install

There is also a --target pip_install_nodeps option that skips pip-based dependency checks.

WarpX release managers might also want to generate a self-contained source package that can be distributed to exotic architectures:

python setup.py sdist --dist-dir .
python3 -m pip wheel -v pywarpx-*.tar.gz
python3 -m pip install *whl

The above steps can also be executed in one go to build from source on a machine:

python3 setup.py sdist --dist-dir .
python3 -m pip install -v pywarpx-*.tar.gz

Last but not least, you can uninstall pywarpx as usual with:

python3 -m pip uninstall pywarpx

HPC

On selected high-performance computing (HPC) systems, WarpX has documented or even pre-build installation routines. Follow the guide here instead of the generic installation routines for optimal stability and best performance.

warpx.profile

Use a warpx.profile file to set up your software environment without colliding with other software. Ideally, store that file directly in your $HOME/ and source it after connecting to the machine:

source $HOME/warpx.profile

We list example warpx.profile files below, which can be used to set up WarpX on various HPC systems.

HPC Machines

This section documents quick-start guides for a selection of supercomputers that WarpX users are active on.

Adastra (CINES)

The Adastra cluster is located at CINES (France). Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

Introduction

If you are new to this system, please see the following resources:

  • Adastra user guide

  • Batch system: Slurm

  • Production directories:

    • $SHAREDSCRATCHDIR: meant for short-term data storage, shared with all members of a project, purged every 30 days (17.6 TB default quota)

    • $SCRATCHDIR: meant for short-term data storage, single user, purged every 30 days

    • $SHAREDWORKDIR: meant for mid-term data storage, shared with all members of a project, never purged (4.76 TB default quota)

    • $WORKDIR: meant for mid-term data storage, single user, never purged

    • $STORE : meant for long term storage, single user, never purged, backed up

    • $SHAREDHOMEDIR : meant for scripts and tools, shared with all members of a project, never purged, backed up

    • $HOME : meant for scripts and tools, single user, never purged, backed up

Preparation

The following instructions will install WarpX in the $SHAREDHOMEDIR directory, which is shared among all the members of a given project. Due to the inode quota enforced for this machine, a shared installation of WarpX is advised.

Use the following commands to download the WarpX source code:

# If you have multiple projects, activate the project that you want to use with:
#
# myproject -a YOUR_PROJECT_NAME
#
git clone https://github.com/ECP-WarpX/WarpX.git $SHAREDHOMEDIR/src/warpx

We use system software modules, add environment hints and further dependencies via the file $SHAREDHOMEDIR/adastra_warpx.profile. Create it now:

cp $SHAREDHOMEDIR/src/warpx/Tools/machines/adastra-cines/adastra_warpx.profile.example $SHAREDHOMEDIR/adastra_warpx.profile
Script Details
# please set your project account and uncomment the following two lines
#export proj=your_project_id
#myproject -a $proj

# required dependencies
module purge
module load cpe/23.12
module load craype-accel-amd-gfx90a craype-x86-trento
module load PrgEnv-cray
module load CCE-GPU-3.0.0
module load amd-mixed/5.2.3

# optional: for PSATD in RZ geometry support
export CMAKE_PREFIX_PATH=${SHAREDHOMEDIR}/sw/adastra/gpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SHAREDHOMEDIR}/sw/adastra/gpu/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${SHAREDHOMEDIR}/sw/adastra/gpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SHAREDHOMEDIR}/sw/adastra/gpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for QED lookup table generation support
module load boost/1.83.0-mpi-python3

# optional: for openPMD support
module load cray-hdf5-parallel
export CMAKE_PREFIX_PATH=${SHAREDHOMEDIR}/sw/adastra/gpu/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SHAREDHOMEDIR}/sw/adastra/gpu/adios2-2.8.3:$CMAKE_PREFIX_PATH

export PATH=${HOME}/sw/adastra/gpu/adios2-2.8.3/bin:${PATH}

# optional: for Python bindings or libEnsemble
module load cray-python/3.11.5

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# make output group-readable by default
umask 0027

# an alias to request an interactive batch node for one hour
# for paralle execution, start on the batch node: srun <command>
alias getNode="salloc --account=$proj --job-name=warpx --constraint=MI250 --nodes=1 --ntasks-per-node=8 --cpus-per-task=8 --gpus-per-node=8 --threads-per-core=1 --exclusive --time=01:00:00"
# note: to access a compute note it is required to get its name (look at the `NODELIST` column)
#    $ squeue -u $USER
# and then to ssh into the node:
#    $ ssh node_name

# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1

# optimize ROCm/HIP compilation for MI250X
export AMREX_AMD_ARCH=gfx90a

# compiler environment hints
export CC=$(which cc)
export CXX=$(which CC)
export FC=$(which amdflang)

Edit the 2nd line of this script, which sets the export proj="" variable using a text editor such as nano, emacs, or vim (all available by default on Adastra login nodes) and uncomment the 3rd line (which sets $proj as the active project).

Important

Now, and as the first step on future logins to Adastra, activate these environment settings:

source $SHAREDHOMEDIR/adastra_warpx.profile

Finally, since Adastra does not yet provide software modules for some of our dependencies, install them once:

bash $SHAREDHOMEDIR/src/warpx/Tools/machines/adastra-cines/install_dependencies.sh
source $SHAREDHOMEDIR/sw/adastra/gpu/venvs/warpx-adastra/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl, Luca Fedeli
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was perlmutter_gpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your adastra_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
SW_DIR="${SHAREDHOMEDIR}/sw/adastra/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# BLAS++ (for PSATD+RZ)
if [ -d $SHAREDHOMEDIR/src/blaspp ]
then
  cd $SHAREDHOMEDIR/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $SHAREDHOMEDIR/src/blaspp
fi
rm -rf $SHAREDHOMEDIR/src/blaspp-adastra-gpu-build
CXX=$(which CC) cmake -S $SHAREDHOMEDIR/src/blaspp -B $SHAREDHOMEDIR/src/blaspp-adastra-gpu-build -Duse_openmp=OFF -Dgpu_backend=hip -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build $SHAREDHOMEDIR/src/blaspp-adastra-gpu-build --target install --parallel 16
rm -rf $SHAREDHOMEDIR/src/blaspp-adastra-gpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d $SHAREDHOMEDIR/src/lapackpp ]
then
  cd $SHAREDHOMEDIR/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $SHAREDHOMEDIR/src/lapackpp
fi
rm -rf $SHAREDHOMEDIR/src/lapackpp-adastra-gpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $SHAREDHOMEDIR/src/lapackpp -B $SHAREDHOMEDIR/src/lapackpp-adastra-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build $SHAREDHOMEDIR/src/lapackpp-adastra-gpu-build --target install --parallel 16
rm -rf $SHAREDHOMEDIR/src/lapackpp-adastra-gpu-build

# c-blosc (I/O compression, for OpenPMD)
if [ -d $SHAREDHOMEDIR/src/c-blosc ]
then
  # git repository is already there
  :
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git $SHAREDHOMEDIR/src/c-blosc
fi
rm -rf $SHAREDHOMEDIR/src/c-blosc-ad-build
cmake -S $SHAREDHOMEDIR/src/c-blosc -B $SHAREDHOMEDIR/src/c-blosc-ad-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build $SHAREDHOMEDIR/src/c-blosc-ad-build --target install --parallel 16
rm -rf $SHAREDHOMEDIR/src/c-blosc-ad-build

# ADIOS2 v. 2.8.3 (for OpenPMD)
if [ -d $SHAREDHOMEDIR/src/adios2 ]
then
  # git repository is already there
  :
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git $SHAREDHOMEDIR/src/adios2
fi
rm -rf $SHAREDHOMEDIR/src/adios2-ad-build
cmake -S $SHAREDHOMEDIR/src/adios2 -B $SHAREDHOMEDIR/src/adios2-ad-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build $SHAREDHOMEDIR/src/adios2-ad-build --target install -j 16
rm -rf $SHAREDHOMEDIR/src/adios2-ad-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-adastra
python3 -m venv ${SW_DIR}/venvs/warpx-adastra
source ${SW_DIR}/venvs/warpx-adastra/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="cc -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r $SHAREDHOMEDIR/src/warpx/requirements.txt
# optional: for libEnsemble
python3 -m pip install -r $SHAREDHOMEDIR/src/warpx/Tools/LibEnsemble/requirements.txt
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
#python3 -m pip install --upgrade torch --index-url https://download.pytorch.org/whl/rocm5.4.2
#python3 -m pip install -r $SHAREDHOMEDIR/src/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $SHAREDHOMEDIR/src/warpx
rm -rf build_adastra

cmake -S . -B build_adastra -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_adastra -j 16

The WarpX application executables are now in $SHAREDHOMEDIR/src/warpx/build_adastra/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_adastra_py

cmake -S . -B build_adastra_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_adastra_py -j 16 --target pip_install

Now, you can submit Adstra compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Adastra jobs (example inputs). For executables, you can reference their location in your job script .

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $SHAREDHOMEDIR/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_adastra and rebuild WarpX.

Running
MI250X GPUs (2x64 GB)

In non-interactive runs:

You can copy this file from $HOME/src/warpx/Tools/machines/adastra-cines/submit.sh.
#!/bin/bash
#SBATCH --account=<account_to_charge>
#SBATCH --job-name=warpx
#SBATCH --constraint=MI250
#SBATCH --nodes=2
#SBATCH --exclusive
#SBATCH --output=%x-%j.out
#SBATCH --time=00:10:00

module purge

# A CrayPE environment version
module load cpe/23.12
# An architecture
module load craype-accel-amd-gfx90a craype-x86-trento
# A compiler to target the architecture
module load PrgEnv-cray
# Some architecture related libraries and tools
module load CCE-GPU-3.0.0
module load amd-mixed/5.2.3

date
module list

export MPICH_GPU_SUPPORT_ENABLED=1

# note
# this environment setting is currently needed to work-around a
# known issue with Libfabric
#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

# note
# On machines with similar architectures (Frontier, OLCF) these settings
# seem to prevent the following issue:
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software

# note
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null

export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
     --cpus-per-task=8 --threads-per-core=1 --gpu-bind=closest \
    ./warpx inputs > output.txt
Post-Processing

Note

TODO: Document any Jupyter or data services.

Known System Issues

Warning

May 16th, 2022: There is a caching bug in Libfabric that causes WarpX simulations to occasionally hang on on more than 1 node.

As a work-around, please export the following environment variable in your job scripts until the issue is fixed:

#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

Warning

Sep 2nd, 2022: rocFFT in ROCm 5.1-5.3 tries to write to a cache in the home area by default. This does not scale, disable it via:

export ROCFFT_RTC_CACHE_PATH=/dev/null

Warning

January, 2023: We discovered a regression in AMD ROCm, leading to 2x slower current deposition (and other slowdowns) in ROCm 5.3 and 5.4. Reported to AMD and fixed for the next release of ROCm.

Stay with the ROCm 5.2 module to avoid.

Crusher (OLCF)

The Crusher cluster is located at OLCF.

On Crusher, each compute node provides four AMD MI250X GPUs, each with two Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

Introduction

If you are new to this system, please see the following resources:

  • Crusher user guide

  • Batch system: Slurm

  • Production directories:

    • $HOME: per-user directory, use only for inputs, source and scripts; backed up; mounted as read-only on compute nodes, that means you cannot run in it (50 GB quota)

    • $PROJWORK/$proj/: shared with all members of a project, purged every 90 days, Lustre (recommended)

    • $MEMBERWORK/$proj/: single user, purged every 90 days, Lustre (usually smaller quota, 50TB default quota)

    • $WORLDWORK/$proj/: shared with all users, purged every 90 days, Lustre (50TB default quota)

Note: the Orion Lustre filesystem on Frontier and Crusher, and the older Alpine GPFS filesystem on Summit are not mounted on each others machines. Use Globus to transfer data between them if needed.

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use system software modules, add environment hints and further dependencies via the file $HOME/crusher_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/crusher-olcf/crusher_warpx.profile.example $HOME/crusher_warpx.profile
Script Details
# please set your project account
#    note: WarpX ECP members use aph114 or aph114_crusher
export proj=""  # change me!

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module load cmake/3.23.2
module load craype-accel-amd-gfx90a
module load rocm/5.2.0  # waiting for 5.6 for next bump
module load cray-mpich
module load cce/15.0.0  # must be loaded after rocm

# optional: faster builds
module load ccache
module load ninja

# optional: just an additional text editor
module load nano

# optional: for PSATD in RZ geometry support
export CMAKE_PREFIX_PATH=${HOME}/sw/crusher/gpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${HOME}/sw/crusher/gpu/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${HOME}/sw/crusher/gpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${HOME}/sw/crusher/gpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for QED lookup table generation support
module load boost/1.79.0-cxx17

# optional: for openPMD support
module load adios2/2.8.3
module load hdf5/1.14.0

# optional: for Python bindings or libEnsemble
module load cray-python/3.9.13.1

if [ -d "${HOME}/sw/crusher/gpu/venvs/warpx-crusher" ]
then
  source ${HOME}/sw/crusher/gpu/venvs/warpx-crusher/bin/activate
fi

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# make output group-readable by default
umask 0027

# an alias to request an interactive batch node for one hour
#   for paralle execution, start on the batch node: srun <command>
alias getNode="salloc -A $proj -J warpx -t 01:00:00 -p batch -N 1"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -A $proj -J warpx -t 00:30:00 -p batch -N 1"

# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1

# optimize ROCm/HIP compilation for MI250X
export AMREX_AMD_ARCH=gfx90a

# compiler environment hints
export CC=$(which hipcc)
export CXX=$(which hipcc)
export FC=$(which ftn)
export CFLAGS="-I${ROCM_PATH}/include"
export CXXFLAGS="-I${ROCM_PATH}/include -Wno-pass-failed"
export LDFLAGS="-L${ROCM_PATH}/lib -lamdhip64 ${PE_MPICH_GTL_DIR_amd_gfx90a} -lmpi_gtl_hsa"

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project aph114, then run vi $HOME/crusher_warpx.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="aph114"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to Crusher, activate these environment settings:

source $HOME/crusher_warpx.profile

Finally, since Crusher does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/crusher-olcf/install_dependencies.sh
source $HOME/sw/crusher/gpu/venvs/warpx-crusher/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was crusher_gpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your crusher_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Check $proj variable is correct and has a corresponding CFS directory #######
#
if [ ! -d "${PROJWORK}/${proj}/" ]
then
    echo "WARNING: The directory $PROJWORK/$proj/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your crusher_warpx.profile file to continue!"
    exit
fi


# Remove old dependencies #####################################################
#
SW_DIR="${HOME}/sw/crusher/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-crusher-gpu-build
CXX=$(which CC) cmake -S $HOME/src/blaspp -B $HOME/src/blaspp-crusher-gpu-build -Duse_openmp=OFF -Dgpu_backend=hip -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build $HOME/src/blaspp-crusher-gpu-build --target install --parallel 16
rm -rf $HOME/src/blaspp-crusher-gpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-crusher-gpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B $HOME/src/lapackpp-crusher-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build $HOME/src/lapackpp-crusher-gpu-build --target install --parallel 16
rm -rf $HOME/src/lapackpp-crusher-gpu-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-crusher
python3 -m venv ${SW_DIR}/venvs/warpx-crusher
source ${SW_DIR}/venvs/warpx-crusher/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="cc -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
# optional: for libEnsemble
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
#python3 -m pip install --upgrade torch --index-url https://download.pytorch.org/whl/rocm5.4.2
#python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_crusher

cmake -S . -B build_crusher -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_crusher -j 16

The WarpX application executables are now in $HOME/src/warpx/build_crusher/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_crusher_py

cmake -S . -B build_crusher_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_crusher_py -j 16 --target pip_install

Now, you can submit Crusher compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Crusher jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_crusher and rebuild WarpX.

Running
MI250X GPUs (2x64 GB)

After requesting an interactive node with the getNode alias above, run a simulation like this, here using 8 MPI ranks and a single node:

runNode ./warpx inputs

Or in non-interactive runs:

You can copy this file from $HOME/src/warpx/Tools/machines/crusher-olcf/submit.sh.
#!/usr/bin/env bash

#SBATCH -A <project id>
#    note: WarpX ECP members use aph114
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p batch
#SBATCH --ntasks-per-node=8
# Since 2022-12-29 Crusher is using a low-noise mode layout,
# making only 7 instead of 8 cores available per process
# https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html#id6
#SBATCH --cpus-per-task=7
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=closest
#SBATCH -N 1

# From the documentation:
# Each Crusher compute node consists of [1x] 64-core AMD EPYC 7A53
# "Optimized 3rd Gen EPYC" CPU (with 2 hardware threads per physical core) with
# access to 512 GB of DDR4 memory.
# Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies
# (GCDs) for a total of 8 GCDs per node. The programmer can think of the 8 GCDs
# as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

# note (5-16-22, OLCFHELP-6888)
# this environment setting is currently needed on Crusher to work-around a
# known issue with Libfabric
#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

# Seen since August 2023 on Frontier, adapting the same for Crusher
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software

# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null

export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
    ./warpx.3d inputs > output_${SLURM_JOBID}.txt
Post-Processing

For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).

Please follow the same guidance as for OLCF Summit post-processing.

Known System Issues

Note

Please see the Frontier Known System Issues due to the similarity of the two systems.

Frontier (OLCF)

The Frontier cluster is located at OLCF.

On Frontier, each compute node provides four AMD MI250X GPUs, each with two Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

Introduction

If you are new to this system, please see the following resources:

  • Frontier user guide

  • Batch system: Slurm

  • Filesystems:

    • $HOME: per-user directory, use only for inputs, source and scripts; backed up; mounted as read-only on compute nodes, that means you cannot run in it (50 GB quota)

    • $PROJWORK/$proj/: shared with all members of a project, purged every 90 days, Lustre (recommended)

    • $MEMBERWORK/$proj/: single user, purged every 90 days, Lustre (usually smaller quota, 50TB default quota)

    • $WORLDWORK/$proj/: shared with all users, purged every 90 days, Lustre (50TB default quota)

Note: the Orion Lustre filesystem on Frontier and the older Alpine GPFS filesystem on Summit are not mounted on each others machines. Use Globus to transfer data between them if needed.

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use system software modules, add environment hints and further dependencies via the file $HOME/frontier_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/frontier-olcf/frontier_warpx.profile.example $HOME/frontier_warpx.profile
Script Details
# please set your project account
export proj=""  # change me!

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module load cmake/3.23.2
module load craype-accel-amd-gfx90a
module load rocm/5.2.0  # waiting for 5.6 for next bump
module load cray-mpich
module load cce/15.0.0  # must be loaded after rocm

# optional: faster builds
module load ccache
module load ninja

# optional: just an additional text editor
module load nano

# optional: for PSATD in RZ geometry support
export CMAKE_PREFIX_PATH=${HOME}/sw/frontier/gpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${HOME}/sw/frontier/gpu/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${HOME}/sw/frontier/gpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${HOME}/sw/frontier/gpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for QED lookup table generation support
module load boost/1.79.0-cxx17

# optional: for openPMD support
module load adios2/2.8.3
module load hdf5/1.14.0

# optional: for Python bindings or libEnsemble
module load cray-python/3.9.13.1

if [ -d "${HOME}/sw/frontier/gpu/venvs/warpx-frontier" ]
then
  source ${HOME}/sw/frontier/gpu/venvs/warpx-frontier/bin/activate
fi

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# make output group-readable by default
umask 0027

# an alias to request an interactive batch node for one hour
#   for paralle execution, start on the batch node: srun <command>
alias getNode="salloc -A $proj -J warpx -t 01:00:00 -p batch -N 1"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -A $proj -J warpx -t 00:30:00 -p batch -N 1"

# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1

# optimize ROCm/HIP compilation for MI250X
export AMREX_AMD_ARCH=gfx90a

# compiler environment hints
export CC=$(which hipcc)
export CXX=$(which hipcc)
export FC=$(which ftn)
export CFLAGS="-I${ROCM_PATH}/include"
export CXXFLAGS="-I${ROCM_PATH}/include -Wno-pass-failed"
export LDFLAGS="-L${ROCM_PATH}/lib -lamdhip64 ${PE_MPICH_GTL_DIR_amd_gfx90a} -lmpi_gtl_hsa"

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project aph114, then run vi $HOME/frontier_warpx.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="aph114"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to Frontier, activate these environment settings:

source $HOME/frontier_warpx.profile

Finally, since Frontier does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/frontier-olcf/install_dependencies.sh
source $HOME/sw/frontier/gpu/venvs/warpx-frontier/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was frontier_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your frontier_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Check $proj variable is correct and has a corresponding CFS directory #######
#
if [ ! -d "${PROJWORK}/${proj}/" ]
then
    echo "WARNING: The directory $PROJWORK/$proj/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your frontier_warpx.profile file to continue!"
    exit
fi


# Remove old dependencies #####################################################
#
SW_DIR="${HOME}/sw/frontier/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-frontier-gpu-build
CXX=$(which CC) cmake -S $HOME/src/blaspp -B $HOME/src/blaspp-frontier-gpu-build -Duse_openmp=OFF -Dgpu_backend=hip -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build $HOME/src/blaspp-frontier-gpu-build --target install --parallel 16
rm -rf $HOME/src/blaspp-frontier-gpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-frontier-gpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B $HOME/src/lapackpp-frontier-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build $HOME/src/lapackpp-frontier-gpu-build --target install --parallel 16
rm -rf $HOME/src/lapackpp-frontier-gpu-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-frontier
python3 -m venv ${SW_DIR}/venvs/warpx-frontier
source ${SW_DIR}/venvs/warpx-frontier/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
# cupy and h5py need an older Cython
# https://github.com/cupy/cupy/issues/4610
# https://github.com/h5py/h5py/issues/2268
python3 -m pip install --upgrade "cython<3.0"
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="cc -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
# cupy for ROCm
#   https://docs.cupy.dev/en/stable/install.html#building-cupy-for-rocm-from-source
#   https://github.com/cupy/cupy/issues/7830
CC=cc CXX=CC \
CUPY_INSTALL_USE_HIP=1  \
ROCM_HOME=${ROCM_PATH}  \
HCC_AMDGPU_TARGET=${AMREX_AMD_ARCH}  \
  python3 -m pip install -v cupy
# optional: for libEnsemble
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
#python3 -m pip install --upgrade torch --index-url https://download.pytorch.org/whl/rocm5.4.2
#python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_frontier

cmake -S . -B build_frontier -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_frontier -j 16

The WarpX application executables are now in $HOME/src/warpx/build_frontier/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_frontier_py

cmake -S . -B build_frontier_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_frontier_py -j 16 --target pip_install

Now, you can submit Frontier compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Frontier jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_frontier and rebuild WarpX.

Running
MI250X GPUs (2x64 GB)

After requesting an interactive node with the getNode alias above, run a simulation like this, here using 8 MPI ranks and a single node:

runNode ./warpx inputs

Or in non-interactive runs:

You can copy this file from $HOME/src/warpx/Tools/machines/frontier-olcf/submit.sh.
#!/usr/bin/env bash

#SBATCH -A <project id>
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p batch
#SBATCH --ntasks-per-node=8
# Due to Frontier's Low-Noise Mode Layout only 7 instead of 8 cores are available per process
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#low-noise-mode-layout
#SBATCH --cpus-per-task=7
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=closest
#SBATCH -N 20

# load cray libs and ROCm libs
#export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}

# From the documentation:
# Each Frontier compute node consists of [1x] 64-core AMD EPYC 7A53
# "Optimized 3rd Gen EPYC" CPU (with 2 hardware threads per physical core) with
# access to 512 GB of DDR4 memory.
# Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies
# (GCDs) for a total of 8 GCDs per node. The programmer can think of the 8 GCDs
# as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

# note (5-16-22 and 7-12-22)
# this environment setting is currently needed on Frontier to work-around a
# known issue with Libfabric (both in the May and June PE)
#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

# Seen since August 2023
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software

# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null

export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
    ./warpx inputs > output.txt
Post-Processing

For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).

Please follow the same guidance as for OLCF Summit post-processing.

Known System Issues

Warning

May 16th, 2022 (OLCFHELP-6888): There is a caching bug in Libfabric that causes WarpX simulations to occasionally hang on Frontier on more than 1 node.

As a work-around, please export the following environment variable in your job scripts until the issue is fixed:

#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

Warning

Sep 2nd, 2022 (OLCFDEV-1079): rocFFT in ROCm 5.1-5.3 tries to write to a cache in the home area by default. This does not scale, disable it via:

export ROCFFT_RTC_CACHE_PATH=/dev/null

Warning

January, 2023 (OLCFDEV-1284, AMD Ticket: ORNLA-130): We discovered a regression in AMD ROCm, leading to 2x slower current deposition (and other slowdowns) in ROCm 5.3 and 5.4.

June, 2023: Although a fix was planned for ROCm 5.5, we still see the same issue in this release and continue to exchange with AMD and HPE on the issue.

Stay with the ROCm 5.2 module to avoid a 2x slowdown.

Warning

August, 2023 (OLCFDEV-1597, OLCFHELP-12850, OLCFHELP-14253): With runs above 500 nodes, we observed issues in MPI_Waitall calls of the kind OFI Poll Failed UNDELIVERABLE. According to the system known issues entry OLCFDEV-1597, we work around this by setting this environment variable in job scripts:

export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software

Warning

Checkpoints and I/O at scale seem to be slow with the default Lustre filesystem configuration. Please test checkpointing and I/O with short #SBATCH -q debug runs before running the full simulation. Execute lfs getstripe -d <dir> to show the default progressive file layout. Consider using lfs setstripe to change the striping for new files before you submit the run.

mkdir /lustre/orion/proj-shared/<your-project>/<path/to/new/sim/dir>
cd <new/sim/dir/above>
# create your diagnostics directory first
mkdir diags
# change striping for new files before you submit the simulation
#   this is an example, striping 10 MB blocks onto 32 nodes
lfs setstripe -S 10M -c 32 diags

Additionally, other AMReX users reported good performance for plotfile checkpoint/restart when using

amr.plot_nfiles = -1
amr.checkpoint_nfiles = -1
amrex.async_out_nfiles = 4096  # set to number of GPUs used

Fugaku (Riken)

The Fugaku cluster is located at the Riken Center for Computational Science (Japan).

Introduction

If you are new to this system, please see the following resources:

Preparation

Use the following commands to download the WarpX source code and switch to the correct branch:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

Compiling WarpX on Fugaku is more practical on a compute node. Use the following commands to acquire a compute node for one hour:

pjsub --interact -L "elapse=02:00:00" -L "node=1" --sparam "wait-time=300" --mpi "max-proc-per-node=48" --all-mount-gfscache

We use system software modules, add environment hints and further dependencies via the file $HOME/fugaku_warpx.profile. Create it now, modify it if needed, and source it (it will take few minutes):

cp $HOME/src/warpx/Tools/machines/fugaku-riken/fugaku_warpx.profile.example $HOME/fugaku_warpx.profile
source $HOME/fugaku_warpx.profile
Script Details
. /vol0004/apps/oss/spack/share/spack/setup-env.sh

# required dependencies
spack load cmake@3.24.3%fj@4.10.0 arch=linux-rhel8-a64fx

# avoid harmless warning messages "[WARN] xos LPG [...]"
export LD_LIBRARY_PATH=/lib64:$LD_LIBRARY_PATH

# optional: faster builds
spack load ninja@1.11.1%fj@4.10.0

# optional: for PSATD
spack load fujitsu-fftw@1.1.0%fj@4.10.0

# optional: for QED lookup table generation support
spack load boost@1.80.0%fj@4.8.1/zc5pwgc

# optional: for openPMD support
spack load hdf5@1.12.2%fj@4.8.1/im6lxev
export CMAKE_PREFIX_PATH=${HOME}/sw/fugaku/a64fx/c-blosc-1.21.1-install:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${HOME}/sw/fugaku/a64fx/adios2-2.8.3-install:$CMAKE_PREFIX_PATH

# compiler environment hints
export CC=$(which mpifcc)
export CXX=$(which mpiFCC)
export FC=$(which mpifrt)
export CFLAGS="-O3 -Nclang -Nlibomp -Klib -g -DNDEBUG"
export CXXFLAGS="-O3 -Nclang -Nlibomp -Klib -g -DNDEBUG"

# avoid harmless warning messages "[WARN] xos LPG [...]"
export LD_LIBRARY_PATH=/lib64:$LD_LIBRARY_PATH

Finally, since Fugaku does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/fugaku-riken/install_dependencies.sh
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl, Luca Fedeli
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail

# Remove old dependencies #####################################################
#
SRC_DIR="${HOME}/src/"
SW_DIR="${HOME}/sw/fugaku/a64fx/"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}
mkdir -p ${SRC_DIR}

# General extra dependencies ##################################################
#

# c-blosc (I/O compression)
if [ -d ${SRC_DIR}/c-blosc ]
then
  cd ${SRC_DIR}/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git ${SRC_DIR}/c-blosc
fi
  rm -rf ${SRC_DIR}/c-blosc-fugaku-build
  cmake -S ${SRC_DIR}/c-blosc -B ${SRC_DIR}/c-blosc-fugaku-build -DBUILD_SHARED_LIBS=OFF -DBUILD_SHARED=OFF -DBUILD_STATIC=ON -DBUILD_TESTS=OFF -DBUILD_FUZZERS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1-install
  cmake --build ${SRC_DIR}/c-blosc-fugaku-build --target install --parallel 48
  rm -rf ${SRC_DIR}/c-blosc-fugaku-build

# ADIOS2 (I/O)
if [ -d ${SRC_DIR}/c-blosc ]
then
  cd ${SRC_DIR}/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git ${SRC_DIR}/adios2
fi
rm -rf ${SRC_DIR}/adios2-fugaku-build
cmake -S ${SRC_DIR}/adios2 -B ${SRC_DIR}/adios2-fugaku-build -DBUILD_SHARED_LIBS=OFF -DADIOS2_USE_Blosc=ON -DBUILD_TESTING=OFF -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3-install
cmake --build ${SRC_DIR}/adios2-fugaku-build --target install -j 48
rm -rf ${SRC_DIR}/adios2-fugaku-build
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build

export CC=$(which mpifcc)
export CXX=$(which mpiFCC)
export CFLAGS="-Nclang"
export CXXFLAGS="-Nclang"

cmake -S . -B build -DWarpX_COMPUTE=OMP \
    -DWarpX_DIMS="1;2;3" \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_CXX_FLAGS_RELEASE="-Ofast" \
    -DAMReX_DIFFERENT_COMPILER=ON \
    -DWarpX_MPI_THREAD_MULTIPLE=OFF

cmake --build build -j 48

That’s it! A 3D WarpX executable is now in build/bin/ and can be run with a 3D example inputs file.

Running
A64FX CPUs

In non-interactive runs, you can use pjsub submit.sh where submit.sh can be adapted from:

You can copy this file from Tools/machines/fugaku-riken/submit.sh.
#!/bin/bash
#PJM -L "node=48"
#PJM -L "rscgrp=small"
#PJM -L "elapse=0:30:00"
#PJM -s
#PJM -L "freq=2200,eco_state=2"
#PJM --mpi "max-proc-per-node=12"
#PJM -x PJM_LLIO_GFSCACHE=/vol0004:/vol0003
#PJM --llio localtmp-size=10Gi
#PJM --llio sharedtmp-size=10Gi

export NODES=48
export MPI_RANKS=$((NODES * 12))
export OMP_NUM_THREADS=4

export EXE="./warpx"
export INPUT="i.3d"

export XOS_MMM_L_PAGING_POLICY=demand:demand:demand

# Add HDF5 library path to LD_LIBRARY_PATH
# This is done manually to avoid calling spack during the run,
# since this would take a significant amount of time.
export LD_LIBRARY_PATH=/vol0004/apps/oss/spack-v0.19/opt/spack/linux-rhel8-a64fx/fj-4.8.1/hdf5-1.12.2-im6lxevf76cu6cbzspi4itgz3l4gncjj/lib:$LD_LIBRARY_PATH

# Broadcast WarpX executable to all the nodes
llio_transfer ${EXE}

mpiexec -stdout-proc ./output.%j/%/1000r/stdout -stderr-proc ./output.%j/%/1000r/stderr -n ${MPI_RANKS} ${EXE} ${INPUT}

llio_transfer --purge ${EXE}

Note: the Boost Eco Mode mode that is set in this example increases the default frequency of the A64FX from 2 GHz to 2.2 GHz, while at the same time switching off one of the two floating-point arithmetic pipelines. Some preliminary tests with WarpX show that this mode achieves performances similar to those of the normal mode but with a reduction of the energy consumption of approximately 20%.

HPC3 (UCI)

The HPC3 supercomputer is located at University of California, Irvine.

Introduction

If you are new to this system, please see the following resources:

  • HPC3 user guide

  • Batch system: Slurm (notes)

  • Jupyter service

  • Filesystems:

    • $HOME: per-user directory, use only for inputs, source and scripts; backed up (40GB)

    • /pub/$USER: per-user production directory; fast and larger storage for parallel jobs (1TB default quota)

    • /dfsX/<lab-path> lab group quota (based on PI’s purchase allocation). The storage owner (PI) can specify what users have read/write capability on the specific filesystem.

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

On HPC3, you recommend to run on the fast GPU nodes with V100 GPUs.

We use system software modules, add environment hints and further dependencies via the file $HOME/hpc3_gpu_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/hpc3-uci/hpc3_gpu_warpx.profile.example $HOME/hpc3_gpu_warpx.profile
Script Details
# please set your project account
export proj=""  # change me! GPU projects must end in "..._g"

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module load cmake/3.22.1
module load gcc/11.2.0
module load cuda/11.7.1
module load openmpi/4.1.2/gcc.11.2.0

# optional: for QED support with detailed tables
module load boost/1.78.0/gcc.11.2.0

# optional: for openPMD and PSATD+RZ support
module load OpenBLAS/0.3.21
module load hdf5/1.13.1/gcc.11.2.0-openmpi.4.1.2
export CMAKE_PREFIX_PATH=${HOME}/sw/hpc3/gpu/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${HOME}/sw/hpc3/gpu/adios2-2.8.3:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${HOME}/sw/hpc3/gpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${HOME}/sw/hpc3/gpu/lapackpp-master:$CMAKE_PREFIX_PATH

export LD_LIBRARY_PATH=${HOME}/sw/hpc3/gpu/c-blosc-1.21.1/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${HOME}/sw/hpc3/gpu/adios2-2.8.3/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${HOME}/sw/hpc3/gpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${HOME}/sw/hpc3/gpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

export PATH=${HOME}/sw/hpc3/gpu/adios2-2.8.3/bin:${PATH}

# optional: CCache
#module load ccache  # missing

# optional: for Python bindings
module load python/3.10.2

if [ -d "${HOME}/sw/hpc3/gpu/venvs/warpx-gpu" ]
then
  source ${HOME}/sw/hpc3/gpu/venvs/warpx-gpu/bin/activate
fi

# an alias to request an interactive batch node for one hour
#   for parallel execution, start on the batch node: srun <command>
alias getNode="salloc -N 1 -t 0:30:00 --gres=gpu:V100:1 -p free-gpu"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -N 1 -t 0:30:00 --gres=gpu:V100:1 -p free-gpu"

# optimize CUDA compilation for V100
export AMREX_CUDA_ARCH=7.0

# compiler environment hints
export CXX=$(which g++)
export CC=$(which gcc)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project plasma, then run vi $HOME/hpc3_gpu_warpx.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="plasma"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to HPC3, activate these environment settings:

source $HOME/hpc3_gpu_warpx.profile

Finally, since HPC3 does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/hpc3-uci/install_gpu_dependencies.sh
source $HOME/sw/hpc3/gpu/venvs/warpx-gpu/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl, Victor Flores
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was hpc3_gpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your hpc3_gpu_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Check $proj variable is correct and has a corresponding project directory ####
#
#if [ ! -d "/dfsX/${proj}/" ]
#then
#    echo "WARNING: The directory /dfsX/${proj}/ does not exist!"
#    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
#    echo "Please edit line 2 of your hpc3_gpu_warpx.profile file to continue!"
##    exit
fi


# Remove old dependencies #####################################################
#
SW_DIR="${HOME}/sw/hpc3/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# c-blosc (I/O compression)
if [ -d $HOME/src/c-blosc ]
then
  cd $HOME/src/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git $HOME/src/c-blosc
fi
rm -rf $HOME/src/c-blosc-pm-gpu-build
cmake -S $HOME/src/c-blosc -B $HOME/src/c-blosc-pm-gpu-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build $HOME/src/c-blosc-pm-gpu-build --target install --parallel 8
rm -rf $HOME/src/c-blosc-pm-gpu-build

# ADIOS2
if [ -d $HOME/src/adios2 ]
then
  cd $HOME/src/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git $HOME/src/adios2
fi
rm -rf $HOME/src/adios2-pm-gpu-build
cmake -S $HOME/src/adios2 -B $HOME/src/adios2-pm-gpu-build -DBUILD_TESTING=OFF -DADIOS2_BUILD_EXAMPLES=OFF -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_HDF5=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build $HOME/src/adios2-pm-gpu-build --target install --parallel 8
rm -rf $HOME/src/adios2-pm-gpu-build

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-pm-gpu-build
cmake -S $HOME/src/blaspp -B $HOME/src/blaspp-pm-gpu-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build $HOME/src/blaspp-pm-gpu-build --target install --parallel 8
rm -rf $HOME/src/blaspp-pm-gpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-pm-gpu-build
CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B $HOME/src/lapackpp-pm-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build $HOME/src/lapackpp-pm-gpu-build --target install --parallel 8
rm -rf $HOME/src/lapackpp-pm-gpu-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-gpu
python3 -m venv ${SW_DIR}/venvs/warpx-gpu
source ${SW_DIR}/venvs/warpx-gpu/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
# optional: for libEnsemble
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
python3 -m pip install --upgrade torch  # CUDA 11.7 compatible wheel
python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build

cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build -j 8

The WarpX application executables are now in $HOME/src/warpx/build/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_py

cmake -S . -B build_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_py -j 8 --target pip_install

Now, you can submit HPC3 compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit HPC3 jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PSCRATCH.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build and rebuild WarpX.

Running

The batch script below can be used to run a WarpX simulation on multiple nodes (change -N accordingly) on the supercomputer HPC3 at UCI. This partition as up to 32 nodes with four V100 GPUs (16 GB each) per node.

Replace descriptions between chevrons <> by relevant values, for instance <proj> could be plasma. Note that we run one MPI rank per GPU.

You can copy this file from $HOME/src/warpx/Tools/machines/hpc3-uci/hpc3_gpu.sbatch.
#!/bin/bash -l

# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl, Victor Flores
# License: BSD-3-Clause-LBNL

#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH -J WarpX
#S BATCH -A <proj>
# V100 GPU options: gpu, free-gpu, debug-gpu
#SBATCH -p free-gpu
# use all four GPUs per node
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:V100:4
#SBATCH --cpus-per-task=10
#S BATCH --mail-type=begin,end
#S BATCH --mail-user=<your-email>@uci.edu
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx.rz
INPUTS=inputs_rz

# OpenMP threads
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

# run
mpirun -np ${SLURM_NTASKS} bash -c "
    export CUDA_VISIBLE_DEVICES=\${SLURM_LOCALID};
    ${EXE} ${INPUTS}" \
  > output.txt

To run a simulation, copy the lines above to a file hpc3_gpu.sbatch and run

sbatch hpc3_gpu.sbatch

to submit the job.

Post-Processing

UCI provides a pre-configured Jupyter service that can be used for data-analysis.

We recommend to install at least the following pip packages for running Python3 Jupyter notebooks on WarpX data analysis: h5py ipympl ipywidgets matplotlib numpy openpmd-viewer openpmd-api pandas scipy yt

Juwels (JSC)

Note

For the moment, WarpX doesn’t run on Juwels with MPI_THREAD_MULTIPLE. Please compile with this compilation flag: MPI_THREAD_MULTIPLE=FALSE.

The Juwels supercomputer is located at JSC.

Introduction

If you are new to this system, please see the following resources:

See this page for a quick introduction. (Full user guide).

  • Batch system: Slurm

  • Production directories:

    • $SCRATCH: Scratch filesystem for temporary data (90 day purge)

    • $FASTDATA/: Storage location for large data (backed up)

    • Note that the $HOME directory is not designed for simulation runs and producing output there will impact performance.

Installation

Use the following commands to download the WarpX source code and switch to the correct branch:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use the following modules and environments on the system.

You can copy this file from Tools/machines/juwels-jsc/juwels_warpx.profile.example.
# please set your project account
#export proj=<yourProject>

# required dependencies
module load ccache
module load CMake
module load GCC
module load CUDA/11.3
module load OpenMPI
module load FFTW
module load HDF5
module load Python

# JUWELS' job scheduler may not map ranks to GPUs,
# so we give a hint to AMReX about the node layout.
# This is usually done in Make.<supercomputing center> files in AMReX
# but there is no such file for JSC yet.
export GPUS_PER_SOCKET=2
export GPUS_PER_NODE=4

# optimize CUDA compilation for V100 (7.0) or for A100 (8.0)
export AMREX_CUDA_ARCH=8.0

Note that for now WarpX must rely on OpenMPI instead of the recommended MPI implementation on this platform MVAPICH2.

We recommend to store the above lines in a file, such as $HOME/juwels_warpx.profile, and load it into your shell after a login:

source $HOME/juwels_warpx.profile

Then, cd into the directory $HOME/src/warpx and use the following commands to compile:

cd $HOME/src/warpx
rm -rf build

cmake -S . -B build -DWarpX_DIMS="1;2;3" -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_MPI_THREAD_MULTIPLE=OFF
cmake --build build -j 16

The other general compile-time options apply as usual.

That’s it! A 3D WarpX executable is now in build/bin/ and can be run with a 3D example inputs file. Most people execute the binary directly or copy it out to a location in $SCRATCH.

Note

Currently, if you want to use HDF5 output with openPMD, you need to add

export OMPI_MCA_io=romio321

in your job scripts, before running the srun command.

Running
Queue: gpus (4 x Nvidia V100 GPUs)

The Juwels GPUs are V100 (16GB) and A100 (40GB).

An example submission script reads

You can copy this file from Tools/machines/juwels-jsc/juwels.sbatch.
#!/bin/bash -l

#SBATCH -A $proj
#SBATCH --partition=booster
#SBATCH --nodes=2
#SBATCH --ntasks=8
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:4
#SBATCH --time=00:05:00
#SBATCH --job-name=warpx
#SBATCH --output=warpx-%j-%N.txt
#SBATCH --error=warpx-%j-%N.err

export OMP_NUM_THREADS=1
export OMPI_MCA_io=romio321  # for HDF5 support in openPMD

# you can comment this out if you sourced the warpx.profile
# files before running sbatch:
module load GCC
module load OpenMPI
module load CUDA/11.3
module load HDF5
module load Python

srun -n 8 --cpu_bind=sockets $HOME/src/warpx/build/bin/warpx.3d.MPI.CUDA.DP.OPMD.QED inputs
Queue: batch (2 x Intel Xeon Platinum 8168 CPUs, 24 Cores + 24 Hyperthreads/CPU)

todo

See the data analysis section for more information on how to visualize the simulation results.

Karolina (IT4I)

The Karolina cluster is located at IT4I, Technical University of Ostrava.

Introduction

If you are new to this system, please see the following resources:

  • IT4I user guide

  • Batch system: SLURM

  • Jupyter service: not provided/documented (yet)

  • Filesystems:

    • $HOME: per-user directory, use only for inputs, source and scripts; backed up (25GB default quota)

    • /scratch/: production directory; very fast for parallel jobs (10TB default)

    • /mnt/proj<N>/<proj>: per-project work directory, used for long term data storage (20TB default)

Installation

We show how to install from scratch all the dependencies using Spack.

For size reasons it is not advisable to install WarpX in the $HOME directory, it should be installed in the “work directory”. For this purpose we set an environment variable $WORK with the path to the “work directory”.

On Karolina, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes.

Profile file

One can use the pre-prepared karolina_warpx.profile script below, which you can copy to ${HOME}/karolina_warpx.profile, edit as required and then source.

Script Details
Copy the contents of this file to ${HOME}/karolina_warpx.profile.
# please set your project account, ie DD-N-N
export proj="<proj_id>"  # change me!

# Name and Path of this Script ################### (DO NOT change!)
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)

if [ -z ${proj-} ]; then
    echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file!"
    echo "Please edit its line 2 to continue!"
    return
fi

# set env variable storing the path to the work directory
# please check if your project ID belongs to proj1, proj2, proj3 etc
export WORK="/mnt/proj<N>/${proj,,}/${USER}"  # change me!
mkdir -p WORK

# clone warpx
# you can also clone your own fork here, eg git@github.com:<user>/WarpX.git
if [ ! -d "$WORK/src/warpx" ]
then
    git clone https://github.com/ECP-WarpX/WarpX.git $WORK/src/warpx
fi

# load required modules
module purge
module load OpenMPI/4.1.4-GCC-11.3.0-CUDA-11.7.0

source $WORK/spack/share/spack/setup-env.sh && spack env activate warpx-karolina-cuda && {
    echo "Spack environment 'warpx-karolina-cuda' activated successfully."
} || {
    echo "Failed to activate Spack environment 'warpx-karolina-cuda'. Please run install_dependencies.sh."
}

# Text Editor for Tools ########################## (edit this line)
# examples: "nano", "vim", "emacs -nw" or without terminal: "gedit"
#export EDITOR="nano"  # change me!

# allocate an interactive shell for one hour
# usage: getNode 2  # allocates two interactive nodes (default: 1)
function getNode() {
    if [ -z "$1" ] ; then
        numNodes=1
    else
        numNodes=$1
    fi
    export OMP_NUM_THREADS=16
    srun --time=1:00:00 --nodes=$numNodes --ntasks=$((8 * $numNodes)) --ntasks-per-node=8 --cpus-per-task=16 --exclusive --gpus-per-node=8 -p qgpu -A $proj --pty bash
}

# Environment #####################################################
# optimize CUDA compilation for A100
export AMREX_CUDA_ARCH="8.0"
export SCRATCH="/scratch/project/${proj,,}/${USER}"

# optimize CPU microarchitecture for AMD EPYC 7763 (zen3)
export CFLAGS="-march=znver3"
export CXXFLAGS="-march=znver3"

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

To have the environment activated on every login, add the following line to ${HOME}/.bashrc:

source $HOME/karolina_warpx.profile

To install the spack environment and Python packages:

bash $WORK/src/warpx/Tools/machines/karolina-it4i/install_dependencies.sh
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl, Andrei Berceanu
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #################################
#
set -eu -o pipefail

# Check: ##########################################################
#
# Was karolina_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then
    echo "WARNING: The 'proj' variable is not yet set in your karolina_warpx.profile file!"
    echo "Please edit its line 2 to continue!"
    return
fi

# download and activate spack
# this might take about ~ 1 hour
if [ ! -d "$WORK/spack" ]
then
    git clone -c feature.manyFiles=true -b v0.21.0 https://github.com/spack/spack.git $WORK/spack
    source $WORK/spack/share/spack/setup-env.sh
else
    # If the directory exists, checkout v0.21.0 branch
    cd $WORK/spack
    git checkout v0.21.0
    git pull origin v0.21.0
    source $WORK/spack/share/spack/setup-env.sh

    # Delete spack env if present
    spack env deactivate || true
    spack env rm -y warpx-karolina-cuda || true

    cd -
fi

# create and activate the spack environment
spack env create warpx-karolina-cuda $WORK/src/warpx/Tools/machines/karolina-it4i/spack-karolina-cuda.yaml
spack env activate warpx-karolina-cuda
spack install

# Python ##########################################################
#
python -m pip install --user --upgrade pandas
python -m pip install --user --upgrade matplotlib
# optional
#python -m pip install --user --upgrade yt

# install or update WarpX dependencies
python -m pip install --user --upgrade picmistandard==0.28.0
python -m pip install --user --upgrade lasy

# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
# python -m pip install --user --upgrade -r $WORK/src/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $WORK/src/warpx
rm -rf build_gpu

cmake -S . -B build_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu -j 48

The WarpX application executables are now in $WORK/src/warpx/build_gpu/bin/. Additionally, the following commands will install WarpX as a Python module:

cd $WORK/src/warpx
rm -rf build_gpu_py

cmake -S . -B build_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu_py -j 48 --target pip_install

Now, you can submit Karolina compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Karolina jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in /scratch/.

Running

The batch script below can be used to run a WarpX simulation on multiple GPU nodes (change #SBATCH --nodes= accordingly) on the supercomputer Karolina at IT4I. This partition has up to 72 nodes. Every node has 8x A100 (40GB) GPUs and 2x AMD EPYC 7763, 64-core, 2.45 GHz processors.

Replace descriptions between chevrons <> by relevant values, for instance <proj> could be DD-23-83. Note that we run one MPI rank per GPU.

You can copy this file from $WORK/src/warpx/Tools/machines/karolina-it4i/karolina_gpu.sbatch.
#!/bin/bash -l

# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl, Andrei Berceanu
# License: BSD-3-Clause-LBNL

#SBATCH --account=<proj>
#SBATCH --partition=qgpu
#SBATCH --time=00:10:00
#SBATCH --job-name=WarpX
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=16
#SBATCH --gpus-per-node=8
#SBATCH --gpu-bind=single:1

#SBATCH --mail-type=ALL
# change me!
#SBATCH --mail-user=someone@example.com
#SBATCH --chdir=/scratch/project/<proj>/it4i-<user>/runs/warpx

#SBATCH -o stdout_%j
#SBATCH -e stderr_%j

# OpenMP threads per MPI rank
export OMP_NUM_THREADS=16
export SRUN_CPUS_PER_TASK=16

# set user rights to u=rwx;g=r-x;o=---
umask 0027

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx.rz
INPUTS=./inputs_rz

# run
srun -K1 ${EXE} ${INPUTS}

To run a simulation, copy the lines above to a file karolina_gpu.sbatch and run

sbatch karolina_gpu.sbatch

to submit the job.

Post-Processing

Note

This section was not yet written. Usually, we document here how to use a Jupyter service.

Lassen (LLNL)

The Lassen V100 GPU cluster is located at LLNL.

Introduction

If you are new to this system, please see the following resources:

Login

Lassen is currently transitioning to RHEL8. During this transition, first SSH into lassen and then to the updated RHEL8/TOSS4 nodes.

ssh lassen.llnl.gov
ssh eatoss4

Approximately October/November 2023, the new software environment on these nodes will be the new default.

ssh lassen.llnl.gov

Approximately October/November 2023, this partition will become TOSS4 (RHEL8) as well.

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git /usr/workspace/${USER}/lassen/src/warpx

We use system software modules, add environment hints and further dependencies via the file $HOME/lassen_v100_warpx.profile. Create it now:

cp /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/lassen_v100_warpx.profile.example $HOME/lassen_v100_warpx.profile
Script Details
# please set your project account
#export proj="<yourProjectNameHere>"  # edit this and comment in

# required dependencies
module load cmake/3.23.1
module load gcc/11.2.1
module load cuda/12.0.0

# optional: for QED lookup table generation support
module load boost/1.70.0

# optional: for openPMD support
SRC_DIR="/usr/workspace/${USER}/lassen/src"
SW_DIR="/usr/workspace/${USER}/lassen/gpu"
export CMAKE_PREFIX_PATH=${SW_DIR}/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/hdf5-1.14.1.2:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/adios2-2.8.3:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${SW_DIR}/c-blosc-1.21.1/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/hdf5-1.14.1.2/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/adios2-2.8.3/lib64:$LD_LIBRARY_PATH
export PATH=${SW_DIR}/hdf5-1.14.1.2/bin:${PATH}
export PATH=${SW_DIR}/adios2-2.8.3/bin:$PATH

# optional: for PSATD in RZ geometry support
export CMAKE_PREFIX_PATH=${SW_DIR}/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${SW_DIR}/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for Python bindings
module load python/3.8.2

if [ -d "${SW_DIR}/venvs/warpx-lassen" ]
then
    source ${SW_DIR}/venvs/warpx-lassen/bin/activate
fi

# optional: an alias to request an interactive node for two hours
alias getNode="bsub -G $proj -W 2:00 -nnodes 1 -Is /bin/bash"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="bsub -q debug -P $proj -W 2:00 -nnodes 1 -I"

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# optimize CUDA compilation for V100
export AMREX_CUDA_ARCH=7.0
export CUDAARCHS=70

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project nsldt, then run vi $HOME/lassen_v100_warpx.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="nsldt"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to lassen, activate these environment settings:

source $HOME/lassen_v100_warpx.profile

We use system software modules, add environment hints and further dependencies via the file $HOME/lassen_v100_warpx_toss3.profile. Create it now:

cp /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/lassen_v100_warpx_toss3.profile.example $HOME/lassen_v100_warpx_toss3.profile
Script Details
# please set your project account
#export proj="<yourProjectNameHere>"  # edit this and comment in

# required dependencies
module load cmake/3.23.1
module load gcc/11.2.1
module load cuda/12.0.0

# optional: for QED lookup table generation support
module load boost/1.70.0

# optional: for openPMD support
SRC_DIR="/usr/workspace/${USER}/lassen/src"
SW_DIR="/usr/workspace/${USER}/lassen-toss3/gpu"
export CMAKE_PREFIX_PATH=${SW_DIR}/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/hdf5-1.14.1.2:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/adios2-2.8.3:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${SW_DIR}/c-blosc-1.21.1/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/hdf5-1.14.1.2/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/adios2-2.8.3/lib64:$LD_LIBRARY_PATH
export PATH=${SW_DIR}/hdf5-1.14.1.2/bin:${PATH}
export PATH=${SW_DIR}/adios2-2.8.3/bin:${PATH}

# optional: for PSATD in RZ geometry support
export CMAKE_PREFIX_PATH=${SW_DIR}/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${SW_DIR}/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for Python bindings
module load python/3.8.2

if [ -d "${SW_DIR}/venvs/warpx-lassen-toss3" ]
then
    source ${SW_DIR}/venvs/warpx-lassen-toss3/bin/activate
fi

# optional: an alias to request an interactive node for two hours
alias getNode="bsub -G $proj -W 2:00 -nnodes 1 -Is /bin/bash"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="bsub -q debug -P $proj -W 2:00 -nnodes 1 -I"

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# optimize CUDA compilation for V100
export AMREX_CUDA_ARCH=7.0
export CUDAARCHS=70

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project nsldt, then run vi $HOME/lassen_v100_warpx_toss3.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="nsldt"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to lassen, activate these environment settings:

source $HOME/lassen_v100_warpx_toss3.profile

Finally, since lassen does not yet provide software modules for some of our dependencies, install them once:

bash /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/install_v100_dependencies.sh
source /usr/workspace/${USER}/lassen/gpu/venvs/warpx-lassen/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was lassen_v100_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your lassen_v100_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
SRC_DIR="/usr/workspace/${USER}/lassen/src"
SW_DIR="/usr/workspace/${USER}/lassen/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}
mkdir -p ${SRC_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with $HOME and is faster
build_dir=$(mktemp -d)

# c-blosc (I/O compression)
if [ -d ${SRC_DIR}/c-blosc ]
then
  cd ${SRC_DIR}/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git ${SRC_DIR}/c-blosc
fi
cmake -S ${SRC_DIR}/c-blosc -B ${build_dir}/c-blosc-lassen-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build ${build_dir}/c-blosc-lassen-build --target install --parallel 10

# HDF5
if [ -d ${SRC_DIR}/hdf5 ]
then
  cd ${SRC_DIR}/hdf5
  git fetch --prune
  git checkout hdf5-1_14_1-2
  cd -
else
  git clone -b hdf5-1_14_1-2 https://github.com/HDFGroup/hdf5.git ${SRC_DIR}/hdf5
fi
cmake -S ${SRC_DIR}/hdf5 -B ${build_dir}/hdf5-lassen-build -DBUILD_TESTING=OFF -DHDF5_ENABLE_PARALLEL=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/hdf5-1.14.1.2
cmake --build ${build_dir}/hdf5-lassen-build --target install --parallel 10

# ADIOS2
if [ -d ${SRC_DIR}/adios2 ]
then
  cd ${SRC_DIR}/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git ${SRC_DIR}/adios2
fi
cmake -S ${SRC_DIR}/adios2 -B ${build_dir}/adios2-lassen-build -DBUILD_TESTING=OFF -DADIOS2_BUILD_EXAMPLES=OFF -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_SST=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build ${build_dir}/adios2-lassen-build --target install -j 10

# BLAS++ (for PSATD+RZ)
if [ -d ${SRC_DIR}/blaspp ]
then
  cd ${SRC_DIR}/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git ${SRC_DIR}/blaspp
fi
cmake -S ${SRC_DIR}/blaspp -B ${build_dir}/blaspp-lassen-build -Duse_openmp=ON -Dgpu_backend=cuda -Duse_cmake_find_blas=ON -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-lassen-build --target install --parallel 10

# LAPACK++ (for PSATD+RZ)
if [ -d ${SRC_DIR}/lapackpp ]
then
  cd ${SRC_DIR}/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git ${SRC_DIR}/lapackpp
fi
CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S ${SRC_DIR}/lapackpp -B ${build_dir}/lapackpp-lassen-build -Duse_cmake_find_lapack=ON -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master -DLAPACK_LIBRARIES=/usr/lib64/liblapack.so
cmake --build ${build_dir}/lapackpp-lassen-build --target install --parallel 10


# Python ######################################################################
#
# sometimes, the Lassen PIP Index is down
export PIP_EXTRA_INDEX_URL="https://pypi.org/simple"

python3 -m pip install --upgrade --user virtualenv
rm -rf ${SW_DIR}/venvs/warpx-lassen
python3 -m venv ${SW_DIR}/venvs/warpx-lassen
source ${SW_DIR}/venvs/warpx-lassen/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip cache purge
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
# Older version for h4py
# https://github.com/h5py/h5py/issues/2268
python3 -m pip install --upgrade "cython<3"
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade -Ccompile-args="-j10" scipy
python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
CC=mpicc H5PY_SETUP_REQUIRES=0 HDF5_DIR=${SW_DIR}/hdf5-1.14.1.2 HDF5_MPI=ON python3 -m pip install --upgrade h5py --no-cache-dir --no-build-isolation --no-binary h5py
MPLLOCALFREETYPE=1 python3 -m pip install --upgrade matplotlib==3.2.2  # does not try to build freetype itself
echo "matplotlib==3.2.2" > ${build_dir}/constraints.txt
python3 -m pip install --upgrade -c ${build_dir}/constraints.txt yt

# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r ${SRC_DIR}/warpx/requirements.txt

# for ML dependencies, see install_v100_ml.sh


# remove build temporary directory
rm -rf ${build_dir}
AI/ML Dependencies (Optional)

If you plan to run AI/ML workflows depending on pyTorch, run the next step as well. This will take a while and should be skipped if not needed.

runNode bash /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/install_v100_ml.sh
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was lassen_v100_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your lassen_v100_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
SRC_DIR="/usr/workspace/${USER}/lassen/src"
mkdir -p ${SRC_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qqq -y torch 2>/dev/null || true


# Python ML ###################################################################
#
# for basic python dependencies, see install_v100_dependencies.sh

# optional: for libEnsemble - WIP: issues with nlopt
# python3 -m pip install -r ${SRC_DIR}/warpx/Tools/LibEnsemble/requirements.txt

# optional: for pytorch
if [ -d ${SRC_DIR}/pytorch ]
then
  cd ${SRC_DIR}/pytorch
  git fetch
  git checkout .
  git checkout v2.0.1
  git submodule update --init --recursive
  cd -
else
  git clone -b v2.0.1 --recurse-submodules https://github.com/pytorch/pytorch.git ${SRC_DIR}/pytorch
fi
cd ${SRC_DIR}/pytorch
rm -rf build

# see https://github.com/pytorch/pytorch/issues/97497#issuecomment-1499069641
#     https://github.com/pytorch/pytorch/pull/98511
wget -q -O - https://github.com/pytorch/pytorch/pull/98511.patch | git apply

python3 -m pip install -r requirements.txt

# see https://github.com/pytorch/pytorch/issues/108984#issuecomment-1712938737
LDFLAGS="-L${CUDA_HOME}/nvidia/targets/ppc64le-linux/lib/" \
USE_CUDA=1 BLAS=OpenBLAS MAX_JOBS=64 ATEN_AVX512_256=OFF BUILD_TEST=0 python3 setup.py develop
#   (optional) If using torch.compile with inductor/triton, install the matching version of triton
#make triton
rm -rf build
cd -

# optional: optimas dependencies (based on libEnsemble & ax->botorch->gpytorch->pytorch)
# TODO: scikit-learn needs a BLAS hint
#   commented because scikit-learn et al. compile > 2 hrs
#   please run manually on a login node if needed
#python3 -m pip install -r ${SRC_DIR}/warpx/Tools/optimas/requirements.txt

For optimas dependencies (incl. scikit-learn), plan another hour of build time:

python3 -m pip install -r /usr/workspace/${USER}/lassen/src/warpx/Tools/optimas/requirements.txt
bash /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/install_v100_dependencies_toss3.sh
source /usr/workspace/${USER}/lassen-toss3/gpu/venvs/warpx-lassen-toss3/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was lassen_v100_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your lassen_v100_warpx_toss3.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
SRC_DIR="/usr/workspace/${USER}/lassen-toss3/src"
SW_DIR="/usr/workspace/${USER}/lassen-toss3/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}
mkdir -p ${SRC_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with $HOME and is faster
build_dir=$(mktemp -d)

# c-blosc (I/O compression)
if [ -d ${SRC_DIR}/c-blosc ]
then
  cd ${SRC_DIR}/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git ${SRC_DIR}/c-blosc
fi
cmake -S ${SRC_DIR}/c-blosc -B ${build_dir}/c-blosc-lassen-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build ${build_dir}/c-blosc-lassen-build --target install --parallel 10

# HDF5
if [ -d ${SRC_DIR}/hdf5 ]
then
  cd ${SRC_DIR}/hdf5
  git fetch --prune
  git checkout hdf5-1_14_1-2
  cd -
else
  git clone -b hdf5-1_14_1-2 https://github.com/HDFGroup/hdf5.git ${SRC_DIR}/hdf5
fi
cmake -S ${SRC_DIR}/hdf5 -B ${build_dir}/hdf5-lassen-build -DBUILD_TESTING=OFF -DHDF5_ENABLE_PARALLEL=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/hdf5-1.14.1.2
cmake --build ${build_dir}/hdf5-lassen-build --target install --parallel 10

# ADIOS2
if [ -d ${SRC_DIR}/adios2 ]
then
  cd ${SRC_DIR}/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git ${SRC_DIR}/adios2
fi
cmake -S ${SRC_DIR}/adios2 -B ${build_dir}/adios2-lassen-build -DBUILD_TESTING=OFF -DADIOS2_BUILD_EXAMPLES=OFF -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_SST=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build ${build_dir}/adios2-lassen-build --target install -j 10

# BLAS++ (for PSATD+RZ)
if [ -d ${SRC_DIR}/blaspp ]
then
  cd ${SRC_DIR}/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git ${SRC_DIR}/blaspp
fi
cmake -S ${SRC_DIR}/blaspp -B ${build_dir}/blaspp-lassen-build -Duse_openmp=ON -Dgpu_backend=cuda -Duse_cmake_find_blas=ON -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-lassen-build --target install --parallel 10

# LAPACK++ (for PSATD+RZ)
if [ -d ${SRC_DIR}/lapackpp ]
then
  cd ${SRC_DIR}/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git ${SRC_DIR}/lapackpp
fi
CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S ${SRC_DIR}/lapackpp -B ${build_dir}/lapackpp-lassen-build -Duse_cmake_find_lapack=ON -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master -DLAPACK_LIBRARIES=/usr/lib64/liblapack.so
cmake --build ${build_dir}/lapackpp-lassen-build --target install --parallel 10


# Python ######################################################################
#
# sometimes, the Lassen PIP Index is down
export PIP_EXTRA_INDEX_URL="https://pypi.org/simple"

python3 -m pip install --upgrade --user virtualenv
rm -rf ${SW_DIR}/venvs/warpx-lassen-toss3
python3 -m venv ${SW_DIR}/venvs/warpx-lassen-toss3
source ${SW_DIR}/venvs/warpx-lassen-toss3/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip cache purge
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
# Older version for h4py
# https://github.com/h5py/h5py/issues/2268
python3 -m pip install --upgrade "cython<3"
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
CMAKE_PREFIX_PATH=/usr/lib64:${CMAKE_PREFIX_PATH} python3 -m pip install --upgrade -Ccompile-args="-j10" -Csetup-args=-Dblas=BLAS -Csetup-args=-Dlapack=BLAS scipy
python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
CC=mpicc H5PY_SETUP_REQUIRES=0 HDF5_DIR=${SW_DIR}/hdf5-1.14.1.2 HDF5_MPI=ON python3 -m pip install --upgrade h5py --no-cache-dir --no-build-isolation --no-binary h5py
MPLLOCALFREETYPE=1 python3 -m pip install --upgrade matplotlib==3.2.2  # does not try to build freetype itself
echo "matplotlib==3.2.2" > ${build_dir}/constraints.txt
python3 -m pip install --upgrade -c ${build_dir}/constraints.txt yt

# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r ${SRC_DIR}/warpx/requirements.txt

# for ML dependencies, see install_v100_ml.sh


# remove build temporary directory
rm -rf ${build_dir}
Compilation

Use the following cmake commands to compile the application executable:

cd /usr/workspace/${USER}/lassen/src/warpx
rm -rf build_lassen

cmake -S . -B build_lassen -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lassen -j 8

The WarpX application executables are now in /usr/workspace/${USER}/lassen/src/warpx/build_lassen/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_lassen_py

cmake -S . -B build_lassen_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lassen_py -j 8 --target pip_install

Now, you can submit lassen compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit lassen jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd /usr/workspace/${USER}/lassen/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf /usr/workspace/${USER}/lassen/src/warpx/build_lassen and rebuild WarpX.

Running
V100 GPUs (16GB)

The batch script below can be used to run a WarpX simulation on 2 nodes on the supercomputer Lassen at LLNL. Replace descriptions between chevrons <> by relevant values, for instance <input file> could be plasma_mirror_inputs. Note that the only option so far is to run with one MPI rank per GPU.

You can copy this file from Tools/machines/lassen-llnl/lassen_v100.bsub.
#!/bin/bash

# Copyright 2020-2023 Axel Huebl
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# Refs.:
#   https://jsrunvisualizer.olcf.ornl.gov/?s4f0o11n6c7g1r11d1b1l0=
#   https://hpc.llnl.gov/training/tutorials/using-lcs-sierra-system#quick16

#BSUB -G <allocation ID>
#BSUB -W 00:10
#BSUB -nnodes 2
#BSUB -alloc_flags smt4
#BSUB -J WarpX
#BSUB -o WarpXo.%J
#BSUB -e WarpXe.%J

# Work-around OpenMPI bug with chunked HDF5
#   https://github.com/open-mpi/ompi/issues/7795
export OMPI_MCA_io=ompio

# Work-around for broken IBM "libcollectives" MPI_Allgatherv
#   https://github.com/ECP-WarpX/WarpX/pull/2874
export OMPI_MCA_coll_ibm_skip_allgatherv=true

# ROMIO has a hint for GPFS named IBM_largeblock_io which optimizes I/O with operations on large blocks
export IBM_largeblock_io=true

# MPI-I/O: ROMIO hints for parallel HDF5 performance
export ROMIO_HINTS=./romio-hints
#   number of hosts: unique node names minus batch node
NUM_HOSTS=$(( $(echo $LSB_HOSTS | tr ' ' '\n' | uniq | wc -l) - 1 ))
cat > romio-hints << EOL
   romio_cb_write enable
   romio_ds_write enable
   cb_buffer_size 16777216
   cb_nodes ${NUM_HOSTS}
EOL

# OpenMPI file locks are slow and not needed
# https://github.com/open-mpi/ompi/issues/10053
export OMPI_MCA_sharedfp=^lockedfile,individual

# HDF5: disable slow locks (promise not to open half-written files)
export HDF5_USE_FILE_LOCKING=FALSE

# OpenMP: 1 thread per MPI rank
export OMP_NUM_THREADS=1

# store out task host mapping: helps identify broken nodes at scale
jsrun -r 4 -a1 -g 1 -c 7 -e prepended hostname > task_host_mapping.txt

# run WarpX
jsrun -r 4 -a 1 -g 1 -c 7 -l GPU-CPU -d packed -b rs -e prepended -M "-gpu" <path/to/executable> <input file> > output.txt

To run a simulation, copy the lines above to a file lassen_v100.bsub and run

bsub lassen_v100.bsub

to submit the job.

For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell solver on V100 GPUs for a well load-balanced problem (in our case laser wakefield acceleration simulation in a boosted frame in the quasi-linear regime), the following set of parameters provided good performance:

  • amr.max_grid_size=256 and amr.blocking_factor=128.

  • One MPI rank per GPU (e.g., 4 MPI ranks for the 4 GPUs on each Lassen node)

  • Two `128x128x128` grids per GPU, or one `128x128x256` grid per GPU.

Known System Issues

Warning

Feb 17th, 2022 (INC0278922): The implementation of AllGatherv in IBM’s MPI optimization library “libcollectives” is broken and leads to HDF5 crashes for multi-node runs.

Our batch script templates above apply this work-around before the call to jsrun, which avoids the broken routines from IBM and trades them for an OpenMPI implementation of collectives:

export OMPI_MCA_coll_ibm_skip_allgatherv=true

As part of the same CORAL acquisition program, Lassen is very similar to the design of Summit (OLCF). Thus, when encountering new issues it is worth checking also the known Summit issues and work-arounds.

Lawrencium (LBNL)

The Lawrencium cluster is located at LBNL.

Introduction

If you are new to this system, please see the following resources:

Installation

Use the following commands to download the WarpX source code and switch to the correct branch:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use the following modules and environments on the system ($HOME/lawrencium_warpx.profile).

You can copy this file from Tools/machines/lawrencium-lbnl/lawrencium_warpx.profile.example.
# please set your project account
#export proj="<yourProject>"  # change me, e.g., ac_blast

# required dependencies
module load cmake/3.24.1
module load cuda/11.4
module load gcc/7.4.0
module load openmpi/4.0.1-gcc

# optional: for QED support with detailed tables
module load boost/1.70.0-gcc

# optional: for openPMD and PSATD+RZ support
module load hdf5/1.10.5-gcc-p
module load lapack/3.8.0-gcc
# CPU only:
#module load fftw/3.3.8-gcc

export CMAKE_PREFIX_PATH=$HOME/sw/v100/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/v100/adios2-2.8.3:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/v100/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/v100/lapackpp-master:$CMAKE_PREFIX_PATH

export PATH=$HOME/sw/v100/adios2-2.8.3/bin:$PATH

# optional: CCache
#module load ccache  # missing

# optional: for Python bindings or libEnsemble
module load python/3.8.8

if [ -d "$HOME/sw/v100/venvs/warpx" ]
then
  source $HOME/sw/v100/venvs/warpx/bin/activate
fi

# an alias to request an interactive batch node for one hour
#   for parallel execution, start on the batch node: srun <command>
alias getNode="salloc -N 1 -t 1:00:00 --qos=es_debug --partition=es1 --constraint=es1_v100 --gres=gpu:1 --cpus-per-task=4 -A $proj"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -N 1 -t 1:00:00 --qos=es_debug --partition=es1 --constraint=es1_v100 --gres=gpu:1 --cpus-per-task=4 -A $proj"

# optimize CUDA compilation for 1080 Ti (deprecated)
#export AMREX_CUDA_ARCH=6.1
# optimize CUDA compilation for V100
export AMREX_CUDA_ARCH=7.0
# optimize CUDA compilation for 2080 Ti
#export AMREX_CUDA_ARCH=7.5

# compiler environment hints
export CXX=$(which g++)
export CC=$(which gcc)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

We recommend to store the above lines in a file, such as $HOME/lawrencium_warpx.profile, and load it into your shell after a login:

source $HOME/lawrencium_warpx.profile

And since Lawrencium does not yet provide a module for them, install ADIOS2, BLAS++ and LAPACK++:

# c-blosc (I/O compression)
git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git src/c-blosc
rm -rf src/c-blosc-v100-build
cmake -S src/c-blosc -B src/c-blosc-v100-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/c-blosc-1.21.1
cmake --build src/c-blosc-v100-build --target install --parallel 12

# ADIOS2
git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git src/adios2
rm -rf src/adios2-v100-build
cmake -S src/adios2 -B src/adios2-v100-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/adios2-2.8.3
cmake --build src/adios2-v100-build --target install -j 12

# BLAS++ (for PSATD+RZ)
git clone https://github.com/icl-utk-edu/blaspp.git src/blaspp
rm -rf src/blaspp-v100-build
cmake -S src/blaspp -B src/blaspp-v100-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/blaspp-master
cmake --build src/blaspp-v100-build --target install --parallel 12

# LAPACK++ (for PSATD+RZ)
git clone https://github.com/icl-utk-edu/lapackpp.git src/lapackpp
rm -rf src/lapackpp-v100-build
cmake -S src/lapackpp -B src/lapackpp-v100-build -DCMAKE_CXX_STANDARD=17 -Dgpu_backend=cuda -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/lapackpp-master -Duse_cmake_find_lapack=ON -DBLAS_LIBRARIES=${LAPACK_DIR}/lib/libblas.a -DLAPACK_LIBRARIES=${LAPACK_DIR}/lib/liblapack.a
cmake --build src/lapackpp-v100-build --target install --parallel 12

Optionally, download and install Python packages for PICMI or dynamic ensemble optimizations (libEnsemble):

python3 -m pip install --user --upgrade pip
python3 -m pip install --user virtualenv
python3 -m pip cache purge
rm -rf $HOME/sw/v100/venvs/warpx
python3 -m venv $HOME/sw/v100/venvs/warpx
source $HOME/sw/v100/venvs/warpx/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
python3 -m pip install --upgrade mpi4py --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# optional: for libEnsemble
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt

Then, cd into the directory $HOME/src/warpx and use the following commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build

cmake -S . -B build -DWarpX_DIMS="1;2;RZ;3" -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON
cmake --build build -j 12

The general cmake compile-time options apply as usual.

That’s it! A 3D WarpX executable is now in build/bin/ and can be run with a 3D example inputs file. Most people execute the binary directly or copy it out to a location in /global/scratch/users/$USER/.

For a full PICMI install, follow the instructions for Python (PICMI) bindings:

# PICMI build
cd $HOME/src/warpx

# install or update dependencies
python3 -m pip install -r requirements.txt

# compile parallel PICMI interfaces in 3D, 2D, 1D and RZ
WARPX_MPI=ON WARPX_COMPUTE=CUDA WARPX_PSATD=ON BUILD_PARALLEL=12 python3 -m pip install --force-reinstall --no-deps -v .

Or, if you are developing, do a quick PICMI install of a single geometry (see: WarpX_DIMS) using:

# find dependencies & configure
cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS=RZ

# build and then call "python3 -m pip install ..."
cmake --build build --target pip_install -j 12
Running
V100 GPUs (16 GB)

12 nodes with each two NVIDIA V100 GPUs.

You can copy this file from Tools/machines/lawrencium-lbnl/lawrencium_v100.sbatch.
#!/bin/bash -l

# Copyright 2023 The WarpX Community
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH --job-name=WarpX
#SBATCH --account=<proj>
#SBATCH --qos=es_normal
# 2xV100 nodes
#SBATCH --partition=es1
#SBATCH --constraint=es1_v100
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=4
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
#S BATCH --mail-type=all
#S BATCH --mail-user=yourmail@lbl.gov

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs_3d

srun ${EXE} ${INPUTS} \
  > output_${SLURM_JOB_ID}.txt

To run a simulation, copy the lines above to a file v100.sbatch and run

sbatch lawrencium_v100.sbatch
2080 Ti GPUs (10 GB)

18 nodes with each four NVIDIA 2080 TI GPUs. These are most interesting if you run in single precision.

Use --constraint=es1_2080ti --cpus-per-task=2 in the above template to run on those nodes.

Leonardo (CINECA)

The Leonardo cluster is hosted at CINECA.

On Leonardo, each one of the 3456 compute nodes features a custom Atos Bull Sequana XH21355 “Da Vinci” blade, composed of:

  • 1 x CPU Intel Ice Lake Xeon 8358 32 cores 2.60 GHz

  • 512 (8 x 64) GB RAM DDR4 3200 MHz

  • 4 x NVidia custom Ampere A100 GPU 64GB HBM2

  • 2 x NVidia HDR 2×100 GB/s cards

Introduction

If you are new to this system, please see the following resources:

Storage organization:

  • $HOME: permanent, backed up, user specific (50 GB quota)

  • $CINECA_SCRATCH: temporary, user specific, no backup, a large disk for the storage of run time data and files, automatic cleaning procedure of data older than 40 days

  • $PUBLIC: permanent, no backup (50 GB quota)

  • $WORK: permanent, project specific, no backup

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use system software modules, add environment hints and further dependencies via the file $HOME/leonardo_gpu_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/leonardo-cineca/leonardo_gpu_warpx.profile.example $HOME/leonardo_gpu_warpx.profile
Script Details
# required dependencies
module load profile/base
module load cmake/3.24.3
module load gmp/6.2.1
module load mpfr/4.1.0
module load mpc/1.2.1
module load gcc/11.3.0
module load cuda/11.8
module load zlib/1.2.13--gcc--11.3.0
module load openmpi/4.1.4--gcc--11.3.0-cuda-11.8

# optional: for QED support with detailed tables
module load boost/1.80.0--openmpi--4.1.4--gcc--11.3.0

# optional: for openPMD and PSATD+RZ support
module load openblas/0.3.21--gcc--11.3.0
export CMAKE_PREFIX_PATH=/leonardo/prod/spack/03/install/0.19/linux-rhel8-icelake/gcc-11.3.0/c-blosc-1.21.1-aifmix6v5lwxgt7rigwoebalrgbcnv26:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/adios2-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/lapackpp-master:$CMAKE_PREFIX_PATH

export LD_LIBRARY_PATH=/leonardo/prod/spack/03/install/0.19/linux-rhel8-icelake/gcc-11.3.0/c-blosc-1.21.1-aifmix6v5lwxgt7rigwoebalrgbcnv26/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/sw/adios2-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/sw/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/sw/lapackpp-master/lib64:$LD_LIBRARY_PATH

export PATH=$HOME/sw/adios2-master/bin:$PATH

# optional: for Python bindings or libEnsemble
module load python/3.10.8--gcc--11.3.0

if [ -d "$HOME/sw/venvs/warpx" ]
then
  source $HOME/sw/venvs/warpx/bin/activate
fi

# optimize CUDA compilation for A100
export AMREX_CUDA_ARCH=8.0

# compiler environment hints
export CXX=$(which g++)
export CC=$(which gcc)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

Important

Now, and as the first step on future logins to Leonardo, activate these environment settings:

source $HOME/leonardo_gpu_warpx.profile

Finally, since Leonardo does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/leonardo_cineca/install_gpu_dependencies.sh
source $HOME/sw/venvs/warpx/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl, Marta Galbiati
# License: BSD-3-Clause-LBNL

set -eu -o pipefail


# Check: ######################################################################
#
#   Was leonardo_gpu_warpx.profile sourced and configured correctly?
#


# Remove old dependencies #####################################################
#
SW_DIR="$HOME/sw"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}


# General extra dependencies ##################################################
#

# ADIOS2
if [ -d $HOME/src/adios2 ]
then
  cd $HOME/src/adios2
  git fetch
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/ornladios/ADIOS2.git $HOME/src/adios2
fi
rm -rf $HOME/src/adios2-gpu-build
cmake -S $HOME/src/adios2 -B $HOME/src/adios2-gpu-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-master
cmake --build $HOME/src/adios2-gpu-build --target install -j 16
rm -rf $HOME/src/adios2-gpu-build


# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-gpu-build
CXX=$(which g++) cmake -S $HOME/src/blaspp -B $HOME/src/blaspp-gpu-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build $HOME/src/blaspp-gpu-build --target install --parallel 16
rm -rf $HOME/src/blaspp-gpu-build


# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-gpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B $HOME/src/lapackpp-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build $HOME/src/lapackpp-gpu-build --target install --parallel 16
rm -rf $HOME/src/lapackpp-gpu-build


# Python ######################################################################
#
rm -rf ${SW_DIR}/venvs/warpx
python3 -m venv ${SW_DIR}/venvs/warpx
source ${SW_DIR}/venvs/warpx/bin/activate
python3 -m ensurepip --upgrade
python3 -m pip cache purge
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="gcc -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
# optional: for libEnsemble
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
python3 -m pip install --upgrade torch  # CUDA 11.8 compatible wheel
python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_gpu

cmake -S . -B build_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu -j 16

The WarpX application executables are now in $HOME/src/warpx/build_gpu/bin/. Additionally, the following commands will install WarpX as a Python module:

cd $HOME/src/warpx
rm -rf build_gpu_py

cmake -S . -B build_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_PYTHON=ON -DWarpX_APP=OFF -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu_py -j 16 --target pip_install

Now, you can submit Leonardo compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Leonardo jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $CINECA_SCRATCH.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directories rm -rf $HOME/src/warpx/build_gpu* and rebuild WarpX.

Running

The batch script below can be used to run a WarpX simulation on multiple nodes on Leonardo. Replace descriptions between chevrons <> by relevant values. Note that we run one MPI rank per GPU.

You can copy this file from $HOME/src/warpx/Tools/machines/leonardo-cineca/job.sh.
#!/usr/bin/bash
#SBATCH --time=02:00:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-socket=4
#SBATCH --cpus-per-task=8
#SBATCH --gpus-per-node=4
#SBATCH --gpus-per-task=1
#SBATCH --mem=494000
#SBATCH --partition=boost_usr_prod
#SBATCH --job-name=<job name>
#SBATCH --gres=gpu:4
#SBATCH --err=job.err
#SBATCH --out=job.out
#SBATCH --account=<project id>
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<mail>

cd /leonardo_scratch/large/userexternal/<username>/<directory>
srun /leonardo/home/userexternal/<username>/src/warpx/build_gpu/bin/warpx.2d <input file> > output.txt

To run a simulation, copy the lines above to a file job.sh and run

sbatch job.sh

to submit the job.

Post-Processing

For post-processing, activate the environment settings:

source $HOME/leonardo_gpu_warpx.profile

and run python scripts.

LUMI (CSC)

The LUMI cluster is located at CSC (Finland). Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

Introduction

If you are new to this system, please see the following resources:

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use system software modules, add environment hints and further dependencies via the file $HOME/lumi_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/lumi-csc/lumi_warpx.profile.example $HOME/lumi_warpx.profile
Script Details
# please set your project account
#export proj="project_..."

# required dependencies
module load LUMI/23.09  partition/G
module load rocm/5.2.3  # waiting for 5.5 for next bump
module load buildtools/23.09

# optional: just an additional text editor
module load nano

# optional: for PSATD in RZ geometry support
SW_DIR="${HOME}/sw/lumi/gpu"
export CMAKE_PREFIX_PATH=${SW_DIR}/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${SW_DIR}/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for QED lookup table generation support
module load Boost/1.82.0-cpeCray-23.09

# optional: for openPMD support
export CMAKE_PREFIX_PATH=${SW_DIR}/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/hdf5-1.14.1.2:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/adios2-2.8.3:$CMAKE_PREFIX_PATH
export PATH=${SW_DIR}/hdf5-1.14.1.2/bin:${PATH}
export PATH=${SW_DIR}/adios2-2.8.3/bin:${PATH}

# optional: for Python bindings or libEnsemble
module load cray-python/3.10.10

if [ -d "${SW_DIR}/venvs/warpx-lumi" ]
then
  source ${SW_DIR}/venvs/warpx-lumi/bin/activate
fi

# an alias to request an interactive batch node for one hour
#   for paralle execution, start on the batch node: srun <command>
alias getNode="salloc -A $proj -J warpx -t 01:00:00 -p dev-g -N 1 --ntasks-per-node=8 --gpus-per-task=1 --gpu-bind=closest"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -A $proj -J warpx -t 00:30:00 -p dev-g -N 1 --ntasks-per-node=8 --gpus-per-task=1 --gpu-bind=closest"

# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1

# optimize ROCm/HIP compilation for MI250X
export AMREX_AMD_ARCH=gfx90a

# compiler environment hints
# Warning: using the compiler wrappers cc and CC
#          instead of amdclang and amdclang++
#          currently results in a significant
#          loss of performances
export CC=$(which amdclang)
export CXX=$(which amdclang++)
export FC=$(which amdflang)
export CFLAGS="-I${ROCM_PATH}/include"
export CXXFLAGS="-I${ROCM_PATH}/include -Wno-pass-failed"
export LDFLAGS="-L${ROCM_PATH}/lib -lamdhip64 ${PE_MPICH_GTL_DIR_amd_gfx90a} -lmpi_gtl_hsa"

Edit the 2nd line of this script, which sets the export proj="project_..." variable using a text editor such as nano, emacs, or vim (all available by default on LUMI login nodes). You can find out your project name by running lumi-ldap-userinfo on LUMI. For example, if you are member of the project project_465000559, then run nano $HOME/lumi_impactx.profile and edit line 2 to read:

export proj="project_465000559"

Exit the nano editor with Ctrl + O (save) and then Ctrl + X (exit).

Important

Now, and as the first step on future logins to LUMI, activate these environment settings:

source $HOME/lumi_warpx.profile

Finally, since LUMI does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/lumi-csc/install_dependencies.sh
source $HOME/sw/lumi/gpu/venvs/warpx-lumi/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl, Luca Fedeli
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was lumi_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your lumi_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
SRC_DIR="${HOME}/src"
SW_DIR="${HOME}/sw/lumi/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}
mkdir -p ${SRC_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with $HOME and is faster
build_dir=$(mktemp -d)

# BLAS++ (for PSATD+RZ)
if [ -d ${SRC_DIR}/blaspp ]
then
  cd ${SRC_DIR}/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git ${SRC_DIR}/blaspp
fi
rm -rf ${build_dir}/blaspp-lumi-gpu-build
CXX=$(which CC)                              \
cmake -S ${SRC_DIR}/blaspp                   \
      -B ${build_dir}/blaspp-lumi-gpu-build  \
      -Duse_openmp=OFF                       \
      -Dgpu_backend=hip                      \
      -DCMAKE_CXX_STANDARD=17                \
      -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-lumi-gpu-build --target install --parallel 16
rm -rf ${build_dir}/blaspp-lumi-gpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d ${SRC_DIR}/lapackpp ]
then
  cd ${SRC_DIR}/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git ${SRC_DIR}/lapackpp
fi
rm -rf ${build_dir}/lapackpp-lumi-gpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" \
cmake -S ${SRC_DIR}/lapackpp                     \
      -B ${build_dir}/lapackpp-lumi-gpu-build    \
      -DCMAKE_CXX_STANDARD=17                    \
      -Dbuild_tests=OFF                          \
      -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON     \
      -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build ${build_dir}/lapackpp-lumi-gpu-build --target install --parallel 16
rm -rf ${build_dir}/lapackpp-lumi-gpu-build

# c-blosc (I/O compression, for openPMD)
if [ -d ${SRC_DIR}/c-blosc ]
then
  cd ${SRC_DIR}/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git ${SRC_DIR}/c-blosc
fi
rm -rf ${build_dir}/c-blosc-lu-build
cmake -S ${SRC_DIR}/c-blosc             \
      -B ${build_dir}/c-blosc-lu-build  \
      -DBUILD_TESTS=OFF                 \
      -DBUILD_BENCHMARKS=OFF            \
      -DDEACTIVATE_AVX2=OFF             \
      -DCMAKE_INSTALL_PREFIX=${HOME}/sw/lumi/gpu/c-blosc-1.21.1
cmake --build ${build_dir}/c-blosc-lu-build --target install --parallel 16
rm -rf ${build_dir}/c-blosc-lu-build

# HDF5 (for openPMD)
if [ -d ${SRC_DIR}/hdf5 ]
then
  cd ${SRC_DIR}/hdf5
  git fetch --prune
  git checkout hdf5-1_14_1-2
  cd -
else
  git clone -b hdf5-1_14_1-2 https://github.com/HDFGroup/hdf5.git ${SRC_DIR}/hdf5
fi
rm -rf ${build_dir}/hdf5-lu-build
cmake -S ${SRC_DIR}/hdf5          \
      -B ${build_dir}/hdf5-lu-build  \
      -DBUILD_TESTING=OFF         \
      -DHDF5_ENABLE_PARALLEL=ON   \
      -DCMAKE_INSTALL_PREFIX=${SW_DIR}/hdf5-1.14.1.2
cmake --build ${build_dir}/hdf5-lu-build --target install --parallel 10
rm -rf ${build_dir}/hdf5-lu-build

# ADIOS2 (for openPMD)
if [ -d ${SRC_DIR}/adios2 ]
then
  cd ${SRC_DIR}/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git ${SRC_DIR}/adios2
fi
rm -rf ${build_dir}/adios2-lu-build
cmake -S ${SRC_DIR}/adios2             \
      -B ${build_dir}/adios2-lu-build  \
      -DADIOS2_USE_Blosc=ON            \
      -DADIOS2_USE_Fortran=OFF         \
      -DADIOS2_USE_HDF5=OFF            \
      -DADIOS2_USE_Python=OFF          \
      -DADIOS2_USE_ZeroMQ=OFF          \
      -DCMAKE_INSTALL_PREFIX=${HOME}/sw/lumi/gpu/adios2-2.8.3
cmake --build ${build_dir}/adios2-lu-build --target install -j 16
rm -rf ${build_dir}/adios2-lu-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-lumi
python3 -m venv ${SW_DIR}/venvs/warpx-lumi
source ${SW_DIR}/venvs/warpx-lumi/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="cc -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r ${SRC_DIR}/warpx/requirements.txt
# optional: for libEnsemble
python3 -m pip install -r ${SRC_DIR}/warpx/Tools/LibEnsemble/requirements.txt
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
#python3 -m pip install --upgrade torch --index-url https://download.pytorch.org/whl/rocm5.4.2
#python3 -m pip install -r ${SRC_DIR}/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_lumi

cmake -S . -B build_lumi -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_QED_TABLES_GEN_OMP=OFF -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lumi -j 16

The WarpX application executables are now in $HOME/src/warpx/build_lumi/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_lumi_py

cmake -S . -B build_lumi_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_QED_TABLES_GEN_OMP=OFF -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lumi_py -j 16 --target pip_install
Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_lumi and rebuild WarpX.

Running
MI250X GPUs (2x64 GB)

The GPU partition on the supercomputer LUMI at CSC has up to 2978 nodes, each with 8 Graphics Compute Dies (GCDs). WarpX runs one MPI rank per Graphics Compute Die.

For interactive runs, simply use the aliases getNode or runNode ....

The batch script below can be used to run a WarpX simulation on multiple nodes (change -N accordingly). Replace descriptions between chevrons <> by relevant values, for instance <project id> or the concete inputs file. Copy the executable or point to it via EXE and adjust the path for the INPUTS variable accordingly.

You can copy this file from Tools/machines/lumi-csc/lumi.sbatch.
#!/bin/bash -l

#SBATCH -A <project id>
#SBATCH --job-name=warpx
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
#SBATCH --partition=standard-g
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --gpus-per-node=8
#SBATCH --time=00:10:00

date

# note (12-12-22)
# this environment setting is currently needed on LUMI to work-around a
# known issue with Libfabric
#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

# Seen since August 2023 seen on OLCF (not yet seen on LUMI?)
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
#export MPICH_SMP_SINGLE_COPY_MODE=NONE
#export FI_CXI_RX_MATCH_MODE=software

# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null

# Seen since August 2023
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software

# LUMI documentation suggests using the following wrapper script
# to set the ROCR_VISIBLE_DEVICES to the value of SLURM_LOCALID
# see https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/lumig-job/
cat << EOF > select_gpu
#!/bin/bash

export ROCR_VISIBLE_DEVICES=\$SLURM_LOCALID
exec \$*
EOF

chmod +x ./select_gpu

sleep 1

# LUMI documentation suggests using the following CPU bind
# in order to have 6 threads per GPU (blosc compression in adios2 uses threads)
# see https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/lumig-job/
#
# WARNING: the following CPU_BIND options don't work on the dev-g partition.
#          If you want to run your simulation on dev-g, please comment them
#          out and replace them with CPU_BIND="map_cpu:49,57,17,25,1,9,33,41"
#
CPU_BIND="mask_cpu:7e000000000000,7e00000000000000"
CPU_BIND="${CPU_BIND},7e0000,7e000000"
CPU_BIND="${CPU_BIND},7e,7e00"
CPU_BIND="${CPU_BIND},7e00000000,7e0000000000"

export OMP_NUM_THREADS=6

export MPICH_GPU_SUPPORT_ENABLED=1

srun --cpu-bind=${CPU_BIND} ./select_gpu ./warpx inputs | tee outputs.txt
rm -rf ./select_gpu

To run a simulation, copy the lines above to a file lumi.sbatch and run

sbatch lumi.sbatch

to submit the job.

Post-Processing

Note

TODO: Document any Jupyter or data services.

Known System Issues

Warning

December 12th, 2022: There is a caching bug in libFabric that causes WarpX simulations to occasionally hang on LUMI on more than 1 node.

As a work-around, please export the following environment variable in your job scripts until the issue is fixed:

#export FI_MR_CACHE_MAX_COUNT=0  # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks  # alternative cache monitor

Warning

January, 2023: We discovered a regression in AMD ROCm, leading to 2x slower current deposition (and other slowdowns) in ROCm 5.3 and 5.4.

June, 2023: Although a fix was planned for ROCm 5.5, we still see the same issue in this release and continue to exchange with AMD and HPE on the issue.

Stay with the ROCm 5.2 module to avoid a 2x slowdown.

Warning

May 2023: rocFFT in ROCm 5.1-5.3 tries to write to a cache in the home area by default. This does not scale, disable it via:

export ROCFFT_RTC_CACHE_PATH=/dev/null

LXPLUS (CERN)

The LXPLUS cluster is located at CERN.

Introduction

If you are new to this system, please see the following resources:

  • Lxplus documentation

  • Batch system: HTCondor

  • Filesystem locations:
    • User folder: /afs/cern.ch/user/<a>/<account> (10GByte)

    • Work folder: /afs/cern.ch/work/<a>/<account> (100GByte)

    • Eos storage: /eos/home-<a>/<account> (1T)

Through LXPLUS we have access to CPU and GPU nodes (the latter equipped with NVIDIA V100 and T4 GPUs).

Installation

Only very little software is pre-installed on LXPLUS so we show how to install from scratch all the dependencies using Spack.

For size reasons it is not advisable to install WarpX in the $HOME directory, while it should be installed in the “work directory”. For this purpose we set an environment variable with the path to the “work directory”

export WORK=/afs/cern.ch/work/${USER:0:1}/$USER/

We clone WarpX in $WORK:

cd $WORK
git clone https://github.com/ECP-WarpX/WarpX.git warpx
Installation profile file

The easiest way to install the dependencies is to use the pre-prepared warpx.profile as follows:

cp $WORK/warpx/WarpX/Tools/machines/lxplus-cern/lxplus_warpx.profile.example $WORK/lxplus_warpx.profile
source $WORK/lxplus_warpx.profile

When doing this one can directly skip to the Building WarpX section.

To have the environment activated at every login it is then possible to add the following lines to the .bashrc

export WORK=/afs/cern.ch/work/${USER:0:1}/$USER/
source $WORK/lxplus_warpx.profile
GCC

The pre-installed GNU compiler is outdated so we need a more recent compiler. Here we use the gcc 11.2.0 from the LCG project, but other options are possible.

We activate it by doing

source /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos7/setup.sh

In order to avoid using different compilers this line could be added directly into the $HOME/.bashrc file.

Spack

We download and activate Spack in $WORK:

cd $WORK
git clone -c feature.manyFiles=true https://github.com/spack/spack.git
source spack/share/spack/setup-env.sh

Now we add our gcc 11.2.0 compiler to spack:

spack compiler find /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos7/bin
Installing the Dependencies

To install the dependencies we create a virtual environment, which we call warpx-lxplus:

spack env create warpx-lxplus $WORK/WarpX/Tools/machines/lxplus-cern/spack.yaml
spack env activate warpx-lxplus
spack install

If the GPU support or the Python bindings are not needed, it’s possible to skip the installation by respectively setting the following environment variables export SPACK_STACK_USE_PYTHON=0 and export SPACK_STACK_USE_CUDA = 0 before running the previous commands.

After the installation is done once, all we need to do in future sessions is just activate the environment again:

spack env activate warpx-lxplus

The environment warpx-lxplus (or -cuda or -cuda-py) must be reactivated everytime that we log in so it could be a good idea to add the following lines to the .bashrc:

source $WORK/spack/share/spack/setup-env.sh
spack env activate -d warpx-lxplus
cd $HOME
Building WarpX

We prepare and load the Spack software environment as above. Then we build WarpX:

cmake -S . -B build -DWarpX_DIMS="1;2;RZ;3"
cmake --build build -j 6

Or if we need to compile with CUDA:

cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_DIMS="1;2;RZ;3"
cmake --build build -j 6

That’s it! A 3D WarpX executable is now in build/bin/ and can be run with a 3D example inputs file. Most people execute the binary directly or copy it out to a location in $WORK.

Python Bindings

Here we assume that a Python interpreter has been set up as explained previously.

Now, ensure Python tooling is up-to-date:

python3 -m pip install -U pip
python3 -m pip install -U build packaging setuptools wheel
python3 -m pip install -U cmake

Then we compile WarpX as in the previous section (with or without CUDA) adding -DWarpX_PYTHON=ON and then we install it into our Python:

cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_DIMS="1;2;RZ;3" -DWarpX_APP=OFF -DWarpX_PYTHON=ON
cmake --build build --target pip_install -j 6

This builds WarpX for 3D geometry.

Alternatively, if you like to build WarpX for all geometries at once, use:

BUILD_PARALLEL=6 python3 -m pip wheel .
python3 -m pip install pywarpx-*whl

Ookami (Stony Brook)

The Ookami cluster is located at Stony Brook University.

Introduction

If you are new to this system, please see the following resources:

We use Ookami as a development cluster for A64FX, The cluster also provides a few extra nodes, e.g. two Thunder X2 (ARM) nodes.

Installation

Use the following commands to download the WarpX source code and switch to the correct branch:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use the following modules and environments on the system ($HOME/warpx_gcc10.profile).

You can copy this file from Tools/machines/ookami-sbu/ookami_warpx.profile.example.
# please set your project account (not relevant yet)
#export proj=<yourProject>

# required dependencies
module load cmake/3.19.0
module load gcc/10.3.0
module load openmpi/gcc10/4.1.0

# optional: faster builds (not available yet)
#module load ccache
#module load ninja

# optional: for PSATD support (not available yet)
#module load fftw

# optional: for QED lookup table generation support (not available yet)
#module load boost

# optional: for openPMD support
#module load adios2  # not available yet
#module load hdf5    # only serial

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CXXFLAGS="-mcpu=a64fx"

We recommend to store the above lines in a file, such as $HOME/warpx_gcc10.profile, and load it into your shell after a login:

source $HOME/warpx_gcc10.profile

Then, cd into the directory $HOME/src/warpx and use the following commands to compile:

cd $HOME/src/warpx
rm -rf build

cmake -S . -B build -DWarpX_COMPUTE=OMP -DWarpX_DIMS="1;2;3"
cmake --build build -j 10

# or (currently better performance)
cmake -S . -B build -DWarpX_COMPUTE=NOACC -DWarpX_DIMS="1;2;3"
cmake --build build -j 10

The general cmake compile-time options apply as usual.

That’s it! A 3D WarpX executable is now in build/bin/ and can be run with a 3D example inputs file. Most people execute the binary directly or copy it out to a location in /lustre/scratch/<netid>.

Running

For running on 48 cores of a single node:

srun -p short -N 1 -n 48 --pty bash
OMP_NUM_THREADS=1 mpiexec -n 48 --map-by ppr:12:numa:pe=1 --report-bindings ./warpx inputs

# alternatively, using 4 MPI ranks with each 12 threads on a single node:
OMP_NUM_THREADS=12 mpiexec -n 4 --map-by ppr:4:numa:pe=12 --report-bindings ./warpx inputs

The Ookami HPE Apollo 80 system has 174 A64FX compute nodes each with 32GB of high-bandwidth memory.

Additional Compilers

This section is just a note for developers. We compiled with the Fujitsu Compiler (Clang) with the following build string:

cmake -S . -B build                              \
   -DCMAKE_C_COMPILER=$(which mpifcc)            \
   -DCMAKE_C_COMPILER_ID="Clang"                 \
   -DCMAKE_C_COMPILER_VERSION=12.0               \
   -DCMAKE_C_STANDARD_COMPUTED_DEFAULT="11"      \
   -DCMAKE_CXX_COMPILER=$(which mpiFCC)          \
   -DCMAKE_CXX_COMPILER_ID="Clang"               \
   -DCMAKE_CXX_COMPILER_VERSION=12.0             \
   -DCMAKE_CXX_STANDARD_COMPUTED_DEFAULT="14"    \
   -DCMAKE_CXX_FLAGS="-Nclang"                   \
   -DAMReX_DIFFERENT_COMPILER=ON                 \
   -DAMReX_MPI_THREAD_MULTIPLE=FALSE             \
   -DWarpX_COMPUTE=OMP
cmake --build build -j 10

Note that the best performance for A64FX is currently achieved with the GCC or ARM compilers.

Perlmutter (NERSC)

The Perlmutter cluster is located at NERSC.

Introduction

If you are new to this system, please see the following resources:

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

On Perlmutter, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes.

We use system software modules, add environment hints and further dependencies via the file $HOME/perlmutter_gpu_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_gpu_warpx.profile.example $HOME/perlmutter_gpu_warpx.profile
Script Details
# please set your project account
export proj=""  # change me! GPU projects must end in "..._g"

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module load gpu
module load PrgEnv-gnu
module load craype
module load craype-x86-milan
module load craype-accel-nvidia80
module load cudatoolkit
module load cmake/3.24.3

# optional: for QED support with detailed tables
export BOOST_ROOT=/global/common/software/spackecp/perlmutter/e4s-23.05/default/spack/opt/spack/linux-sles15-zen3/gcc-11.2.0/boost-1.82.0-ow5r5qrgslcwu33grygouajmuluzuzv3

# optional: for openPMD and PSATD+RZ support
module load cray-hdf5-parallel/1.12.2.9
export CMAKE_PREFIX_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/adios2-2.8.3:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/lapackpp-master:$CMAKE_PREFIX_PATH

export LD_LIBRARY_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/c-blosc-1.21.1/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/adios2-2.8.3/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

export PATH=${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/adios2-2.8.3/bin:${PATH}

# optional: CCache
export PATH=/global/common/software/spackecp/perlmutter/e4s-23.08/default/spack/opt/spack/linux-sles15-zen3/gcc-11.2.0/ccache-4.8.2-cvooxdw5wgvv2g3vjxjkrpv6dopginv6/bin:$PATH

# optional: for Python bindings or libEnsemble
module load cray-python/3.11.5

if [ -d "${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/venvs/warpx-gpu" ]
then
  source ${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/venvs/warpx-gpu/bin/activate
fi

# an alias to request an interactive batch node for one hour
#   for parallel execution, start on the batch node: srun <command>
alias getNode="salloc -N 1 --ntasks-per-node=4 -t 1:00:00 -q interactive -C gpu --gpu-bind=single:1 -c 32 -G 4 -A $proj"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -N 1 --ntasks-per-node=4 -t 0:30:00 -q interactive -C gpu --gpu-bind=single:1 -c 32 -G 4 -A $proj"

# necessary to use CUDA-Aware MPI and run a job
export CRAY_ACCEL_TARGET=nvidia80

# optimize CUDA compilation for A100
export AMREX_CUDA_ARCH=8.0

# optimize CPU microarchitecture for AMD EPYC 3rd Gen (Milan/Zen3)
# note: the cc/CC/ftn wrappers below add those
export CXXFLAGS="-march=znver3"
export CFLAGS="-march=znver3"

# compiler environment hints
export CC=cc
export CXX=CC
export FC=ftn
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=CC

Edit the 2nd line of this script, which sets the export proj="" variable. Perlmutter GPU projects must end in ..._g. For example, if you are member of the project m3239, then run nano $HOME/perlmutter_gpu_warpx.profile and edit line 2 to read:

export proj="m3239_g"

Exit the nano editor with Ctrl + O (save) and then Ctrl + X (exit).

Important

Now, and as the first step on future logins to Perlmutter, activate these environment settings:

source $HOME/perlmutter_gpu_warpx.profile

Finally, since Perlmutter does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/perlmutter-nersc/install_gpu_dependencies.sh
source ${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/venvs/warpx-gpu/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was perlmutter_gpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your perlmutter_gpu_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Check $proj variable is correct and has a corresponding CFS directory #######
#
if [ ! -d "${CFS}/${proj%_g}/" ]
then
    echo "WARNING: The directory ${CFS}/${proj%_g}/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your perlmutter_gpu_warpx.profile file to continue!"
    exit
fi


# Remove old dependencies #####################################################
#
SW_DIR="${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with $HOME and is faster
build_dir=$(mktemp -d)

# c-blosc (I/O compression)
if [ -d $HOME/src/c-blosc ]
then
  cd $HOME/src/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git $HOME/src/c-blosc
fi
rm -rf $HOME/src/c-blosc-pm-gpu-build
cmake -S $HOME/src/c-blosc -B ${build_dir}/c-blosc-pm-gpu-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build ${build_dir}/c-blosc-pm-gpu-build --target install --parallel 16
rm -rf ${build_dir}/c-blosc-pm-gpu-build

# ADIOS2
if [ -d $HOME/src/adios2 ]
then
  cd $HOME/src/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git $HOME/src/adios2
fi
rm -rf $HOME/src/adios2-pm-gpu-build
cmake -S $HOME/src/adios2 -B ${build_dir}/adios2-pm-gpu-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build ${build_dir}/adios2-pm-gpu-build --target install -j 16
rm -rf ${build_dir}/adios2-pm-gpu-build

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-pm-gpu-build
CXX=$(which CC) cmake -S $HOME/src/blaspp -B ${build_dir}/blaspp-pm-gpu-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-pm-gpu-build --target install --parallel 16
rm -rf ${build_dir}/blaspp-pm-gpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-pm-gpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B ${build_dir}/lapackpp-pm-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build ${build_dir}/lapackpp-pm-gpu-build --target install --parallel 16
rm -rf ${build_dir}/lapackpp-pm-gpu-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-gpu
python3 -m venv ${SW_DIR}/venvs/warpx-gpu
source ${SW_DIR}/venvs/warpx-gpu/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="cc -target-accel=nvidia80 -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
python3 -m pip install --upgrade cupy-cuda12x  # CUDA 12 compatible wheel
# optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
python3 -m pip install --upgrade torch  # CUDA 12 compatible wheel
python3 -m pip install --upgrade optimas[all]


# remove build temporary directory
rm -rf ${build_dir}

We use system software modules, add environment hints and further dependencies via the file $HOME/perlmutter_cpu_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_cpu_warpx.profile.example $HOME/perlmutter_cpu_warpx.profile
Script Details
# please set your project account
export proj=""  # change me!

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module load cpu
module load cmake/3.24.3
module load cray-fftw/3.3.10.6

# optional: for QED support with detailed tables
export BOOST_ROOT=/global/common/software/spackecp/perlmutter/e4s-23.05/default/spack/opt/spack/linux-sles15-zen3/gcc-11.2.0/boost-1.82.0-ow5r5qrgslcwu33grygouajmuluzuzv3

# optional: for openPMD and PSATD+RZ support
module load cray-hdf5-parallel/1.12.2.9
export CMAKE_PREFIX_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/adios2-2.8.3:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/lapackpp-master:$CMAKE_PREFIX_PATH

export LD_LIBRARY_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/c-blosc-1.21.1/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/adios2-2.8.3/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

export PATH=${CFS}/${proj}/${USER}/sw/perlmutter/cpu/adios2-2.8.3/bin:${PATH}

# optional: CCache
export PATH=/global/common/software/spackecp/perlmutter/e4s-23.08/default/spack/opt/spack/linux-sles15-zen3/gcc-11.2.0/ccache-4.8.2-cvooxdw5wgvv2g3vjxjkrpv6dopginv6/bin:$PATH

# optional: for Python bindings or libEnsemble
module load cray-python/3.11.5

if [ -d "${CFS}/${proj}/${USER}/sw/perlmutter/cpu/venvs/warpx-cpu" ]
then
  source ${CFS}/${proj}/${USER}/sw/perlmutter/cpu/venvs/warpx-cpu/bin/activate
fi

# an alias to request an interactive batch node for one hour
#   for parallel execution, start on the batch node: srun <command>
alias getNode="salloc --nodes 1 --qos interactive --time 01:00:00 --constraint cpu --account=$proj"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun --nodes 1 --qos interactive --time 01:00:00 --constraint cpu $proj"

# optimize CPU microarchitecture for AMD EPYC 3rd Gen (Milan/Zen3)
# note: the cc/CC/ftn wrappers below add those
export CXXFLAGS="-march=znver3"
export CFLAGS="-march=znver3"

# compiler environment hints
export CC=cc
export CXX=CC
export FC=ftn

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project m3239, then run nano $HOME/perlmutter_cpu_warpx.profile and edit line 2 to read:

export proj="m3239"

Exit the nano editor with Ctrl + O (save) and then Ctrl + X (exit).

Important

Now, and as the first step on future logins to Perlmutter, activate these environment settings:

source $HOME/perlmutter_cpu_warpx.profile

Finally, since Perlmutter does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/perlmutter-nersc/install_cpu_dependencies.sh
source ${CFS}/${proj}/${USER}/sw/perlmutter/cpu/venvs/warpx-cpu/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was perlmutter_cpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your perlmutter_cpu_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Check $proj variable is correct and has a corresponding CFS directory #######
#
if [ ! -d "${CFS}/${proj}/" ]
then
    echo "WARNING: The directory ${CFS}/${proj}/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your perlmutter_cpu_warpx.profile file to continue!"
    exit
fi


# Remove old dependencies #####################################################
#
SW_DIR="${CFS}/${proj}/${USER}/sw/perlmutter/cpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with $HOME and is faster
build_dir=$(mktemp -d)

# c-blosc (I/O compression)
if [ -d $HOME/src/c-blosc ]
then
  cd $HOME/src/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git $HOME/src/c-blosc
fi
rm -rf $HOME/src/c-blosc-pm-cpu-build
cmake -S $HOME/src/c-blosc -B ${build_dir}/c-blosc-pm-cpu-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build ${build_dir}/c-blosc-pm-cpu-build --target install --parallel 16
rm -rf ${build_dir}/c-blosc-pm-cpu-build

# ADIOS2
if [ -d $HOME/src/adios2 ]
then
  cd $HOME/src/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git $HOME/src/adios2
fi
rm -rf $HOME/src/adios2-pm-cpu-build
cmake -S $HOME/src/adios2 -B ${build_dir}/adios2-pm-cpu-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_CUDA=OFF -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build ${build_dir}/adios2-pm-cpu-build --target install -j 16
rm -rf ${build_dir}/adios2-pm-cpu-build

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-pm-cpu-build
CXX=$(which CC) cmake -S $HOME/src/blaspp -B ${build_dir}/blaspp-pm-cpu-build -Duse_openmp=ON -Dgpu_backend=OFF -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-pm-cpu-build --target install --parallel 16
rm -rf ${build_dir}/blaspp-pm-cpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-pm-cpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B ${build_dir}/lapackpp-pm-cpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build ${build_dir}/lapackpp-pm-cpu-build --target install --parallel 16
rm -rf ${build_dir}/lapackpp-pm-cpu-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-cpu
python3 -m venv ${SW_DIR}/venvs/warpx-cpu
source ${SW_DIR}/venvs/warpx-cpu/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="cc -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
# optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
python3 -m pip install --upgrade torch --index-url https://download.pytorch.org/whl/cpu
python3 -m pip install --upgrade optimas[all]


# remove build temporary directory
rm -rf ${build_dir}
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_pm_gpu

cmake -S . -B build_pm_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu -j 16

The WarpX application executables are now in $HOME/src/warpx/build_pm_gpu/bin/. Additionally, the following commands will install WarpX as a Python module:

cd $HOME/src/warpx
rm -rf build_pm_gpu_py

cmake -S . -B build_pm_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu_py -j 16 --target pip_install
cd $HOME/src/warpx
rm -rf build_pm_cpu

cmake -S . -B build_pm_cpu -DWarpX_COMPUTE=OMP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_cpu -j 16

The WarpX application executables are now in $HOME/src/warpx/build_pm_cpu/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_pm_cpu_py

cmake -S . -B build_pm_cpu_py -DWarpX_COMPUTE=OMP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_cpu_py -j 16 --target pip_install

Now, you can submit Perlmutter compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Perlmutter jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PSCRATCH.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_pm_* and rebuild WarpX.

Running

The batch script below can be used to run a WarpX simulation on multiple nodes (change -N accordingly) on the supercomputer Perlmutter at NERSC. This partition as up to 1536 nodes.

Replace descriptions between chevrons <> by relevant values, for instance <input file> could be plasma_mirror_inputs. Note that we run one MPI rank per GPU.

You can copy this file from $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_gpu.sbatch.
#!/bin/bash -l

# Copyright 2021-2023 Axel Huebl, Kevin Gott
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL

#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH -J WarpX
#    note: <proj> must end on _g
#SBATCH -A <proj>
#SBATCH -q regular
# A100 40GB (most nodes)
#SBATCH -C gpu
# A100 80GB (256 nodes)
#S BATCH -C gpu&hbm80g
#SBATCH --exclusive
# ideally single:1, but NERSC cgroups issue
#SBATCH --gpu-bind=none
#SBATCH --ntasks-per-node=4
#SBATCH --gpus-per-node=4
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs

# pin to closest NIC to GPU
export MPICH_OFI_NIC_POLICY=GPU

# threads for OpenMP and threaded compressors per MPI rank
#   note: 16 avoids hyperthreading (32 virtual cores, 16 physical)
export SRUN_CPUS_PER_TASK=16
export OMP_NUM_THREADS=${SRUN_CPUS_PER_TASK}

# GPU-aware MPI optimizations
GPU_AWARE_MPI="amrex.use_gpu_aware_mpi=1"

# CUDA visible devices are ordered inverse to local task IDs
#   Reference: nvidia-smi topo -m
srun --cpu-bind=cores bash -c "
    export CUDA_VISIBLE_DEVICES=\$((3-SLURM_LOCALID));
    ${EXE} ${INPUTS} ${GPU_AWARE_MPI}" \
  > output.txt

To run a simulation, copy the lines above to a file perlmutter_gpu.sbatch and run

sbatch perlmutter_gpu.sbatch

to submit the job.

Perlmutter has 256 nodes that provide 80 GB HBM per A100 GPU. In the A100 (40GB) batch script, replace -C gpu with -C gpu&hbm80g to use these large-memory GPUs.

The Perlmutter CPU partition as up to 3072 nodes, each with 2x AMD EPYC 7763 CPUs.

You can copy this file from $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_cpu.sbatch.
#!/bin/bash -l

# Copyright 2021-2023 WarpX
#
# This file is part of WarpX.
#
# Authors: Axel Huebl
# License: BSD-3-Clause-LBNL

#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH -J WarpX
#SBATCH -A <proj>
#SBATCH -q regular
#SBATCH -C cpu
#SBATCH --ntasks-per-node=16
#SBATCH --exclusive
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs_small

# each CPU node on Perlmutter (NERSC) has 64 hardware cores with
# 2x Hyperthreading/SMP
# https://en.wikichip.org/wiki/amd/epyc/7763
# https://www.amd.com/en/products/cpu/amd-epyc-7763
# Each CPU is made up of 8 chiplets, each sharing 32MB L3 cache.
# This will be our MPI rank assignment (2x8 is 16 ranks/node).

# threads for OpenMP and threaded compressors per MPI rank
export SRUN_CPUS_PER_TASK=16  # 8 cores per chiplet, 2x SMP
export OMP_PLACES=threads
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=${SRUN_CPUS_PER_TASK}

srun --cpu-bind=cores \
  ${EXE} ${INPUTS} \
  > output.txt
Post-Processing

For post-processing, most users use Python via NERSC’s Jupyter service (documentation).

As a one-time preparatory setup, log into Perlmutter via SSH and do not source the WarpX profile script above. Create your own Conda environment and Jupyter kernel for post-processing:

module load python

conda config --set auto_activate_base false

# create conda environment
rm -rf $HOME/.conda/envs/warpx-pm-postproc
conda create --yes -n warpx-pm-postproc -c conda-forge mamba conda-libmamba-solver
conda activate warpx-pm-postproc
conda config --set solver libmamba
mamba install --yes -c conda-forge python ipykernel ipympl matplotlib numpy pandas yt openpmd-viewer openpmd-api h5py fast-histogram dask dask-jobqueue pyarrow

# create Jupyter kernel
rm -rf $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/
python -m ipykernel install --user --name warpx-pm-postproc --display-name WarpX-PM-PostProcessing
echo -e '#!/bin/bash\nmodule load python\nsource activate warpx-pm-postproc\nexec "$@"' > $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel-helper.sh
chmod a+rx $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel-helper.sh
KERNEL_STR=$(jq '.argv |= ["{resource_dir}/kernel-helper.sh"] + .' $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel.json | jq '.argv[1] = "python"')
echo ${KERNEL_STR} | jq > $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel.json

exit

When opening a Jupyter notebook on https://jupyter.nersc.gov, just select WarpX-PM-PostProcessing from the list of available kernels on the top right of the notebook.

Additional software can be installed later on, e.g., in a Jupyter cell using !mamba install -y -c conda-forge .... Software that is not available via conda can be installed via !python -m pip install ....

Polaris (ALCF)

The Polaris cluster is located at ALCF.

Introduction

If you are new to this system, please see the following resources:

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

On Polaris, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes.

We use system software modules, add environment hints and further dependencies via the file $HOME/polaris_gpu_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/polaris-alcf/polaris_gpu_warpx.profile.example $HOME/polaris_gpu_warpx.profile
Script Details
# Set the project name
export proj=""  # change me!

# swap to GNU programming environment (with gcc 11.2)
module swap PrgEnv-nvhpc PrgEnv-gnu
module swap gcc/12.2.0 gcc/11.2.0
module load nvhpc-mixed/22.11

# swap to the Milan cray package
module swap craype-x86-rome craype-x86-milan

# required dependencies
module load cmake/3.23.2

# optional: for QED support with detailed tables
# module load boost/1.81.0

# optional: for openPMD and PSATD+RZ support
module load cray-hdf5-parallel/1.12.2.3
export CMAKE_PREFIX_PATH=/home/${USER}/sw/polaris/gpu/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=/home/${USER}/sw/polaris/gpu/adios2-2.8.3:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=/home/${USER}/sw/polaris/gpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=/home/${USER}/sw/polaris/gpu/lapackpp-master:$CMAKE_PREFIX_PATH

export LD_LIBRARY_PATH=/home/${USER}/sw/polaris/gpu/c-blosc-1.21.1/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/${USER}/sw/polaris/gpu/adios2-2.8.3/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/${USER}/sw/polaris/gpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/${USER}/sw/polaris/gpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

export PATH=/home/${USER}/sw/polaris/gpu/adios2-2.8.3/bin:${PATH}

# optional: for Python bindings or libEnsemble
module load cray-python/3.9.13.1

if [ -d "/home/${USER}/sw/polaris/gpu/venvs/warpx" ]
then
  source /home/${USER}/sw/polaris/gpu/venvs/warpx/bin/activate
fi

# necessary to use CUDA-Aware MPI and run a job
export CRAY_ACCEL_TARGET=nvidia80

# optimize CUDA compilation for A100
export AMREX_CUDA_ARCH=8.0

# optimize CPU microarchitecture for AMD EPYC 3rd Gen (Milan/Zen3)
# note: the cc/CC/ftn wrappers below add those
export CXXFLAGS="-march=znver3"
export CFLAGS="-march=znver3"

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project proj_name, then run nano $HOME/polaris_gpu_warpx.profile and edit line 2 to read:

export proj="proj_name"

Exit the nano editor with Ctrl + O (save) and then Ctrl + X (exit).

Important

Now, and as the first step on future logins to Polaris, activate these environment settings:

source $HOME/polaris_gpu_warpx.profile

Finally, since Polaris does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/polaris-alcf/install_gpu_dependencies.sh
source ${CFS}/${proj%_g}/${USER}/sw/polaris/gpu/venvs/warpx/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2024 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl, Roelof Groenewald
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail

# Check: ######################################################################
#
#   Was polaris_gpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your polaris_gpu_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi

# Remove old dependencies #####################################################
#
SW_DIR="/home/${USER}/sw/polaris/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true

# General extra dependencies ##################################################
#

# c-blosc (I/O compression)
if [ -d $HOME/src/c-blosc ]
then
  cd $HOME/src/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git $HOME/src/c-blosc
fi
rm -rf $HOME/src/c-blosc-pm-gpu-build
cmake -S $HOME/src/c-blosc -B $HOME/src/c-blosc-pm-gpu-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build $HOME/src/c-blosc-pm-gpu-build --target install --parallel 16
rm -rf $HOME/src/c-blosc-pm-gpu-build

# ADIOS2
if [ -d $HOME/src/adios2 ]
then
  cd $HOME/src/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git $HOME/src/adios2
fi
rm -rf $HOME/src/adios2-pm-gpu-build
cmake -S $HOME/src/adios2 -B $HOME/src/adios2-pm-gpu-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build $HOME/src/adios2-pm-gpu-build --target install -j 16
rm -rf $HOME/src/adios2-pm-gpu-build

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-pm-gpu-build
CXX=$(which CC) cmake -S $HOME/src/blaspp -B $HOME/src/blaspp-pm-gpu-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build $HOME/src/blaspp-pm-gpu-build --target install --parallel 16
rm -rf $HOME/src/blaspp-pm-gpu-build

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-pm-gpu-build
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B $HOME/src/lapackpp-pm-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build $HOME/src/lapackpp-pm-gpu-build --target install --parallel 16
rm -rf $HOME/src/lapackpp-pm-gpu-build

# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx
python3 -m venv ${SW_DIR}/venvs/warpx
source ${SW_DIR}/venvs/warpx/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="CC -target-accel=nvidia80 -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
python3 -m pip install cupy-cuda11x  # CUDA 11.8 compatible wheel
# optional: for libEnsemble
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
python3 -m pip install --upgrade torch  # CUDA 11.8 compatible wheel
python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt

Under construction

Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_pm_gpu

cmake -S . -B build_pm_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu -j 16

The WarpX application executables are now in $HOME/src/warpx/build_pm_gpu/bin/. Additionally, the following commands will install WarpX as a Python module:

cd $HOME/src/warpx
rm -rf build_pm_gpu_py

cmake -S . -B build_pm_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu_py -j 16 --target pip_install

Under construction

Now, you can submit Polaris compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Polaris jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PSCRATCH.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_pm_* and rebuild WarpX.

Running

The batch script below can be used to run a WarpX simulation on multiple nodes (change <NODES> accordingly) on the supercomputer Polaris at ALCF.

Replace descriptions between chevrons <> by relevant values, for instance <input file> could be plasma_mirror_inputs. Note that we run one MPI rank per GPU.

You can copy this file from $HOME/src/warpx/Tools/machines/polaris-alcf/polaris_gpu.pbs.
#!/bin/bash -l

#PBS -A <proj>
#PBS -l select=<NODES>:system=polaris
#PBS -l place=scatter
#PBS -l walltime=0:10:00
#PBS -l filesystems=home:eagle
#PBS -q debug
#PBS -N test_warpx

# Set required environment variables
# support gpu-aware-mpi
# export MPICH_GPU_SUPPORT_ENABLED=1

# Change to working directory
echo Working directory is $PBS_O_WORKDIR
cd ${PBS_O_WORKDIR}

echo Jobid: $PBS_JOBID
echo Running on host `hostname`
echo Running on nodes `cat $PBS_NODEFILE`

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=input1d

# MPI and OpenMP settings
NNODES=`wc -l < $PBS_NODEFILE`
NRANKS_PER_NODE=4
NDEPTH=1
NTHREADS=1

NTOTRANKS=$(( NNODES * NRANKS_PER_NODE ))
echo "NUM_OF_NODES= ${NNODES} TOTAL_NUM_RANKS= ${NTOTRANKS} RANKS_PER_NODE= ${NRANKS_PER_NODE} THREADS_PER_RANK= ${NTHREADS}"

mpiexec -np ${NTOTRANKS} ${EXE} ${INPUTS} > output.txt

To run a simulation, copy the lines above to a file polaris_gpu.pbs and run

qsub polaris_gpu.pbs

to submit the job.

Under construction

Quartz (LLNL)

The Quartz Intel CPU cluster is located at LLNL.

Introduction

If you are new to this system, please see the following resources:

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use system software modules, add environment hints and further dependencies via the file $HOME/quartz_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/quartz-llnl/quartz_warpx.profile.example $HOME/quartz_warpx.profile
Script Details
# please set your project account
#export proj="<yourProjectNameHere>"  # edit this and comment in

# required dependencies
module load cmake/3.23.1
module load clang/14.0.6-magic
module load mvapich2/2.3.7

# optional: for PSATD support
module load fftw/3.3.10

# optional: for QED lookup table generation support
module load boost/1.80.0

# optional: for openPMD support
module load hdf5-parallel/1.14.0

SW_DIR="/usr/workspace/${USER}/quartz"
export CMAKE_PREFIX_PATH=${SW_DIR}/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/adios2-2.8.3:$CMAKE_PREFIX_PATH
export PATH=${SW_DIR}/adios2-2.8.3/bin:${PATH}

# optional: for PSATD in RZ geometry support
export CMAKE_PREFIX_PATH=${SW_DIR}/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=${SW_DIR}/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for Python bindings
module load python/3.9.12

if [ -d "${SW_DIR}/venvs/warpx-quartz" ]
then
    source ${SW_DIR}/venvs/warpx-quartz/bin/activate
fi

# optional: an alias to request an interactive node for two hours
alias getNode="srun --time=0:30:00 --nodes=1 --ntasks-per-node=2 --cpus-per-task=18 -p pdebug --pty bash"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun --time=0:30:00 --nodes=1 --ntasks-per-node=2 --cpus-per-task=18 -p pdebug"

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# optimize CPU microarchitecture for Intel Xeon E5-2695 v4
# note: the cc/CC/ftn wrappers below add those
export CXXFLAGS="-march=broadwell"
export CFLAGS="-march=broadwell"

# compiler environment hints
export CC=$(which clang)
export CXX=$(which clang++)
export FC=$(which gfortran)

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project tps, then run vi $HOME/quartz_warpx.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="tps"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to Quartz, activate these environment settings:

source $HOME/quartz_warpx.profile

Finally, since Quartz does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/quartz-llnl/install_dependencies.sh
source /usr/workspace/${USER}/quartz/venvs/warpx-quartz/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was quartz_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your quartz_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
SW_DIR="/usr/workspace/${USER}/quartz"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with ${HOME} and is faster
build_dir=$(mktemp -d)

# c-blosc (I/O compression)
if [ -d ${HOME}/src/c-blosc ]
then
  cd ${HOME}/src/c-blosc
  git fetch --prune
  git checkout v1.21.1
  cd -
else
  git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git ${HOME}/src/c-blosc
fi
cmake -S ${HOME}/src/c-blosc -B ${build_dir}/c-blosc-quartz-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1
cmake --build ${build_dir}/c-blosc-quartz-build --target install --parallel 6

# ADIOS2
if [ -d ${HOME}/src/adios2 ]
then
  cd ${HOME}/src/adios2
  git fetch --prune
  git checkout v2.8.3
  cd -
else
  git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git ${HOME}/src/adios2
fi
cmake -S ${HOME}/src/adios2 -B ${build_dir}/adios2-quartz-build -DBUILD_TESTING=OFF -DADIOS2_BUILD_EXAMPLES=OFF -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_SST=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3
cmake --build ${build_dir}/adios2-quartz-build --target install -j 6

# BLAS++ (for PSATD+RZ)
if [ -d ${HOME}/src/blaspp ]
then
  cd ${HOME}/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git ${HOME}/src/blaspp
fi
cmake -S ${HOME}/src/blaspp -B ${build_dir}/blaspp-quartz-build -Duse_openmp=ON -Duse_cmake_find_blas=ON -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-quartz-build --target install --parallel 6

# LAPACK++ (for PSATD+RZ)
if [ -d ${HOME}/src/lapackpp ]
then
  cd ${HOME}/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git ${HOME}/src/lapackpp
fi
CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S ${HOME}/src/lapackpp -B ${build_dir}/lapackpp-quartz-build -Duse_cmake_find_lapack=ON -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build ${build_dir}/lapackpp-quartz-build --target install --parallel 6


# Python ######################################################################
#
python3 -m pip install --upgrade --user virtualenv
rm -rf ${SW_DIR}/venvs/warpx-quartz
python3 -m venv ${SW_DIR}/venvs/warpx-quartz
source ${SW_DIR}/venvs/warpx-quartz/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip cache purge
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt

# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r ${HOME}/src/warpx/requirements.txt

# ML dependencies
python3 -m pip install --upgrade torch


# remove build temporary directory ############################################
#
rm -rf ${build_dir}
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_quartz

cmake -S . -B build_quartz -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_quartz -j 6

The WarpX application executables are now in $HOME/src/warpx/build_quartz/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_quartz_py

cmake -S . -B build_quartz_py -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_quartz_py -j 6 --target pip_install

Now, you can submit Quartz compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Quartz jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_quartz and rebuild WarpX.

Running
Intel Xeon E5-2695 v4 CPUs

The batch script below can be used to run a WarpX simulation on 2 nodes on the supercomputer Quartz at LLNL. Replace descriptions between chevrons <> by relevant values, for instance <input file> could be plasma_mirror_inputs.

You can copy this file from Tools/machines/quartz-llnl/quartz.sbatch.
#!/bin/bash -l

# Just increase this number of you need more nodes.
#SBATCH -N 2
#SBATCH -t 24:00:00
#SBATCH -A <allocation ID>

#SBATCH -J WarpX
#SBATCH -q pbatch
#SBATCH --qos=normal
#SBATCH --license=lustre1,lustre2
#SBATCH --export=ALL
#SBATCH -e error.txt
#SBATCH -o output.txt
# one MPI rank per half-socket (see below)
#SBATCH --tasks-per-node=2
# request all logical (virtual) cores per half-socket
#SBATCH --cpus-per-task=18


# each Quartz node has 1 socket of Intel Xeon E5-2695 v4
# each Xeon CPU is divided into 2 bus rings that each have direct L3 access
export WARPX_NMPI_PER_NODE=2

# each MPI rank per half-socket has 9 physical cores
#   or 18 logical (virtual) cores
# over-subscribing each physical core with 2x
#   hyperthreading led to a slight (3.5%) speedup on Cori's Intel Xeon E5-2698 v3,
#   so we do the same here
# the settings below make sure threads are close to the
#   controlling MPI rank (process) per half socket and
#   distribute equally over close-by physical cores and,
#   for N>9, also equally over close-by logical cores
export OMP_PROC_BIND=spread
export OMP_PLACES=threads
export OMP_NUM_THREADS=18

EXE="<path/to/executable>"  # e.g. ./warpx

srun --cpu_bind=cores -n $(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} )) ${EXE} <input file>

To run a simulation, copy the lines above to a file quartz.sbatch and run

sbatch quartz.sbatch

to submit the job.

Spock (OLCF)

The Spock cluster is located at OLCF.

Introduction

If you are new to this system, please see the following resources:

  • Spock user guide

  • Batch system: Slurm

  • Production directories:

    • $PROJWORK/$proj/: shared with all members of a project (recommended)

    • $MEMBERWORK/$proj/: single user (usually smaller quota)

    • $WORLDWORK/$proj/: shared with all users

    • Note that the $HOME directory is mounted as read-only on compute nodes. That means you cannot run in your $HOME.

Installation

Use the following commands to download the WarpX source code and switch to the correct branch:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use the following modules and environments on the system ($HOME/spock_warpx.profile).

You can copy this file from Tools/machines/spock-olcf/spock_warpx.profile.example.
# please set your project account
#export proj=<yourProject>

# required dependencies
module load cmake/3.20.2
module load craype-accel-amd-gfx908
module load rocm/4.3.0

# optional: faster builds
module load ccache
module load ninja

# optional: just an additional text editor
module load nano

# optional: an alias to request an interactive node for one hour
alias getNode="salloc -A $proj -J warpx -t 01:00:00 -p ecp -N 1"

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# optimize CUDA compilation for MI100
export AMREX_AMD_ARCH=gfx908

# compiler environment hints
export CC=$ROCM_PATH/llvm/bin/clang
export CXX=$(which hipcc)
export LDFLAGS="-L${CRAYLIBS_X86_64} $(CC --cray-print-opts=libs) -lmpi"
# GPU aware MPI: ${PE_MPICH_GTL_DIR_gfx908} -lmpi_gtl_hsa

We recommend to store the above lines in a file, such as $HOME/spock_warpx.profile, and load it into your shell after a login:

source $HOME/spock_warpx.profile

Then, cd into the directory $HOME/src/warpx and use the following commands to compile:

cd $HOME/src/warpx
rm -rf build

cmake -S . -B build -DWarpX_DIMS="1;2;3" -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DAMReX_AMD_ARCH=gfx908 -DMPI_CXX_COMPILER=$(which CC) -DMPI_C_COMPILER=$(which cc) -DMPI_COMPILER_FLAGS="--cray-print-opts=all"
cmake --build build -j 10

The general cmake compile-time options apply as usual.

That’s it! A 3D WarpX executable is now in build/bin/ and can be run with a 3D example inputs file. Most people execute the binary directly or copy it out to a location in $PROJWORK/$proj/.

Running
MI100 GPUs (32 GB)

After requesting an interactive node with the getNode alias above, run a simulation like this, here using 4 MPI ranks:

srun -n 4 -c 2 --ntasks-per-node=4 ./warpx inputs

Or in non-interactive runs started with sbatch:

You can copy this file from Tools/machines/spock-olcf/spock_mi100.sbatch.
#!/bin/bash

#SBATCH -A <project id>
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p ecp
#SBATCH -N 1

export OMP_NUM_THREADS=1
srun -n 4 -c 2 --ntasks-per-node=4 ./warpx inputs > output.txt

We can currently use up to 4 nodes with 4 GPUs each (maximum: -N 4 -n 16).

Post-Processing

For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).

Please follow the same guidance as for OLCF Summit post-processing.

Summit (OLCF)

The Summit cluster is located at OLCF.

On Summit, each compute node provides six V100 GPUs (16GB) and two Power9 CPUs.

Introduction

If you are new to this system, please see the following resources:

  • Summit user guide

  • Batch system: LSF

  • Jupyter service

  • Filesystems:

    • $HOME: per-user directory, use only for inputs, source and scripts; backed up; mounted as read-only on compute nodes, that means you cannot run in it (50 GB quota)

    • $PROJWORK/$proj/: shared with all members of a project, purged every 90 days, GPFS (recommended)

    • $MEMBERWORK/$proj/: single user, purged every 90 days, GPFS (usually smaller quota)

    • $WORLDWORK/$proj/: shared with all users, purged every 90 days, GPFS

    • /ccs/proj/$proj/: another, non-GPFS, file system for software and smaller data.

Note: the Alpine GPFS filesystem on Summit and the new Orion Lustre filesystem on Frontier are not mounted on each others machines. Use Globus to transfer data between them if needed.

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use system software modules, add environment hints and further dependencies via the file $HOME/summit_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/summit-olcf/summit_warpx.profile.example $HOME/summit_warpx.profile
Script Details
# please set your project account
export proj=""  # change me!

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# optional: just an additional text editor
module load nano

# required dependencies
module load cmake/3.20.2
module load gcc/9.3.0
module load cuda/11.7.1

# optional: faster re-builds
module load ccache
module load ninja

# optional: for PSATD in RZ geometry support
export CMAKE_PREFIX_PATH=/ccs/proj/$proj/${USER}/sw/summit/gpu/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=/ccs/proj/$proj/${USER}/sw/summit/gpu/lapackpp-master:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=/ccs/proj/$proj/${USER}/sw/summit/gpu/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/ccs/proj/$proj/${USER}/sw/summit/gpu/lapackpp-master/lib64:$LD_LIBRARY_PATH

# optional: for QED lookup table generation support
module load boost/1.76.0

# optional: for openPMD support
module load adios2/2.8.1
module load hdf5/1.12.2

# optional: for openPMD support (GNUmake only)
#module load ums
#module load ums-aph114
#module load openpmd-api/0.15.1

# often unstable at runtime with dependencies
module unload darshan-runtime

# optional: Ascent in situ support
#   note: build WarpX with CMake
export Ascent_DIR=/sw/summit/ums/ums010/ascent/0.8.0_warpx/summit/cuda/gnu/ascent-install/

# optional: for Python bindings or libEnsemble
module load python/3.8.10
module load freetype/2.10.4     # matplotlib

# dependencies for numpy, blaspp & lapackpp
module load openblas/0.3.5-omp
export BLAS=${OLCF_OPENBLAS_ROOT}/lib/libopenblas.so
export LAPACK=${OLCF_OPENBLAS_ROOT}/lib/libopenblas.so

# dependency for pyTorch
module load magma

if [ -d "/ccs/proj/$proj/${USER}/sw/summit/gpu/venvs/warpx-summit" ]
then
  source /ccs/proj/$proj/${USER}/sw/summit/gpu/venvs/warpx-summit/bin/activate
fi

# an alias to request an interactive batch node for two hours
#   for paralle execution, start on the batch node: jsrun <command>
alias getNode="bsub -q debug -P $proj -W 2:00 -nnodes 1 -Is /bin/bash"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="bsub -q debug -P $proj -W 2:00 -nnodes 1 -I"

# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand

# make output group-readable by default
umask 0027

# optimize CUDA compilation for V100
export AMREX_CUDA_ARCH=7.0

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=$(which g++)

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project aph114, then run vi $HOME/summit_warpx.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="aph114"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to Summit, activate these environment settings:

source $HOME/summit_warpx.profile

Finally, since Summit does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/summit-olcf/install_gpu_dependencies.sh
source /ccs/proj/$proj/${USER}/sw/summit/gpu/venvs/warpx-summit/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was perlmutter_gpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your summit_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Check $proj variable is correct and has a corresponding PROJWORK directory ##
#
if [ ! -d "${PROJWORK}/${proj}/" ]
then
    echo "WARNING: The directory $PROJWORK/$proj/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your summit_warpx.profile file to continue!"
    exit
fi


# Check $proj variable is correct and has a corresponding Software directory ##
#
if [ ! -d "/ccs/proj/${proj}/" ]
then
    echo "WARNING: The directory /ccs/proj/$proj/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your summit_warpx.profile file to continue!"
    exit
fi


# Remove old dependencies #####################################################
#
SW_DIR="/ccs/proj/${proj}/${USER}/sw/summit/gpu/"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with $HOME and is faster
build_dir=$(mktemp -d)

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
cmake -S $HOME/src/blaspp -B ${build_dir}/blaspp-summit-build -Duse_openmp=ON -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-summit-build --target install --parallel 10

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
cmake -S $HOME/src/lapackpp -B ${build_dir}/lapackpp-summit-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build ${build_dir}/lapackpp-summit-build --target install --parallel 10

# remove build temporary directory
rm -rf ${build_dir}


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-summit
python3 -m venv ${SW_DIR}/venvs/warpx-summit
source ${SW_DIR}/venvs/warpx-summit/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade -Ccompile-args="-j10" scipy
python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib==3.2.2  # does not try to build freetype itself
python3 -m pip install --upgrade yt

# install or update WarpX dependencies such as picmistandard
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt

# for ML dependencies, see install_gpu_ml.sh
AI/ML Dependencies (Optional)

If you plan to run AI/ML workflows depending on pyTorch, run the next step as well. This will take a while and should be skipped if not needed.

runNode bash $HOME/src/warpx/Tools/machines/summit-olcf/install_gpu_ml.sh
Script Details
#!/bin/bash
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was perlmutter_gpu_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your summit_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Check $proj variable is correct and has a corresponding PROJWORK directory ##
#
if [ ! -d "${PROJWORK}/${proj}/" ]
then
    echo "WARNING: The directory $PROJWORK/$proj/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your summit_warpx.profile file to continue!"
    exit
fi


# Check $proj variable is correct and has a corresponding Software directory ##
#
if [ ! -d "/ccs/proj/${proj}/" ]
then
    echo "WARNING: The directory /ccs/proj/$proj/ does not exist!"
    echo "Is the \$proj environment variable of value \"$proj\" correctly set? "
    echo "Please edit line 2 of your summit_warpx.profile file to continue!"
    exit
fi


# Remove old dependencies #####################################################
#
# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qqq -y torch 2>/dev/null || true


# Python ML ###################################################################
#
# for basic python dependencies, see install_gpu_dependencies.sh

# optional: for libEnsemble - WIP: issues with nlopt
# python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt

# optional: for pytorch
if [ -d /ccs/proj/${proj}/${USER}/src/pytorch ]
then
  cd /ccs/proj/${proj}/${USER}/src/pytorch
  git fetch
  git checkout .
  git checkout v2.0.1
  git submodule update --init --recursive
  cd -
else
  git clone -b v2.0.1 --recurse-submodules https://github.com/pytorch/pytorch.git /ccs/proj/${proj}/${USER}/src/pytorch
fi
cd /ccs/proj/${proj}/${USER}/src/pytorch
rm -rf build
python3 -m pip install -r requirements.txt
#   patch to avoid compile issues
#   https://github.com/pytorch/pytorch/issues/97497#issuecomment-1499069641
#   https://github.com/pytorch/pytorch/pull/98511
wget -q -O - https://github.com/pytorch/pytorch/pull/98511.patch | git apply
USE_CUDA=1 BLAS=OpenBLAS MAX_JOBS=64 ATEN_AVX512_256=OFF BUILD_TEST=0 python3 setup.py develop
#   (optional) If using torch.compile with inductor/triton, install the matching version of triton
#make triton
rm -rf build
cd -

# optional: optimas dependencies (based on libEnsemble & ax->botorch->gpytorch->pytorch)
#   commented because scikit-learn et al. compile > 2 hrs
#   please run manually on a login node if needed
#python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt

For optimas dependencies (incl. scikit-learn), plan another hour of build time:

python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt
Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_summit

cmake -S . -B build_summit -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_summit -j 8

The WarpX application executables are now in $HOME/src/warpx/build_summit/bin/. Additionally, the following commands will install WarpX as a Python module:

rm -rf build_summit_py

cmake -S . -B build_summit_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_summit_py -j 8 --target pip_install

Now, you can submit Summit compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Summit jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log     # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_summit and rebuild WarpX.

Running
V100 GPUs (16GB)

The batch script below can be used to run a WarpX simulation on 2 nodes on the supercomputer Summit at OLCF. Replace descriptions between chevrons <> by relevant values, for instance <input file> could be plasma_mirror_inputs. Note that WarpX runs with one MPI rank per GPU and there are 6 GPUs per node:

You can copy this file from Tools/machines/summit-olcf/summit_v100.bsub.
#!/bin/bash

# Copyright 2019-2020 Maxence Thevenet, Axel Huebl
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# Refs.:
#   https://jsrunvisualizer.olcf.ornl.gov/?s4f0o11n6c7g1r11d1b1l0=
#   https://docs.olcf.ornl.gov/systems/summit_user_guide.html#cuda-aware-mpi

#BSUB -P <allocation ID>
#BSUB -W 00:10
#BSUB -nnodes 2
#BSUB -alloc_flags smt4
#BSUB -J WarpX
#BSUB -o WarpXo.%J
#BSUB -e WarpXe.%J

# make output group-readable by default
umask 0027

# fix problems with collectives since RHEL8 update: OLCFHELP-3545
# disable all the IBM optimized barriers and drop back to HCOLL or OMPI's barrier implementations
export OMPI_MCA_coll_ibm_skip_barrier=true

# libfabric 1.6+: limit the visible devices
# Needed for ADIOS2 SST staging/streaming workflows since RHEL8 update
#   https://github.com/ornladios/ADIOS2/issues/2887
#export FABRIC_IFACE=mlx5_0   # ADIOS SST: select interface (1 NIC on Summit)
#export FI_OFI_RXM_USE_SRX=1  # libfabric: use shared receive context from MSG provider

# ROMIO has a hint for GPFS named IBM_largeblock_io which optimizes I/O with operations on large blocks
export IBM_largeblock_io=true

# MPI-I/O: ROMIO hints for parallel HDF5 performance
export OMPI_MCA_io=romio321
export ROMIO_HINTS=./romio-hints
#   number of hosts: unique node names minus batch node
NUM_HOSTS=$(( $(echo $LSB_HOSTS | tr ' ' '\n' | uniq | wc -l) - 1 ))
cat > romio-hints << EOL
   romio_cb_write enable
   romio_ds_write enable
   cb_buffer_size 16777216
   cb_nodes ${NUM_HOSTS}
EOL

# OpenMP: 1 thread per MPI rank
export OMP_NUM_THREADS=1

# run WarpX
jsrun -r 6 -a 1 -g 1 -c 7 -l GPU-CPU -d packed -b rs --smpiargs="-gpu" <path/to/executable> <input file> > output.txt

To run a simulation, copy the lines above to a file summit_v100.bsub and run

bsub summit_v100.bsub

to submit the job.

For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell solver on Summit for a well load-balanced problem (in our case laser wakefield acceleration simulation in a boosted frame in the quasi-linear regime), the following set of parameters provided good performance:

  • amr.max_grid_size=256 and amr.blocking_factor=128.

  • One MPI rank per GPU (e.g., 6 MPI ranks for the 6 GPUs on each Summit node)

  • Two `128x128x128` grids per GPU, or one `128x128x256` grid per GPU.

A batch script with more options regarding profiling on Summit can be found at Summit batch script

Power9 CPUs

Similar to above, the batch script below can be used to run a WarpX simulation on 1 node on the supercomputer Summit at OLCF, on Power9 CPUs (i.e., the GPUs are ignored).

You can copy this file from Tools/machines/summit-olcf/summit_power9.bsub.
#!/bin/bash

# Copyright 2019-2020 Maxence Thevenet, Axel Huebl, Michael Rowan
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# Refs.:
#   https://jsrunvisualizer.olcf.ornl.gov/?s1f0o121n2c21g0r11d1b1l0=

#BSUB -P <allocation ID>
#BSUB -W 00:10
#BSUB -nnodes 1
#BSUB -alloc_flags "smt1"
#BSUB -J WarpX
#BSUB -o WarpXo.%J
#BSUB -e WarpXe.%J

# make output group-readable by default
umask 0027

# fix problems with collectives since RHEL8 update: OLCFHELP-3545
# disable all the IBM optimized barriers and drop back to HCOLL or OMPI's barrier implementations
export OMPI_MCA_coll_ibm_skip_barrier=true

# libfabric 1.6+: limit the visible devices
# Needed for ADIOS2 SST staging/streaming workflows since RHEL8 update
#   https://github.com/ornladios/ADIOS2/issues/2887
#export FABRIC_IFACE=mlx5_0   # ADIOS SST: select interface (1 NIC on Summit)
#export FI_OFI_RXM_USE_SRX=1  # libfabric: use shared receive context from MSG provider

# ROMIO has a hint for GPFS named IBM_largeblock_io which optimizes I/O with operations on large blocks
export IBM_largeblock_io=true

# MPI-I/O: ROMIO hints for parallel HDF5 performance
export OMPI_MCA_io=romio321
export ROMIO_HINTS=./romio-hints
#   number of hosts: unique node names minus batch node
NUM_HOSTS=$(( $(echo $LSB_HOSTS | tr ' ' '\n' | uniq | wc -l) - 1 ))
cat > romio-hints << EOL
   romio_cb_write enable
   romio_ds_write enable
   cb_buffer_size 16777216
   cb_nodes ${NUM_HOSTS}
EOL

# OpenMP: 21 threads per MPI rank
export OMP_NUM_THREADS=21

# run WarpX
jsrun -n 2 -a 1 -c 21 -r 2 -l CPU-CPU -d packed -b rs <path/to/executable> <input file> > output.txt

For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell solver on Summit for a well load-balanced problem, the following set of parameters provided good performance:

  • amr.max_grid_size=64 and amr.blocking_factor=64

  • Two MPI ranks per node (i.e. 2 resource sets per node; equivalently, 1 resource set per socket)

  • 21 physical CPU cores per MPI rank

  • 21 OpenMP threads per MPI rank (i.e. 1 OpenMP thread per physical core)

  • SMT 1 (Simultaneous Multithreading level 1)

  • Sixteen `64x64x64` grids per MPI rank (with default tiling in WarpX, this results in ~49 tiles per OpenMP thread)

I/O Performance Tuning
GPFS Large Block I/O

Setting IBM_largeblock_io to true disables data shipping, saving overhead when writing/reading large contiguous I/O chunks.

export IBM_largeblock_io=true
ROMIO MPI-IO Hints

You might notice some parallel HDF5 performance improvements on Summit by setting the appropriate ROMIO hints for MPI-IO operations.

export OMPI_MCA_io=romio321
export ROMIO_HINTS=./romio-hints

You can generate the romio-hints by issuing the following command. Remember to change the number of cb_nodes to match the number of compute nodes you are using (example here: 64).

cat > romio-hints << EOL
romio_cb_write enable
romio_ds_write enable
cb_buffer_size 16777216
cb_nodes 64
EOL

The romio-hints file contains pairs of key-value hints to enable and tune collective buffering of MPI-IO operations. As Summit’s Alpine file system uses a 16MB block size, you should set the collective buffer size to 16GB and tune the number of aggregators (cb_nodes) to the number of compute nodes you are using, i.e., one aggregator per node.

Further details are available at Summit’s documentation page.

Known System Issues

Warning

Sep 16th, 2021 (OLCFHELP-3685): The Jupyter service cannot open HDF5 files without hanging, due to a filesystem mounting problem.

Please apply this work-around in a Jupyter cell before opening any HDF5 files for read:

import os
os.environ['HDF5_USE_FILE_LOCKING'] = "FALSE"

Warning

Aug 27th, 2021 (OLCFHELP-3442): Created simulation files and directories are no longer accessible by your team members, even if you create them on $PROJWORK. Setting the proper “user mask” (umask) does not yet work to fix this.

Please run those commands after running a simulation to fix this. You can also append this to the end of your job scripts after the jsrun line:

# cd your-simulation-directory
find . -type d -exec chmod g+rwx {} \;
find . -type f -exec chmod g+rw {} \;

Warning

Sep 3rd, 2021 (OLCFHELP-3545): The implementation of barriers in IBM’s MPI fork is broken and leads to crashes at scale. This is seen with runs using 200 nodes and above.

Our batch script templates above apply this work-around before the call to jsrun, which avoids the broken routines from IBM and trades them for an OpenMPI implementation of collectives:

export OMPI_MCA_coll_ibm_skip_barrier=true

Warning

Sep 3rd, 2021 (OLCFHELP-3319): If you are an active developer and compile middleware libraries (e.g., ADIOS2) yourself that use MPI and/or infiniband, be aware of libfabric: IBM forks the open source version of this library and ships a patched version.

Avoid conflicts with mainline versions of this library in MPI that lead to crashes at runtime by loading alongside the system MPI module:

module load libfabric/1.12.1-sysrdma

For instance, if you compile large software stacks with Spack, make sure to register libfabric with that exact version as an external module.

If you load the documented ADIOS2 module above, this problem does not affect you, since the correct libfabric version is chosen for this one.

Warning

Related to the above issue, the fabric selection in ADIOS2 was designed for libfabric 1.6. With newer versions of libfabric, a workaround is needed to guide the selection of a functional fabric for RDMA support. Details are discussed in ADIOS2 issue #2887.

The following environment variables can be set as work-arounds, when working with ADIOS2 SST:

export FABRIC_IFACE=mlx5_0   # ADIOS SST: select interface (1 NIC on Summit)
export FI_OFI_RXM_USE_SRX=1  # libfabric: use shared receive context from MSG provider

Warning

Oct 12th, 2021 (OLCFHELP-4242): There is currently a problem with the pre-installed Jupyter extensions, which can lead to connection splits at long running analysis sessions.

Work-around this issue by running in a single Jupyter cell, before starting analysis:

!jupyter serverextension enable --py --sys-prefix dask_labextension
Post-Processing

For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).

We usually just install our software on-the-fly on Summit. When starting up a post-processing session, run this in your first cells:

Note

The following software packages are installed only into a temporary directory.

# work-around for OLCFHELP-4242
!jupyter serverextension enable --py --sys-prefix dask_labextension

# next Jupyter cell: the software you want
!mamba install --quiet -c conda-forge -y openpmd-api openpmd-viewer ipympl ipywidgets fast-histogram yt

# restart notebook

Taurus (ZIH)

The Taurus cluster is located at ZIH (TU Dresden).

The cluster has multiple partitions, this section describes how to use the AMD Rome CPUs + NVIDIA A100¶.

Introduction

If you are new to this system, please see the following resources:

  • ZIH user guide

  • Batch system: Slurm

  • Jupyter service: Missing?

  • Production directories:

    • $PSCRATCH: per-user production directory, purged every 30 days (<TBD>TB)

    • /global/cscratch1/sd/m3239: shared production directory for users in the project m3239, purged every 30 days (50TB)

    • /global/cfs/cdirs/m3239/: community file system for users in the project m3239 (100TB)

Installation

Use the following commands to download the WarpX source code and switch to the correct branch:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

We use the following modules and environments on the system ($HOME/taurus_warpx.profile).

You can copy this file from Tools/machines/taurus-zih/taurus_warpx.profile.example.
# please set your project account
#export proj="<yourProject>"  # change me

# required dependencies
module load modenv/hiera
module load foss/2021b
module load CUDA/11.8.0
module load CMake/3.22.1

# optional: for QED support with detailed tables
#module load Boost  # TODO

# optional: for openPMD and PSATD+RZ support
module load HDF5/1.13.1

# optional: for Python bindings or libEnsemble
#module load python  # TODO
#
#if [ -d "$HOME/sw/taurus/venvs/warpx" ]
#then
#  source $HOME/sw/taurus/venvs/warpx/bin/activate
#fi

# an alias to request an interactive batch node for one hour
#   for parallel execution, start on the batch node: srun <command>
alias getNode="salloc --time=2:00:00 -N1 -n1 --cpus-per-task=6 --mem-per-cpu=2048 --gres=gpu:1 --gpu-bind=single:1 -p alpha-interactive --pty bash"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun --time=2:00:00 -N1 -n1 --cpus-per-task=6 --mem-per-cpu=2048 --gres=gpu:1 --gpu-bind=single:1 -p alpha-interactive --pty bash"

# optimize CUDA compilation for A100
export AMREX_CUDA_ARCH=8.0

# compiler environment hints
#export CC=$(which gcc)
#export CXX=$(which g++)
#export FC=$(which gfortran)
#export CUDACXX=$(which nvcc)
#export CUDAHOSTCXX=${CXX}

We recommend to store the above lines in a file, such as $HOME/taurus_warpx.profile, and load it into your shell after a login:

source $HOME/taurus_warpx.profile

Then, cd into the directory $HOME/src/warpx and use the following commands to compile:

cd $HOME/src/warpx
rm -rf build

cmake -S . -B build -DWarpX_DIMS="1;2;3" -DWarpX_COMPUTE=CUDA
cmake --build build -j 16

The general cmake compile-time options apply as usual.

Running
A100 GPUs (40 GB)

The alpha partition has 34 nodes with 8 x NVIDIA A100-SXM4 Tensor Core-GPUs and 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz (multithreading disabled) per node.

The batch script below can be used to run a WarpX simulation on multiple nodes (change -N accordingly). Replace descriptions between chevrons <> by relevant values, for instance <input file> could be plasma_mirror_inputs. Note that we run one MPI rank per GPU.

You can copy this file from Tools/machines/taurus-zih/taurus.sbatch.
#!/bin/bash -l

# Copyright 2023 Axel Huebl, Thomas Miethlinger
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL

#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -J WarpX
#SBATCH -p alpha
#SBATCH --exclusive
#SBATCH --cpus-per-task=6
#SBATCH --mem-per-cpu=2048
#SBATCH --gres=gpu:1
#SBATCH --gpu-bind=single:1
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs_small

# run
srun ${EXE} ${INPUTS} \
  > output.txt

To run a simulation, copy the lines above to a file taurus.sbatch and run

sbatch taurus.sbatch

to submit the job.

Great Lakes (UMich)

The Great Lakes cluster is located at University of Michigan. The cluster has various partitions, including GPU nodes and CPU nodes.

Introduction

If you are new to this system, please see the following resources:

Preparation

Use the following commands to download the WarpX source code:

git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx

On Great Lakes, you can run either on GPU nodes with fast V100 GPUs (recommended), the even faster A100 GPUs (only a few available) or CPU nodes.

We use system software modules, add environment hints and further dependencies via the file $HOME/greatlakes_v100_warpx.profile. Create it now:

cp $HOME/src/warpx/Tools/machines/greatlakes-umich/greatlakes_v100_warpx.profile.example $HOME/greatlakes_v100_warpx.profile
Script Details
# please set your project account
export proj=""  # change me!

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module purge
module load gcc/10.3.0
module load cuda/12.1.1
module load cmake/3.26.3
module load openblas/0.3.23
module load openmpi/4.1.6-cuda

# optional: for QED support
module load boost/1.78.0

# optional: for openPMD and PSATD+RZ support
module load phdf5/1.12.1

SW_DIR="${HOME}/sw/greatlakes/v100"
export CMAKE_PREFIX_PATH=${SW_DIR}/c-blosc2-2.14.4:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/adios2-2.10.0:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=${SW_DIR}/lapackpp-master:$CMAKE_PREFIX_PATH

export LD_LIBRARY_PATH=${SW_DIR}/c-blosc2-2.14.4/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/adios2-2.10.0/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/blaspp-master/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${SW_DIR}/lapackpp-master/lib64:$LD_LIBRARY_PATH

export PATH=${SW_DIR}/adios2-2.10.0/bin:${PATH}

# optional: for Python bindings or libEnsemble
module load python/3.12.1

if [ -d "${SW_DIR}/venvs/warpx-v100" ]
then
  source ${SW_DIR}/venvs/warpx-v100/bin/activate
fi

# an alias to request an interactive batch node for one hour
#   for parallel execution, start on the batch node: srun <command>
alias getNode="salloc -N 1 --partition=gpu --ntasks-per-node=2 --cpus-per-task=20 --gpus-per-task=v100:1 -t 1:00:00 -A $proj"
# an alias to run a command on a batch node for up to 30min
#   usage: runNode <command>
alias runNode="srun -N 1 --partition=gpu --ntasks-per-node=2 --cpus-per-task=20 --gpus-per-task=v100:1 -t 1:00:00 -A $proj"

# optimize CUDA compilation for V100
export AMREX_CUDA_ARCH=7.0

# optimize CPU microarchitecture for Intel Xeon Gold 6148
export CXXFLAGS="-march=skylake-avx512"
export CFLAGS="-march=skylake-avx512"

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project iloveplasma, then run nano $HOME/greatlakes_v100_warpx.profile and edit line 2 to read:

export proj="iloveplasma"

Exit the nano editor with Ctrl + O (save) and then Ctrl + X (exit).

Important

Now, and as the first step on future logins to Great Lakes, activate these environment settings:

source $HOME/greatlakes_v100_warpx.profile

Finally, since Great Lakes does not yet provide software modules for some of our dependencies, install them once:

bash $HOME/src/warpx/Tools/machines/greatlakes-umich/install_v100_dependencies.sh
source ${HOME}/sw/greatlakes/v100/venvs/warpx-v100/bin/activate
Script Details
#!/bin/bash
#
# Copyright 2024 The WarpX Community
#
# This file is part of WarpX.
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was greatlakes_v100_warpx.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your greatlakes_v100_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
echo "Cleaning up prior installation directory... This may take several minutes."
SW_DIR="${HOME}/sw/greatlakes/v100"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y pywarpx
python3 -m pip uninstall -qq -y warpx
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# General extra dependencies ##################################################
#

# tmpfs build directory: avoids issues often seen with $HOME and is faster
build_dir=$(mktemp -d)

# c-blosc (I/O compression)
if [ -d $HOME/src/c-blosc2 ]
then
  cd $HOME/src/c-blosc2
  git fetch --prune
  git checkout v2.14.4
  cd -
else
  git clone -b v2.14.4 https://github.com/Blosc/c-blosc2.git $HOME/src/c-blosc2
fi
rm -rf $HOME/src/c-blosc2-v100-build
cmake -S $HOME/src/c-blosc2 -B ${build_dir}/c-blosc2-v100-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DBUILD_EXAMPLES=OFF -DBUILD_FUZZERS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc2-2.14.4
cmake --build ${build_dir}/c-blosc2-v100-build --target install --parallel 8
rm -rf ${build_dir}/c-blosc2-v100-build

# ADIOS2
if [ -d $HOME/src/adios2 ]
then
  cd $HOME/src/adios2
  git fetch --prune
  git checkout v2.10.0
  cd -
else
  git clone -b v2.10.0 https://github.com/ornladios/ADIOS2.git $HOME/src/adios2
fi
rm -rf $HOME/src/adios2-v100-build
cmake                                \
  -S $HOME/src/adios2                \
  -B ${build_dir}/adios2-v100-build  \
  -DADIOS2_USE_Blosc2=ON             \
  -DADIOS2_USE_Campaign=OFF          \
  -DADIOS2_USE_Fortran=OFF           \
  -DADIOS2_USE_Python=OFF            \
  -DADIOS2_USE_ZeroMQ=OFF            \
  -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.10.0
cmake --build ${build_dir}/adios2-v100-build --target install -j 8
rm -rf ${build_dir}/adios2-v100-build

# BLAS++ (for PSATD+RZ)
if [ -d $HOME/src/blaspp ]
then
  cd $HOME/src/blaspp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp
fi
rm -rf $HOME/src/blaspp-v100-build
cmake -S $HOME/src/blaspp -B ${build_dir}/blaspp-v100-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master
cmake --build ${build_dir}/blaspp-v100-build --target install --parallel 8
rm -rf ${build_dir}/blaspp-v100-build

# LAPACK++ (for PSATD+RZ)
if [ -d $HOME/src/lapackpp ]
then
  cd $HOME/src/lapackpp
  git fetch --prune
  git checkout master
  git pull
  cd -
else
  git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp
fi
rm -rf $HOME/src/lapackpp-v100-build
CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B ${build_dir}/lapackpp-v100-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master
cmake --build ${build_dir}/lapackpp-v100-build --target install --parallel 8
rm -rf ${build_dir}/lapackpp-v100-build


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/warpx-v100
python3 -m venv ${SW_DIR}/venvs/warpx-v100
source ${SW_DIR}/venvs/warpx-v100/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# install or update WarpX dependencies
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt
python3 -m pip install --upgrade cupy-cuda12x  # CUDA 12 compatible wheel
# optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch)
python3 -m pip install --upgrade torch  # CUDA 12 compatible wheel
python3 -m pip install --upgrade optimas[all]


# remove build temporary directory
rm -rf ${build_dir}

Note

This section is TODO.

Note

This section is TODO.

Compilation

Use the following cmake commands to compile the application executable:

cd $HOME/src/warpx
rm -rf build_v100

cmake -S . -B build_v100 -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_v100 -j 8

The WarpX application executables are now in $HOME/src/warpx/build_v100/bin/. Additionally, the following commands will install WarpX as a Python module:

cd $HOME/src/warpx
rm -rf build_v100_py

cmake -S . -B build_v100_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_v100_py -j 8 --target pip_install

Note

This section is TODO.

Note

This section is TODO.

Now, you can submit Great Lakes compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit greatlakes jobs (example inputs). For executables, you can reference their location in your job script or copy them to a location in /scratch.

Update WarpX & Dependencies

If you already installed WarpX in the past and want to update it, start by getting the latest source code:

cd $HOME/src/warpx

# read the output of this command - does it look ok?
git status

# get the latest WarpX source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log # press q to exit

And, if needed,

As a last step, clean the build directory rm -rf $HOME/src/warpx/build_* and rebuild WarpX.

Running

The batch script below can be used to run a WarpX simulation on multiple nodes (change -N accordingly) on the supercomputer Great Lakes at University of Michigan. This partition has 20 nodes, each with two V100 GPUs.

Replace descriptions between chevrons <> by relevant values, for instance <input file> could be plasma_mirror_inputs. Note that we run one MPI rank per GPU.

You can copy this file from $HOME/src/warpx/Tools/machines/greatlakes-umich/greatlakes_v100.sbatch.
#!/bin/bash -l

# Copyright 2024 The WarpX Community
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL

#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -J WarpX
#SBATCH -A <proj>
#SBATCH --partition=gpu
#SBATCH --exclusive
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=20
#SBATCH --gpus-per-task=v100:1
#SBATCH --gpu-bind=single:1
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j

# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs

# threads for OpenMP and threaded compressors per MPI rank
#   per node are 2x 2.4 GHz Intel Xeon Gold 6148
#   note: the system seems to only expose cores (20 per socket),
#         not hyperthreads (40 per socket)
export SRUN_CPUS_PER_TASK=20
export OMP_NUM_THREADS=${SRUN_CPUS_PER_TASK}

# GPU-aware MPI optimizations
GPU_AWARE_MPI="amrex.use_gpu_aware_mpi=1"

# run WarpX
srun --cpu-bind=cores \
  ${EXE} ${INPUTS} ${GPU_AWARE_MPI} \
  > output.txt

To run a simulation, copy the lines above to a file greatlakes_v100.sbatch and run

sbatch greatlakes_v100.sbatch

to submit the job.

This partition has 2 nodes, each with four A100 GPUs that provide 80 GB HBM per A100 GPU. To the user, each node will appear as if it has 8 A100 GPUs with 40 GB memory each.

Note

This section is TODO.

The Great Lakes CPU partition as up to 455 nodes, each with 2x Intel Xeon Gold 6154 CPUs and 180 GB RAM.

Note

This section is TODO.

Post-Processing

For post-processing, many users prefer to use the online Jupyter service (documentation) that is directly connected to the cluster’s fast filesystem.

Note

This section is a stub and contributions are welcome. We can document further details, e.g., which recommended post-processing Python software to install or how to customize Jupyter kernels here.

Tip

Your HPC system is not in the list? Open an issue and together we can document it!

Batch Systems

HPC systems use a scheduling (“batch”) system for time sharing of computing resources. The batch system is used to request, queue, schedule and execute compute jobs asynchronously. The individual HPC machines above document job submission example scripts, as templates for your modifications.

In this section, we document a quick reference guide (or cheat sheet) to interact in more detail with the various batch systems that you might encounter on different systems.

Slurm

Slurm is a modern and very popular batch system. Slurm is used at NERSC, OLCF Frontier, among others.

Job Submission
  • sbatch your_job_script.sbatch

Job Control
  • interactive job:

    • salloc --time=1:00:00 --nodes=1 --ntasks-per-node=4 --cpus-per-task=8

      • e.g. srun "hostname"

    • GPU allocation on most machines require additional flags, e.g. --gpus-per-task=1 or --gres=...

  • details for my jobs:

    • scontrol -d show job 12345 all details for job with <job id> 12345

    • squeue -u $(whoami) -l all jobs under my user name

  • details for queues:

    • squeue -p queueName -l list full queue

    • squeue -p queueName --start (show start times for pending jobs)

    • squeue -p queueName -l -t R (only show running jobs in queue)

    • sinfo -p queueName (show online/offline nodes in queue)

    • sview (alternative on taurus: module load llview and llview)

    • scontrol show partition queueName

  • communicate with job:

    • scancel <job id> abort job

    • scancel -s <signal number> <job id> send signal or signal name to job

    • scontrol update timelimit=4:00:00 jobid=12345 change the walltime of a job

    • scontrol update jobid=12345 dependency=afterany:54321 only start job 12345 after job with id 54321 has finished

    • scontrol hold <job id> prevent the job from starting

    • scontrol release <job id> release the job to be eligible for run (after it was set on hold)

References

LSF

LSF (for Load Sharing Facility) is an IBM batch system. It is used at OLCF Summit, LLNL Lassen, and other IBM systems.

Job Submission
  • bsub your_job_script.bsub

Job Control
  • interactive job:

    • bsub -P $proj -W 2:00 -nnodes 1 -Is /bin/bash

  • details for my jobs:

    • bjobs 12345 all details for job with <job id> 12345

    • bjobs [-l] all jobs under my user name

    • jobstat -u $(whoami) job eligibility

    • bjdepinfo 12345 job dependencies on other jobs

  • details for queues:

    • bqueues list queues

  • communicate with job:

    • bkill <job id> abort job

    • bpeek [-f] <job id> peek into stdout/stderr of a job

    • bkill -s <signal number> <job id> send signal or signal name to job

    • bchkpnt and brestart checkpoint and restart job (untested/unimplemented)

    • bmod -W 1:30 12345 change the walltime of a job (currently not allowed)

    • bstop <job id> prevent the job from starting

    • bresume <job id> release the job to be eligible for run (after it was set on hold)

References

PBS

PBS (for Portable Batch System) is a popular HPC batch system. The OpenPBS project is related to PBS, PBS Pro and TORQUE.

Job Submission
  • qsub your_job_script.qsub

Job Control
  • interactive job:

    • qsub -I

  • details for my jobs:

    • qstat -f 12345 all details for job with <job id> 12345

    • qstat -u $(whoami) all jobs under my user name

  • details for queues:

    • qstat -a queueName show all jobs in a queue

    • pbs_free -l compact view on free and busy nodes

    • pbsnodes list all nodes and their detailed state (free, busy/job-exclusive, offline)

  • communicate with job:

    • qdel <job id> abort job

    • qsig -s <signal number> <job id> send signal or signal name to job

    • qalter -lwalltime=12:00:00 <job id> change the walltime of a job

    • qalter -Wdepend=afterany:54321 12345 only start job 12345 after job with id 54321 has finished

    • qhold <job id> prevent the job from starting

    • qrls <job id> release the job to be eligible for run (after it was set on hold)

References

PJM

PJM (probably for Parallel Job Manager?) is a Fujitsu batch system It is used at RIKEN Fugaku and on other Fujitsu systems.

Note

This section is a stub and improvements to complete the (TODO) sections are welcome.

Job Submission
  • pjsub your_job_script.pjsub

Job Control
  • interactive job:

    • pjsub --interact

  • details for my jobs:

    • pjstat status of all jobs

    • (TODO) all details for job with <job id> 12345

    • (TODO) all jobs under my user name

  • details for queues:

    • (TODO) show all jobs in a queue

    • (TODO) compact view on free and busy nodes

    • (TODO) list all nodes and their detailed state (free, busy/job-exclusive, offline)

  • communicate with job:

    • pjdel <job id> abort job

    • (TODO) send signal or signal name to job

    • (TODO) change the walltime of a job

    • (TODO) only start job 12345 after job with id 54321 has finished

    • pjhold <job id> prevent the job from starting

    • pjrls <job id> release the job to be eligible for run (after it was set on hold)

References

Usage

Run WarpX

In order to run a new simulation:

  1. create a new directory, where the simulation will be run

  2. make sure the WarpX executable is either copied into this directory or in your PATH environment variable

  3. add an inputs file and on HPC systems a submission script to the directory

  4. run

1. Run Directory

On Linux/macOS, this is as easy as this

mkdir -p <run_directory>

Where <run_directory> by the actual path to the run directory.

2. Executable

If you installed warpX with a package manager, a warpx-prefixed executable will be available as a regular system command to you. Depending on the chosen build options, the name is suffixed with more details. Try it like this:

warpx<TAB>

Hitting the <TAB> key will suggest available WarpX executables as found in your PATH environment variable.

Note

WarpX needs separate binaries to run in dimensionality of 1D, 2D, 3D, and RZ. We encode the supported dimensionality in the binary file name.

If you compiled the code yourself, the WarpX executable is stored in the source folder under build/bin. We also create a symbolic link that is just called warpx that points to the last executable you built, which can be copied, too. Copy the executable to this directory:

cp build/bin/<warpx_executable> <run_directory>/

where <warpx_executable> should be replaced by the actual name of the executable (see above) and <run_directory> by the actual path to the run directory.

3. Inputs

Add an input file in the directory (see examples and parameters). This file contains the numerical and physical parameters that define the situation to be simulated.

On HPC systems, also copy and adjust a submission script that allocated computing nodes for you. Please reach out to us if you need help setting up a template that runs with ideal performance.

4. Run

Run the executable, e.g. with MPI:

cd <run_directory>

# run with an inputs file:
mpirun -np <n_ranks> ./warpx <input_file>

or

# run with a PICMI input script:
mpirun -np <n_ranks> python <python_script>

Here, <n_ranks> is the number of MPI ranks used, and <input_file> is the name of the input file (<python_script> is the name of the PICMI script). Note that the actual executable might have a longer name, depending on build options.

We used the copied executable in the current directory (./); if you installed with a package manager, skip the ./ because WarpX is in your PATH.

On an HPC system, you would instead submit the job script at this point, e.g. sbatch <submission_script> (SLURM on Cori/NERSC) or bsub <submission_script> (LSF on Summit/OLCF).

Tip

In the next sections, we will explain parameters of the <input_file>. You can overwrite all parameters inside this file also from the command line, e.g.:

mpirun -np 4 ./warpx <input_file> max_step=10 warpx.numprocs=1 2 2

5. Outputs

By default, WarpX will write a status update to the terminal (stdout). On HPC systems, we usually store a copy of this in a file called outputs.txt.

We also store by default an exact copy of all explicitly and implicitly used inputs parameters in a file called warpx_used_inputs (this file name can be changed). This is important for reproducibility, since as we wrote in the previous paragraph, the options in the input file can be extended and overwritten from the command line.

Further configured diagnostics are explained in the next sections. By default, they are written to a subdirectory in diags/ and can use various output formats.

Examples

This section allows you to download input files that correspond to different physical situations.

We provide two kinds of inputs:

For a complete list of all example input files, also have a look at our Examples/ directory. It contains folders and subfolders with self-describing names that you can try. All these input files are automatically tested, so they should always be up-to-date.

Plasma-Based Acceleration

Laser-Wakefield Acceleration of Electrons

This example shows how to model a laser-wakefield accelerator (LWFA) [2, 3].

Laser-wakefield acceleration is best performed in 3D or quasi-cylindrical (RZ) geometry, in order to correctly capture some of the key physics (laser diffraction, beamloading, shape of the accelerating bubble in the blowout regime, etc.). For physical situations that have close-to-cylindrical symmetry, simulations in RZ geometry capture the relevant physics at a fraction of the computational cost of a 3D simulation. On the other hand, for physical situation with strong asymmetries (e.g., non-round laser driver, strong hosing of the accelerated beam, etc.), only 3D simulations are suitable.

For LWFA scenarios with long propagation lengths, use the boosted frame method. An example can be seen in the PWFA example.

Run

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

This example can be run either as:

  • Python script: python3 PICMI_inputs_3d.py or

  • WarpX executable using an input file: warpx.3d inputs_3d max_step=400

You can copy this file from Examples/Physics_applications/laser_acceleration/PICMI_inputs_3d.py.
#!/usr/bin/env python3

from pywarpx import picmi

# Physical constants
c = picmi.constants.c
q_e = picmi.constants.q_e

# Number of time steps
max_steps = 100

# Number of cells
nx = 32
ny = 32
nz = 256

# Physical domain
xmin = -30e-06
xmax =  30e-06
ymin = -30e-06
ymax =  30e-06
zmin = -56e-06
zmax =  12e-06

# Domain decomposition
max_grid_size = 64
blocking_factor = 32

# Create grid
grid = picmi.Cartesian3DGrid(
    number_of_cells = [nx, ny, nz],
    lower_bound = [xmin, ymin, zmin],
    upper_bound = [xmax, ymax, zmax],
    lower_boundary_conditions = ['periodic', 'periodic', 'dirichlet'],
    upper_boundary_conditions = ['periodic', 'periodic', 'dirichlet'],
    lower_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
    upper_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
    moving_window_velocity = [0., 0., c],
    warpx_max_grid_size = max_grid_size,
    warpx_blocking_factor = blocking_factor)

# Particles: plasma electrons
plasma_density = 2e23
plasma_xmin = -20e-06
plasma_ymin = -20e-06
plasma_zmin = 0
plasma_xmax = 20e-06
plasma_ymax = 20e-06
plasma_zmax = None
uniform_distribution = picmi.UniformDistribution(
    density = plasma_density,
    lower_bound = [plasma_xmin, plasma_ymin, plasma_zmin],
    upper_bound = [plasma_xmax, plasma_ymax, plasma_zmax],
    fill_in = True)
electrons = picmi.Species(
    particle_type = 'electron',
    name = 'electrons',
    initial_distribution = uniform_distribution)

# Particles: beam electrons
q_tot = 1e-12
x_m = 0.
y_m = 0.
z_m = -28e-06
x_rms = 0.5e-06
y_rms = 0.5e-06
z_rms = 0.5e-06
ux_m = 0.
uy_m = 0.
uz_m = 500.
ux_th = 2.
uy_th = 2.
uz_th = 50.
gaussian_bunch_distribution = picmi.GaussianBunchDistribution(
    n_physical_particles = q_tot / q_e,
    rms_bunch_size = [x_rms, y_rms, z_rms],
    rms_velocity = [c*ux_th, c*uy_th, c*uz_th],
    centroid_position = [x_m, y_m, z_m],
    centroid_velocity = [c*ux_m, c*uy_m, c*uz_m])
beam = picmi.Species(
    particle_type = 'electron',
    name = 'beam',
    initial_distribution = gaussian_bunch_distribution)

# Laser
e_max = 16e12
position_z = 9e-06
profile_t_peak = 30.e-15
profile_focal_distance = 100e-06
laser = picmi.GaussianLaser(
    wavelength = 0.8e-06,
    waist = 5e-06,
    duration = 15e-15,
    focal_position = [0, 0, profile_focal_distance + position_z],
    centroid_position = [0, 0, position_z - c*profile_t_peak],
    propagation_direction = [0, 0, 1],
    polarization_direction = [0, 1, 0],
    E0 = e_max,
    fill_in = False)
laser_antenna = picmi.LaserAntenna(
    position = [0., 0., position_z],
    normal_vector = [0, 0, 1])

# Electromagnetic solver
solver = picmi.ElectromagneticSolver(
    grid = grid,
    method = 'Yee',
    cfl = 1.,
    divE_cleaning = 0)

# Diagnostics
diag_field_list = ['B', 'E', 'J', 'rho']
particle_diag = picmi.ParticleDiagnostic(
    name = 'diag1',
    period = 100,
    write_dir = '.',
    warpx_file_prefix = 'Python_LaserAcceleration_plt')
field_diag = picmi.FieldDiagnostic(
    name = 'diag1',
    grid = grid,
    period = 100,
    data_list = diag_field_list,
    write_dir = '.',
    warpx_file_prefix = 'Python_LaserAcceleration_plt')

# Set up simulation
sim = picmi.Simulation(
    solver = solver,
    max_steps = max_steps,
    verbose = 1,
    particle_shape = 'cubic',
    warpx_use_filter = 1,
    warpx_serialize_initial_conditions = 1,
    warpx_do_dynamic_scheduling = 0)

# Add plasma electrons
sim.add_species(
    electrons,
    layout = picmi.GriddedLayout(grid = grid, n_macroparticle_per_cell = [1, 1, 1]))

# Add beam electrons
sim.add_species(
    beam,
    layout = picmi.PseudoRandomLayout(grid = grid, n_macroparticles = 100))

# Add laser
sim.add_laser(
    laser,
    injection_method = laser_antenna)

# Add diagnostics
sim.add_diagnostic(particle_diag)
sim.add_diagnostic(field_diag)

# Write input file that can be used to run with the compiled version
sim.write_input_file(file_name = 'inputs_3d_picmi')

# Initialize inputs and WarpX instance
sim.initialize_inputs()
sim.initialize_warpx()

# Advance simulation until last time step
sim.step(max_steps)
You can copy this file from Examples/Physics_applications/laser_acceleration/inputs_3d.
#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 100           # for production, run for longer time, e.g. max_step = 1000
amr.n_cell = 32 32 256   # for production, run with finer mesh, e.g. amr.n_cell = 64 64 512
amr.max_grid_size = 64   # maximum size of each AMReX box, used to decompose the domain
amr.blocking_factor = 32 # minimum size of each AMReX box, used to decompose the domain
geometry.dims = 3
geometry.prob_lo     = -30.e-6   -30.e-6   -56.e-6    # physical domain
geometry.prob_hi     =  30.e-6    30.e-6    12.e-6
amr.max_level = 0 # Maximum level in hierarchy (1 might be unstable, >1 is not supported)
# warpx.fine_tag_lo = -5.e-6   -5.e-6   -50.e-6
# warpx.fine_tag_hi =  5.e-6    5.e-6   -30.e-6

#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic pec
boundary.field_hi = periodic periodic pec

#################################
############ NUMERICS ###########
#################################
warpx.verbose = 1
warpx.do_dive_cleaning = 0
warpx.use_filter = 1
warpx.cfl = 1. # if 1., the time step is set to its CFL limit
warpx.do_moving_window = 1
warpx.moving_window_dir = z
warpx.moving_window_v = 1.0 # units of speed of light
warpx.do_dynamic_scheduling = 0 # for production, set this to 1 (default)
warpx.serialize_initial_conditions = 1         # for production, set this to 0 (default)

# Order of particle shape factors
algo.particle_shape = 3

#################################
############ PLASMA #############
#################################
particles.species_names = electrons

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 1 1 1
electrons.xmin = -20.e-6
electrons.xmax =  20.e-6
electrons.ymin = -20.e-6
electrons.ymax =  20.e-6
electrons.zmin =  0
electrons.profile = constant
electrons.density = 2.e23  # number of electrons per m^3
electrons.momentum_distribution_type = "at_rest"
electrons.do_continuous_injection = 1
electrons.addIntegerAttributes = regionofinterest
electrons.attribute.regionofinterest(x,y,z,ux,uy,uz,t) = "(z>12.0e-6) * (z<13.0e-6)"
electrons.addRealAttributes = initialenergy
electrons.attribute.initialenergy(x,y,z,ux,uy,uz,t) = " ux*ux + uy*uy + uz*uz"

#################################
############ LASER  #############
#################################
lasers.names        = laser1
laser1.profile      = Gaussian
laser1.position     = 0. 0. 9.e-6        # This point is on the laser plane
laser1.direction    = 0. 0. 1.           # The plane normal direction
laser1.polarization = 0. 1. 0.           # The main polarization vector
laser1.e_max        = 16.e12             # Maximum amplitude of the laser field (in V/m)
laser1.profile_waist = 5.e-6             # The waist of the laser (in m)
laser1.profile_duration = 15.e-15        # The duration of the laser (in s)
laser1.profile_t_peak = 30.e-15          # Time at which the laser reaches its peak (in s)
laser1.profile_focal_distance = 100.e-6  # Focal distance from the antenna (in m)
laser1.wavelength = 0.8e-6               # The wavelength of the laser (in m)

# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 100
diag1.diag_type = Full
diag1.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz rho
diag1.format = openpmd

# Reduced Diagnostics
warpx.reduced_diags_names               = FP

FP.type = FieldProbe
FP.intervals = 10
FP.integrate = 0
FP.probe_geometry = Line
FP.x_probe = 0
FP.y_probe = 0
FP.z_probe = -56e-6
FP.x1_probe = 0
FP.y1_probe = 0
FP.z1_probe = 12e-6
FP.resolution = 300
FP.do_moving_window_FP = 1

This example can be run either as:

  • Python script: python3 PICMI_inputs_rz.py or

  • WarpX executable using an input file: warpx.rz inputs_3d max_step=400

You can copy this file from Examples/Physics_applications/laser_acceleration/PICMI_inputs_rz.py.
#!/usr/bin/env python3

from pywarpx import picmi

# Physical constants
c = picmi.constants.c
q_e = picmi.constants.q_e

# Number of time steps
max_steps = 10

# Number of cells
nr = 64
nz = 512

# Physical domain
rmin =  0
rmax =  30e-06
zmin = -56e-06
zmax =  12e-06

# Domain decomposition
max_grid_size = 64
blocking_factor = 32

# Create grid
grid = picmi.CylindricalGrid(
    number_of_cells = [nr, nz],
    n_azimuthal_modes = 2,
    lower_bound = [rmin, zmin],
    upper_bound = [rmax, zmax],
    lower_boundary_conditions = ['none', 'dirichlet'],
    upper_boundary_conditions = ['dirichlet', 'dirichlet'],
    lower_boundary_conditions_particles = ['absorbing', 'absorbing'],
    upper_boundary_conditions_particles = ['absorbing', 'absorbing'],
    moving_window_velocity = [0., c],
    warpx_max_grid_size = max_grid_size,
    warpx_blocking_factor = blocking_factor)

# Particles: plasma electrons
plasma_density = 2e23
plasma_xmin = -20e-06
plasma_ymin = None
plasma_zmin = 10e-06
plasma_xmax = 20e-06
plasma_ymax = None
plasma_zmax = None
uniform_distribution = picmi.UniformDistribution(
    density = plasma_density,
    lower_bound = [plasma_xmin, plasma_ymin, plasma_zmin],
    upper_bound = [plasma_xmax, plasma_ymax, plasma_zmax],
    fill_in = True)
electrons = picmi.Species(
    particle_type = 'electron',
    name = 'electrons',
    initial_distribution = uniform_distribution)

# Particles: beam electrons
q_tot = 1e-12
x_m = 0.
y_m = 0.
z_m = -28e-06
x_rms = 0.5e-06
y_rms = 0.5e-06
z_rms = 0.5e-06
ux_m = 0.
uy_m = 0.
uz_m = 500.
ux_th = 2.
uy_th = 2.
uz_th = 50.
gaussian_bunch_distribution = picmi.GaussianBunchDistribution(
    n_physical_particles = q_tot / q_e,
    rms_bunch_size = [x_rms, y_rms, z_rms],
    rms_velocity = [c*ux_th, c*uy_th, c*uz_th],
    centroid_position = [x_m, y_m, z_m],
    centroid_velocity = [c*ux_m, c*uy_m, c*uz_m])
beam = picmi.Species(
    particle_type = 'electron',
    name = 'beam',
    initial_distribution = gaussian_bunch_distribution)

# Laser
e_max = 16e12
position_z = 9e-06
profile_t_peak = 30.e-15
profile_focal_distance = 100e-06
laser = picmi.GaussianLaser(
    wavelength = 0.8e-06,
    waist = 5e-06,
    duration = 15e-15,
    focal_position = [0, 0, profile_focal_distance + position_z],
    centroid_position = [0, 0, position_z - c*profile_t_peak],
    propagation_direction = [0, 0, 1],
    polarization_direction = [0, 1, 0],
    E0 = e_max,
    fill_in = False)
laser_antenna = picmi.LaserAntenna(
    position = [0., 0., position_z],
    normal_vector = [0, 0, 1])

# Electromagnetic solver
solver = picmi.ElectromagneticSolver(
    grid = grid,
    method = 'Yee',
    cfl = 1.,
    divE_cleaning = 0)

# Diagnostics
diag_field_list = ['B', 'E', 'J', 'rho']
field_diag = picmi.FieldDiagnostic(
    name = 'diag1',
    grid = grid,
    period = 10,
    data_list = diag_field_list,
    warpx_dump_rz_modes = 1,
    write_dir = '.',
    warpx_file_prefix = 'Python_LaserAccelerationRZ_plt')
diag_particle_list = ['weighting', 'momentum']
particle_diag = picmi.ParticleDiagnostic(
    name = 'diag1',
    period = 10,
    species = [electrons, beam],
    data_list = diag_particle_list,
    write_dir = '.',
    warpx_file_prefix = 'Python_LaserAccelerationRZ_plt')

# Set up simulation
sim = picmi.Simulation(
    solver = solver,
    max_steps = max_steps,
    verbose = 1,
    particle_shape = 'cubic',
    warpx_use_filter = 0)

# Add plasma electrons
sim.add_species(
    electrons,
    layout = picmi.GriddedLayout(grid = grid, n_macroparticle_per_cell = [1, 4, 1]))

# Add beam electrons
sim.add_species(
    beam,
    layout = picmi.PseudoRandomLayout(grid = grid, n_macroparticles = 100))

# Add laser
sim.add_laser(
    laser,
    injection_method = laser_antenna)

# Add diagnostics
sim.add_diagnostic(field_diag)
sim.add_diagnostic(particle_diag)

# Write input file that can be used to run with the compiled version
sim.write_input_file(file_name = 'inputs_rz_picmi')

# Initialize inputs and WarpX instance
sim.initialize_inputs()
sim.initialize_warpx()

# Advance simulation until last time step
sim.step(max_steps)
You can copy this file from Examples/Physics_applications/laser_acceleration/inputs_rz.
#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 10
amr.n_cell =  64  512
amr.max_grid_size = 64   # maximum size of each AMReX box, used to decompose the domain
amr.blocking_factor = 32 # minimum size of each AMReX box, used to decompose the domain
geometry.dims = RZ
geometry.prob_lo     =   0.   -56.e-6    # physical domain
geometry.prob_hi     =  30.e-6    12.e-6
amr.max_level = 0 # Maximum level in hierarchy (1 might be unstable, >1 is not supported)

warpx.n_rz_azimuthal_modes = 2

boundary.field_lo = none pec
boundary.field_hi = pec pec

#################################
############ NUMERICS ###########
#################################
warpx.verbose = 1
warpx.do_dive_cleaning = 0
warpx.use_filter = 1
warpx.filter_npass_each_dir = 0 1
warpx.cfl = 1. # if 1., the time step is set to its CFL limit
warpx.do_moving_window = 1
warpx.moving_window_dir = z
warpx.moving_window_v = 1.0 # units of speed of light

# Order of particle shape factors
algo.particle_shape = 3

#################################
############ PLASMA #############
#################################
particles.species_names = electrons beam

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 1 4 1
electrons.xmin = -20.e-6
electrons.xmax =  20.e-6
electrons.zmin =  10.e-6
electrons.profile = constant
electrons.density = 2.e23  # number of electrons per m^3
electrons.momentum_distribution_type = "at_rest"
electrons.do_continuous_injection = 1
electrons.addRealAttributes = orig_x orig_z
electrons.attribute.orig_x(x,y,z,ux,uy,uz,t) = "x"
electrons.attribute.orig_z(x,y,z,ux,uy,uz,t) = "z"

beam.charge = -q_e
beam.mass = m_e
beam.injection_style = "gaussian_beam"
beam.x_rms = .5e-6
beam.y_rms = .5e-6
beam.z_rms = .5e-6
beam.x_m = 0.
beam.y_m = 0.
beam.z_m = -28.e-6
beam.npart = 100
beam.q_tot = -1.e-12
beam.momentum_distribution_type = "gaussian"
beam.ux_m = 0.0
beam.uy_m = 0.0
beam.uz_m = 500.
beam.ux_th = 2.
beam.uy_th = 2.
beam.uz_th = 50.

#################################
############ LASER ##############
#################################
lasers.names        = laser1
laser1.profile      = Gaussian
laser1.position     = 0. 0. 9.e-6        # This point is on the laser plane
laser1.direction    = 0. 0. 1.           # The plane normal direction
laser1.polarization = 0. 1. 0.           # The main polarization vector
laser1.e_max        = 16.e12             # Maximum amplitude of the laser field (in V/m)
laser1.profile_waist = 5.e-6             # The waist of the laser (in m)
laser1.profile_duration = 15.e-15        # The duration of the laser (in s)
laser1.profile_t_peak = 30.e-15          # Time at which the laser reaches its peak (in s)
laser1.profile_focal_distance = 100.e-6  # Focal distance from the antenna (in m)
laser1.wavelength = 0.8e-6               # The wavelength of the laser (in m)

# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 10
diag1.diag_type = Full
diag1.fields_to_plot = Er Et Ez Br Bt Bz jr jt jz rho
diag1.electrons.variables = w ux uy uz orig_x orig_z
diag1.beam.variables = w ux uy uz
Analyze

Note

This section is TODO.

Visualize

You can run the following script to visualize the beam evolution over time:

Script plot_3d.py
You can copy this file from Examples/Physics_applications/laser_acceleration/plot_3d.py diags/diag1000400/.
#!/usr/bin/env python3

# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl
# License: BSD-3-Clause-LBNL
#
# This is a script plots the wakefield of an LWFA simulation.

import sys

import matplotlib.pyplot as plt
import yt

yt.funcs.mylog.setLevel(50)


def plot_lwfa():
    # this will be the name of the plot file
    fn = sys.argv[1]

    # Read the file
    ds = yt.load(fn)

    # plot the laser field and absolute density
    fields = ["Ey", "rho"]
    normal = "y"
    sl = yt.SlicePlot(ds, normal=normal, fields=fields)
    for field in fields:
        sl.set_log(field, False)

    sl.set_figure_size((4, 8))
    fig = sl.export_to_mpl_figure(nrows_ncols=(2, 1))
    fig.tight_layout()
    plt.show()

if __name__ == "__main__":
    plot_lwfa()
(top) Electric field of the laser pulse and (bottom) absolute density.

(top) Electric field of the laser pulse and (bottom) absolute density.

Beam-Driven Wakefield Acceleration of Electrons

This example shows how to model a beam-driven plasma-wakefield accelerator (PWFA) [2, 3].

PWFA is best performed in 3D or quasi-cylindrical (RZ) geometry, in order to correctly capture some of the key physics (structure of the space-charge fields, beamloading, shape of the accelerating bubble in the blowout regime, etc.). For physical situations that have close-to-cylindrical symmetry, simulations in RZ geometry capture the relevant physics at a fraction of the computational cost of a 3D simulation. On the other hand, for physical situation with strong asymmetries (e.g., non-round driver, strong hosing of the accelerated beam, etc.), only 3D simulations are suitable.

Additionally, to speed up computation, this example uses the boosted frame method to effectively model long acceleration lengths.

Alternatively, an other common approximation for PWFAs is quasi-static modeling, e.g., if effects such as self-injection can be ignored. In the Beam, Plasma & Accelerator Simulation Toolkit (BLAST), HiPACE++ provides such methods.

Note

TODO: The Python (PICMI) input file should use the boosted frame method, like the inputs_3d_boost file.

Run

This example can be run either as:

  • Python script: python3 PICMI_inputs_plasma_acceleration.py or

  • WarpX executable using an input file: warpx.3d inputs_3d_boost

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

Note

TODO: This input file should use the boosted frame method, like the inputs_3d_boost file.

You can copy this file from Examples/Physics_applications/plasma_acceleration/PICMI_inputs_plasma_acceleration.py.
#!/usr/bin/env python3

from pywarpx import picmi

#from warp import picmi

constants = picmi.constants

nx = 64
ny = 64
nz = 64

xmin = -200.e-6
xmax = +200.e-6
ymin = -200.e-6
ymax = +200.e-6
zmin = -200.e-6
zmax = +200.e-6

moving_window_velocity = [0., 0., constants.c]

number_per_cell_each_dim = [2, 2, 1]

max_steps = 10

grid = picmi.Cartesian3DGrid(number_of_cells = [nx, ny, nz],
                             lower_bound = [xmin, ymin, zmin],
                             upper_bound = [xmax, ymax, zmax],
                             lower_boundary_conditions = ['periodic', 'periodic', 'open'],
                             upper_boundary_conditions = ['periodic', 'periodic', 'open'],
                             lower_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
                             upper_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
                             moving_window_velocity = moving_window_velocity,
                             warpx_max_grid_size=32)

solver = picmi.ElectromagneticSolver(grid=grid, cfl=1)

beam_distribution = picmi.UniformDistribution(density = 1.e23,
                                              lower_bound = [-20.e-6, -20.e-6, -150.e-6],
                                              upper_bound = [+20.e-6, +20.e-6, -100.e-6],
                                              directed_velocity = [0., 0., 1.e9])

plasma_distribution = picmi.UniformDistribution(density = 1.e22,
                                                lower_bound = [-200.e-6, -200.e-6, 0.],
                                                upper_bound = [+200.e-6, +200.e-6, None],
                                                fill_in = True)

beam = picmi.Species(particle_type='electron', name='beam', initial_distribution=beam_distribution)
plasma = picmi.Species(particle_type='electron', name='plasma', initial_distribution=plasma_distribution)

sim = picmi.Simulation(solver = solver,
                       max_steps = max_steps,
                       verbose = 1,
                       warpx_current_deposition_algo = 'esirkepov',
                       warpx_use_filter = 0)

sim.add_species(beam, layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=number_per_cell_each_dim))
sim.add_species(plasma, layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=number_per_cell_each_dim))

field_diag = picmi.FieldDiagnostic(name = 'diag1',
                                   grid = grid,
                                   period = max_steps,
                                   data_list = ['Ex', 'Ey', 'Ez', 'Jx', 'Jy', 'Jz', 'part_per_cell'],
                                   write_dir = '.',
                                   warpx_file_prefix = 'Python_PlasmaAcceleration_plt')

part_diag = picmi.ParticleDiagnostic(name = 'diag1',
                                     period = max_steps,
                                     species = [beam, plasma],
                                     data_list = ['ux', 'uy', 'uz', 'weighting'])

sim.add_diagnostic(field_diag)
sim.add_diagnostic(part_diag)

# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name = 'inputs_from_PICMI')

# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()
You can copy this file from Examples/Physics_applications/plasma_acceleration/inputs_3d_boost.
#################################
####### GENERAL PARAMETERS ######
#################################
stop_time = 3.93151387287e-11
amr.n_cell = 32 32 256
amr.max_grid_size = 64
amr.blocking_factor = 32
amr.max_level = 0
geometry.dims = 3
geometry.prob_lo = -0.00015 -0.00015 -0.00012
geometry.prob_hi = 0.00015 0.00015 1.e-06

#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic pml
boundary.field_hi = periodic periodic pml

#################################
############ NUMERICS ###########
#################################
algo.maxwell_solver = ckc
warpx.verbose = 1
warpx.do_dive_cleaning = 0
warpx.use_filter = 1
warpx.cfl = .99
warpx.do_moving_window = 1
warpx.moving_window_dir = z
warpx.moving_window_v = 1. # in units of the speed of light
my_constants.lramp = 8.e-3
my_constants.dens  = 1e+23

# Order of particle shape factors
algo.particle_shape = 3

#################################
######### BOOSTED FRAME #########
#################################
warpx.gamma_boost = 10.0
warpx.boost_direction = z

#################################
############ PLASMA #############
#################################
particles.species_names = driver plasma_e plasma_p beam driverback
particles.use_fdtd_nci_corr = 1
particles.rigid_injected_species = driver beam

driver.charge = -q_e
driver.mass = 1.e10
driver.injection_style = "gaussian_beam"
driver.x_rms = 2.e-6
driver.y_rms = 2.e-6
driver.z_rms = 4.e-6
driver.x_m = 0.
driver.y_m = 0.
driver.z_m = -20.e-6
driver.npart = 1000
driver.q_tot = -1.e-9
driver.momentum_distribution_type = "gaussian"
driver.ux_m = 0.0
driver.uy_m = 0.0
driver.uz_m = 200000.
driver.ux_th = 2.
driver.uy_th = 2.
driver.uz_th = 20000.
driver.zinject_plane = 0.
driver.rigid_advance = true

driverback.charge = q_e
driverback.mass = 1.e10
driverback.injection_style = "gaussian_beam"
driverback.x_rms = 2.e-6
driverback.y_rms = 2.e-6
driverback.z_rms = 4.e-6
driverback.x_m = 0.
driverback.y_m = 0.
driverback.z_m = -20.e-6
driverback.npart = 1000
driverback.q_tot = 1.e-9
driverback.momentum_distribution_type = "gaussian"
driverback.ux_m = 0.0
driverback.uy_m = 0.0
driverback.uz_m = 200000.
driverback.ux_th = 2.
driverback.uy_th = 2.
driverback.uz_th = 20000.
driverback.do_backward_propagation = true

plasma_e.charge = -q_e
plasma_e.mass = m_e
plasma_e.injection_style = "NUniformPerCell"
plasma_e.zmin = -100.e-6 # 0.e-6
plasma_e.zmax = 0.2
plasma_e.xmin = -70.e-6
plasma_e.xmax =  70.e-6
plasma_e.ymin = -70.e-6
plasma_e.ymax =  70.e-6
# plasma_e.profile = constant
# plasma_e.density = 1.e23
plasma_e.profile = parse_density_function
plasma_e.density_function(x,y,z) = "(z<lramp)*0.5*(1-cos(pi*z/lramp))*dens+(z>lramp)*dens"
plasma_e.num_particles_per_cell_each_dim = 1 1 1
plasma_e.momentum_distribution_type = "at_rest"
plasma_e.do_continuous_injection = 1

plasma_p.charge = q_e
plasma_p.mass = m_p
plasma_p.injection_style = "NUniformPerCell"
plasma_p.zmin = -100.e-6 # 0.e-6
plasma_p.zmax = 0.2
# plasma_p.profile = "constant"
# plasma_p.density = 1.e23
plasma_p.profile = parse_density_function
plasma_p.density_function(x,y,z) = "(z<lramp)*0.5*(1-cos(pi*z/lramp))*dens+(z>lramp)*dens"
plasma_p.xmin = -70.e-6
plasma_p.xmax =  70.e-6
plasma_p.ymin = -70.e-6
plasma_p.ymax =  70.e-6
plasma_p.num_particles_per_cell_each_dim = 1 1 1
plasma_p.momentum_distribution_type = "at_rest"
plasma_p.do_continuous_injection = 1

beam.charge = -q_e
beam.mass = m_e
beam.injection_style = "gaussian_beam"
beam.x_rms = .5e-6
beam.y_rms = .5e-6
beam.z_rms = 1.e-6
beam.x_m = 0.
beam.y_m = 0.
beam.z_m = -100.e-6
beam.npart = 1000
beam.q_tot = -5.e-10
beam.momentum_distribution_type = "gaussian"
beam.ux_m = 0.0
beam.uy_m = 0.0
beam.uz_m = 2000.
beam.ux_th = 2.
beam.uy_th = 2.
beam.uz_th = 200.
beam.zinject_plane = .8e-3
beam.rigid_advance = true

# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 10000
diag1.diag_type = Full
Analyze

Note

This section is TODO.

Visualize

Note

This section is TODO.

In-Depth: PWFA

As described in the Introduction, one of the key applications of the WarpX exascale computing platform is in modelling future, compact and economic plasma-based accelerators. In this section we describe the simulation setup of a realistic electron beam driven plasma wakefield accelerator (PWFA) configuration. For illustration purposes the setup can be explored with WarpX using the example input file PWFA.

The simulation setup consists of 4 particle species: drive beam (driver), witness beam (beam), plasma electrons (plasma_e), and plasma ions (plasma_p). The species physical parameters are summarized in the following table.

Species

Parameters

driver

\(\gamma\) = 48923; N = 2x10^8; \(\sigma_z\) = 4.0 um; \(\sigma_x\) = 2.0 um

beam

\(\gamma\) = 48923; N = 6x10^5; \(\sigma_z\) = 1.0 mm; \(\sigma_x\) = 0.5 um

plasma_e

n = 1x10^23 m^-3; w = 70 um; lr = 8 mm; L = 200 mm

plasma_p

n = 1x10^23 m^-3; w = 70 um; lr = 8 mm; L = 200 mm

Where \(\gamma\) is the beam relativistic Lorentz factor, N is the number of particles, and \(\sigma_x\), \(\sigma_y\), \(\sigma_z\) are the beam widths (root-mean-squares of particle positions) in the transverse (x,y) and longitudinal directions.

The plasma, of total length L, has a density profile that consists of a lr long linear up-ramp, ranging from 0 to peak value n, is uniform within a transverse width of w and after the up-ramp.

With this configuration the driver excites a nonlinear plasma wake and drives the bubble depleted of plasma electrons where the beam accelerates, as can be seen in Fig. [fig:PWFA].

[fig:PWFA] Plot of 2D PWFA example at dump 1000

[fig:PWFA] Plot of the driver (blue), beam (red) and plasma_e (green) electron macroparticle distribution at the time step 1000 of the example simulation. These are overlapping the 2D plot of the longitudinal electric field showing the accelerating/deccelerating (red/blue) regions of the plasma bubble.

Listed below are the key arguments and best-practices relevant for choosing the pwfa simulation parameters used in the example.

2D Geometry

2D cartesian (with longitudinal direction z and transverse x) geometry simulations can give valuable physical and numerical insight into the simulation requirements and evolution. At the same time it is much less time consuming than the full 3D cartesian or cylindrical geometries.

Finite Difference Time Domain

For standard plasma wakefield configurations, it is possible to model the physics correctly using the Particle-In-Cell (PIC) Finite Difference Time Domain (FDTD) algorithms. If the simulation contains localised extremely high intensity fields, however, numerical instabilities might arise, such as the numerical Cherenkov instability (Moving window and optimal Lorentz boosted frame). In that case, it is recommended to use the Pseudo Spectral Analytical Time Domain (PSATD) or the Pseudo-Spectral Time-Domain (PSTD) algorithms. In the example we are describing, it is sufficient to use FDTD.

Cole-Karkkainen solver with Cowan coefficients

There are two FDTD Maxwell field solvers that compute the field push implemented in WarpX: the Yee and Cole-Karkkainen solver with Cowan coefficients (CKC) solvers. The later includes a modification that allows the numerical dispersion of light in vacuum to be exact, and that is why we choose CKC for the example.

Lorentz boosted frame

WarpX simulations can be done in the laboratory or Lorentz-boosted frames. In the laboratory frame, there is typically no need to model the plasma ions species, since they are mainly stationary during the short time scales associated with the motion of plasma electrons. In the boosted frame, that argument is no longer valid, as ions have relativistic velocities. The boosted frame still results in a substantial reduction to the simulation computational cost.

Note

Even if the simulations uses the boosted frame, most of its input file parameters are defined in respect to the laboratory frame.

We recommend that you design your numerical setup so that the width of the box is not significantly narrower than the distance from 0 to its right edge (done, for example, by setting the right edge equal to 0).

Moving window

To avoid having to simulate the whole 0.2 mm of plasma with the high resolution that is required to model the beam and plasma interaction correctly, we use the moving window. In this way we define a simulation box (grid) with a fixed size that travels at the speed-of-light (\(c\)), i.e. follows the beam.

Note

When using moving window the option of continuous injection needs to be active for all particles initialized outside of the simulation box.

Resolution

Longitudinal and transverse resolutions (i.e. number and dimensions of the PIC grid cells) should be chosen to accurately describe the physical processes taking place in the simulation. Convergence scans, where resolution in both directions is gradually increased, should be used to determine the optimal configuration. Multiple cells per beam length and width are recommended (our illustrative example resolution is coarse).

Note

To avoid spurious effects, in the boosted frame, we consider the constraint that the transverse cell size should be larger than the transverse one. Translating this condition to the cell transverse (\(d_{x}\)) and longitudinal dimensions (\(d_{z}\)) in the laboratory frame leads to: \(d_{x} > (d_{z} (1+\beta_{b}) \gamma_{b})\), where \(\beta_{b}\) is the boosted frame velocity in units of \(c\).

Time step

The time step (\(dt\)) is used to iterated over the main PIC loop and is computed by WarpX differently depending on the Maxwell field FDTD solvers used:

  • For Yee is equal to the CFL parameter chosen in the input file (Parameters: Inputs File) times the Courant–Friedrichs–Lewy condition (CFL) that follows the analytical expression in Particle-in-Cell Method

  • For CKC is equal to CFL times the minimum between the boosted frame cell dimensions

where CFL is chosen to be below unity and set an optimal trade-off between making the simulation faster and avoiding NCI and other spurious effects.

Duration of the simulation

To determine the total number of time steps of the simulation, we could either set the <zmax_plasma_to_compute_max_step> parameter to the end of the plasma (\(z_{\textrm{end}}\)), or compute it using:

  • boosted frame edge of the simulation box, \(\textrm{corner} = l_{e}/ ((1-\beta_{b}) \gamma_{b})\)

  • time of interaction in the boosted frame, \(T = \frac{z_{\textrm{end}}/\gamma_{b}-\textrm{corner}}{c (1+\beta_{b})}\)

  • total number of iterations, \(i_{\textrm{max}} = T/dt\)

where \(l_{e}\) is the position of the left edge of the simulation box (in respect to propagation direction).

Plotfiles and snapshots

WarpX allows the data to be stored in different formats, such as plotfiles (following the yt guidelines), hdf5 and openPMD (following its standard). In the example, we are dumping plotfiles with boosted frame information on the simulation particles and fields. We are also requesting back transformed diagnostics that transform that information back to the laboratory frame. The diagnostics results are analysed and stored in snapshots at each time step and so it is best to make sure that the run does not end before filling the final snapshot.

Maximum grid size and blocking factor

These parameters are carfully chosen to improve the code parallelization, load-balancing and performance (Parameters: Inputs File) for each numerical configuration. They define the smallest and largest number of cells that can be contained in each simulation box and are carefully defined in the AMReX documentation.

Laser-Plasma Interaction

Laser-Ion Acceleration with a Planar Target

This example shows how to model laser-ion acceleration with planar targets of solid density [4, 5, 6]. The acceleration mechanism in this scenario depends on target parameters.

Although laser-ion acceleration requires full 3D modeling for adequate description of the acceleration dynamics, especially the acceleration field lengths and decay times, this example models a 2D example. 2D modeling can often hint at a qualitative overview of the dynamics, but mostly saves computational costs since the plasma frequency (and Debye length) of the plasma determines the resolution need in laser-solid interaction modeling.

Note

The resolution of this 2D case is extremely low by default. This includes spatial and temporal resolution, but also the number of macro-particles per cell representing the target density for proper phase space sampling. You will need a computing cluster for adequate resolution of the target density, see comments in the input file.

Run

This example can be run either as:

  • Python script: mpiexec -n 2 python3 PICMI_inputs_2d.py or

  • WarpX executable using an input file: mpiexec -n 2 warpx.2d inputs_2d

Tip

For MPI-parallel runs on computing clusters, change the prefix to mpiexec -n <no. of MPI ranks> ... or srun -n <no. of MPI ranks> ..., depending on the system and number of MPI ranks you want to allocate.

The input option warpx_numprocs / warpx.numprocs needs to be adjusted for parallel domain decomposition, to match the number of MPI ranks used. In order to use dynamic load balancing, use the more general method of setting blocks.

You can copy this file from Examples/Physics_applications/laser_ion/PICMI_inputs_2d.py.
#!/usr/bin/env python3

from pywarpx import picmi

# Physical constants
c = picmi.constants.c
q_e = picmi.constants.q_e

# We only run 100 steps for tests
# Disable `max_step` below to run until the physical `stop_time`.
max_step = 100
# time-scale with highly kinetic dynamics
stop_time = 0.2e-12

# proper resolution for 30 n_c (dx<=3.33nm) incl. acc. length
# (>=6x V100)
# --> choose larger `max_grid_size` and `blocking_factor` for 1 to 8 grids per GPU accordingly
#nx = 7488
#nz = 14720

# Number of cells
nx = 384
nz = 512

# Domain decomposition (deactivate `warpx_numprocs` in `picmi.Simulation` for this to take effect)
max_grid_size = 64
blocking_factor = 32

# Physical domain
xmin = -7.5e-06
xmax = 7.5e-06
zmin = -5.0e-06
zmax = 25.0e-06

# Create grid
grid = picmi.Cartesian2DGrid(
    number_of_cells=[nx, nz],
    lower_bound=[xmin, zmin],
    upper_bound=[xmax, zmax],
    lower_boundary_conditions=['open', 'open'],
    upper_boundary_conditions=['open', 'open'],
    lower_boundary_conditions_particles=['absorbing', 'absorbing'],
    upper_boundary_conditions_particles=['absorbing', 'absorbing'],
    warpx_max_grid_size=max_grid_size,
    warpx_blocking_factor=blocking_factor)

# Particles: plasma parameters
# critical plasma density
nc = 1.742e27  # [m^-3]  1.11485e21 * 1.e6 / 0.8**2
# number density: "fully ionized" electron density as reference
#   [material 1] cryogenic H2
n0 = 30.0  # [n_c]
#   [material 2] liquid crystal
# n0 = 192
#   [material 3] PMMA
# n0 = 230
#   [material 4] Copper (ion density: 8.49e28/m^3; times ionization level)
# n0 = 1400
plasma_density = n0 * nc
preplasma_L = 0.05e-6  # [m] scale length (>0)
preplasma_Lcut = 2.0e-6  # [m] hard cutoff from surface
plasma_r0 = 2.5e-6  # [m] radius or half-thickness
plasma_eps_z = 0.05e-6  # [m] small offset in z to make zmin, zmax interval larger than 2*(r0 + Lcut)
plasma_creation_limit_z = plasma_r0 + preplasma_Lcut + plasma_eps_z  # [m] upper limit in z for particle creation

plasma_xmin = None
plasma_ymin = None
plasma_zmin = -plasma_creation_limit_z
plasma_xmax = None
plasma_ymax = None
plasma_zmax = plasma_creation_limit_z

density_expression_str = f'{plasma_density}*((abs(z)<={plasma_r0}) + (abs(z)<{plasma_r0}+{preplasma_Lcut}) * (abs(z)>{plasma_r0}) * exp(-(abs(z)-{plasma_r0})/{preplasma_L}))'

slab_with_ramp_dist_hydrogen = picmi.AnalyticDistribution(
    density_expression=density_expression_str,
    lower_bound=[plasma_xmin, plasma_ymin, plasma_zmin],
    upper_bound=[plasma_xmax, plasma_ymax, plasma_zmax]
)

# thermal velocity spread for electrons in gamma*beta
ux_th = .01
uz_th = .01

slab_with_ramp_dist_electrons = picmi.AnalyticDistribution(
    density_expression=density_expression_str,
    lower_bound=[plasma_xmin, plasma_ymin, plasma_zmin],
    upper_bound=[plasma_xmax, plasma_ymax, plasma_zmax],
    # if `momentum_expressions` and `momentum_spread_expressions` are unset,
    # a Gaussian momentum distribution is assumed given that `rms_velocity` has any non-zero elements
    rms_velocity=[c*ux_th, 0., c*uz_th]  # thermal velocity spread in m/s
)

# TODO: add additional attributes orig_x and orig_z
electrons = picmi.Species(
    particle_type='electron',
    name='electrons',
    initial_distribution=slab_with_ramp_dist_electrons,
)

# TODO: add additional attributes orig_x and orig_z
hydrogen = picmi.Species(
    particle_type='proton',
    name='hydrogen',
    initial_distribution=slab_with_ramp_dist_hydrogen
)

# Laser
# e_max = a0 * 3.211e12 / lambda_0[mu]
#   a0 = 16, lambda_0 = 0.8mu -> e_max = 64.22 TV/m
e_max = 64.22e12
position_z = -4.0e-06
profile_t_peak = 50.e-15
profile_focal_distance = 4.0e-06
laser = picmi.GaussianLaser(
    wavelength=0.8e-06,
    waist=4.e-06,
    duration=30.e-15,
    focal_position=[0, 0, profile_focal_distance + position_z],
    centroid_position=[0, 0, position_z - c * profile_t_peak],
    propagation_direction=[0, 0, 1],
    polarization_direction=[1, 0, 0],
    E0=e_max,
    fill_in=False)
laser_antenna = picmi.LaserAntenna(
    position=[0., 0., position_z],
    normal_vector=[0, 0, 1])

# Electromagnetic solver
solver = picmi.ElectromagneticSolver(
    grid=grid,
    method='Yee',
    cfl=0.999,
    divE_cleaning=0,
    #warpx_pml_ncell=10
)

# Diagnostics
particle_diag = picmi.ParticleDiagnostic(
    name='Python_LaserIonAcc2d_plt',
    period=100,
    write_dir='./diags',
    warpx_format='openpmd',
    warpx_openpmd_backend='h5',
    # demonstration of a spatial and momentum filter
    warpx_plot_filter_function='(uz>=0) * (x<1.0e-6) * (x>-1.0e-6)'
)
# reduce resolution of output fields
coarsening_ratio = [4, 4]
ncell_field = []
for (ncell_comp, cr) in zip([nx,nz], coarsening_ratio):
    ncell_field.append(int(ncell_comp/cr))
field_diag = picmi.FieldDiagnostic(
    name='Python_LaserIonAcc2d_plt',
    grid=grid,
    period=100,
    number_of_cells=ncell_field,
    data_list=['B', 'E', 'J', 'rho', 'rho_electrons', 'rho_hydrogen'],
    write_dir='./diags',
    warpx_format='openpmd',
    warpx_openpmd_backend='h5'
)

particle_fw_diag = picmi.ParticleDiagnostic(
    name='openPMDfw',
    period=100,
    write_dir='./diags',
    warpx_format='openpmd',
    warpx_openpmd_backend='h5',
    warpx_plot_filter_function='(uz>=0) * (x<1.0e-6) * (x>-1.0e-6)'
)

particle_bw_diag = picmi.ParticleDiagnostic(
    name='openPMDbw',
    period=100,
    write_dir='./diags',
    warpx_format='openpmd',
    warpx_openpmd_backend='h5',
    warpx_plot_filter_function='(uz<0)'
)

# histograms with 2.0 degree acceptance angle in fw direction
# 2 deg * pi / 180 : 0.03490658503 rad
# half-angle +/-   : 0.017453292515 rad
histuH_rdiag = picmi.ReducedDiagnostic(
    diag_type='ParticleHistogram',
    name='histuH',
    period=100,
    species=hydrogen,
    bin_number=1000,
    bin_min=0.0,
    bin_max=0.474,  # 100 MeV protons
    histogram_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)',
    filter_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)')

histue_rdiag = picmi.ReducedDiagnostic(
    diag_type='ParticleHistogram',
    name='histue',
    period=100,
    species=electrons,
    bin_number=1000,
    bin_min=0.0,
    bin_max=197.0,  # 100 MeV electrons
    histogram_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)',
    filter_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)')

# just a test entry to make sure that the histogram filter is purely optional:
# this one just records uz of all hydrogen ions, independent of their pointing
histuzAll_rdiag = picmi.ReducedDiagnostic(
    diag_type='ParticleHistogram',
    name='histuzAll',
    period=100,
    species=hydrogen,
    bin_number=1000,
    bin_min=-0.474,
    bin_max=0.474,
    histogram_function='uz')

field_probe_z_rdiag = picmi.ReducedDiagnostic(
    diag_type='FieldProbe',
    name='FieldProbe_Z',
    period=100,
    integrate=0,
    probe_geometry='Line',
    x_probe=0.0,
    z_probe=-5.0e-6,
    x1_probe=0.0,
    z1_probe=25.0e-6,
    resolution=3712)

field_probe_scat_point_rdiag = picmi.ReducedDiagnostic(
    diag_type='FieldProbe',
    name='FieldProbe_ScatPoint',
    period=1,
    integrate=0,
    probe_geometry='Point',
    x_probe=0.0,
    z_probe=15.0e-6)

field_probe_scat_line_rdiag = picmi.ReducedDiagnostic(
    diag_type='FieldProbe',
    name='FieldProbe_ScatLine',
    period=100,
    integrate=1,
    probe_geometry='Line',
    x_probe=-2.5e-6,
    z_probe=15.0e-6,
    x1_probe=2.5e-6,
    z1_probe=15e-6,
    resolution=201)

load_balance_costs_rdiag = picmi.ReducedDiagnostic(
    diag_type='LoadBalanceCosts',
    name='LBC',
    period=100)

# Set up simulation
sim = picmi.Simulation(
    solver=solver,
    max_time=stop_time,  # need to remove `max_step` to run this far
    verbose=1,
    particle_shape='cubic',
    warpx_numprocs=[1, 2],  # deactivate `numprocs` for dynamic load balancing
    warpx_use_filter=1,
    warpx_load_balance_intervals=100,
    warpx_load_balance_costs_update='heuristic'
)

# Add plasma electrons
sim.add_species(
    electrons,
    layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[2,2])
    # for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
    #layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[4,8])
)

# Add hydrogen ions
sim.add_species(
    hydrogen,
    layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[2,2])
    # for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
    #layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[4,8])
)

# Add laser
sim.add_laser(
    laser,
    injection_method=laser_antenna)

# Add full diagnostics
sim.add_diagnostic(particle_diag)
sim.add_diagnostic(field_diag)
sim.add_diagnostic(particle_fw_diag)
sim.add_diagnostic(particle_bw_diag)
# Add reduced diagnostics
sim.add_diagnostic(histuH_rdiag)
sim.add_diagnostic(histue_rdiag)
sim.add_diagnostic(histuzAll_rdiag)
sim.add_diagnostic(field_probe_z_rdiag)
sim.add_diagnostic(field_probe_scat_point_rdiag)
sim.add_diagnostic(field_probe_scat_line_rdiag)
sim.add_diagnostic(load_balance_costs_rdiag)
# TODO: make ParticleHistogram2D available

# Write input file that can be used to run with the compiled version
sim.write_input_file(file_name='inputs_2d_picmi')

# Initialize inputs and WarpX instance
sim.initialize_inputs()
sim.initialize_warpx()

# Advance simulation until last time step
sim.step(max_step)
You can copy this file from Examples/Physics_applications/laser_ion/inputs_2d.
#################################
# Domain, Resolution & Numerics
#

# We only run 100 steps for tests
# Disable `max_step` below to run until the physical `stop_time`.
max_step = 100
# time-scale with highly kinetic dynamics
stop_time = 0.2e-12            # [s]
# time-scale for converged ion energy
#   notes: - effective acc. time depends on laser pulse
#          - ions will start to leave the box
#stop_time = 1.0e-12           # [s]

# quick tests at ultra-low res. (for CI, and local computer)
amr.n_cell = 384 512

# proper resolution for 10 n_c excl. acc. length
# (>=1x V100)
#amr.n_cell = 2688 3712

# proper resolution for 30 n_c (dx<=3.33nm) incl. acc. length
# (>=6x V100)
#amr.n_cell = 7488 14720

# simulation box, no MR
#   note: increase z (space & cells) for converging ion energy
amr.max_level = 0
geometry.dims = 2
geometry.prob_lo = -7.5e-6 -5.e-6
geometry.prob_hi =  7.5e-6 25.e-6

# Boundary condition
boundary.field_lo = pml pml
boundary.field_hi = pml pml

# Order of particle shape factors
algo.particle_shape = 3

# improved plasma stability for 2D with very low initial target temperature
# when using Esirkepov current deposition with energy-conserving field gather
interpolation.galerkin_scheme = 0

# numerical tuning
warpx.cfl = 0.999
warpx.use_filter = 1          # bilinear current/charge filter


#################################
# Performance Tuning
#
# simple tuning:
#   the numprocs product must be equal to the number of MPI ranks and splits
#   the domain on the coarsest level equally into grids;
#   slicing in the 2nd dimension is preferred for ideal performance
warpx.numprocs = 1 2   # 2 MPI ranks
#warpx.numprocs = 1 4  # 4 MPI ranks

# detail tuning instead of warpx.numprocs:
#   It is important to have enough cells in a block & grid, otherwise
#   performance will suffer.
#   Use larger values for GPUs, try to fill a GPU well with memory and place
#   few large grids on each device (you can go as low as 1 large grid / device
#   if you do not need load balancing).
#   Slicing in the 2nd dimension is preferred for ideal performance
#amr.blocking_factor = 64
#amr.max_grid_size_x = 2688
#amr.max_grid_size_y = 128  # this is confusingly named and means z in 2D

# load balancing
#   The grid & block parameters above are needed for load balancing:
#   an average of ~10 grids per MPI rank (and device) are a good granularity
#   to allow efficient load-balancing as the simulation evolves
algo.load_balance_intervals = 100
algo.load_balance_costs_update = Heuristic

# particle bin-sorting on GPU (ideal defaults not investigated in 2D)
#   Try larger values than the defaults below and report back! :)
#warpx.sort_intervals = 4    # default on CPU: -1 (off); on GPU: 4
#warpx.sort_bin_size = 1 1 1


#################################
# Target Profile
#

#   definitions for target extent and pre-plasma
my_constants.L    = 0.05e-6            # [m] scale length (>0)
my_constants.Lcut = 2.0e-6             # [m] hard cutoff from surface
my_constants.r0 = 2.5e-6               # [m] radius or half-thickness
my_constants.eps_z = 0.05e-6           # [m] small offset in z to make zmin, zmax interval larger than 2*(r0 + Lcut)
my_constants.zmax = r0 + Lcut + eps_z  # [m] upper limit in z for particle creation

particles.species_names = electrons hydrogen

# particle species
hydrogen.species_type = hydrogen
hydrogen.injection_style = NUniformPerCell
hydrogen.num_particles_per_cell_each_dim = 2 2
# for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
#hydrogen.num_particles_per_cell_each_dim = 4 8
hydrogen.momentum_distribution_type = at_rest
# minimum and maximum z position between which particles are initialized
# --> should be set for dense targets limit memory consumption during initialization
hydrogen.zmin = -zmax
hydrogen.zmax = zmax
hydrogen.profile = parse_density_function
hydrogen.addRealAttributes = orig_x orig_z
hydrogen.attribute.orig_x(x,y,z,ux,uy,uz,t) = "x"
hydrogen.attribute.orig_z(x,y,z,ux,uy,uz,t) = "z"

electrons.species_type = electron
electrons.injection_style = NUniformPerCell
electrons.num_particles_per_cell_each_dim = 2 2
# for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
#electrons.num_particles_per_cell_each_dim = 4 8
electrons.momentum_distribution_type = "gaussian"
electrons.ux_th = .01
electrons.uz_th = .01
# minimum and maximum z position between which particles are initialized
# --> should be set for dense targets limit memory consumption during initialization
electrons.zmin = -zmax
electrons.zmax = zmax

# ionization physics (field ionization/ADK)
#   [i1] none (fully pre-ionized):
electrons.profile = parse_density_function
#   [i2] field ionization (ADK):
#hydrogen.do_field_ionization = 1
#hydrogen.physical_element = H
#hydrogen.ionization_initial_level = 0
#hydrogen.ionization_product_species = electrons
#electrons.profile = constant
#electrons.density = 0.0

# collisional physics (binary MC model after Nanbu/Perez)
#collisions.collision_names = c_eH c_ee c_HH
#c_eH.species = electrons hydrogen
#c_ee.species = electrons electrons
#c_HH.species = hydrogen hydrogen
#c_eH.CoulombLog = 15.9
#c_ee.CoulombLog = 15.9
#c_HH.CoulombLog = 15.9

# number density: "fully ionized" electron density as reference
#   [material 1] cryogenic H2
my_constants.nc    = 1.742e27  # [m^-3]  1.11485e21 * 1.e6 / 0.8**2
my_constants.n0    = 30.0      # [n_c]
#   [material 2] liquid crystal
#my_constants.n0    = 192
#   [material 3] PMMA
#my_constants.n0    = 230
#   [material 4] Copper (ion density: 8.49e28/m^3; times ionization level)
#my_constants.n0    = 1400

# density profiles (target extent, pre-plasma and cutoffs defined above particle species list)

# [target 1] flat foil (thickness = 2*r0)
electrons.density_function(x,y,z) = "nc*n0*(
    if(abs(z)<=r0, 1.0, if(abs(z)<r0+Lcut, exp((-abs(z)+r0)/L), 0.0)) )"
hydrogen.density_function(x,y,z) = "nc*n0*(
    if(abs(z)<=r0, 1.0, if(abs(z)<r0+Lcut, exp((-abs(z)+r0)/L), 0.0)) )"

# [target 2] cylinder
#electrons.density_function(x,y,z) = "nc*n0*(
#    ((x*x+z*z)<=(r0*r0)) +
#    (sqrt(x*x+z*z)>r0)*(sqrt(x*x+z*z)<r0+Lcut)*exp( (-sqrt(x*x+z*z)+r0)/L ) )"
#hydrogen.density_function(x,y,z) = "nc*n0*(
#    ((x*x+z*z)<=(r0*r0)) +
#    (sqrt(x*x+z*z)>r0)*(sqrt(x*x+z*z)<r0+Lcut)*exp( (-sqrt(x*x+z*z)+r0)/L ) )"

# [target 3] sphere
#electrons.density_function(x,y,z) = "nc*n0*(
#    ((x*x+y*y+z*z)<=(r0*r0)) +
#    (sqrt(x*x+y*y+z*z)>r0)*(sqrt(x*x+y*y+z*z)<r0+Lcut)*exp( (-sqrt(x*x+y*y+z*z)+r0)/L ) )"
#hydrogen.density_function(x,y,z) = "nc*n0*(
#    ((x*x+y*y+z*z)<=(r0*r0)) +
#    (sqrt(x*x+y*y+z*z)>r0)*(sqrt(x*x+y*y+z*z)<r0+Lcut)*exp( (-sqrt(x*x+y*y+z*z)+r0)/L ) )"


#################################
# Laser Pulse Profile
#
lasers.names        = laser1
laser1.position     = 0. 0. -4.0e-6     # point the laser plane (antenna)
laser1.direction    = 0. 0. 1.          # the plane's (antenna's) normal direction
laser1.polarization = 1. 0. 0.          # the main polarization vector
laser1.a0           = 16.0              # maximum amplitude of the laser field [V/m]
laser1.wavelength   = 0.8e-6            # central wavelength of the laser pulse [m]
laser1.profile      = Gaussian
laser1.profile_waist = 4.e-6            # beam waist (E(w_0)=E_0/e) [m]
laser1.profile_duration = 30.e-15       # pulse length (E(tau)=E_0/e; tau=tau_E=FWHM_I/1.17741) [s]
laser1.profile_t_peak = 50.e-15         # time until peak intensity reached at the laser plane [s]
laser1.profile_focal_distance = 4.0e-6  # focal distance from the antenna [m]

# e_max = a0 * 3.211e12 / lambda_0[mu]
#   a0 = 16, lambda_0 = 0.8mu -> e_max = 64.22 TV/m


#################################
# Diagnostics
#
diagnostics.diags_names = diag1 openPMDfw openPMDbw

diag1.intervals = 100
diag1.diag_type = Full
diag1.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz rho rho_electrons rho_hydrogen
# reduce resolution of output fields
diag1.coarsening_ratio = 4 4
# demonstration of a spatial and momentum filter
diag1.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)
diag1.hydrogen.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)
diag1.format = openpmd
diag1.openpmd_backend = h5

openPMDfw.intervals = 100
openPMDfw.diag_type = Full
openPMDfw.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz rho rho_electrons rho_hydrogen
# reduce resolution of output fields
openPMDfw.coarsening_ratio = 4 4
openPMDfw.format = openpmd
openPMDfw.openpmd_backend = h5
# demonstration of a spatial and momentum filter
openPMDfw.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)
openPMDfw.hydrogen.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)

openPMDbw.intervals = 100
openPMDbw.diag_type = Full
openPMDbw.fields_to_plot = rho_hydrogen
# reduce resolution of output fields
openPMDbw.coarsening_ratio = 4 4
openPMDbw.format = openpmd
openPMDbw.openpmd_backend = h5
# demonstration of a momentum filter
openPMDbw.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz<0)
openPMDbw.hydrogen.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz<0)


#################################
# Reduced Diagnostics
#

# histograms with 2.0 degree acceptance angle in fw direction
# 2 deg * pi / 180 : 0.03490658503 rad
# half-angle +/-   : 0.017453292515 rad
warpx.reduced_diags_names                   = histuH histue histuzAll FieldProbe_Z FieldProbe_ScatPoint FieldProbe_ScatLine LBC PhaseSpaceIons PhaseSpaceElectrons

histuH.type                                 = ParticleHistogram
histuH.intervals                            = 100
histuH.species                              = hydrogen
histuH.bin_number                           = 1000
histuH.bin_min                              =  0.0
histuH.bin_max                              =  0.474  # 100 MeV protons
histuH.histogram_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)"
histuH.filter_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)"

histue.type                                 = ParticleHistogram
histue.intervals                            = 100
histue.species                              = electrons
histue.bin_number                           = 1000
histue.bin_min                              = 0.0
histue.bin_max                              = 197  # 100 MeV electrons
histue.histogram_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)"
histue.filter_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)"

# just a test entry to make sure that the histogram filter is purely optional:
# this one just records uz of all hydrogen ions, independent of their pointing
histuzAll.type                                 = ParticleHistogram
histuzAll.intervals                            = 100
histuzAll.species                              = hydrogen
histuzAll.bin_number                           = 1000
histuzAll.bin_min                              = -0.474
histuzAll.bin_max                              =  0.474
histuzAll.histogram_function(t,x,y,z,ux,uy,uz) = "uz"

FieldProbe_Z.type = FieldProbe
FieldProbe_Z.intervals = 100
FieldProbe_Z.integrate = 0
FieldProbe_Z.probe_geometry = Line
FieldProbe_Z.x_probe = 0.0
FieldProbe_Z.z_probe = -5.0e-6
FieldProbe_Z.x1_probe = 0.0
FieldProbe_Z.z1_probe = 25.0e-6
FieldProbe_Z.resolution = 3712

FieldProbe_ScatPoint.type = FieldProbe
FieldProbe_ScatPoint.intervals = 1
FieldProbe_ScatPoint.integrate = 0
FieldProbe_ScatPoint.probe_geometry = Point
FieldProbe_ScatPoint.x_probe = 0.0
FieldProbe_ScatPoint.z_probe = 15e-6

FieldProbe_ScatLine.type = FieldProbe
FieldProbe_ScatLine.intervals = 100
FieldProbe_ScatLine.integrate = 1
FieldProbe_ScatLine.probe_geometry = Line
FieldProbe_ScatLine.x_probe = -2.5e-6
FieldProbe_ScatLine.z_probe = 15e-6
FieldProbe_ScatLine.x1_probe = 2.5e-6
FieldProbe_ScatLine.z1_probe = 15e-6
FieldProbe_ScatLine.resolution = 201

# check computational load per box
LBC.type = LoadBalanceCosts
LBC.intervals = 100

PhaseSpaceIons.type                                 = ParticleHistogram2D
PhaseSpaceIons.intervals                            = 100
PhaseSpaceIons.species                              = hydrogen
PhaseSpaceIons.bin_number_abs                       = 1000
PhaseSpaceIons.bin_number_ord                       = 1000
PhaseSpaceIons.bin_min_abs                          = -5.e-6
PhaseSpaceIons.bin_max_abs                          = 25.e-6
PhaseSpaceIons.bin_min_ord                          = -0.474
PhaseSpaceIons.bin_max_ord                          = 0.474
PhaseSpaceIons.histogram_function_abs(t,x,y,z,ux,uy,uz,w) = "z"
PhaseSpaceIons.histogram_function_ord(t,x,y,z,ux,uy,uz,w) = "uz"
PhaseSpaceIons.value_function(t,x,y,z,ux,uy,uz,w) = "w"
# PhaseSpaceIons.filter_function(t,x,y,z,ux,uy,uz,w) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)"

PhaseSpaceElectrons.type                                 = ParticleHistogram2D
PhaseSpaceElectrons.intervals                            = 100
PhaseSpaceElectrons.species                              = electrons
PhaseSpaceElectrons.bin_number_abs                       = 1000
PhaseSpaceElectrons.bin_number_ord                       = 1000
PhaseSpaceElectrons.bin_min_abs                          = -5.e-6
PhaseSpaceElectrons.bin_max_abs                          = 25.e-6
PhaseSpaceElectrons.bin_min_ord                          = -197
PhaseSpaceElectrons.bin_max_ord                          = 197
PhaseSpaceElectrons.histogram_function_abs(t,x,y,z,ux,uy,uz,w) = "z"
PhaseSpaceElectrons.histogram_function_ord(t,x,y,z,ux,uy,uz,w) = "uz"
PhaseSpaceElectrons.value_function(t,x,y,z,ux,uy,uz,w) = "w"
PhaseSpaceElectrons.filter_function(t,x,y,z,ux,uy,uz,w) = "sqrt(x*x+y*y) < 1e-6"

#################################
# Physical Background
#
# This example is modeled after a target similar to the hydrogen jet here:
#   [1] https://doi.org/10.1038/s41598-017-10589-3
#   [2] https://arxiv.org/abs/1903.06428
#
authors = "Axel Huebl <axelhuebl@lbl.gov>"
Analyze
Longitudinal phase space of forward-moving electrons in a 2 degree opening angle.

Longitudinal phase space of forward-moving electrons in a 2 degree opening angle.

Longitudinal phase space of forward-moving protons in a 2 degree opening angle.

Longitudinal phase space of forward-moving protons in a 2 degree opening angle.

Time-resolved phase electron space analysis as in Fig. 3 gives information about, e.g., how laser energy is locally converted into electron kinetic energy. Later in time, ion phase spaces like Fig. 4 can reveal where accelerated ion populations originate.

Script analysis_histogram_2D.py
You can copy this file from Examples/Physics_applications/laser_ion/analysis_histogram_2D.py.
#!/usr/bin/env python3

# This script displays a 2D histogram.

import argparse

import matplotlib.colors as colors
import matplotlib.pyplot as plt
import numpy as np
from openpmd_viewer import OpenPMDTimeSeries

parser = argparse.ArgumentParser(description='Process a 2D histogram name and an integer.')
parser.add_argument("hist2D", help="Folder name of the reduced diagnostic.")
parser.add_argument("iter", help="Iteration number of the simulation that is plotted. Enter a number from the list of iterations or 'All' if you want all plots.")
args = parser.parse_args()

path = 'diags/reducedfiles/' + args.hist2D

ts = OpenPMDTimeSeries(path)

it = ts.iterations
data, info = ts.get_field(field="data", iteration=0, plot=True)
print('The available iterations of the simulation are:', it)
print('The axes of the histogram are (0: ordinate ; 1: abscissa):', info.axes)
print('The data shape is:', data.shape)

# Add the simulation time to the title once this information
# is available in the "info" FieldMetaInformation object.
if args.iter == 'All' :
    for it_idx, i in enumerate(it):
        plt.figure()
        data, info = ts.get_field(field="data", iteration=i, plot=False)
        abscissa_name = info.axes[1]  # This might be 'z' or something else
        abscissa_values = getattr(info, abscissa_name, None)
        ordinate_name = info.axes[0]  # This might be 'z' or something else
        ordinate_values = getattr(info, ordinate_name, None)

        plt.pcolormesh(abscissa_values/1e-6, ordinate_values, data, norm=colors.LogNorm(), rasterized=True)
        plt.title(args.hist2D + f" Time: {ts.t[it_idx]:.2e} s  (Iteration: {i:d})")
        plt.xlabel(info.axes[1]+r' ($\mu$m)')
        plt.ylabel(info.axes[0]+r' ($m_\mathrm{species} c$)')
        plt.colorbar()
        plt.tight_layout()
        plt.savefig('Histogram_2D_' + args.hist2D + '_iteration_' + str(i) + '.png')
else :
    i = int(args.iter)
    it_idx = np.where(i == it)[0][0]
    plt.figure()
    data, info = ts.get_field(field="data", iteration=i, plot=False)
    abscissa_name = info.axes[1]  # This might be 'z' or something else
    abscissa_values = getattr(info, abscissa_name, None)
    ordinate_name = info.axes[0]  # This might be 'z' or something else
    ordinate_values = getattr(info, ordinate_name, None)

    plt.pcolormesh(abscissa_values/1e-6, ordinate_values, data, norm=colors.LogNorm(), rasterized=True)
    plt.title(args.hist2D + f" Time: {ts.t[it_idx]:.2e} s  (Iteration: {i:d})")
    plt.xlabel(info.axes[1]+r' ($\mu$m)')
    plt.ylabel(info.axes[0]+r' ($m_\mathrm{species} c$)')
    plt.colorbar()
    plt.tight_layout()
    plt.savefig('Histogram_2D_' + args.hist2D + '_iteration_' + str(i) + '.png')
Visualize

Note

The following images for densities and electromagnetic fields were created with a run on 64 NVidia A100 GPUs featuring a total number of cells of nx = 8192 and nz = 16384, as well as 64 particles per cell per species.

Particle densities for electrons (top), protons (middle), and electrons again in logarithmic scale (bottom).

Particle density output illustrates the evolution of the target in time and space. Logarithmic scales can help to identify where the target becomes transparent for the laser pulse (bottom panel in fig-tnsa-densities ).

Electromagnetic field visualization for E_x (top), B_y (middle), and E_z (bottom).

Electromagnetic field visualization for \(E_x\) (top), \(B_y\) (middle), and \(E_z\) (bottom).

Electromagnetic field output shows where the laser field is strongest at a given point in time, and where accelerating fields build up Fig. 5.

Script plot_2d.py
You can copy this file from Examples/Physics_applications/laser_ion/plot_2d.py.
#!/usr/bin/env python3

# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Marco Garten
# License: BSD-3-Clause-LBNL
#
# This script plots the densities and fields of a 2D laser-ion acceleration simulation.


import argparse
import os
import re

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scipy.constants as sc
from matplotlib.colors import TwoSlopeNorm
from openpmd_viewer import OpenPMDTimeSeries

plt.rcParams.update({'font.size':16})

def create_analysis_dir(directory):
    if not os.path.exists(directory):
        os.makedirs(directory)


def visualize_density_iteration(ts, iteration, out_dir):
    """
    Visualize densities and fields of a single iteration.

    :param ts: OpenPMDTimeSeries
    :param iteration: Output iteration (simulation timestep)
    :param out_dir: Directory for PNG output
    :return:
    """
    # Physics parameters
    lambda_L = 800e-9  # Laser wavelength in meters
    omega_L = 2 * np.pi * sc.c / lambda_L  # Laser frequency in seconds
    n_c = sc.m_e * sc.epsilon_0 * omega_L**2 / sc.elementary_charge**2  # Critical plasma density in meters^(-3)
    micron = 1e-6

    # Simulation parameters
    n_e0 = 30
    n_max = 2 * n_e0
    nr = 1  # Number to decrease resolution

    # Data fetching
    it = iteration
    ii = np.where(ts.iterations == it)[0][0]

    time = ts.t[ii]
    rho_e, rho_e_info = ts.get_field(field="rho_electrons", iteration=it)
    rho_d, rho_d_info = ts.get_field(field="rho_hydrogen", iteration=it)

    # Rescale to critical density
    rho_e = rho_e / (sc.elementary_charge * n_c)
    rho_d = rho_d / (sc.elementary_charge * n_c)

    # Axes setup
    fig, axs = plt.subplots(3, 1, figsize=(5, 8))
    xax, zax = rho_e_info.x, rho_e_info.z

    # Plotting
    # Electron density
    im0 = axs[0].pcolormesh(zax[::nr]/micron, xax[::nr]/micron, -rho_e.T[::nr, ::nr],
                            vmin=0, vmax=n_max, cmap="Reds", rasterized=True)
    plt.colorbar(im0, ax=axs[0], label=r"$n_\mathrm{\,e}\ (n_\mathrm{c})$")

    # Hydrogen density
    im1 = axs[1].pcolormesh(zax[::nr]/micron, xax[::nr]/micron, rho_d.T[::nr, ::nr],
                            vmin=0, vmax=n_max, cmap="Blues", rasterized=True)
    plt.colorbar(im1, ax=axs[1], label=r"$n_\mathrm{\,H}\ (n_\mathrm{c})$")

    # Masked electron density
    divnorm = TwoSlopeNorm(vmin=-7., vcenter=0., vmax=2)
    masked_data = np.ma.masked_where(rho_e.T == 0, rho_e.T)
    my_cmap = plt.cm.PiYG_r.copy()
    my_cmap.set_bad(color='black')
    im2 = axs[2].pcolormesh(zax[::nr]/micron, xax[::nr]/micron, np.log(-masked_data[::nr, ::nr]),
                            norm=divnorm, cmap=my_cmap, rasterized=True)
    plt.colorbar(im2, ax=axs[2], ticks=[-6, -3, 0, 1, 2], extend='both',
                       label=r"$\log n_\mathrm{\,e}\ (n_\mathrm{c})$")

    # Axis labels and title
    for ax in axs:
        ax.set_aspect(1.0)
        ax.set_ylabel(r"$x$ ($\mu$m)")
    for ax in axs[:-1]:
        ax.set_xticklabels([])
    axs[2].set_xlabel(r"$z$ ($\mu$m)")
    fig.suptitle(f"Iteration: {it}, Time: {time/1e-15:.1f} fs")

    plt.tight_layout()

    plt.savefig(f"{out_dir}/densities_{it:06d}.png")

def visualize_field_iteration(ts, iteration, out_dir):

    # Additional parameters
    nr = 1  # Number to decrease resolution
    micron = 1e-6

    # Data fetching
    it = iteration
    ii = np.where(ts.iterations == it)[0][0]
    time = ts.t[ii]

    Ex, Ex_info = ts.get_field(field="E", coord="x", iteration=it)
    Exmax = np.max(np.abs([np.min(Ex),np.max(Ex)]))
    By, By_info = ts.get_field(field="B", coord="y", iteration=it)
    Bymax = np.max(np.abs([np.min(By),np.max(By)]))
    Ez, Ez_info = ts.get_field(field="E", coord="z", iteration=it)
    Ezmax = np.max(np.abs([np.min(Ez),np.max(Ez)]))

    # Axes setup
    fig,axs = plt.subplots(3, 1, figsize=(5, 8))
    xax, zax = Ex_info.x, Ex_info.z

    # Plotting
    im0 = axs[0].pcolormesh(
        zax[::nr]/micron,xax[::nr]/micron,Ex.T[::nr,::nr],
        vmin=-Exmax, vmax=Exmax,
        cmap="RdBu", rasterized=True)

    plt.colorbar(im0,ax=axs[00], label=r"$E_x$ (V/m)")

    im1 = axs[1].pcolormesh(
        zax[::nr]/micron,xax[::nr]/micron,By.T[::nr,::nr],
        vmin=-Bymax, vmax=Bymax,
        cmap="RdBu", rasterized=True)
    plt.colorbar(im1,ax=axs[1], label=r"$B_y$ (T)")

    im2 = axs[2].pcolormesh(
        zax[::nr]/micron,xax[::nr]/micron,Ez.T[::nr,::nr],
        vmin=-Ezmax, vmax=Ezmax,
        cmap="RdBu", rasterized=True)
    plt.colorbar(im2,ax=axs[2],label=r"$E_z$ (V/m)")

    # Axis labels and title
    for ax in axs:
        ax.set_aspect(1.0)
        ax.set_ylabel(r"$x$ ($\mu$m)")
    for ax in axs[:-1]:
        ax.set_xticklabels([])
    axs[2].set_xlabel(r"$z$ ($\mu$m)")
    fig.suptitle(f"Iteration: {it}, Time: {time/1e-15:.1f} fs")

    plt.tight_layout()

    plt.savefig(f"{out_dir}/fields_{it:06d}.png")

def visualize_particle_histogram_iteration(diag_name="histuH", species="hydrogen", iteration=1000, out_dir="./analysis"):

    it = iteration

    if species == "hydrogen":
        # proton rest energy in eV
        mc2 = sc.m_p/sc.electron_volt * sc.c**2
    elif species == "electron":
        mc2 = sc.m_e/sc.electron_volt * sc.c**2
    else:
        raise NotImplementedError("The only implemented presets for this analysis script are `electron` or `hydrogen`.")

    fs = 1.e-15
    MeV = 1.e6

    df = pd.read_csv(f"./diags/reducedfiles/{diag_name}.txt",delimiter=r'\s+')
    # the columns look like this:
    #     #[0]step() [1]time(s) [2]bin1=0.000220() [3]bin2=0.000660() [4]bin3=0.001100()

    # matches words, strings surrounded by " ' ", dots, minus signs and e for scientific notation in numbers
    nested_list = [re.findall(r"[\w'\.]+",col) for col in df.columns]

    index = pd.MultiIndex.from_tuples(nested_list, names=('column#', 'name', 'bin value'))

    df.columns = (index)

    steps = df.values[:, 0].astype(int)
    ii = np.where(steps == it)[0][0]
    time = df.values[:, 1]
    data = df.values[:, 2:]
    edge_vals = np.array([float(row[2]) for row in df.columns[2:]])
    edges_MeV = (np.sqrt(edge_vals**2 + 1)-1) * mc2 / MeV

    time_fs = time / fs

    fig,ax = plt.subplots(1,1)

    ax.plot(edges_MeV, data[ii, :])
    ax.set_yscale("log")
    ax.set_ylabel(r"d$N$/d$\mathcal{E}$ (arb. u.)")
    ax.set_xlabel(r"$\mathcal{E}$ (MeV)")

    fig.suptitle(f"{species} - Iteration: {it}, Time: {time_fs[ii]:.1f} fs")

    plt.tight_layout()
    plt.savefig(f"./{out_dir}/{diag_name}_{it:06d}.png")


if __name__ == "__main__":

    # Argument parsing
    parser = argparse.ArgumentParser(description='Visualize Laser-Ion Accelerator Densities and Fields')
    parser.add_argument('-d', '--diag_dir', type=str, default='./diags/diag1', help='Directory containing density and field diagnostics')
    parser.add_argument('-i', '--iteration', type=int, default=None, help='Specific iteration to visualize')
    parser.add_argument('-hn', '--histogram_name', type=str, default='histuH', help='Name of histogram diagnostic to visualize')
    parser.add_argument('-hs', '--histogram_species', type=str, default='hydrogen', help='Particle species in the visualized histogram diagnostic')
    args = parser.parse_args()

    # Create analysis directory
    analysis_dir = 'analysis'
    create_analysis_dir(analysis_dir)

    # Loading the time series
    ts = OpenPMDTimeSeries(args.diag_dir)

    if args.iteration is not None:
        visualize_density_iteration(ts, args.iteration, analysis_dir)
        visualize_field_iteration(ts, args.iteration, analysis_dir)
        visualize_particle_histogram_iteration(args.histogram_name, args.histogram_species, args.iteration, analysis_dir)
    else:
        for it in ts.iterations:
            visualize_density_iteration(ts, it, analysis_dir)
            visualize_field_iteration(ts, it, analysis_dir)
            visualize_particle_histogram_iteration(args.histogram_name, args.histogram_species, it, analysis_dir)

Plasma-Mirror

This example shows how to model a plasma mirror, using a planar target of solid density [7, 8].

Although laser-solid interaction modeling requires full 3D modeling for adequate description of the dynamics at play, this example models a 2D example. 2D modeling provide a qualitative overview of the dynamics, but mostly saves computational costs since the plasma frequency (and Debye length) of the surface plasma determines the resolution need in laser-solid interaction modeling.

Note

TODO: The Python (PICMI) input file needs to be created.

Run

This example can be run either as:

  • Python script: (TODO) or

  • WarpX executable using an input file: warpx.2d inputs_2d

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

Note

TODO: This input file should be created following the inputs_2d file.

You can copy this file from Examples/Physics_applications/plasma_mirror/inputs_2d.
#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 1000
amr.n_cell =  1024 512
amr.max_grid_size = 128
amr.blocking_factor = 32
amr.max_level = 0
geometry.dims = 2
geometry.prob_lo = -100.e-6   0.     # physical domain
geometry.prob_hi =  100.e-6   100.e-6
warpx.verbose = 1
warpx.serialize_initial_conditions = 1

#################################
####### Boundary condition ######
#################################
boundary.field_lo = pml pml
boundary.field_hi = pml pml

#################################
############ NUMERICS ###########
#################################
my_constants.zc    = 20.e-6
my_constants.zp    = 20.05545177444479562e-6
my_constants.lgrad = .08e-6
my_constants.nc    = 1.74e27
my_constants.zp2   = 24.e-6
my_constants.zc2   = 24.05545177444479562e-6
warpx.cfl = 1.0
warpx.use_filter = 1
algo.load_balance_intervals = 66

# Order of particle shape factors
algo.particle_shape = 3

#################################
############ PLASMA #############
#################################
particles.species_names = electrons ions

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = NUniformPerCell
electrons.num_particles_per_cell_each_dim = 2 2
electrons.momentum_distribution_type = "gaussian"
electrons.ux_th = .01
electrons.uz_th = .01
electrons.zmin = "zc-lgrad*log(400)"
electrons.zmax = 25.47931e-6
electrons.profile = parse_density_function
electrons.density_function(x,y,z) = "if(z<zp, nc*exp((z-zc)/lgrad), if(z<=zp2, 2.*nc, nc*exp(-(z-zc2)/lgrad)))"

ions.charge = q_e
ions.mass = m_p
ions.injection_style = NUniformPerCell
ions.num_particles_per_cell_each_dim = 2 2
ions.momentum_distribution_type = "at_rest"
ions.zmin = 19.520e-6
ions.zmax = 25.47931e-6
ions.profile = parse_density_function
ions.density_function(x,y,z) = "if(z<zp, nc*exp((z-zc)/lgrad), if(z<=zp2, 2.*nc, nc*exp(-(z-zc2)/lgrad)))"

#################################
############# LASER #############
#################################
lasers.names        = laser1
laser1.position     = 0. 0. 5.e-6 # This point is on the laser plane
laser1.direction    = 0. 0. 1.     # The plane normal direction
laser1.polarization = 1. 0. 0.     # The main polarization vector
laser1.e_max        = 4.e12        # Maximum amplitude of the laser field (in V/m)
laser1.wavelength = 0.8e-6         # The wavelength of the laser (in meters)
laser1.profile      = Gaussian
laser1.profile_waist = 5.e-6      # The waist of the laser (in meters)
laser1.profile_duration = 15.e-15  # The duration of the laser (in seconds)
laser1.profile_t_peak = 25.e-15    # The time at which the laser reaches its peak (in seconds)
laser1.profile_focal_distance = 15.e-6  # Focal distance from the antenna (in meters)

# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 10
diag1.diag_type = Full
Analyze

Note

This section is TODO.

Visualize

Note

This section is TODO.

Particle Accelerator & Beam Physics

Gaussian Beam

This example initializes a Gaussian beam distribution.

Run

This example can be run either as:

  • Python script: python3 PICMI_inputs_gaussian_beam.py or

  • WarpX executable using an input file: (TODO)

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

You can copy this file from Examples/Tests/gaussian_beam/PICMI_inputs_gaussian_beam.py.
#!/usr/bin/env python3

#from warp import picmi
import argparse

from pywarpx import picmi

parser = argparse.ArgumentParser(description="Gaussian beam PICMI example")

parser.add_argument('--diagformat', type=str,
                    help='Format of the full diagnostics (plotfile, openpmd, ascent, sensei, ...)',
                    default='plotfile')
parser.add_argument('--fields_to_plot', type=str,
                    help='List of fields to write to diagnostics',
                    default=['E', 'B', 'J', 'part_per_cell'],
                    nargs = '*')

args = parser.parse_args()

constants = picmi.constants

nx = 32
ny = 32
nz = 32

xmin = -2.
xmax = +2.
ymin = -2.
ymax = +2.
zmin = -2.
zmax = +2.

number_sim_particles = 32768
total_charge = 8.010883097437485e-07

beam_rms_size = 0.25
electron_beam_divergence = -0.04*constants.c

em_order = 3

grid = picmi.Cartesian3DGrid(number_of_cells = [nx, ny, nz],
                             lower_bound = [xmin, ymin, zmin],
                             upper_bound = [xmax, ymax, zmax],
                             lower_boundary_conditions = ['periodic', 'periodic', 'open'],
                             upper_boundary_conditions = ['periodic', 'periodic', 'open'],
                             lower_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
                             upper_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
                             warpx_max_grid_size=16)

solver = picmi.ElectromagneticSolver(grid = grid,
                                     cfl = 1.,
                                     stencil_order=[em_order,em_order,em_order])

electron_beam = picmi.GaussianBunchDistribution(n_physical_particles = total_charge/constants.q_e,
                                                rms_bunch_size = [beam_rms_size, beam_rms_size, beam_rms_size],
                                                velocity_divergence = [electron_beam_divergence, electron_beam_divergence, electron_beam_divergence])

proton_beam = picmi.GaussianBunchDistribution(n_physical_particles = total_charge/constants.q_e,
                                              rms_bunch_size = [beam_rms_size, beam_rms_size, beam_rms_size])

electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=electron_beam)
protons = picmi.Species(particle_type='proton', name='protons', initial_distribution=proton_beam)

field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
                                    grid = grid,
                                    period = 10,
                                    data_list = args.fields_to_plot,
                                    warpx_format = args.diagformat,
                                    write_dir = '.',
                                    warpx_file_prefix = 'Python_gaussian_beam_plt')

part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
                                      period = 10,
                                      species = [electrons, protons],
                                      data_list = ['weighting', 'momentum'],
                                      warpx_format = args.diagformat)

sim = picmi.Simulation(solver = solver,
                       max_steps = 10,
                       verbose = 1,
                       warpx_current_deposition_algo = 'direct',
                       warpx_use_filter = 0)

sim.add_species(electrons, layout=picmi.PseudoRandomLayout(n_macroparticles=number_sim_particles))
sim.add_species(protons, layout=picmi.PseudoRandomLayout(n_macroparticles=number_sim_particles))

sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)

# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name = 'inputs_from_PICMI')

# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()

Note

TODO: This input file should be created following the PICMI_inputs_gaussian_beam.py file.

Analyze

Note

This section is TODO.

Visualize

Note

This section is TODO.

Beam-beam collision

This example shows how to simulate the collision between two ultra-relativistic particle beams. This is representative of what happens at the interaction point of a linear collider. We consider a right-propagating electron bunch colliding against a left-propagating positron bunch.

We turn on the Quantum Synchrotron QED module for photon emission (also known as beamstrahlung in the collider community) and the Breit-Wheeler QED module for the generation of electron-positron pairs (also known as coherent pair generation in the collider community).

To solve for the electromagnetic field we use the nodal version of the electrostatic relativistic solver. This solver computes the average velocity of each species, and solves the corresponding relativistic Poisson equation (see the WarpX documentation for warpx.do_electrostatic = relativistic for more detail). This solver accurately reproduced the subtle cancellation that occur for some component of the E + v x B terms which are crucial in simulations of relativistic particles.

This example is based on the following paper Yakimenko et al. [9].

Run

The PICMI input file is not available for this example yet.

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

You can copy this file from Examples/Physics_applications/beam-beam_collision/inputs.
#################################
########## MY CONSTANTS #########
#################################
my_constants.mc2 = m_e*clight*clight
my_constants.nano = 1.0e-9
my_constants.GeV = q_e*1.e9

# BEAMS
my_constants.beam_energy = 125.*GeV
my_constants.beam_uz = beam_energy/(mc2)
my_constants.beam_charge = 0.14*nano
my_constants.sigmax = 10*nano
my_constants.sigmay = 10*nano
my_constants.sigmaz = 10*nano
my_constants.beam_uth = 0.1/100.*beam_uz
my_constants.n0 = beam_charge / (q_e * sigmax * sigmay * sigmaz * (2.*pi)**(3./2.))
my_constants.omegab = sqrt(n0 * q_e**2 / (epsilon0*m_e))
my_constants.mux = 0.0
my_constants.muy = 0.0
my_constants.muz = -0.5*Lz+3.2*sigmaz

# BOX
my_constants.Lx = 100.0*clight/omegab
my_constants.Ly = 100.0*clight/omegab
my_constants.Lz = 180.0*clight/omegab

# for a full scale simulation use: nx, ny, nz = 512, 512, 1024
my_constants.nx = 64
my_constants.ny = 64
my_constants.nz = 128


# TIME
my_constants.T = 0.7*Lz/clight
my_constants.dt = sigmaz/clight/10.

# DIAGS
my_constants.every_red = 1.
warpx.used_inputs_file = warpx_used_inputs.txt

#################################
####### GENERAL PARAMETERS ######
#################################
stop_time = T
amr.n_cell = nx ny nz
amr.max_grid_size = 128
amr.blocking_factor = 2
amr.max_level = 0
geometry.dims = 3
geometry.prob_lo = -0.5*Lx -0.5*Ly -0.5*Lz
geometry.prob_hi =  0.5*Lx  0.5*Ly  0.5*Lz

#################################
######## BOUNDARY CONDITION #####
#################################
boundary.field_lo = PEC PEC PEC
boundary.field_hi = PEC PEC PEC
boundary.particle_lo = Absorbing Absorbing Absorbing
boundary.particle_hi = Absorbing Absorbing Absorbing

#################################
############ NUMERICS ###########
#################################
warpx.do_electrostatic = relativistic
warpx.const_dt = dt
warpx.grid_type = collocated
algo.particle_shape = 3
algo.load_balance_intervals=100
algo.particle_pusher = vay

#################################
########### PARTICLES ###########
#################################
particles.species_names = beam1 beam2 pho1 pho2 ele1 pos1 ele2 pos2
particles.photon_species = pho1 pho2

beam1.species_type = electron
beam1.injection_style = NUniformPerCell
beam1.num_particles_per_cell_each_dim = 1 1 1
beam1.profile = parse_density_function
beam1.density_function(x,y,z) = "n0 *  exp(-(x-mux)**2/(2*sigmax**2))  * exp(-(y-muy)**2/(2*sigmay**2)) * exp(-(z-muz)**2/(2*sigmaz**2))"
beam1.density_min = n0 / 1e3
beam1.momentum_distribution_type = gaussian
beam1.uz_m = beam_uz
beam1.uy_m = 0.0
beam1.ux_m = 0.0
beam1.ux_th = beam_uth
beam1.uy_th = beam_uth
beam1.uz_th = beam_uth
beam1.initialize_self_fields = 1
beam1.self_fields_required_precision = 5e-10
beam1.self_fields_max_iters = 10000
beam1.do_qed_quantum_sync = 1
beam1.qed_quantum_sync_phot_product_species = pho1
beam1.do_classical_radiation_reaction = 0

beam2.species_type = positron
beam2.injection_style = NUniformPerCell
beam2.num_particles_per_cell_each_dim = 1 1 1
beam2.profile = parse_density_function
beam2.density_function(x,y,z) = "n0 *  exp(-(x-mux)**2/(2*sigmax**2))  * exp(-(y-muy)**2/(2*sigmay**2)) * exp(-(z+muz)**2/(2*sigmaz**2))"
beam2.density_min = n0 / 1e3
beam2.momentum_distribution_type = gaussian
beam2.uz_m = -beam_uz
beam2.uy_m = 0.0
beam2.ux_m = 0.0
beam2.ux_th = beam_uth
beam2.uy_th = beam_uth
beam2.uz_th = beam_uth
beam2.initialize_self_fields = 1
beam2.self_fields_required_precision = 5e-10
beam2.self_fields_max_iters = 10000
beam2.do_qed_quantum_sync = 1
beam2.qed_quantum_sync_phot_product_species = pho2
beam2.do_classical_radiation_reaction = 0

pho1.species_type = photon
pho1.injection_style = none
pho1.do_qed_breit_wheeler = 1
pho1.qed_breit_wheeler_ele_product_species = ele1
pho1.qed_breit_wheeler_pos_product_species = pos1

pho2.species_type = photon
pho2.injection_style = none
pho2.do_qed_breit_wheeler = 1
pho2.qed_breit_wheeler_ele_product_species = ele2
pho2.qed_breit_wheeler_pos_product_species = pos2

ele1.species_type = electron
ele1.injection_style = none
ele1.self_fields_required_precision = 1e-11
ele1.self_fields_max_iters = 10000
ele1.do_qed_quantum_sync = 1
ele1.qed_quantum_sync_phot_product_species = pho1
ele1.do_classical_radiation_reaction = 0

pos1.species_type = positron
pos1.injection_style = none
pos1.self_fields_required_precision = 1e-11
pos1.self_fields_max_iters = 10000
pos1.do_qed_quantum_sync = 1
pos1.qed_quantum_sync_phot_product_species = pho1
pos1.do_classical_radiation_reaction = 0

ele2.species_type = electron
ele2.injection_style = none
ele2.self_fields_required_precision = 1e-11
ele2.self_fields_max_iters = 10000
ele2.do_qed_quantum_sync = 1
ele2.qed_quantum_sync_phot_product_species = pho2
ele2.do_classical_radiation_reaction = 0

pos2.species_type = positron
pos2.injection_style = none
pos2.self_fields_required_precision = 1e-11
pos2.self_fields_max_iters = 10000
pos2.do_qed_quantum_sync = 1
pos2.qed_quantum_sync_phot_product_species = pho2
pos2.do_classical_radiation_reaction = 0

pho1.species_type = photon
pho1.injection_style = none
pho1.do_qed_breit_wheeler = 1
pho1.qed_breit_wheeler_ele_product_species = ele1
pho1.qed_breit_wheeler_pos_product_species = pos1

pho2.species_type = photon
pho2.injection_style = none
pho2.do_qed_breit_wheeler = 1
pho2.qed_breit_wheeler_ele_product_species = ele2
pho2.qed_breit_wheeler_pos_product_species = pos2

#################################
############# QED ###############
#################################
qed_qs.photon_creation_energy_threshold = 0.

qed_qs.lookup_table_mode = builtin
qed_qs.chi_min = 1.e-3

qed_bw.lookup_table_mode = builtin
qed_bw.chi_min = 1.e-2

# for accurate results use the generated tables with
# the following parameters
# note: must compile with -DWarpX_QED_TABLE_GEN=ON
#qed_qs.lookup_table_mode = generate
#qed_bw.lookup_table_mode = generate
#qed_qs.tab_dndt_chi_min=1e-3
#qed_qs.tab_dndt_chi_max=2e3
#qed_qs.tab_dndt_how_many=512
#qed_qs.tab_em_chi_min=1e-3
#qed_qs.tab_em_chi_max=2e3
#qed_qs.tab_em_chi_how_many=512
#qed_qs.tab_em_frac_how_many=512
#qed_qs.tab_em_frac_min=1e-12
#qed_qs.save_table_in=my_qs_table.txt
#qed_bw.tab_dndt_chi_min=1e-2
#qed_bw.tab_dndt_chi_max=2e3
#qed_bw.tab_dndt_how_many=512
#qed_bw.tab_pair_chi_min=1e-2
#qed_bw.tab_pair_chi_max=2e3
#qed_bw.tab_pair_chi_how_many=512
#qed_bw.tab_pair_frac_how_many=512
#qed_bw.save_table_in=my_bw_table.txt

warpx.do_qed_schwinger = 0.

#################################
######### DIAGNOSTICS ###########
#################################
# FULL
diagnostics.diags_names = diag1

diag1.intervals = 0
diag1.diag_type = Full
diag1.write_species = 1
diag1.fields_to_plot = Ex Ey Ez Bx By Bz rho_beam1 rho_beam2 rho_ele1 rho_pos1 rho_ele2 rho_pos2
diag1.format = openpmd
diag1.dump_last_timestep = 1
diag1.species = pho1 pho2 ele1 pos1 ele2 pos2 beam1 beam2

# REDUCED
warpx.reduced_diags_names = ParticleNumber ColliderRelevant_beam1_beam2

ColliderRelevant_beam1_beam2.type = ColliderRelevant
ColliderRelevant_beam1_beam2.intervals = every_red
ColliderRelevant_beam1_beam2.species = beam1 beam2

ParticleNumber.type = ParticleNumber
ParticleNumber.intervals = every_red
Visualize

The figure below shows the number of photons emitted per beam particle (left) and the number of secondary pairs generated per beam particle (right).

We compare different results: * (red) simplified WarpX simulation as the example stored in the directory /Examples/Physics_applications/beam-beam_collision; * (blue) large-scale WarpX simulation (high resolution and ad hoc generated tables ; * (black) literature results from Yakimenko et al. [9].

The small-scale simulation has been performed with a resolution of nx = 64, ny = 64, nz = 128 grid cells, while the large-scale one has a much higher resolution of nx = 512, ny = 512, nz = 1024. Moreover, the large-scale simulation uses dedicated QED lookup tables instead of the builtin tables. To generate the tables within WarpX, the code must be compiled with the flag -DWarpX_QED_TABLE_GEN=ON. For the large-scale simulation we have used the following options:

qed_qs.lookup_table_mode = generate
qed_bw.lookup_table_mode = generate
qed_qs.tab_dndt_chi_min=1e-3
qed_qs.tab_dndt_chi_max=2e3
qed_qs.tab_dndt_how_many=512
qed_qs.tab_em_chi_min=1e-3
qed_qs.tab_em_chi_max=2e3
qed_qs.tab_em_chi_how_many=512
qed_qs.tab_em_frac_how_many=512
qed_qs.tab_em_frac_min=1e-12
qed_qs.save_table_in=my_qs_table.txt
qed_bw.tab_dndt_chi_min=1e-2
qed_bw.tab_dndt_chi_max=2e3
qed_bw.tab_dndt_how_many=512
qed_bw.tab_pair_chi_min=1e-2
qed_bw.tab_pair_chi_max=2e3
qed_bw.tab_pair_chi_how_many=512
qed_bw.tab_pair_frac_how_many=512
qed_bw.save_table_in=my_bw_table.txt
Beam-beam collision benchmark against :cite:t:`ex-Yakimenko2019`.

Beam-beam collision benchmark against Yakimenko et al. [9].

High Energy Astrophysical Plasma Physics

Ohm Solver: Magnetic Reconnection

Hybrid-PIC codes are often used to simulate magnetic reconnection in space plasmas. An example of magnetic reconnection from a force-free sheet is provided, based on the simulation described in Le et al. [10].

Run

The following Python script configures and launches the simulation.

Script PICMI_inputs.py
You can copy this file from Examples/Tests/ohm_solver_magnetic_reconnection/PICMI_inputs.py.
#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script demonstrates the use of this model to
# --- simulate magnetic reconnection in a force-free sheet. The setup is based
# --- on the problem described in Le et al. (2016)
# --- https://aip.scitation.org/doi/10.1063/1.4943893.

import argparse
import shutil
import sys
from pathlib import Path

import dill
import numpy as np
from mpi4py import MPI as mpi

from pywarpx import callbacks, fields, libwarpx, picmi

constants = picmi.constants

comm = mpi.COMM_WORLD

simulation = picmi.Simulation(
    warpx_serialize_initial_conditions=True,
    verbose=0
)


class ForceFreeSheetReconnection(object):

    # B0 is chosen with all other quantities scaled by it
    B0 = 0.1 # Initial magnetic field strength (T)

    # Physical parameters
    m_ion = 400.0 # Ion mass (electron masses)

    beta_e = 0.1
    Bg = 0.3 # times B0 - guiding field
    dB = 0.01 # times B0 - initial perturbation to seed reconnection

    T_ratio = 5.0 # T_i / T_e

    # Domain parameters
    LX = 40 # ion skin depths
    LZ = 20 # ion skin depths

    LT = 50 # ion cyclotron periods
    DT = 1e-3 # ion cyclotron periods

    # Resolution parameters
    NX = 512
    NZ = 512

    # Starting number of particles per cell
    NPPC = 400

    # Plasma resistivity - used to dampen the mode excitation
    eta = 6e-3  # normalized resistivity
    # Number of substeps used to update B
    substeps = 20

    def __init__(self, test, verbose):

        self.test = test
        self.verbose = verbose or self.test

        # calculate various plasma parameters based on the simulation input
        self.get_plasma_quantities()

        self.Lx = self.LX * self.l_i
        self.Lz = self.LZ * self.l_i

        self.dt = self.DT * self.t_ci

        # run very low resolution as a CI test
        if self.test:
            self.total_steps = 20
            self.diag_steps = self.total_steps // 5
            self.NX = 128
            self.NZ = 128
        else:
            self.total_steps = int(self.LT / self.DT)
            self.diag_steps = self.total_steps // 200

        # Initial magnetic field
        self.Bg *= self.B0
        self.dB *= self.B0
        self.Bx = (
            f"{self.B0}*tanh(z*{1.0/self.l_i})"
            f"+{-self.dB*self.Lx/(2.0*self.Lz)}*cos({2.0*np.pi/self.Lx}*x)"
            f"*sin({np.pi/self.Lz}*z)"
        )
        self.By = (
            f"sqrt({self.Bg**2 + self.B0**2}-"
            f"({self.B0}*tanh(z*{1.0/self.l_i}))**2)"
        )
        self.Bz = f"{self.dB}*sin({2.0*np.pi/self.Lx}*x)*cos({np.pi/self.Lz}*z)"

        self.J0 = self.B0 / constants.mu0 / self.l_i

        # dump all the current attributes to a dill pickle file
        if comm.rank == 0:
            with open(f'sim_parameters.dpkl', 'wb') as f:
                dill.dump(self, f)

        # print out plasma parameters
        if comm.rank == 0:
            print(
                f"Initializing simulation with input parameters:\n"
                f"\tTi = {self.Ti*1e-3:.1f} keV\n"
                f"\tn0 = {self.n_plasma:.1e} m^-3\n"
                f"\tB0 = {self.B0:.2f} T\n"
                f"\tM/m = {self.m_ion:.0f}\n"
            )
            print(
                f"Plasma parameters:\n"
                f"\tl_i = {self.l_i:.1e} m\n"
                f"\tt_ci = {self.t_ci:.1e} s\n"
                f"\tv_ti = {self.vi_th:.1e} m/s\n"
                f"\tvA = {self.vA:.1e} m/s\n"
            )
            print(
                f"Numerical parameters:\n"
                f"\tdz = {self.Lz/self.NZ:.1e} m\n"
                f"\tdt = {self.dt:.1e} s\n"
                f"\tdiag steps = {self.diag_steps:d}\n"
                f"\ttotal steps = {self.total_steps:d}\n"
            )

        self.setup_run()

    def get_plasma_quantities(self):
        """Calculate various plasma parameters based on the simulation input."""

        # Ion mass (kg)
        self.M = self.m_ion * constants.m_e

        # Cyclotron angular frequency (rad/s) and period (s)
        self.w_ce = constants.q_e * abs(self.B0) / constants.m_e
        self.w_ci = constants.q_e * abs(self.B0) / self.M
        self.t_ci = 2.0 * np.pi / self.w_ci

        # Electron plasma frequency: w_pe / omega_ce = 2 is given
        self.w_pe = 2.0 * self.w_ce

        # calculate plasma density based on electron plasma frequency
        self.n_plasma = (
            self.w_pe**2 * constants.m_e * constants.ep0 / constants.q_e**2
        )

        # Ion plasma frequency (Hz)
        self.w_pi = np.sqrt(
            constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
        )

        # Ion skin depth (m)
        self.l_i = constants.c / self.w_pi

        # # Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
        self.vA = abs(self.B0) / np.sqrt(
            constants.mu0 * self.n_plasma * (constants.m_e + self.M)
        )

        # calculate Te based on beta
        self.Te = (
            self.beta_e * self.B0**2 / (2.0 * constants.mu0 * self.n_plasma)
            / constants.q_e
        )
        self.Ti = self.Te * self.T_ratio

        # calculate thermal speeds
        self.ve_th = np.sqrt(self.Te * constants.q_e / constants.m_e)
        self.vi_th = np.sqrt(self.Ti * constants.q_e / self.M)

        # Ion Larmor radius (m)
        self.rho_i = self.vi_th / self.w_ci

        # Reference resistivity (Malakit et al.)
        self.eta0 = self.l_i * self.vA / (constants.ep0 * constants.c**2)

    def setup_run(self):
        """Setup simulation components."""

        #######################################################################
        # Set geometry and boundary conditions                                #
        #######################################################################

        # Create grid
        self.grid = picmi.Cartesian2DGrid(
            number_of_cells=[self.NX, self.NZ],
            lower_bound=[-self.Lx/2.0, -self.Lz/2.0],
            upper_bound=[self.Lx/2.0, self.Lz/2.0],
            lower_boundary_conditions=['periodic', 'dirichlet'],
            upper_boundary_conditions=['periodic', 'dirichlet'],
            lower_boundary_conditions_particles=['periodic', 'reflecting'],
            upper_boundary_conditions_particles=['periodic', 'reflecting'],
            warpx_max_grid_size=self.NZ
        )
        simulation.time_step_size = self.dt
        simulation.max_steps = self.total_steps
        simulation.current_deposition_algo = 'direct'
        simulation.particle_shape = 1
        simulation.use_filter = False
        simulation.verbose = self.verbose

        #######################################################################
        # Field solver and external field                                     #
        #######################################################################

        self.solver = picmi.HybridPICSolver(
            grid=self.grid, gamma=1.0,
            Te=self.Te, n0=self.n_plasma, n_floor=0.1*self.n_plasma,
            plasma_resistivity=self.eta*self.eta0,
            substeps=self.substeps
        )
        simulation.solver = self.solver

        B_ext = picmi.AnalyticInitialField(
            Bx_expression=self.Bx,
            By_expression=self.By,
            Bz_expression=self.Bz
        )
        simulation.add_applied_field(B_ext)

        #######################################################################
        # Particle types setup                                                #
        #######################################################################

        self.ions = picmi.Species(
            name='ions', charge='q_e', mass=self.M,
            initial_distribution=picmi.UniformDistribution(
                density=self.n_plasma,
                rms_velocity=[self.vi_th]*3,
            )
        )
        simulation.add_species(
            self.ions,
            layout=picmi.PseudoRandomLayout(
                grid=self.grid,
                n_macroparticles_per_cell=self.NPPC
            )
        )

        #######################################################################
        # Add diagnostics                                                     #
        #######################################################################

        callbacks.installafterEsolve(self.check_fields)

        if self.test:
            particle_diag = picmi.ParticleDiagnostic(
                name='diag1',
                period=self.total_steps,
                write_dir='.',
                species=[self.ions],
                data_list=['ux', 'uy', 'uz', 'x', 'z', 'weighting'],
                warpx_file_prefix='Python_ohms_law_solver_magnetic_reconnection_2d_plt',
                # warpx_format='openpmd',
                # warpx_openpmd_backend='h5',
            )
            simulation.add_diagnostic(particle_diag)
            field_diag = picmi.FieldDiagnostic(
                name='diag1',
                grid=self.grid,
                period=self.total_steps,
                data_list=['Bx', 'By', 'Bz', 'Ex', 'Ey', 'Ez'],
                write_dir='.',
                warpx_file_prefix='Python_ohms_law_solver_magnetic_reconnection_2d_plt',
                # warpx_format='openpmd',
                # warpx_openpmd_backend='h5',
            )
            simulation.add_diagnostic(field_diag)


        # reduced diagnostics for reconnection rate calculation
        # create a 2 l_i box around the X-point on which to measure
        # magnetic flux changes
        plane = picmi.ReducedDiagnostic(
            diag_type="FieldProbe",
            name='plane',
            period=self.diag_steps,
            path='diags/',
            extension='dat',
            probe_geometry='Plane',
            resolution=60,
            x_probe=0.0, z_probe=0.0, detector_radius=self.l_i,
            target_up_x=0, target_up_z=1.0
        )
        simulation.add_diagnostic(plane)

        #######################################################################
        # Initialize                                                          #
        #######################################################################

        if comm.rank == 0:
            if Path.exists(Path("diags")):
                shutil.rmtree("diags")
            Path("diags/fields").mkdir(parents=True, exist_ok=True)

        # Initialize inputs and WarpX instance
        simulation.initialize_inputs()
        simulation.initialize_warpx()

    def check_fields(self):

        step = simulation.extension.warpx.getistep(lev=0) - 1

        if not (step == 1 or step%self.diag_steps == 0):
            return

        rho = fields.RhoFPWrapper(include_ghosts=False)[:,:]
        Jiy = fields.JyFPWrapper(include_ghosts=False)[...] / self.J0
        Jy = fields.JyFPAmpereWrapper(include_ghosts=False)[...] / self.J0
        Bx = fields.BxFPWrapper(include_ghosts=False)[...] / self.B0
        By = fields.ByFPWrapper(include_ghosts=False)[...] / self.B0
        Bz = fields.BzFPWrapper(include_ghosts=False)[...] / self.B0

        if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
            return

        # save the fields to file
        with open(f"diags/fields/fields_{step:06d}.npz", 'wb') as f:
            np.savez(f, rho=rho, Jiy=Jiy, Jy=Jy, Bx=Bx, By=By, Bz=Bz)

##########################
# parse input parameters
##########################

parser = argparse.ArgumentParser()
parser.add_argument(
    '-t', '--test', help='toggle whether this script is run as a short CI test',
    action='store_true',
)
parser.add_argument(
    '-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left

run = ForceFreeSheetReconnection(test=args.test, verbose=args.verbose)
simulation.step()

Running the full simulation should take about 4 hours if executed on 1 V100 GPU. For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

python3 PICMI_inputs.py
Analyze

The following script extracts the reconnection rate as a function of time and animates the evolution of the magnetic field (as shown below).

Script analysis.py
You can copy this file from Examples/Tests/ohm_solver_magnetic_reconnection/analysis.py.
#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example of magnetic reconnection.

import glob

import dill
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import colors

plt.rcParams.update({'font.size': 20})

# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
    sim = dill.load(f)

x_idx = 2
z_idx = 4
Ey_idx = 6
Bx_idx = 8

plane_data = np.loadtxt(f'diags/plane.dat', skiprows=1)

steps = np.unique(plane_data[:,0])
num_steps = len(steps)
num_cells = plane_data.shape[0]//num_steps

plane_data = plane_data.reshape((num_steps, num_cells, plane_data.shape[1]))

times = plane_data[:, 0, 1]
dt = np.mean(np.diff(times))

plt.plot(
    times / sim.t_ci,
    np.mean(plane_data[:,:,Ey_idx], axis=1) / (sim.vA * sim.B0),
    'o-'
)

plt.grid()
plt.xlabel(r'$t/\tau_{c,i}$')
plt.ylabel('$<E_y>/v_AB_0$')
plt.title("Reconnection rate")
plt.tight_layout()
plt.savefig("diags/reconnection_rate.png")

if not sim.test:
    from matplotlib.animation import FFMpegWriter, FuncAnimation
    from scipy import interpolate

    # Animate the magnetic reconnection
    fig, axes = plt.subplots(3, 1, sharex=True, figsize=(7, 9))

    for ax in axes.flatten():
        ax.set_aspect('equal')
        ax.set_ylabel('$z/l_i$')

    axes[2].set_xlabel('$x/l_i$')

    datafiles = sorted(glob.glob("diags/fields/*.npz"))
    num_steps = len(datafiles)

    data0 = np.load(datafiles[0])

    sX = axes[0].imshow(
        data0['Jy'].T, origin='lower',
        norm=colors.TwoSlopeNorm(vmin=-0.6, vcenter=0., vmax=1.6),
        extent=[0, sim.LX, -sim.LZ/2, sim.LZ/2],
        cmap=plt.cm.RdYlBu_r
    )
    # axes[0].set_ylim(-5, 5)
    cb = plt.colorbar(sX, ax=axes[0], label='$J_y/J_0$')
    cb.ax.set_yscale('linear')
    cb.ax.set_yticks([-0.5, 0.0, 0.75, 1.5])

    sY = axes[1].imshow(
        data0['By'].T, origin='lower', extent=[0, sim.LX, -sim.LZ/2, sim.LZ/2],
        cmap=plt.cm.plasma
    )
    # axes[1].set_ylim(-5, 5)
    cb = plt.colorbar(sY, ax=axes[1], label='$B_y/B_0$')
    cb.ax.set_yscale('linear')

    sZ = axes[2].imshow(
        data0['Bz'].T, origin='lower', extent=[0, sim.LX, -sim.LZ/2, sim.LZ/2],
        # norm=colors.TwoSlopeNorm(vmin=-0.02, vcenter=0., vmax=0.02),
        cmap=plt.cm.RdBu
    )
    cb = plt.colorbar(sZ, ax=axes[2], label='$B_z/B_0$')
    cb.ax.set_yscale('linear')

    # plot field lines
    x_grid = np.linspace(0, sim.LX, data0['Bx'][:-1].shape[0])
    z_grid = np.linspace(-sim.LZ/2.0, sim.LZ/2.0, data0['Bx'].shape[1])

    n_lines = 10
    start_x = np.zeros(n_lines)
    start_x[:n_lines//2] = sim.LX
    start_z = np.linspace(-sim.LZ/2.0*0.9, sim.LZ/2.0*0.9, n_lines)
    step_size = 1.0 / 100.0

    def get_field_lines(Bx, Bz):
        field_line_coords = []

        Bx_interp = interpolate.interp2d(x_grid, z_grid, Bx[:-1].T)
        Bz_interp = interpolate.interp2d(x_grid, z_grid, Bz[:,:-1].T)

        for kk, z in enumerate(start_z):
            path_x = [start_x[kk]]
            path_z = [z]

            ii = 0
            while ii < 10000:
                ii+=1
                Bx = Bx_interp(path_x[-1], path_z[-1])[0]
                Bz = Bz_interp(path_x[-1], path_z[-1])[0]

                # print(path_x[-1], path_z[-1], Bx, Bz)

                # normalize and scale
                B_mag = np.sqrt(Bx**2 + Bz**2)
                if B_mag == 0:
                    break

                dx = Bx / B_mag * step_size
                dz = Bz / B_mag * step_size

                x_new = path_x[-1] + dx
                z_new = path_z[-1] + dz

                if np.isnan(x_new) or x_new <= 0 or x_new > sim.LX or abs(z_new) > sim.LZ/2:
                    break

                path_x.append(x_new)
                path_z.append(z_new)

            field_line_coords.append([path_x, path_z])
        return field_line_coords

    field_lines = []
    for path in get_field_lines(data0['Bx'], data0['Bz']):
        path_x = path[0]
        path_z = path[1]
        l, = axes[2].plot(path_x, path_z, '--', color='k')
        # draws arrows on the field lines
        # if path_x[10] > path_x[0]:
        axes[2].arrow(
            path_x[50], path_z[50],
            path_x[250]-path_x[50], path_z[250]-path_z[50],
            shape='full', length_includes_head=True, lw=0, head_width=1.0,
            color='g'
        )

        field_lines.append(l)

    def animate(i):
        data = np.load(datafiles[i])
        sX.set_array(data['Jy'].T)
        sY.set_array(data['By'].T)
        sZ.set_array(data['Bz'].T)
        sZ.set_clim(-np.max(abs(data['Bz'])), np.max(abs(data['Bz'])))

        for ii, path in enumerate(get_field_lines(data['Bx'], data['Bz'])):
            path_x = path[0]
            path_z = path[1]
            field_lines[ii].set_data(path_x, path_z)

    anim = FuncAnimation(
        fig, animate, frames=num_steps-1, repeat=True
    )

    writervideo = FFMpegWriter(fps=14)
    anim.save('diags/mag_reconnection.mp4', writer=writervideo)

if sim.test:
    import os
    import sys
    sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
    import checksumAPI

    # this will be the name of the plot file
    fn = sys.argv[1]
    test_name = os.path.split(os.getcwd())[1]
    checksumAPI.evaluate_checksum(test_name, fn)
Magnetic reconnection.

Magnetic reconnection from a force-free sheet.

Microelectronics

ARTEMIS (Adaptive mesh Refinement Time-domain ElectrodynaMIcs Solver) is based on WarpX and couples the Maxwell’s equations implementation in WarpX with classical equations that describe quantum material behavior (such as, LLG equation for micromagnetics and London equation for superconducting materials) for quantifying the performance of next-generation microelectronics.

Nuclear Fusion

Note

TODO

Fundamental Plasma Physics

Langmuir Waves

These are examples of Plasma oscillations (Langmuir waves) in a uniform plasma in 1D, 2D, 3D, and RZ.

In each case, a uniform plasma is setup with a sinusoidal perturbation in the electron momentum along each axis. The plasma is followed for a short period of time, long enough so that E fields develop. The resulting fields can be compared to the analytic solutions.

Run

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

This example can be run as a Python script: python3 PICMI_inputs_3d.py.

You can copy this file from Examples/Tests/langmuir/PICMI_inputs_3d.py.
#!/usr/bin/env python3
#
# --- Simple example of Langmuir oscillations in a uniform plasma

from pywarpx import picmi

constants = picmi.constants

##########################
# physics parameters
##########################

plasma_density = 1.e25
plasma_xmin = 0.
plasma_x_velocity = 0.1*constants.c

##########################
# numerics parameters
##########################

# --- Number of time steps
max_steps = 40
diagnostic_interval = 10

# --- Grid
nx = 64
ny = 64
nz = 64

xmin = -20.e-6
ymin = -20.e-6
zmin = -20.e-6
xmax = +20.e-6
ymax = +20.e-6
zmax = +20.e-6

number_per_cell_each_dim = [2,2,2]

##########################
# physics components
##########################

uniform_plasma = picmi.UniformDistribution(density = 1.e25,
                                           upper_bound = [0., None, None],
                                           directed_velocity = [0.1*constants.c, 0., 0.])

electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=uniform_plasma)

##########################
# numerics components
##########################

grid = picmi.Cartesian3DGrid(number_of_cells = [nx, ny, nz],
                             lower_bound = [xmin, ymin, zmin],
                             upper_bound = [xmax, ymax, zmax],
                             lower_boundary_conditions = ['periodic', 'periodic', 'periodic'],
                             upper_boundary_conditions = ['periodic', 'periodic', 'periodic'],
                             moving_window_velocity = [0., 0., 0.],
                             warpx_max_grid_size = 32)

solver = picmi.ElectromagneticSolver(grid=grid, cfl=1.)

##########################
# diagnostics
##########################

field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
                                    grid = grid,
                                    period = diagnostic_interval,
                                    data_list = ['Ex', 'Jx'],
                                    write_dir = '.',
                                    warpx_file_prefix = 'Python_Langmuir_plt')

part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
                                      period = diagnostic_interval,
                                      species = [electrons],
                                      data_list = ['weighting', 'ux'])

##########################
# simulation setup
##########################

sim = picmi.Simulation(solver = solver,
                       max_steps = max_steps,
                       verbose = 1,
                       warpx_current_deposition_algo = 'direct')

sim.add_species(electrons,
                layout = picmi.GriddedLayout(n_macroparticle_per_cell=number_per_cell_each_dim, grid=grid))

sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)

##########################
# simulation run
##########################

# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name = 'inputs_from_PICMI')

# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()

This example can be run as WarpX executable using an input file: warpx.3d inputs_3d

You can copy this file from Examples/Tests/langmuir/inputs_3d.
# Parameters for the plasma wave
my_constants.max_step = 40
my_constants.lx = 40.e-6 # length of sides
my_constants.dx = 6.25e-07 # grid cell size
my_constants.nx = lx/dx # number of cells in each dimension
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24  # electron and positron densities, #/m^3
my_constants.wp = sqrt(2.*n0*q_e**2/(epsilon0*m_e))  # plasma frequency
my_constants.kp = wp/clight  # plasma wavenumber
my_constants.k = 2.*2.*pi/lx  # perturbation wavenumber
# Note: kp is calculated in SI for a density of 4e24 (i.e. 2e24 electrons + 2e24 positrons)
# k is calculated so as to have 2 periods within the 40e-6 wide box.

# Maximum number of time steps
max_step = max_step

# number of grid points
amr.n_cell =  nx nx nx

# Maximum allowable size of each subdomain in the problem domain;
#    this is used to decompose the domain for parallel calculations.
amr.max_grid_size = nx nx nx

# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0

# Geometry
geometry.dims = 3
geometry.prob_lo     = -lx/2.   -lx/2.   -lx/2.    # physical domain
geometry.prob_hi     =  lx/2.    lx/2.    lx/2.

# Boundary condition
boundary.field_lo = periodic periodic periodic
boundary.field_hi = periodic periodic periodic

warpx.serialize_initial_conditions = 1

# Verbosity
warpx.verbose = 1

# Algorithms
algo.current_deposition = esirkepov
algo.field_gathering = energy-conserving
warpx.use_filter = 0

# Order of particle shape factors
algo.particle_shape = 1

# CFL
warpx.cfl = 1.0

# Particles
particles.species_names = electrons positrons

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 1 1 1
electrons.xmin = -20.e-6
electrons.xmax =  20.e-6
electrons.ymin = -20.e-6
electrons.ymax = 20.e-6
electrons.zmin = -20.e-6
electrons.zmax = 20.e-6

electrons.profile = constant
electrons.density = n0   # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
electrons.momentum_function_uz(x,y,z) = "epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"

positrons.charge = q_e
positrons.mass = m_e
positrons.injection_style = "NUniformPerCell"
positrons.num_particles_per_cell_each_dim = 1 1 1
positrons.xmin = -20.e-6
positrons.xmax =  20.e-6
positrons.ymin = -20.e-6
positrons.ymax = 20.e-6
positrons.zmin = -20.e-6
positrons.zmax = 20.e-6

positrons.profile = constant
positrons.density = n0   # number of positrons per m^3
positrons.momentum_distribution_type = parse_momentum_function
positrons.momentum_function_ux(x,y,z) = "-epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
positrons.momentum_function_uy(x,y,z) = "-epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
positrons.momentum_function_uz(x,y,z) = "-epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"

# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = max_step
diag1.diag_type = Full
diag1.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz part_per_cell rho
diag1.electrons.variables = w ux
diag1.positrons.variables = uz

This example can be run as a Python script: python3 PICMI_inputs_2d.py.

You can copy this file from Examples/Tests/langmuir/PICMI_inputs_2d.py.
#!/usr/bin/env python3
#
# --- Simple example of Langmuir oscillations in a uniform plasma
# --- in two dimensions

from pywarpx import picmi

constants = picmi.constants

##########################
# physics parameters
##########################

plasma_density = 1.e25
plasma_xmin = 0.
plasma_x_velocity = 0.1*constants.c

##########################
# numerics parameters
##########################

# --- Number of time steps
max_steps = 40
diagnostic_intervals = "::10"

# --- Grid
nx = 64
ny = 64

xmin = -20.e-6
ymin = -20.e-6
xmax = +20.e-6
ymax = +20.e-6

number_per_cell_each_dim = [2,2]

##########################
# physics components
##########################

uniform_plasma = picmi.UniformDistribution(density = 1.e25,
                                           upper_bound = [0., None, None],
                                           directed_velocity = [0.1*constants.c, 0., 0.])

electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=uniform_plasma)

##########################
# numerics components
##########################

grid = picmi.Cartesian2DGrid(number_of_cells = [nx, ny],
                             lower_bound = [xmin, ymin],
                             upper_bound = [xmax, ymax],
                             lower_boundary_conditions = ['periodic', 'periodic'],
                             upper_boundary_conditions = ['periodic', 'periodic'],
                             moving_window_velocity = [0., 0., 0.],
                             warpx_max_grid_size = 32)

solver = picmi.ElectromagneticSolver(grid=grid, cfl=1.)

##########################
# diagnostics
##########################

field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
                                    grid = grid,
                                    period = diagnostic_intervals,
                                    data_list = ['Ex', 'Jx'],
                                    write_dir = '.',
                                    warpx_file_prefix = 'Python_Langmuir_2d_plt')

part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
                                      period = diagnostic_intervals,
                                      species = [electrons],
                                      data_list = ['weighting', 'ux'])

##########################
# simulation setup
##########################

sim = picmi.Simulation(solver = solver,
                       max_steps = max_steps,
                       verbose = 1,
                       warpx_current_deposition_algo = 'direct',
                       warpx_use_filter = 0)

sim.add_species(electrons,
                layout = picmi.GriddedLayout(n_macroparticle_per_cell=number_per_cell_each_dim, grid=grid))

sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)

##########################
# simulation run
##########################

# write_inputs will create an inputs file that can be used to run
# with the compiled version.
sim.write_input_file(file_name = 'inputs2d_from_PICMI')

# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()

This example can be run as WarpX executable using an input file: warpx.2d inputs_2d

You can copy this file from Examples/Tests/langmuir/inputs_2d.
# Maximum number of time steps
max_step = 80

# number of grid points
amr.n_cell =   128  128

# Maximum allowable size of each subdomain in the problem domain;
#    this is used to decompose the domain for parallel calculations.
amr.max_grid_size = 64

# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0

# Geometry
geometry.dims = 2
geometry.prob_lo     = -20.e-6   -20.e-6    # physical domain
geometry.prob_hi     =  20.e-6    20.e-6

# Boundary condition
boundary.field_lo = periodic periodic
boundary.field_hi = periodic periodic

warpx.serialize_initial_conditions = 1

# Verbosity
warpx.verbose = 1

# Algorithms
algo.field_gathering = energy-conserving
warpx.use_filter = 0

# Order of particle shape factors
algo.particle_shape = 1

# CFL
warpx.cfl = 1.0

# Parameters for the plasma wave
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24  # electron and positron densities, #/m^3
my_constants.wp = sqrt(2.*n0*q_e**2/(epsilon0*m_e))  # plasma frequency
my_constants.kp = wp/clight  # plasma wavenumber
my_constants.k = 2.*pi/20.e-6  # perturbation wavenumber
# Note: kp is calculated in SI for a density of 4e24 (i.e. 2e24 electrons + 2e24 positrons)
# k is calculated so as to have 2 periods within the 40e-6 wide box.

# Particles
particles.species_names = electrons positrons

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 2 2
electrons.xmin = -20.e-6
electrons.xmax =  20.e-6
electrons.ymin = -20.e-6
electrons.ymax = 20.e-6
electrons.zmin = -20.e-6
electrons.zmax = 20.e-6

electrons.profile = constant
electrons.density = n0   # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
electrons.momentum_function_uz(x,y,z) = "epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"

positrons.charge = q_e
positrons.mass = m_e
positrons.injection_style = "NUniformPerCell"
positrons.num_particles_per_cell_each_dim = 2 2
positrons.xmin = -20.e-6
positrons.xmax =  20.e-6
positrons.ymin = -20.e-6
positrons.ymax = 20.e-6
positrons.zmin = -20.e-6
positrons.zmax = 20.e-6

positrons.profile = constant
positrons.density = n0   # number of positrons per m^3
positrons.momentum_distribution_type = parse_momentum_function
positrons.momentum_function_ux(x,y,z) = "-epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
positrons.momentum_function_uy(x,y,z) = "-epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
positrons.momentum_function_uz(x,y,z) = "-epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"

# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 40
diag1.diag_type = Full

This example can be run as a Python script: python3 PICMI_inputs_rz.py.

You can copy this file from Examples/Tests/langmuir/PICMI_inputs_rz.py.
#!/usr/bin/env python3
#
# This is a script that analyses the multimode simulation results.
# This simulates a RZ multimode periodic plasma wave.
# The electric field from the simulation is compared to the analytic value

import matplotlib

matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np

from pywarpx import fields, picmi

constants = picmi.constants

##########################
# physics parameters
##########################

density = 2.e24
epsilon0 = 0.001*constants.c
epsilon1 = 0.001*constants.c
epsilon2 = 0.001*constants.c
w0 = 5.e-6
n_osc_z = 3

# Plasma frequency
wp = np.sqrt((density*constants.q_e**2)/(constants.m_e*constants.ep0))
kp = wp/constants.c

##########################
# numerics parameters
##########################

nr = 64
nz = 200

rmin =  0.e0
zmin =  0.e0
rmax = +20.e-6
zmax = +40.e-6

# Wave vector of the wave
k0 = 2.*np.pi*n_osc_z/(zmax - zmin)

diagnostic_intervals = 40

##########################
# physics components
##########################

uniform_plasma = picmi.UniformDistribution(density = density,
                                           upper_bound = [+18e-6, +18e-6, None],
                                           directed_velocity = [0., 0., 0.])

momentum_expressions = ["""+ epsilon0/kp*2*x/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
                           - epsilon1/kp*2/w0*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
                           + epsilon1/kp*4*x**2/w0**3*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
                           - epsilon2/kp*8*x/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
                           + epsilon2/kp*8*x*(x**2-y**2)/w0**4*exp(-(x**2+y**2)/w0**2)*sin(k0*z)""",
                        """+ epsilon0/kp*2*y/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
                           + epsilon1/kp*4*x*y/w0**3*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
                           + epsilon2/kp*8*y/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
                           + epsilon2/kp*8*y*(x**2-y**2)/w0**4*exp(-(x**2+y**2)/w0**2)*sin(k0*z)""",
                        """- epsilon0/kp*k0*exp(-(x**2+y**2)/w0**2)*cos(k0*z)
                           - epsilon1/kp*k0*2*x/w0*exp(-(x**2+y**2)/w0**2)*cos(k0*z)
                           - epsilon2/kp*k0*4*(x**2-y**2)/w0**2*exp(-(x**2+y**2)/w0**2)*cos(k0*z)"""]

analytic_plasma = picmi.AnalyticDistribution(density_expression = density,
                                             upper_bound = [+18e-6, +18e-6, None],
                                             epsilon0 = epsilon0,
                                             epsilon1 = epsilon1,
                                             epsilon2 = epsilon2,
                                             kp = kp,
                                             k0 = k0,
                                             w0 = w0,
                                             momentum_expressions = momentum_expressions)

electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=analytic_plasma)
protons = picmi.Species(particle_type='proton', name='protons', initial_distribution=uniform_plasma)

##########################
# numerics components
##########################

grid = picmi.CylindricalGrid(number_of_cells = [nr, nz],
                             n_azimuthal_modes = 3,
                             lower_bound = [rmin, zmin],
                             upper_bound = [rmax, zmax],
                             lower_boundary_conditions = ['none', 'periodic'],
                             upper_boundary_conditions = ['none', 'periodic'],
                             lower_boundary_conditions_particles = ['absorbing', 'periodic'],
                             upper_boundary_conditions_particles = ['absorbing', 'periodic'],
                             moving_window_velocity = [0.,0.],
                             warpx_max_grid_size=64)

solver = picmi.ElectromagneticSolver(grid=grid, cfl=1.)

##########################
# diagnostics
##########################

field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
                                    grid = grid,
                                    period = diagnostic_intervals,
                                    data_list = ['Er', 'Ez', 'Bt', 'Jr', 'Jz', 'part_per_cell'],
                                    write_dir = '.',
                                    warpx_file_prefix = 'Python_Langmuir_rz_multimode_plt')

part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
                                      period = diagnostic_intervals,
                                      species = [electrons],
                                      data_list = ['weighting', 'momentum'])

##########################
# simulation setup
##########################

sim = picmi.Simulation(solver = solver,
                       max_steps = 40,
                       verbose = 1,
                       warpx_current_deposition_algo = 'esirkepov',
                       warpx_field_gathering_algo = 'energy-conserving',
                       warpx_particle_pusher_algo = 'boris',
                       warpx_use_filter = 0)

sim.add_species(electrons, layout=picmi.GriddedLayout(n_macroparticle_per_cell=[2,16,2], grid=grid))
sim.add_species(protons, layout=picmi.GriddedLayout(n_macroparticle_per_cell=[2,16,2], grid=grid))

sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)

##########################
# simulation run
##########################

# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name='inputsrz_from_PICMI')

# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()


# Below is WarpX specific code to check the results.

def calcEr( z, r, k0, w0, wp, t, epsilons) :
    """
    Return the radial electric field as an array
    of the same length as z and r, in the half-plane theta=0
    """
    Er_array = (
        epsilons[0] * constants.m_e*constants.c/constants.q_e * 2*r/w0**2 *
            np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
        - epsilons[1] * constants.m_e*constants.c/constants.q_e * 2/w0 *
            np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
        + epsilons[1] * constants.m_e*constants.c/constants.q_e * 4*r**2/w0**3 *
            np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
        - epsilons[2] * constants.m_e*constants.c/constants.q_e * 8*r/w0**2 *
            np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
        + epsilons[2] * constants.m_e*constants.c/constants.q_e * 8*r**3/w0**4 *
            np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t ))
    return( Er_array )

def calcEz( z, r, k0, w0, wp, t, epsilons) :
    """
    Return the longitudinal electric field as an array
    of the same length as z and r, in the half-plane theta=0
    """
    Ez_array = (
        - epsilons[0] * constants.m_e*constants.c/constants.q_e * k0 *
            np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t )
        - epsilons[1] * constants.m_e*constants.c/constants.q_e * k0 * 2*r/w0 *
            np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t )
        - epsilons[2] * constants.m_e*constants.c/constants.q_e * k0 * 4*r**2/w0**2 *
            np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t ))
    return( Ez_array )

# Current time of the simulation
t0 = sim.extension.warpx.gett_new(0)

# Get the raw field data. Note that these are the real and imaginary
# parts of the fields for each azimuthal mode.
Ex_sim_wrap = fields.ExWrapper()
Ez_sim_wrap = fields.EzWrapper()
Ex_sim_modes = Ex_sim_wrap[...]
Ez_sim_modes = Ez_sim_wrap[...]

rr_Er = Ex_sim_wrap.mesh('r')
zz_Er = Ex_sim_wrap.mesh('z')
rr_Ez = Ez_sim_wrap.mesh('r')
zz_Ez = Ez_sim_wrap.mesh('z')

rr_Er = rr_Er[:,np.newaxis]*np.ones(zz_Er.shape[0])[np.newaxis,:]
zz_Er = zz_Er[np.newaxis,:]*np.ones(rr_Er.shape[0])[:,np.newaxis]
rr_Ez = rr_Ez[:,np.newaxis]*np.ones(zz_Ez.shape[0])[np.newaxis,:]
zz_Ez = zz_Ez[np.newaxis,:]*np.ones(rr_Ez.shape[0])[:,np.newaxis]

# Sum the real components to get the field along x-axis (theta = 0)
Er_sim = Ex_sim_modes[:,:,0] + np.sum(Ex_sim_modes[:,:,1::2], axis=2)
Ez_sim = Ez_sim_modes[:,:,0] + np.sum(Ez_sim_modes[:,:,1::2], axis=2)

# The analytical solutions
Er_th = calcEr(zz_Er, rr_Er, k0, w0, wp, t0, [epsilon0, epsilon1, epsilon2])
Ez_th = calcEz(zz_Ez, rr_Ez, k0, w0, wp, t0, [epsilon0, epsilon1, epsilon2])

max_error_Er = abs(Er_sim - Er_th).max()/abs(Er_th).max()
max_error_Ez = abs(Ez_sim - Ez_th).max()/abs(Ez_th).max()
print("Max error Er %e"%max_error_Er)
print("Max error Ez %e"%max_error_Ez)

# Plot the last field from the loop (Er at iteration 40)
fig, ax = plt.subplots(3)
im = ax[0].imshow( Er_sim, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[0], orientation='vertical')
ax[0].set_title('Er, last iteration (simulation)')
ax[1].imshow( Er_th, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[1], orientation='vertical')
ax[1].set_title('Er, last iteration (theory)')
im = ax[2].imshow( (Er_sim - Er_th)/abs(Er_th).max(), aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[2], orientation='vertical')
ax[2].set_title('Er, last iteration (difference)')
plt.savefig('langmuir_multi_rz_multimode_analysis_Er.png')

fig, ax = plt.subplots(3)
im = ax[0].imshow( Ez_sim, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[0], orientation='vertical')
ax[0].set_title('Ez, last iteration (simulation)')
ax[1].imshow( Ez_th, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[1], orientation='vertical')
ax[1].set_title('Ez, last iteration (theory)')
im = ax[2].imshow( (Ez_sim - Ez_th)/abs(Ez_th).max(), aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[2], orientation='vertical')
ax[2].set_title('Ez, last iteration (difference)')
plt.savefig('langmuir_multi_rz_multimode_analysis_Ez.png')

assert max(max_error_Er, max_error_Ez) < 0.02

This example can be run as WarpX executable using an input file: warpx.rz inputs_rz

You can copy this file from Examples/Tests/langmuir/inputs_rz.
# Parameters for the plasma wave
my_constants.max_step = 80
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24  # electron density, #/m^3
my_constants.wp = sqrt(n0*q_e**2/(epsilon0*m_e))  # plasma frequency
my_constants.kp = wp/clight  # plasma wavenumber
my_constants.k0 = 2.*pi/20.e-6  # longitudianl perturbation wavenumber
my_constants.w0 = 5.e-6  # transverse perturbation length
# Note: kp is calculated in SI for a density of 2e24
# k0 is calculated so as to have 2 periods within the 40e-6 wide box.

# Maximum number of time steps
max_step = max_step

# number of grid points
amr.n_cell =   64  128

# Maximum allowable size of each subdomain in the problem domain;
#    this is used to decompose the domain for parallel calculations.
amr.max_grid_size = 64

# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0

# Geometry
geometry.dims = RZ
geometry.prob_lo     =   0.e-6   -20.e-6    # physical domain
geometry.prob_hi     =  20.e-6    20.e-6
boundary.field_lo = none periodic
boundary.field_hi = none periodic

warpx.serialize_initial_conditions = 1

# Verbosity
warpx.verbose = 1

# Algorithms
algo.field_gathering = energy-conserving
algo.current_deposition = esirkepov
warpx.use_filter = 0

# Order of particle shape factors
algo.particle_shape = 1

# CFL
warpx.cfl = 1.0

# Having this turned on makes for a more sensitive test
warpx.do_dive_cleaning = 1

# Particles
particles.species_names = electrons ions

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 2 2 2
electrons.xmin =   0.e-6
electrons.xmax =  18.e-6
electrons.zmin = -20.e-6
electrons.zmax = +20.e-6

electrons.profile = constant
electrons.density = n0   # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon/kp*2*x/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon/kp*2*y/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)"
electrons.momentum_function_uz(x,y,z) = "-epsilon/kp*k0*exp(-(x**2+y**2)/w0**2)*cos(k0*z)"


ions.charge = q_e
ions.mass = m_p
ions.injection_style = "NUniformPerCell"
ions.num_particles_per_cell_each_dim = 2 2 2
ions.xmin =   0.e-6
ions.xmax =  18.e-6
ions.zmin = -20.e-6
ions.zmax = +20.e-6

ions.profile = constant
ions.density = n0   # number of ions per m^3
ions.momentum_distribution_type = at_rest

# Diagnostics
diagnostics.diags_names = diag1 diag_parser_filter diag_uniform_filter diag_random_filter
diag1.intervals = max_step/2
diag1.diag_type = Full
diag1.fields_to_plot = jr jz Er Ez Bt

## diag_parser_filter is a diag used to test the particle filter function.
diag_parser_filter.intervals = max_step:max_step:
diag_parser_filter.diag_type = Full
diag_parser_filter.species = electrons
diag_parser_filter.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = "(uy-uz < 0) *
                                                                 (sqrt(x**2+y**2)<10e-6) * (z > 0)"

## diag_uniform_filter is a diag used to test the particle uniform filter.
diag_uniform_filter.intervals = max_step:max_step:
diag_uniform_filter.diag_type = Full
diag_uniform_filter.species = electrons
diag_uniform_filter.electrons.uniform_stride = 3

## diag_random_filter is a diag used to test the particle random filter.
diag_random_filter.intervals = max_step:max_step:
diag_random_filter.diag_type = Full
diag_random_filter.species = electrons
diag_random_filter.electrons.random_fraction = 0.66

Note

TODO: This input file should be created, like the inputs_1d file.

This example can be run as WarpX executable using an input file: warpx.1d inputs_1d

You can copy this file from Examples/Tests/langmuir/inputs_1d.
# Maximum number of time steps
max_step = 80

# number of grid points
amr.n_cell =  128

# Maximum allowable size of each subdomain in the problem domain;
#    this is used to decompose the domain for parallel calculations.
amr.max_grid_size = 64

# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0

# Geometry
geometry.dims = 1
geometry.prob_lo     = -20.e-6    # physical domain
geometry.prob_hi     =  20.e-6

# Boundary condition
boundary.field_lo = periodic
boundary.field_hi = periodic

warpx.serialize_initial_conditions = 1

# Verbosity
warpx.verbose = 1

# Algorithms
algo.field_gathering = energy-conserving
warpx.use_filter = 0

# Order of particle shape factors
algo.particle_shape = 1

# CFL
warpx.cfl = 0.8

# Parameters for the plasma wave
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24  # electron and positron densities, #/m^3
my_constants.wp = sqrt(2.*n0*q_e**2/(epsilon0*m_e))  # plasma frequency
my_constants.kp = wp/clight  # plasma wavenumber
my_constants.k = 2.*pi/20.e-6  # perturbation wavenumber
# Note: kp is calculated in SI for a density of 4e24 (i.e. 2e24 electrons + 2e24 positrons)
# k is calculated so as to have 2 periods within the 40e-6 wide box.

# Particles
particles.species_names = electrons positrons

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 2
electrons.zmin = -20.e-6
electrons.zmax = 20.e-6

electrons.profile = constant
electrons.density = n0   # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
electrons.momentum_function_uz(x,y,z) = "epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"

positrons.charge = q_e
positrons.mass = m_e
positrons.injection_style = "NUniformPerCell"
positrons.num_particles_per_cell_each_dim = 2
positrons.zmin = -20.e-6
positrons.zmax = 20.e-6

positrons.profile = constant
positrons.density = n0   # number of positrons per m^3
positrons.momentum_distribution_type = parse_momentum_function
positrons.momentum_function_ux(x,y,z) = "-epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
positrons.momentum_function_uy(x,y,z) = "-epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
positrons.momentum_function_uz(x,y,z) = "-epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"

# Diagnostics
diagnostics.diags_names = diag1 openpmd
diag1.intervals = 40
diag1.diag_type = Full

openpmd.intervals = 40
openpmd.diag_type = Full
openpmd.format = openpmd
Analyze

We run the following script to analyze correctness:

Script analysis_3d.py
You can copy this file from Examples/Tests/langmuir/analysis_3d.py.
#!/usr/bin/env python3

# Copyright 2019-2022 Jean-Luc Vay, Maxence Thevenet, Remi Lehe, Axel Huebl
#
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rt`. This simulates a 3D periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_x = \epsilon \,\frac{m_e c^2 k_x}{q_e}\sin(k_x x)\cos(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_y = \epsilon \,\frac{m_e c^2 k_y}{q_e}\cos(k_x x)\sin(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_z = \epsilon \,\frac{m_e c^2 k_z}{q_e}\cos(k_x x)\cos(k_y y)\sin(k_z z)\sin( \omega_p t)$$
import os
import re
import sys

import matplotlib.pyplot as plt
import yt
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable

yt.funcs.mylog.setLevel(50)

import numpy as np
from scipy.constants import c, e, epsilon_0, m_e

sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI

# this will be the name of the plot file
fn = sys.argv[1]

# Parse test name and check if current correction (psatd.current_correction=1) is applied
current_correction = True if re.search( 'current_correction', fn ) else False

# Parse test name and check if Vay current deposition (algo.current_deposition=vay) is used
vay_deposition = True if re.search( 'Vay_deposition', fn ) else False

# Parse test name and check if div(E)/div(B) cleaning (warpx.do_div<e,b>_cleaning=1) is used
div_cleaning = True if re.search('div_cleaning', fn) else False

# Parameters (these parameters must match the parameters in `inputs.multi.rt`)
epsilon = 0.01
n = 4.e24
n_osc_x = 2
n_osc_y = 2
n_osc_z = 2
lo = [-20.e-6, -20.e-6, -20.e-6]
hi = [ 20.e-6,  20.e-6,  20.e-6]
Ncell = [64, 64, 64]

# Wave vector of the wave
kx = 2.*np.pi*n_osc_x/(hi[0]-lo[0])
ky = 2.*np.pi*n_osc_y/(hi[1]-lo[1])
kz = 2.*np.pi*n_osc_z/(hi[2]-lo[2])
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))

k = {'Ex':kx, 'Ey':ky, 'Ez':kz}
cos = {'Ex': (0,1,1), 'Ey':(1,0,1), 'Ez':(1,1,0)}

def get_contribution( is_cos, k, idim ):
    du = (hi[idim]-lo[idim])/Ncell[idim]
    u = lo[idim] + du*( 0.5 + np.arange(Ncell[idim]) )
    if is_cos[idim] == 1:
        return( np.cos(k*u) )
    else:
        return( np.sin(k*u) )

def get_theoretical_field( field, t ):
    amplitude = epsilon * (m_e*c**2*k[field])/e * np.sin(wp*t)
    cos_flag = cos[field]
    x_contribution = get_contribution( cos_flag, kx, 0 )
    y_contribution = get_contribution( cos_flag, ky, 1 )
    z_contribution = get_contribution( cos_flag, kz, 2 )

    E = amplitude * x_contribution[:, np.newaxis, np.newaxis] \
                  * y_contribution[np.newaxis, :, np.newaxis] \
                  * z_contribution[np.newaxis, np.newaxis, :]

    return( E )

# Read the file
ds = yt.load(fn)

# Check that the particle selective output worked:
species = 'electrons'
print('ds.field_list', ds.field_list)
for field in ['particle_weight',
              'particle_momentum_x']:
    print('assert that this is in ds.field_list', (species, field))
    assert (species, field) in ds.field_list
for field in ['particle_momentum_y',
              'particle_momentum_z']:
    print('assert that this is NOT in ds.field_list', (species, field))
    assert (species, field) not in ds.field_list
species = 'positrons'
for field in ['particle_momentum_x',
              'particle_momentum_y']:
    print('assert that this is NOT in ds.field_list', (species, field))
    assert (species, field) not in ds.field_list

t0 = ds.current_time.to_value()
data = ds.covering_grid(level = 0, left_edge = ds.domain_left_edge, dims = ds.domain_dimensions)
edge = np.array([(ds.domain_left_edge[2]).item(), (ds.domain_right_edge[2]).item(), \
                 (ds.domain_left_edge[0]).item(), (ds.domain_right_edge[0]).item()])

# Check the validity of the fields
error_rel = 0
for field in ['Ex', 'Ey', 'Ez']:
    E_sim = data[('mesh',field)].to_ndarray()
    E_th = get_theoretical_field(field, t0)
    max_error = abs(E_sim-E_th).max()/abs(E_th).max()
    print('%s: Max error: %.2e' %(field,max_error))
    error_rel = max( error_rel, max_error )

# Plot the last field from the loop (Ez at iteration 40)
fig, (ax1, ax2) = plt.subplots(1, 2, dpi = 100)
# First plot (slice at y=0)
E_plot = E_sim[:,Ncell[1]//2+1,:]
vmin = E_plot.min()
vmax = E_plot.max()
cax1 = make_axes_locatable(ax1).append_axes('right', size = '5%', pad = '5%')
im1 = ax1.imshow(E_plot, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb1 = fig.colorbar(im1, cax = cax1)
ax1.set_xlabel(r'$z$')
ax1.set_ylabel(r'$x$')
ax1.set_title(r'$E_z$ (sim)')
# Second plot (slice at y=0)
E_plot = E_th[:,Ncell[1]//2+1,:]
vmin = E_plot.min()
vmax = E_plot.max()
cax2 = make_axes_locatable(ax2).append_axes('right', size = '5%', pad = '5%')
im2 = ax2.imshow(E_plot, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb2 = fig.colorbar(im2, cax = cax2)
ax2.set_xlabel(r'$z$')
ax2.set_ylabel(r'$x$')
ax2.set_title(r'$E_z$ (theory)')
# Save figure
fig.tight_layout()
fig.savefig('Langmuir_multi_analysis.png', dpi = 200)

tolerance_rel = 5e-2

print("error_rel    : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))

assert( error_rel < tolerance_rel )

# Check relative L-infinity spatial norm of rho/epsilon_0 - div(E)
# with current correction (and periodic single box option) or with Vay current deposition
if current_correction:
    tolerance = 1e-9
elif vay_deposition:
    tolerance = 1e-3
if current_correction or vay_deposition:
    rho  = data[('boxlib','rho')].to_ndarray()
    divE = data[('boxlib','divE')].to_ndarray()
    error_rel = np.amax( np.abs( divE - rho/epsilon_0 ) ) / np.amax( np.abs( rho/epsilon_0 ) )
    print("Check charge conservation:")
    print("error_rel = {}".format(error_rel))
    print("tolerance = {}".format(tolerance))
    assert( error_rel < tolerance )

if div_cleaning:
    ds_old = yt.load('Langmuir_multi_psatd_div_cleaning_plt000038')
    ds_mid = yt.load('Langmuir_multi_psatd_div_cleaning_plt000039')
    ds_new = yt.load(fn) # this is the last plotfile

    ad_old = ds_old.covering_grid(level = 0, left_edge = ds_old.domain_left_edge, dims = ds_old.domain_dimensions)
    ad_mid = ds_mid.covering_grid(level = 0, left_edge = ds_mid.domain_left_edge, dims = ds_mid.domain_dimensions)
    ad_new = ds_new.covering_grid(level = 0, left_edge = ds_new.domain_left_edge, dims = ds_new.domain_dimensions)

    rho   = ad_mid['rho'].v.squeeze()
    divE  = ad_mid['divE'].v.squeeze()
    F_old = ad_old['F'].v.squeeze()
    F_new = ad_new['F'].v.squeeze()

    # Check max norm of error on dF/dt = div(E) - rho/epsilon_0
    # (the time interval between the old and new data is 2*dt)
    dt = 1.203645751e-15
    x = F_new - F_old
    y = (divE - rho/epsilon_0) * 2 * dt
    error_rel = np.amax(np.abs(x - y)) / np.amax(np.abs(y))
    tolerance = 1e-2
    print("Check div(E) cleaning:")
    print("error_rel = {}".format(error_rel))
    print("tolerance = {}".format(tolerance))
    assert(error_rel < tolerance)

test_name = os.path.split(os.getcwd())[1]

if re.search( 'single_precision', fn ):
    checksumAPI.evaluate_checksum(test_name, fn, rtol=1.e-3)
else:
    checksumAPI.evaluate_checksum(test_name, fn)
Script analysis_2d.py
You can copy this file from Examples/Tests/langmuir/analysis_2d.py.
#!/usr/bin/env python3

# Copyright 2019 Jean-Luc Vay, Maxence Thevenet, Remi Lehe
#
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rt`. This simulates a 3D periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_x = \epsilon \,\frac{m_e c^2 k_x}{q_e}\sin(k_x x)\cos(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_y = \epsilon \,\frac{m_e c^2 k_y}{q_e}\cos(k_x x)\sin(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_z = \epsilon \,\frac{m_e c^2 k_z}{q_e}\cos(k_x x)\cos(k_y y)\sin(k_z z)\sin( \omega_p t)$$
import os
import re
import sys

import matplotlib.pyplot as plt
import yt
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable

yt.funcs.mylog.setLevel(50)

import numpy as np
from scipy.constants import c, e, epsilon_0, m_e

sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI

# this will be the name of the plot file
fn = sys.argv[1]

# Parse test name and check if current correction (psatd.current_correction=1) is applied
current_correction = True if re.search( 'current_correction', fn ) else False

# Parse test name and check if Vay current deposition (algo.current_deposition=vay) is used
vay_deposition = True if re.search( 'Vay_deposition', fn ) else False

# Parse test name and check if particle_shape = 4 is used
particle_shape_4 = True if re.search('particle_shape_4', fn) else False

# Parameters (these parameters must match the parameters in `inputs.multi.rt`)
epsilon = 0.01
n = 4.e24
n_osc_x = 2
n_osc_z = 2
xmin = -20e-6; xmax = 20.e-6; Nx = 128
zmin = -20e-6; zmax = 20.e-6; Nz = 128

# Wave vector of the wave
kx = 2.*np.pi*n_osc_x/(xmax-xmin)
kz = 2.*np.pi*n_osc_z/(zmax-zmin)
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))

k = {'Ex':kx, 'Ez':kz}
cos = {'Ex': (0,1,1), 'Ez':(1,1,0)}

def get_contribution( is_cos, k ):
    du = (xmax-xmin)/Nx
    u = xmin + du*( 0.5 + np.arange(Nx) )
    if is_cos == 1:
        return( np.cos(k*u) )
    else:
        return( np.sin(k*u) )

def get_theoretical_field( field, t ):
    amplitude = epsilon * (m_e*c**2*k[field])/e * np.sin(wp*t)
    cos_flag = cos[field]
    x_contribution = get_contribution( cos_flag[0], kx )
    z_contribution = get_contribution( cos_flag[2], kz )

    E = amplitude * x_contribution[:, np.newaxis ] \
                  * z_contribution[np.newaxis, :]

    return( E )

# Read the file
ds = yt.load(fn)
t0 = ds.current_time.to_value()
data = ds.covering_grid(level = 0, left_edge = ds.domain_left_edge, dims = ds.domain_dimensions)
edge = np.array([(ds.domain_left_edge[1]).item(), (ds.domain_right_edge[1]).item(), \
                 (ds.domain_left_edge[0]).item(), (ds.domain_right_edge[0]).item()])

# Check the validity of the fields
error_rel = 0
for field in ['Ex', 'Ez']:
    E_sim = data[('mesh',field)].to_ndarray()[:,:,0]
    E_th = get_theoretical_field(field, t0)
    max_error = abs(E_sim-E_th).max()/abs(E_th).max()
    print('%s: Max error: %.2e' %(field,max_error))
    error_rel = max( error_rel, max_error )

# Plot the last field from the loop (Ez at iteration 40)
fig, (ax1, ax2) = plt.subplots(1, 2, dpi = 100)
# First plot
vmin = E_sim.min()
vmax = E_sim.max()
cax1 = make_axes_locatable(ax1).append_axes('right', size = '5%', pad = '5%')
im1 = ax1.imshow(E_sim, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb1 = fig.colorbar(im1, cax = cax1)
ax1.set_xlabel(r'$z$')
ax1.set_ylabel(r'$x$')
ax1.set_title(r'$E_z$ (sim)')
# Second plot
vmin = E_th.min()
vmax = E_th.max()
cax2 = make_axes_locatable(ax2).append_axes('right', size = '5%', pad = '5%')
im2 = ax2.imshow(E_th, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb2 = fig.colorbar(im2, cax = cax2)
ax2.set_xlabel(r'$z$')
ax2.set_ylabel(r'$x$')
ax2.set_title(r'$E_z$ (theory)')
# Save figure
fig.tight_layout()
fig.savefig('Langmuir_multi_2d_analysis.png', dpi = 200)

if particle_shape_4:
# lower fidelity, due to smoothing
    tolerance_rel = 0.07
else:
    tolerance_rel = 0.05

print("error_rel    : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))

assert( error_rel < tolerance_rel )

# Check relative L-infinity spatial norm of rho/epsilon_0 - div(E)
# with current correction (and periodic single box option) or with Vay current deposition
if current_correction:
    tolerance = 1e-9
elif vay_deposition:
    tolerance = 1e-3
if current_correction or vay_deposition:
    rho  = data[('boxlib','rho')].to_ndarray()
    divE = data[('boxlib','divE')].to_ndarray()
    error_rel = np.amax( np.abs( divE - rho/epsilon_0 ) ) / np.amax( np.abs( rho/epsilon_0 ) )
    print("Check charge conservation:")
    print("error_rel = {}".format(error_rel))
    print("tolerance = {}".format(tolerance))
    assert( error_rel < tolerance )

test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)
Script analysis_rz.py
You can copy this file from Examples/Tests/langmuir/analysis_rz.py.
#!/usr/bin/env python3

# Copyright 2019 David Grote, Maxence Thevenet
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rz.rt`. This simulates a RZ periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_r = -\partial_r \phi = \epsilon \,\frac{mc^2}{e}\frac{2\,r}{w_0^2} \exp\left(-\frac{r^2}{w_0^2}\right) \sin(k_0 z) \sin(\omega_p t)
# $$ E_z = -\partial_z \phi = - \epsilon \,\frac{mc^2}{e} k_0 \exp\left(-\frac{r^2}{w_0^2}\right) \cos(k_0 z) \sin(\omega_p t)
# Unrelated to the Langmuir waves, we also test the plotfile particle filter function in this
# analysis script.
import os
import re
import sys

import matplotlib

matplotlib.use('Agg')
import matplotlib.pyplot as plt
import yt

yt.funcs.mylog.setLevel(50)

import numpy as np
import post_processing_utils
from scipy.constants import c, e, epsilon_0, m_e

sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI

# this will be the name of the plot file
fn = sys.argv[1]

test_name = os.path.split(os.getcwd())[1]

# Parse test name and check if current correction (psatd.current_correction) is applied
current_correction = True if re.search('current_correction', fn) else False

# Parameters (these parameters must match the parameters in `inputs.multi.rz.rt`)
epsilon = 0.01
n = 2.e24
w0 = 5.e-6
n_osc_z = 2
rmin =   0e-6; rmax = 20.e-6; Nr = 64
zmin = -20e-6; zmax = 20.e-6; Nz = 128

# Wave vector of the wave
k0 = 2.*np.pi*n_osc_z/(zmax-zmin)
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))
kp = wp/c

def Er( z, r, epsilon, k0, w0, wp, t) :
    """
    Return the radial electric field as an array
    of the same length as z and r, in the half-plane theta=0
    """
    Er_array = \
        epsilon * m_e*c**2/e * 2*r/w0**2 * \
            np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
    return( Er_array )

def Ez( z, r, epsilon, k0, w0, wp, t) :
    """
    Return the longitudinal electric field as an array
    of the same length as z and r, in the half-plane theta=0
    """
    Ez_array = \
        - epsilon * m_e*c**2/e * k0 * \
            np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t )
    return( Ez_array )

# Read the file
ds = yt.load(fn)
t0 = ds.current_time.to_value()
data = ds.covering_grid(level=0, left_edge=ds.domain_left_edge,
                        dims=ds.domain_dimensions)

# Get cell centered coordinates
dr = (rmax - rmin)/Nr
dz = (zmax - zmin)/Nz
coords = np.indices([Nr, Nz],'d')
rr = rmin + (coords[0] + 0.5)*dr
zz = zmin + (coords[1] + 0.5)*dz

# Check the validity of the fields
overall_max_error = 0
Er_sim = data[('boxlib','Er')].to_ndarray()[:,:,0]
Er_th = Er(zz, rr, epsilon, k0, w0, wp, t0)
max_error = abs(Er_sim-Er_th).max()/abs(Er_th).max()
print('Er: Max error: %.2e' %(max_error))
overall_max_error = max( overall_max_error, max_error )

Ez_sim = data[('boxlib','Ez')].to_ndarray()[:,:,0]
Ez_th = Ez(zz, rr, epsilon, k0, w0, wp, t0)
max_error = abs(Ez_sim-Ez_th).max()/abs(Ez_th).max()
print('Ez: Max error: %.2e' %(max_error))
overall_max_error = max( overall_max_error, max_error )

# Plot the last field from the loop (Ez at iteration 40)
plt.subplot2grid( (1,2), (0,0) )
plt.imshow( Ez_sim )
plt.colorbar()
plt.title('Ez, last iteration\n(simulation)')
plt.subplot2grid( (1,2), (0,1) )
plt.imshow( Ez_th )
plt.colorbar()
plt.title('Ez, last iteration\n(theory)')
plt.tight_layout()
plt.savefig(test_name+'_analysis.png')

error_rel = overall_max_error

tolerance_rel = 0.12

print("error_rel    : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))

assert( error_rel < tolerance_rel )

# Check charge conservation (relative L-infinity norm of error) with current correction
if current_correction:
    divE = data[('boxlib','divE')].to_ndarray()
    rho  = data[('boxlib','rho')].to_ndarray() / epsilon_0
    error_rel = np.amax(np.abs(divE - rho)) / max(np.amax(divE), np.amax(rho))
    tolerance = 1.e-9
    print("Check charge conservation:")
    print("error_rel = {}".format(error_rel))
    print("tolerance = {}".format(tolerance))
    assert( error_rel < tolerance )


## In the final past of the test, we verify that the diagnostic particle filter function works as
## expected in RZ geometry. For this, we only use the last simulation timestep.

dim = "rz"
species_name = "electrons"

parser_filter_fn = "diags/diag_parser_filter000080"
parser_filter_expression = "(py-pz < 0) * (r<10e-6) * (z > 0)"
post_processing_utils.check_particle_filter(fn, parser_filter_fn, parser_filter_expression,
                                            dim, species_name)

uniform_filter_fn = "diags/diag_uniform_filter000080"
uniform_filter_expression = "ids%3 == 0"
post_processing_utils.check_particle_filter(fn, uniform_filter_fn, uniform_filter_expression,
                                            dim, species_name)

random_filter_fn = "diags/diag_random_filter000080"
random_fraction = 0.66
post_processing_utils.check_random_filter(fn, random_filter_fn, random_fraction,
                                          dim, species_name)

checksumAPI.evaluate_checksum(test_name, fn)
Script analysis_1d.py
You can copy this file from Examples/Tests/langmuir/analysis_1d.py.
#!/usr/bin/env python3

# Copyright 2019-2022 Jean-Luc Vay, Maxence Thevenet, Remi Lehe, Prabhat Kumar, Axel Huebl
#
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rt`. This simulates a 1D periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_z = \epsilon \,\frac{m_e c^2 k_z}{q_e}\sin(k_z z)\sin( \omega_p t)$$
import os
import re
import sys

import matplotlib

matplotlib.use('Agg')
import matplotlib.pyplot as plt
import yt

yt.funcs.mylog.setLevel(50)

import numpy as np
from scipy.constants import c, e, epsilon_0, m_e

sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI

# this will be the name of the plot file
fn = sys.argv[1]

# Parse test name and check if current correction (psatd.current_correction=1) is applied
current_correction = True if re.search( 'current_correction', fn ) else False

# Parse test name and check if Vay current deposition (algo.current_deposition=vay) is used
vay_deposition = True if re.search( 'Vay_deposition', fn ) else False

# Parameters (these parameters must match the parameters in `inputs.multi.rt`)
epsilon = 0.01
n = 4.e24
n_osc_z = 2
zmin = -20e-6; zmax = 20.e-6; Nz = 128

# Wave vector of the wave
kz = 2.*np.pi*n_osc_z/(zmax-zmin)
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))

k = {'Ez':kz}
cos = {'Ez':(1,1,0)}

def get_contribution( is_cos, k ):
    du = (zmax-zmin)/Nz
    u = zmin + du*( 0.5 + np.arange(Nz) )
    if is_cos == 1:
        return( np.cos(k*u) )
    else:
        return( np.sin(k*u) )

def get_theoretical_field( field, t ):
    amplitude = epsilon * (m_e*c**2*k[field])/e * np.sin(wp*t)
    cos_flag = cos[field]
    z_contribution = get_contribution( cos_flag[2], kz )

    E = amplitude * z_contribution

    return( E )

# Read the file
ds = yt.load(fn)
t0 = ds.current_time.to_value()
data = ds.covering_grid(level=0, left_edge=ds.domain_left_edge,
                                    dims=ds.domain_dimensions)
# Check the validity of the fields
error_rel = 0
for field in ['Ez']:
    E_sim = data[('mesh',field)].to_ndarray()[:,0,0]
    E_th = get_theoretical_field(field, t0)
    max_error = abs(E_sim-E_th).max()/abs(E_th).max()
    print('%s: Max error: %.2e' %(field,max_error))
    error_rel = max( error_rel, max_error )

# Plot the last field from the loop (Ez at iteration 80)
plt.subplot2grid( (1,2), (0,0) )
plt.plot( E_sim )
#plt.colorbar()
plt.title('Ez, last iteration\n(simulation)')
plt.subplot2grid( (1,2), (0,1) )
plt.plot( E_th )
#plt.colorbar()
plt.title('Ez, last iteration\n(theory)')
plt.tight_layout()
plt.savefig('langmuir_multi_1d_analysis.png')

tolerance_rel = 0.05

print("error_rel    : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))

assert( error_rel < tolerance_rel )

# Check relative L-infinity spatial norm of rho/epsilon_0 - div(E) when
# current correction (psatd.do_current_correction=1) is applied or when
# Vay current deposition (algo.current_deposition=vay) is used
if current_correction or vay_deposition:
    rho  = data[('boxlib','rho')].to_ndarray()
    divE = data[('boxlib','divE')].to_ndarray()
    error_rel = np.amax( np.abs( divE - rho/epsilon_0 ) ) / np.amax( np.abs( rho/epsilon_0 ) )
    tolerance = 1.e-9
    print("Check charge conservation:")
    print("error_rel = {}".format(error_rel))
    print("tolerance = {}".format(tolerance))
    assert( error_rel < tolerance )

test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)
Visualize

Note

This section is TODO.

Capacitive Discharge

The examples in this directory are based on the benchmark cases from Turner et al. (Phys. Plasmas 20, 013507, 2013) [11].

The Monte-Carlo collision (MCC) model can be used to simulate electron and ion collisions with a neutral background gas. In particular this can be used to study capacitive discharges between parallel plates. The implementation has been tested against the benchmark results from Turner et al. [11].

Note

This example needs additional calibration data for cross sections. Download this data alongside your inputs file and update the paths in the inputs file:

git clone https://github.com/ECP-WarpX/warpx-data.git
Run

The 1D PICMI input file can be used to reproduce the results from Turner et al. for a given case, N from 1 to 4, by executing python3 PICMI_inputs_1d.py -n N, e.g.,

python3 PICMI_inputs_1d.py -n 1

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

You can copy this file from Examples/Physics_applications/capacitive_discharge/PICMI_inputs_1d.py.
#!/usr/bin/env python3
#
# --- Copyright 2021 Modern Electron (DSMC test added in 2023 by TAE Technologies)
# --- Monte-Carlo Collision script to reproduce the benchmark tests from
# --- Turner et al. (2013) - https://doi.org/10.1063/1.4775084

import argparse
import sys

import numpy as np
from scipy.sparse import csc_matrix
from scipy.sparse import linalg as sla

from pywarpx import callbacks, fields, libwarpx, particle_containers, picmi

constants = picmi.constants


class PoissonSolver1D(picmi.ElectrostaticSolver):
    """This solver is maintained as an example of the use of Python callbacks.
       However, it is not necessarily needed since the 1D code has the direct tridiagonal
       solver implemented."""

    def __init__(self, grid, **kwargs):
        """Direct solver for the Poisson equation using superLU. This solver is
        useful for 1D cases.

        Arguments:
            grid (picmi.Cartesian1DGrid): Instance of the grid on which the
            solver will be installed.
        """
        # Sanity check that this solver is appropriate to use
        if not isinstance(grid, picmi.Cartesian1DGrid):
            raise RuntimeError('Direct solver can only be used on a 1D grid.')

        super(PoissonSolver1D, self).__init__(
            grid=grid, method=kwargs.pop('method', 'Multigrid'),
            required_precision=1, **kwargs
        )

    def solver_initialize_inputs(self):
        """Grab geometrical quantities from the grid. The boundary potentials
        are also obtained from the grid using 'warpx_potential_zmin' for the
        left_voltage and 'warpx_potential_zmax' for the right_voltage.
        These can be given as floats or strings that can be parsed by the
        WarpX parser.
        """
        # grab the boundary potentials from the grid object
        self.right_voltage = self.grid.potential_zmax

        # set WarpX boundary potentials to None since we will handle it
        # ourselves in this solver
        self.grid.potential_xmin = None
        self.grid.potential_xmax = None
        self.grid.potential_ymin = None
        self.grid.potential_ymax = None
        self.grid.potential_zmin = None
        self.grid.potential_zmax = None

        super(PoissonSolver1D, self).solver_initialize_inputs()

        self.nz = self.grid.number_of_cells[0]
        self.dz = (self.grid.upper_bound[0] - self.grid.lower_bound[0]) / self.nz

        self.nxguardphi = 1
        self.nzguardphi = 1

        self.phi = np.zeros(self.nz + 1 + 2*self.nzguardphi)

        self.decompose_matrix()

        callbacks.installpoissonsolver(self._run_solve)

    def decompose_matrix(self):
        """Function to build the superLU object used to solve the linear
        system."""
        self.nsolve = self.nz + 1

        # Set up the computation matrix in order to solve A*phi = rho
        A = np.zeros((self.nsolve, self.nsolve))
        idx = np.arange(self.nsolve)
        A[idx, idx] = -2.0
        A[idx[1:], idx[:-1]] = 1.0
        A[idx[:-1], idx[1:]] = 1.0

        A[0, 1] = 0.0
        A[-1, -2] = 0.0
        A[0, 0] = 1.0
        A[-1, -1] = 1.0

        A = csc_matrix(A, dtype=np.float64)
        self.lu = sla.splu(A)

    def _run_solve(self):
        """Function run on every step to perform the required steps to solve
        Poisson's equation."""
        # get rho from WarpX
        self.rho_data = fields.RhoFPWrapper(0, False)[...]
        # run superLU solver to get phi
        self.solve()
        # write phi to WarpX
        fields.PhiFPWrapper(0, True)[...] = self.phi[:]

    def solve(self):
        """The solution step. Includes getting the boundary potentials and
        calculating phi from rho."""

        left_voltage = 0.0
        right_voltage = eval(
            self.right_voltage, {
                't': self.sim.extension.warpx.gett_new(0),
                'sin': np.sin, 'pi': np.pi
            }
        )

        # Construct b vector
        rho = -self.rho_data / constants.ep0
        b = np.zeros(rho.shape[0], dtype=np.float64)
        b[:] = rho * self.dz**2

        b[0] = left_voltage
        b[-1] = right_voltage

        phi = self.lu.solve(b)

        self.phi[self.nzguardphi:-self.nzguardphi] = phi

        self.phi[:self.nzguardphi] = left_voltage
        self.phi[-self.nzguardphi:] = right_voltage


class CapacitiveDischargeExample(object):
    '''The following runs a simulation of a parallel plate capacitor seeded
    with a plasma in the spacing between the plates. A time varying voltage is
    applied across the capacitor. The groups of 4 values below correspond to
    the 4 cases simulated by Turner et al. (2013) in their benchmarks of
    PIC-MCC codes.
    '''

    gap = 0.067 # m

    freq = 13.56e6 # Hz
    voltage = [450.0, 200.0, 150.0, 120.0] # V

    gas_density = [9.64e20, 32.1e20, 96.4e20, 321e20] # m^-3
    gas_temp = 300.0 # K
    m_ion = 6.67e-27 # kg

    plasma_density = [2.56e14, 5.12e14, 5.12e14, 3.84e14] # m^-3
    elec_temp = 30000.0 # K

    seed_nppc = 16 * np.array([32, 16, 8, 4])

    nz = [128, 256, 512, 512]

    dt = 1.0 / (np.array([400, 800, 1600, 3200]) * freq)

    # Total simulation time in seconds
    total_time = np.array([1280, 5120, 5120, 15360]) / freq
    # Time (in seconds) between diagnostic evaluations
    diag_interval = 32 / freq

    def __init__(self, n=0, test=False, pythonsolver=False, dsmc=False):
        """Get input parameters for the specific case (n) desired."""
        self.n = n
        self.test = test
        self.pythonsolver = pythonsolver
        self.dsmc = dsmc

        # Case specific input parameters
        self.voltage = f"{self.voltage[n]}*sin(2*pi*{self.freq:.5e}*t)"

        self.gas_density = self.gas_density[n]
        self.plasma_density = self.plasma_density[n]
        self.seed_nppc = self.seed_nppc[n]

        self.nz = self.nz[n]

        self.dt = self.dt[n]
        self.max_steps = int(self.total_time[n] / self.dt)
        self.diag_steps = int(self.diag_interval / self.dt)

        if self.test:
            self.max_steps = 50
            self.diag_steps = 5
            self.mcc_subcycling_steps = 2
            self.rng = np.random.default_rng(23094290)
        else:
            self.mcc_subcycling_steps = None
            self.rng = np.random.default_rng()

        self.ion_density_array = np.zeros(self.nz + 1)

        self.setup_run()

    def setup_run(self):
        """Setup simulation components."""

        #######################################################################
        # Set geometry and boundary conditions                                #
        #######################################################################

        self.grid = picmi.Cartesian1DGrid(
            number_of_cells=[self.nz],
            warpx_max_grid_size=128,
            lower_bound=[0],
            upper_bound=[self.gap],
            lower_boundary_conditions=['dirichlet'],
            upper_boundary_conditions=['dirichlet'],
            lower_boundary_conditions_particles=['absorbing'],
            upper_boundary_conditions_particles=['absorbing'],
            warpx_potential_hi_z=self.voltage,
        )

        #######################################################################
        # Field solver                                                        #
        #######################################################################

        if self.pythonsolver:
            self.solver = PoissonSolver1D(grid=self.grid)
        else:
            # This will use the tridiagonal solver
            self.solver = picmi.ElectrostaticSolver(grid=self.grid)

        #######################################################################
        # Particle types setup                                                #
        #######################################################################

        self.electrons = picmi.Species(
            particle_type='electron', name='electrons',
            initial_distribution=picmi.UniformDistribution(
                density=self.plasma_density,
                rms_velocity=[np.sqrt(constants.kb * self.elec_temp / constants.m_e)]*3,
            )
        )
        self.ions = picmi.Species(
            particle_type='He', name='he_ions',
            charge='q_e', mass=self.m_ion,
            initial_distribution=picmi.UniformDistribution(
                density=self.plasma_density,
                rms_velocity=[np.sqrt(constants.kb * self.gas_temp / self.m_ion)]*3,
            )
        )
        if self.dsmc:
            self.neutrals = picmi.Species(
                particle_type='He', name='neutrals',
                charge=0, mass=self.m_ion,
                warpx_reflection_model_zlo=1.0,
                warpx_reflection_model_zhi=1.0,
                warpx_do_resampling=True,
                warpx_resampling_trigger_max_avg_ppc=int(self.seed_nppc*1.5),
                initial_distribution=picmi.UniformDistribution(
                    density=self.gas_density,
                    rms_velocity=[np.sqrt(constants.kb * self.gas_temp / self.m_ion)]*3,
                )
            )

        #######################################################################
        # Collision initialization                                            #
        #######################################################################

        cross_sec_direc = '../../../../warpx-data/MCC_cross_sections/He/'
        electron_colls = picmi.MCCCollisions(
            name='coll_elec',
            species=self.electrons,
            background_density=self.gas_density,
            background_temperature=self.gas_temp,
            background_mass=self.ions.mass,
            ndt=self.mcc_subcycling_steps,
            scattering_processes={
                'elastic' : {
                    'cross_section' : cross_sec_direc+'electron_scattering.dat'
                },
                'excitation1' : {
                    'cross_section': cross_sec_direc+'excitation_1.dat',
                    'energy' : 19.82
                },
                'excitation2' : {
                    'cross_section': cross_sec_direc+'excitation_2.dat',
                    'energy' : 20.61
                },
                'ionization' : {
                    'cross_section' : cross_sec_direc+'ionization.dat',
                    'energy' : 24.55,
                    'species' : self.ions
                },
            }
        )

        ion_scattering_processes={
            'elastic': {'cross_section': cross_sec_direc+'ion_scattering.dat'},
            'back': {'cross_section': cross_sec_direc+'ion_back_scatter.dat'},
            # 'charge_exchange': {'cross_section': cross_sec_direc+'charge_exchange.dat'}
        }
        if self.dsmc:
            ion_colls = picmi.DSMCCollisions(
                name='coll_ion',
                species=[self.ions, self.neutrals],
                ndt=5, scattering_processes=ion_scattering_processes
            )
        else:
            ion_colls = picmi.MCCCollisions(
                name='coll_ion',
                species=self.ions,
                background_density=self.gas_density,
                background_temperature=self.gas_temp,
                ndt=self.mcc_subcycling_steps,
                scattering_processes=ion_scattering_processes
            )

        #######################################################################
        # Initialize simulation                                               #
        #######################################################################

        self.sim = picmi.Simulation(
            solver=self.solver,
            time_step_size=self.dt,
            max_steps=self.max_steps,
            warpx_collisions=[electron_colls, ion_colls],
            verbose=self.test
        )
        self.solver.sim = self.sim

        self.sim.add_species(
            self.electrons,
            layout = picmi.GriddedLayout(
                n_macroparticle_per_cell=[self.seed_nppc], grid=self.grid
            )
        )
        self.sim.add_species(
            self.ions,
            layout = picmi.GriddedLayout(
                n_macroparticle_per_cell=[self.seed_nppc], grid=self.grid
            )
        )
        if self.dsmc:
            self.sim.add_species(
                self.neutrals,
                layout = picmi.GriddedLayout(
                    n_macroparticle_per_cell=[self.seed_nppc//2], grid=self.grid
                )
            )
        self.solver.sim_ext = self.sim.extension

        if self.dsmc:
            # Periodically reset neutral density to starting temperature
            callbacks.installbeforecollisions(self.rethermalize_neutrals)

        #######################################################################
        # Add diagnostics for the CI test to be happy                         #
        #######################################################################

        if self.dsmc:
            file_prefix = 'Python_dsmc_1d_plt'
        else:
            if self.pythonsolver:
                file_prefix = 'Python_background_mcc_1d_plt'
            else:
                file_prefix = 'Python_background_mcc_1d_tridiag_plt'

        species = [self.electrons, self.ions]
        if self.dsmc:
            species.append(self.neutrals)
        particle_diag = picmi.ParticleDiagnostic(
            species=species,
            name='diag1',
            period=0,
            write_dir='.',
            warpx_file_prefix=file_prefix
        )
        field_diag = picmi.FieldDiagnostic(
            name='diag1',
            grid=self.grid,
            period=0,
            data_list=['rho_electrons', 'rho_he_ions'],
            write_dir='.',
            warpx_file_prefix=file_prefix
        )
        self.sim.add_diagnostic(particle_diag)
        self.sim.add_diagnostic(field_diag)

    def rethermalize_neutrals(self):
        # When using DSMC the neutral temperature will change due to collisions
        # with the ions. This is not captured in the original MCC test.
        # Re-thermalize the neutrals every 1000 steps
        step = self.sim.extension.warpx.getistep(lev=0)
        if step % 1000 != 10:
            return

        if not hasattr(self, 'neutral_cont'):
            self.neutral_cont = particle_containers.ParticleContainerWrapper(
                self.neutrals.name
            )

        ux_arrays = self.neutral_cont.uxp
        uy_arrays = self.neutral_cont.uyp
        uz_arrays = self.neutral_cont.uzp

        vel_std = np.sqrt(constants.kb * self.gas_temp / self.m_ion)
        for ii in range(len(ux_arrays)):
            nps = len(ux_arrays[ii])
            ux_arrays[ii][:] = vel_std * self.rng.normal(size=nps)
            uy_arrays[ii][:] = vel_std * self.rng.normal(size=nps)
            uz_arrays[ii][:] = vel_std * self.rng.normal(size=nps)

    def _get_rho_ions(self):
        # deposit the ion density in rho_fp
        he_ions_wrapper = particle_containers.ParticleContainerWrapper('he_ions')
        he_ions_wrapper.deposit_charge_density(level=0)

        rho_data = self.rho_wrapper[...]
        self.ion_density_array += rho_data / constants.q_e / self.diag_steps

    def run_sim(self):

        self.sim.step(self.max_steps - self.diag_steps)

        self.rho_wrapper = fields.RhoFPWrapper(0, False)
        callbacks.installafterstep(self._get_rho_ions)

        self.sim.step(self.diag_steps)

        if self.pythonsolver:
            # confirm that the external solver was run
            assert hasattr(self.solver, 'phi')

        if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
            np.save(f'ion_density_case_{self.n+1}.npy', self.ion_density_array)

        # query the particle z-coordinates if this is run during CI testing
        # to cover that functionality
        if self.test:
            he_ions_wrapper = particle_containers.ParticleContainerWrapper('he_ions')
            nparts = he_ions_wrapper.get_particle_count(local=True)
            z_coords = np.concatenate(he_ions_wrapper.zp)
            assert len(z_coords) == nparts
            assert np.all(z_coords >= 0.0) and np.all(z_coords <= self.gap)


##########################
# parse input parameters
##########################

parser = argparse.ArgumentParser()
parser.add_argument(
    '-t', '--test', help='toggle whether this script is run as a short CI test',
    action='store_true',
)
parser.add_argument(
    '-n', help='Test number to run (1 to 4)', required=False, type=int,
    default=1
)
parser.add_argument(
    '--pythonsolver', help='toggle whether to use the Python level solver',
    action='store_true'
)
parser.add_argument(
    '--dsmc', help='toggle whether to use DSMC for ions in place of MCC',
    action='store_true'
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left

if args.n < 1 or args.n > 4:
    raise AttributeError('Test number must be an integer from 1 to 4.')

run = CapacitiveDischargeExample(
    n=args.n-1, test=args.test, pythonsolver=args.pythonsolver, dsmc=args.dsmc
)
run.run_sim()
Analyze

Once the simulation completes an output file avg_ion_density.npy will be created which can be compared to the literature results as in the plot below. Running case 1 on four CPU processors takes roughly 20 minutes to complete.

Visualize

The figure below shows a comparison of the ion density as calculated in WarpX (in June 2022 with PR #3118) compared to the literature results (which can be found in the supplementary materials of Turner et al.).

MCC benchmark against :cite:t:`ex-Turner2013`.

MCC benchmark against Turner et al. [11].

Kinetic-fluid Hybrid Models

WarpX includes a reduced plasma model in which electrons are treated as a massless fluid while ions are kinetically evolved, and Ohm’s law is used to calculate the electric field. This model is appropriate for problems in which ion kinetics dominate (ion cyclotron waves, for instance). See the theory section for more details. Several examples and benchmarks of this kinetic-fluid hybrid model are provided below. A few of the examples are replications of the verification tests described in Muñoz et al. [1]. The hybrid-PIC model was added to WarpX in PR #3665 - the figures in the examples below were generated at that time.

Ohm solver: Electromagnetic modes

In this example a simulation is seeded with a thermal plasma while an initial magnetic field is applied in either the \(z\) or \(x\) direction. The simulation is progressed for a large number of steps and the resulting fields are Fourier analyzed for Alfvén mode excitations.

Run

The same input script can be used for 1d, 2d or 3d Cartesian simulations as well as replicating either the parallel propagating or ion-Bernstein modes as indicated below.

Script PICMI_inputs.py
You can copy this file from Examples/Tests/ohm_solver_EM_modes/PICMI_inputs.py.
#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script is set up to produce either parallel or
# --- perpendicular (Bernstein) EM modes and can be run in 1d, 2d or 3d
# --- Cartesian geometries. See Section 4.2 and 4.3 of Munoz et al. (2018).
# --- As a CI test only a small number of steps are taken using the 1d version.

import argparse
import os
import sys

import dill
import numpy as np
from mpi4py import MPI as mpi

from pywarpx import callbacks, fields, libwarpx, picmi

constants = picmi.constants

comm = mpi.COMM_WORLD

simulation = picmi.Simulation(
    warpx_serialize_initial_conditions=True,
    verbose=0
)


class EMModes(object):
    '''The following runs a simulation of an uniform plasma at a set
    temperature (Te = Ti) with an external magnetic field applied in either the
    z-direction (parallel to domain) or x-direction (perpendicular to domain).
    The analysis script (in this same directory) analyzes the output field data
    for EM modes. This input is based on the EM modes tests as described by
    Munoz et al. (2018) and tests done by Scott Nicks at TAE Technologies.
    '''
    # Applied field parameters
    B0          = 0.25 # Initial magnetic field strength (T)
    beta        = [0.01, 0.1] # Plasma beta, used to calculate temperature

    # Plasma species parameters
    m_ion       = [100.0, 400.0] # Ion mass (electron masses)
    vA_over_c  = [1e-4, 1e-3] # ratio of Alfven speed and the speed of light

    # Spatial domain
    Nz          = [1024, 1920] # number of cells in z direction
    Nx          = 8 # number of cells in x (and y) direction for >1 dimensions

    # Temporal domain (if not run as a CI test)
    LT          = 300.0 # Simulation temporal length (ion cyclotron periods)

    # Numerical parameters
    NPPC        = [1024, 256, 64] # Seed number of particles per cell
    DZ          = 1.0 / 10.0 # Cell size (ion skin depths)
    DT          = [5e-3, 4e-3] # Time step (ion cyclotron periods)

    # Plasma resistivity - used to dampen the mode excitation
    eta = [[1e-7, 1e-7], [1e-7, 1e-5], [1e-7, 1e-4]]
    # Number of substeps used to update B
    substeps = 20

    def __init__(self, test, dim, B_dir, verbose):
        """Get input parameters for the specific case desired."""
        self.test = test
        self.dim = int(dim)
        self.B_dir = B_dir
        self.verbose = verbose or self.test

        # sanity check
        assert (dim > 0 and dim < 4), f"{dim}-dimensions not a valid input"

        # get simulation parameters from the defaults given the direction of
        # the initial B-field and the dimensionality
        self.get_simulation_parameters()

        # calculate various plasma parameters based on the simulation input
        self.get_plasma_quantities()

        self.dz = self.DZ * self.l_i
        self.Lz = self.Nz * self.dz
        self.Lx = self.Nx * self.dz

        self.dt = self.DT * self.t_ci

        if not self.test:
            self.total_steps = int(self.LT / self.DT)
        else:
            # if this is a test case run for only a small number of steps
            self.total_steps = 250
        # output diagnostics 20 times per cyclotron period
        self.diag_steps = int(1.0/20 / self.DT)

        # dump all the current attributes to a dill pickle file
        if comm.rank == 0:
            with open(f'sim_parameters.dpkl', 'wb') as f:
                dill.dump(self, f)

        # print out plasma parameters
        if comm.rank == 0:
            print(
                f"Initializing simulation with input parameters:\n"
                f"\tT = {self.T_plasma:.3f} eV\n"
                f"\tn = {self.n_plasma:.1e} m^-3\n"
                f"\tB0 = {self.B0:.2f} T\n"
                f"\tM/m = {self.m_ion:.0f}\n"
            )
            print(
                f"Plasma parameters:\n"
                f"\tl_i = {self.l_i:.1e} m\n"
                f"\tt_ci = {self.t_ci:.1e} s\n"
                f"\tv_ti = {self.v_ti:.1e} m/s\n"
                f"\tvA = {self.vA:.1e} m/s\n"
            )
            print(
                f"Numerical parameters:\n"
                f"\tdz = {self.dz:.1e} m\n"
                f"\tdt = {self.dt:.1e} s\n"
                f"\tdiag steps = {self.diag_steps:d}\n"
                f"\ttotal steps = {self.total_steps:d}\n"
            )

        self.setup_run()

    def get_simulation_parameters(self):
        """Pick appropriate parameters from the defaults given the direction
        of the B-field and the simulation dimensionality."""
        if self.B_dir == 'z':
            idx = 0
            self.Bx = 0.0
            self.By = 0.0
            self.Bz = self.B0
        elif self.B_dir == 'y':
            idx = 1
            self.Bx = 0.0
            self.By = self.B0
            self.Bz = 0.0
        else:
            idx = 1
            self.Bx = self.B0
            self.By = 0.0
            self.Bz = 0.0

        self.beta = self.beta[idx]
        self.m_ion = self.m_ion[idx]
        self.vA_over_c = self.vA_over_c[idx]
        self.Nz = self.Nz[idx]
        self.DT = self.DT[idx]

        self.NPPC = self.NPPC[self.dim-1]
        self.eta = self.eta[self.dim-1][idx]

    def get_plasma_quantities(self):
        """Calculate various plasma parameters based on the simulation input."""
        # Ion mass (kg)
        self.M = self.m_ion * constants.m_e

        # Cyclotron angular frequency (rad/s) and period (s)
        self.w_ci = constants.q_e * abs(self.B0) / self.M
        self.t_ci = 2.0 * np.pi / self.w_ci

        # Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
        self.vA = self.vA_over_c * constants.c
        self.n_plasma = (
            (self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
        )

        # Ion plasma frequency (Hz)
        self.w_pi = np.sqrt(
            constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
        )

        # Skin depth (m)
        self.l_i = constants.c / self.w_pi

        # Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
        self.v_ti = np.sqrt(self.beta / 2.0) * self.vA

        # Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
        self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV

        # Larmor radius (m)
        self.rho_i = self.v_ti / self.w_ci

    def setup_run(self):
        """Setup simulation components."""

        #######################################################################
        # Set geometry and boundary conditions                                #
        #######################################################################

        if self.dim == 1:
            grid_object = picmi.Cartesian1DGrid
        elif self.dim == 2:
            grid_object = picmi.Cartesian2DGrid
        else:
            grid_object = picmi.Cartesian3DGrid

        self.grid = grid_object(
            number_of_cells=[self.Nx, self.Nx, self.Nz][-self.dim:],
            warpx_max_grid_size=self.Nz,
            lower_bound=[-self.Lx/2.0, -self.Lx/2.0, 0][-self.dim:],
            upper_bound=[self.Lx/2.0, self.Lx/2.0, self.Lz][-self.dim:],
            lower_boundary_conditions=['periodic']*self.dim,
            upper_boundary_conditions=['periodic']*self.dim
        )
        simulation.time_step_size = self.dt
        simulation.max_steps = self.total_steps
        simulation.current_deposition_algo = 'direct'
        simulation.particle_shape = 1
        simulation.verbose = self.verbose

        #######################################################################
        # Field solver and external field                                     #
        #######################################################################

        self.solver = picmi.HybridPICSolver(
            grid=self.grid,
            Te=self.T_plasma, n0=self.n_plasma, plasma_resistivity=self.eta,
            substeps=self.substeps
        )
        simulation.solver = self.solver

        B_ext = picmi.AnalyticInitialField(
            Bx_expression=self.Bx,
            By_expression=self.By,
            Bz_expression=self.Bz
        )
        simulation.add_applied_field(B_ext)

        #######################################################################
        # Particle types setup                                                #
        #######################################################################

        self.ions = picmi.Species(
            name='ions', charge='q_e', mass=self.M,
            initial_distribution=picmi.UniformDistribution(
                density=self.n_plasma,
                rms_velocity=[self.v_ti]*3,
            )
        )
        simulation.add_species(
            self.ions,
            layout=picmi.PseudoRandomLayout(
                grid=self.grid, n_macroparticles_per_cell=self.NPPC
            )
        )

        #######################################################################
        # Add diagnostics                                                     #
        #######################################################################

        if self.B_dir == 'z':
            self.output_file_name = 'par_field_data.txt'
        else:
            self.output_file_name = 'perp_field_data.txt'

        if self.test:
            particle_diag = picmi.ParticleDiagnostic(
                name='field_diag',
                period=self.total_steps,
                write_dir='.',
                warpx_file_prefix='Python_ohms_law_solver_EM_modes_1d_plt',
                # warpx_format = 'openpmd',
                # warpx_openpmd_backend = 'h5'
            )
            simulation.add_diagnostic(particle_diag)
            field_diag = picmi.FieldDiagnostic(
                name='field_diag',
                grid=self.grid,
                period=self.total_steps,
                data_list=['B', 'E', 'J_displacement'],
                write_dir='.',
                warpx_file_prefix='Python_ohms_law_solver_EM_modes_1d_plt',
                # warpx_format = 'openpmd',
                # warpx_openpmd_backend = 'h5'
            )
            simulation.add_diagnostic(field_diag)

        if self.B_dir == 'z' or self.dim == 1:
            line_diag = picmi.ReducedDiagnostic(
                diag_type='FieldProbe',
                probe_geometry='Line',
                z_probe=0,
                z1_probe=self.Lz,
                resolution=self.Nz - 1,
                name=self.output_file_name[:-4],
                period=self.diag_steps,
                path='diags/'
            )
            simulation.add_diagnostic(line_diag)
        else:
            # install a custom "reduced diagnostic" to save the average field
            callbacks.installafterEsolve(self._record_average_fields)
            try:
                os.mkdir("diags")
            except OSError:
                # diags directory already exists
                pass
            with open(f"diags/{self.output_file_name}", 'w') as f:
                f.write(
                   "[0]step() [1]time(s) [2]z_coord(m) "
                   "[3]Ez_lev0-(V/m) [4]Bx_lev0-(T) [5]By_lev0-(T)\n"
                )

        #######################################################################
        # Initialize simulation                                               #
        #######################################################################

        simulation.initialize_inputs()
        simulation.initialize_warpx()

    def _record_average_fields(self):
        """A custom reduced diagnostic to store the average E&M fields in a
        similar format as the reduced diagnostic so that the same analysis
        script can be used regardless of the simulation dimension.
        """
        step = simulation.extension.warpx.getistep(lev=0) - 1

        if step % self.diag_steps != 0:
            return

        Bx_warpx = fields.BxWrapper()[...]
        By_warpx = fields.ByWrapper()[...]
        Ez_warpx = fields.EzWrapper()[...]

        if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
            return

        t = step * self.dt
        z_vals = np.linspace(0, self.Lz, self.Nz, endpoint=False)

        if self.dim == 2:
            Ez = np.mean(Ez_warpx[:-1], axis=0)
            Bx = np.mean(Bx_warpx[:-1], axis=0)
            By = np.mean(By_warpx[:-1], axis=0)
        else:
            Ez = np.mean(Ez_warpx[:-1, :-1], axis=(0, 1))
            Bx = np.mean(Bx_warpx[:-1], axis=(0, 1))
            By = np.mean(By_warpx[:-1], axis=(0, 1))

        with open(f"diags/{self.output_file_name}", 'a') as f:
            for ii in range(self.Nz):
                f.write(
                    f"{step:05d} {t:.10e} {z_vals[ii]:.10e} {Ez[ii]:+.10e} "
                    f"{Bx[ii]:+.10e} {By[ii]:+.10e}\n"
                )


##########################
# parse input parameters
##########################

parser = argparse.ArgumentParser()
parser.add_argument(
    '-t', '--test', help='toggle whether this script is run as a short CI test',
    action='store_true',
)
parser.add_argument(
    '-d', '--dim', help='Simulation dimension', required=False, type=int,
    default=1
)
parser.add_argument(
    '--bdir', help='Direction of the B-field', required=False,
    choices=['x', 'y', 'z'], default='z'
)
parser.add_argument(
    '-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left

run = EMModes(test=args.test, dim=args.dim, B_dir=args.bdir, verbose=args.verbose)
simulation.step()

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

Execute:

python3 PICMI_inputs.py -dim {1/2/3} --bdir z

Execute:

python3 PICMI_inputs.py -dim {1/2/3} --bdir {x/y}
Analyze

The following script reads the simulation output from the above example, performs Fourier transforms of the field data and compares the calculated spectrum to the theoretical dispersions.

Script analysis.py
You can copy this file from Examples/Tests/ohm_solver_EM_modes/analysis.py.
#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example producing EM modes.

import dill
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

from pywarpx import picmi

constants = picmi.constants

matplotlib.rcParams.update({'font.size': 20})

# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
    sim = dill.load(f)

if sim.B_dir == 'z':
    field_idx_dict = {'z': 4, 'Ez': 7, 'Bx': 8, 'By': 9}
    data = np.loadtxt("diags/par_field_data.txt", skiprows=1)
else:
    if sim.dim == 1:
        field_idx_dict = {'z': 4, 'Ez': 7, 'Bx': 8, 'By': 9}
    else:
        field_idx_dict = {'z': 2, 'Ez': 3, 'Bx': 4, 'By': 5}
    data = np.loadtxt("diags/perp_field_data.txt", skiprows=1)

# step, t, z, Ez, Bx, By = raw_data.T
step = data[:,0]

num_steps = len(np.unique(step))

# get the spatial resolution
resolution = len(np.where(step == 0)[0]) - 1

# reshape to separate spatial and time coordinates
sim_data = data.reshape((num_steps, resolution+1, data.shape[1]))

z_grid = sim_data[1, :, field_idx_dict['z']]
idx = np.argsort(z_grid)[1:]
dz = np.mean(np.diff(z_grid[idx]))
dt = np.mean(np.diff(sim_data[:,0,1]))

data = np.zeros((num_steps, resolution, 3))
for i in range(num_steps):
    data[i,:,0] = sim_data[i,idx,field_idx_dict['Bx']]
    data[i,:,1] = sim_data[i,idx,field_idx_dict['By']]
    data[i,:,2] = sim_data[i,idx,field_idx_dict['Ez']]

print(f"Data file contains {num_steps} time snapshots.")
print(f"Spatial resolution is {resolution}")

def get_analytic_R_mode(w):
    return w / np.sqrt(1.0 + abs(w))

def get_analytic_L_mode(w):
    return w / np.sqrt(1.0 - abs(w))

if sim.B_dir == 'z':
    global_norm = (
        1.0 / (2.0*constants.mu0)
        / ((3.0/2)*sim.n_plasma*sim.T_plasma*constants.q_e)
    )
else:
    global_norm = (
        constants.ep0 / 2.0
        / ((3.0/2)*sim.n_plasma*sim.T_plasma*constants.q_e)
    )

if sim.B_dir == 'z':
    Bl = (data[:, :, 0] + 1.0j * data[:, :, 1]) / np.sqrt(2.0)
    field_kw = np.fft.fftshift(np.fft.fft2(Bl))
else:
    field_kw = np.fft.fftshift(np.fft.fft2(data[:, :, 2]))

w_norm = sim.w_ci
if sim.B_dir == 'z':
    k_norm = 1.0 / sim.l_i
else:
    k_norm = 1.0 / sim.rho_i

k = 2*np.pi * np.fft.fftshift(np.fft.fftfreq(resolution, dz)) / k_norm
w = 2*np.pi * np.fft.fftshift(np.fft.fftfreq(num_steps, dt)) / w_norm
w = -np.flipud(w)

# aspect = (xmax-xmin)/(ymax-ymin) / aspect_true
extent = [k[0], k[-1], w[0], w[-1]]

fig, ax1 = plt.subplots(1, 1, figsize=(10, 7.25))

if sim.B_dir == 'z' and sim.dim == 1:
    vmin = -3
    vmax = 3.5
else:
    vmin = None
    vmax = None

im = ax1.imshow(
    np.log10(np.abs(field_kw**2) * global_norm), extent=extent,
    aspect="equal", cmap='inferno', vmin=vmin, vmax=vmax
)

# Colorbars
fig.subplots_adjust(right=0.5)
cbar_ax = fig.add_axes([0.525, 0.15, 0.03, 0.7])
fig.colorbar(im, cax=cbar_ax, orientation='vertical')

#cbar_lab = r'$\log_{10}(\frac{|B_{R/L}|^2}{2\mu_0}\frac{2}{3n_0k_BT_e})$'
if sim.B_dir == 'z':
    cbar_lab = r'$\log_{10}(\beta_{R/L})$'
else:
    cbar_lab = r'$\log_{10}(\varepsilon_0|E_z|^2/(3n_0k_BT_e))$'
cbar_ax.set_ylabel(cbar_lab, rotation=270, labelpad=30)

if sim.B_dir == 'z':
    # plot the L mode
    ax1.plot(get_analytic_L_mode(w), np.abs(w), c='limegreen', ls='--', lw=1.25,
            label='L mode:\n'+r'$(kl_i)^2=\frac{(\omega/\Omega_i)^2}{1-\omega/\Omega_i}$')
    # plot the R mode
    ax1.plot(get_analytic_R_mode(w), -np.abs(w), c='limegreen', ls='-.', lw=1.25,
        label='R mode:\n'+r'$(kl_i)^2=\frac{(\omega/\Omega_i)^2}{1+\omega/\Omega_i}$')

    ax1.plot(k,1.0+3.0*sim.v_ti/w_norm*k*k_norm, c='limegreen', ls=':', lw=1.25, label = r'$\omega = \Omega_i + 3v_{th,i} k$')
    ax1.plot(k,1.0-3.0*sim.v_ti/w_norm*k*k_norm, c='limegreen', ls=':', lw=1.25)

else:
    # digitized values from Munoz et al. (2018)
    x = [0.006781609195402272, 0.1321379310344828, 0.2671034482758621, 0.3743678160919539, 0.49689655172413794, 0.6143908045977011, 0.766022988505747, 0.885448275862069, 1.0321149425287355, 1.193862068965517, 1.4417701149425288, 1.7736781609195402]
    y = [-0.033194664836814436, 0.5306857657503109, 1.100227301968521, 1.5713856842646996, 2.135780760818287, 2.675601492473303, 3.3477291246729854, 3.8469357121413563, 4.4317021915340735, 5.1079898786293265, 6.10275764463696, 7.310074194793499]
    ax1.plot(x, y, c='limegreen', ls='-.', lw=1.5, label="X mode")

    x = [3.9732873563218387, 3.6515862068965514, 3.306275862068966, 2.895655172413793, 2.4318850574712645, 2.0747586206896553, 1.8520229885057473, 1.6589195402298849, 1.4594942528735633, 1.2911724137931033, 1.1551264367816092, 1.0335402298850576, 0.8961149425287356, 0.7419770114942528, 0.6141379310344828, 0.4913103448275862]
    y = [1.1145945018655916, 1.1193978642192393, 1.1391259596002916, 1.162971222713042, 1.1986533430544237, 1.230389844319595, 1.2649997855641806, 1.3265857528841618, 1.3706737573444268, 1.4368486511986962, 1.4933310460179268, 1.5485268259210019, 1.6386327572157655, 1.7062658146416778, 1.7828194021529358, 1.8533687867221342]
    ax1.plot(x, y, c='limegreen', ls=':', lw=2, label="Bernstein modes")

    x = [3.9669885057471266, 3.6533333333333333, 3.3213563218390805, 2.9646896551724136, 2.6106436781609195, 2.2797011494252875, 1.910919540229885, 1.6811724137931034, 1.4499540229885057, 1.2577011494252872, 1.081057471264368, 0.8791494252873564, 0.7153103448275862]
    y = [2.2274306300124374, 2.2428271218424327, 2.272505039241755, 2.3084873697302397, 2.3586224642964364, 2.402667581592829, 2.513873997512545, 2.5859673199811297, 2.6586610627439207, 2.7352146502551786, 2.8161427284813656, 2.887850066475104, 2.9455761890466183]
    ax1.plot(x, y, c='limegreen', ls=':', lw=2)

    x = [3.9764137931034487, 3.702022988505747, 3.459793103448276, 3.166712643678161, 2.8715862068965516, 2.5285057471264367, 2.2068505747126435, 1.9037011494252871, 1.6009885057471265, 1.3447816091954023, 1.1538850574712645, 0.9490114942528736]
    y = [3.3231976669382854, 3.34875841660591, 3.378865205643951, 3.424454260839731, 3.474160483767209, 3.522194107303684, 3.6205343740618434, 3.7040356821203417, 3.785435519149119, 3.868851052879873, 3.9169704507440923, 3.952481022429987]
    ax1.plot(x, y, c='limegreen', ls=':', lw=2)

    x = [3.953609195402299, 3.7670114942528734, 3.5917471264367817, 3.39735632183908, 3.1724137931034484, 2.9408045977011494, 2.685977011494253, 2.4593563218390804, 2.2203218390804595, 2.0158850574712646, 1.834183908045977, 1.6522758620689655, 1.4937471264367814, 1.3427586206896551, 1.2075402298850575]
    y = [4.427971008277223, 4.458335120298495, 4.481579963117039, 4.495861388686366, 4.544581206844791, 4.587425483552773, 4.638160998413175, 4.698631899472488, 4.757987734271133, 4.813955483123902, 4.862332203971352, 4.892481880173264, 4.9247759145687695, 4.947934983059571, 4.953124329888064]
    ax1.plot(x, y, c='limegreen', ls=':', lw=2)

# ax1.legend(loc='upper left')
fig.legend(loc=7, fontsize=18)

if sim.B_dir == 'z':
    ax1.set_xlabel(r'$k l_i$')
    ax1.set_title('$B_{R/L} = B_x \pm iB_y$')
    fig.suptitle("Parallel EM modes")
    ax1.set_xlim(-3, 3)
    ax1.set_ylim(-6, 3)
    dir_str = 'par'
else:
    ax1.set_xlabel(r'$k \rho_i$')
    ax1.set_title('$E_z(k, \omega)$')
    fig.suptitle(f"Perpendicular EM modes (ion Bernstein) - {sim.dim}D")
    ax1.set_xlim(-3, 3)
    ax1.set_ylim(0, 8)
    dir_str = 'perp'

ax1.set_ylabel(r'$\omega / \Omega_i$')

plt.savefig(
    f"spectrum_{dir_str}_{sim.dim}d_{sim.substeps}_substeps_{sim.eta}_eta.png",
    bbox_inches='tight'
)
if not sim.test:
    plt.show()

if sim.test:
    import os
    import sys
    sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
    import checksumAPI

    # this will be the name of the plot file
    fn = sys.argv[1]
    test_name = os.path.split(os.getcwd())[1]
    checksumAPI.evaluate_checksum(test_name, fn)

Right and left circularly polarized electromagnetic waves are supported through the cyclotron motion of the ions, except in a region of thermal resonances as indicated on the plot below.

Parallel EM modes in thermal ion plasma

Calculated Alvén waves spectrum with the theoretical dispersions overlaid.

Perpendicularly propagating modes are also supported, commonly referred to as ion-Bernstein modes.

Perpendicular modes in thermal ion plasma

Calculated ion Bernstein waves spectrum with the theoretical dispersion overlaid.

Ohm solver: Cylindrical normal modes

A RZ-geometry example case for normal modes propagating along an applied magnetic field in a cylinder is also available. The analytical solution for these modes are described in Stix [12] Chapter 6, Sec. 2.

Run

The following script initializes a thermal plasma in a metallic cylinder with periodic boundaries at the cylinder ends.

Script PICMI_inputs_rz.py
You can copy this file from Examples/Tests/ohm_solver_EM_modes/PICMI_inputs_rz.py.
#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script is set up to produce parallel normal EM modes
# --- in a metallic cylinder and is run in RZ geometry.
# --- As a CI test only a small number of steps are taken.

import argparse
import sys

import dill
import numpy as np
from mpi4py import MPI as mpi

from pywarpx import picmi

constants = picmi.constants

comm = mpi.COMM_WORLD

simulation = picmi.Simulation(verbose=0)


class CylindricalNormalModes(object):
    '''The following runs a simulation of an uniform plasma at a set ion
    temperature (and Te = 0) with an external magnetic field applied in the
    z-direction (parallel to domain).
    The analysis script (in this same directory) analyzes the output field
    data for EM modes.
    '''
    # Applied field parameters
    B0          = 0.5 # Initial magnetic field strength (T)
    beta        = 0.01 # Plasma beta, used to calculate temperature

    # Plasma species parameters
    m_ion       = 400.0 # Ion mass (electron masses)
    vA_over_c   = 5e-3 # ratio of Alfven speed and the speed of light

    # Spatial domain
    Nz          = 512 # number of cells in z direction
    Nr          = 128 # number of cells in r direction

    # Temporal domain (if not run as a CI test)
    LT          = 800.0 # Simulation temporal length (ion cyclotron periods)

    # Numerical parameters
    NPPC        = 8000 # Seed number of particles per cell
    DZ          = 0.4 # Cell size (ion skin depths)
    DR          = 0.4 # Cell size (ion skin depths)
    DT          = 0.02 # Time step (ion cyclotron periods)

    # Plasma resistivity - used to dampen the mode excitation
    eta = 5e-4
    # Number of substeps used to update B
    substeps = 20

    def __init__(self, test, verbose):
        """Get input parameters for the specific case desired."""
        self.test = test
        self.verbose = verbose or self.test

        # calculate various plasma parameters based on the simulation input
        self.get_plasma_quantities()

        if not self.test:
            self.total_steps = int(self.LT / self.DT)
        else:
            # if this is a test case run for only a small number of steps
            self.total_steps = 100
            # and make the grid and particle count smaller
            self.Nz = 128
            self.Nr = 64
            self.NPPC = 200
        # output diagnostics 5 times per cyclotron period
        self.diag_steps = max(10, int(1.0 / 5 / self.DT))

        self.Lz = self.Nz * self.DZ * self.l_i
        self.Lr = self.Nr * self.DR * self.l_i

        self.dt = self.DT * self.t_ci

        # dump all the current attributes to a dill pickle file
        if comm.rank == 0:
            with open(f'sim_parameters.dpkl', 'wb') as f:
                dill.dump(self, f)

        # print out plasma parameters
        if comm.rank == 0:
            print(
                f"Initializing simulation with input parameters:\n"
                f"\tT = {self.T_plasma:.3f} eV\n"
                f"\tn = {self.n_plasma:.1e} m^-3\n"
                f"\tB0 = {self.B0:.2f} T\n"
                f"\tM/m = {self.m_ion:.0f}\n"
            )
            print(
                f"Plasma parameters:\n"
                f"\tl_i = {self.l_i:.1e} m\n"
                f"\tt_ci = {self.t_ci:.1e} s\n"
                f"\tv_ti = {self.v_ti:.1e} m/s\n"
                f"\tvA = {self.vA:.1e} m/s\n"
            )
            print(
                f"Numerical parameters:\n"
                f"\tdt = {self.dt:.1e} s\n"
                f"\tdiag steps = {self.diag_steps:d}\n"
                f"\ttotal steps = {self.total_steps:d}\n",
                flush=True
            )
        self.setup_run()

    def get_plasma_quantities(self):
        """Calculate various plasma parameters based on the simulation input."""
        # Ion mass (kg)
        self.M = self.m_ion * constants.m_e

        # Cyclotron angular frequency (rad/s) and period (s)
        self.w_ci = constants.q_e * abs(self.B0) / self.M
        self.t_ci = 2.0 * np.pi / self.w_ci

        # Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
        self.vA = self.vA_over_c * constants.c
        self.n_plasma = (
            (self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
        )

        # Ion plasma frequency (Hz)
        self.w_pi = np.sqrt(
            constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
        )

        # Skin depth (m)
        self.l_i = constants.c / self.w_pi

        # Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
        self.v_ti = np.sqrt(self.beta / 2.0) * self.vA

        # Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
        self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV

        # Larmor radius (m)
        self.rho_i = self.v_ti / self.w_ci

    def setup_run(self):
        """Setup simulation components."""

        #######################################################################
        # Set geometry and boundary conditions                                #
        #######################################################################

        self.grid = picmi.CylindricalGrid(
            number_of_cells=[self.Nr, self.Nz],
            warpx_max_grid_size=self.Nz,
            lower_bound=[0, -self.Lz/2.0],
            upper_bound=[self.Lr, self.Lz/2.0],
            lower_boundary_conditions = ['none', 'periodic'],
            upper_boundary_conditions = ['dirichlet', 'periodic'],
            lower_boundary_conditions_particles = ['absorbing', 'periodic'],
            upper_boundary_conditions_particles = ['reflecting', 'periodic']
        )
        simulation.time_step_size = self.dt
        simulation.max_steps = self.total_steps
        simulation.current_deposition_algo = 'direct'
        simulation.particle_shape = 1
        simulation.verbose = self.verbose

        #######################################################################
        # Field solver and external field                                     #
        #######################################################################

        self.solver = picmi.HybridPICSolver(
            grid=self.grid,
            Te=0.0, n0=self.n_plasma, plasma_resistivity=self.eta,
            substeps=self.substeps,
            n_floor=self.n_plasma*0.05
        )
        simulation.solver = self.solver

        B_ext = picmi.AnalyticInitialField(
            Bz_expression=self.B0
        )
        simulation.add_applied_field(B_ext)

        #######################################################################
        # Particle types setup                                                #
        #######################################################################

        self.ions = picmi.Species(
            name='ions', charge='q_e', mass=self.M,
            initial_distribution=picmi.UniformDistribution(
                density=self.n_plasma,
                rms_velocity=[self.v_ti]*3,
            )
        )
        simulation.add_species(
            self.ions,
            layout=picmi.PseudoRandomLayout(
                grid=self.grid, n_macroparticles_per_cell=self.NPPC
            )
        )

        #######################################################################
        # Add diagnostics                                                     #
        #######################################################################

        field_diag = picmi.FieldDiagnostic(
            name='field_diag',
            grid=self.grid,
            period=self.diag_steps,
            data_list=['B', 'E'],
            write_dir='diags',
            warpx_file_prefix='field_diags',
            warpx_format='openpmd',
            warpx_openpmd_backend='h5',
        )
        simulation.add_diagnostic(field_diag)

        # add particle diagnostic for checksum
        if self.test:
            part_diag = picmi.ParticleDiagnostic(
                name='diag1',
                period=self.total_steps,
                species=[self.ions],
                data_list=['ux', 'uy', 'uz', 'weighting'],
                write_dir='.',
                warpx_file_prefix='Python_ohms_law_solver_EM_modes_rz_plt'
            )
            simulation.add_diagnostic(part_diag)


##########################
# parse input parameters
##########################

parser = argparse.ArgumentParser()
parser.add_argument(
    '-t', '--test', help='toggle whether this script is run as a short CI test',
    action='store_true',
)
parser.add_argument(
    '-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left

run = CylindricalNormalModes(test=args.test, verbose=args.verbose)
simulation.step()

The example can be executed using:

python3 PICMI_inputs_rz.py
Analyze

After the simulation completes the following script can be used to analyze the field evolution and extract the normal mode dispersion relation. It performs a standard Fourier transform along the cylinder axis and a Hankel transform in the radial direction.

Script analysis_rz.py
You can copy this file from Examples/Tests/ohm_solver_EM_modes/analysis_rz.py.
#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example producing EM modes.

import dill
import matplotlib.pyplot as plt
import numpy as np
import scipy.fft as fft
from matplotlib import colors
from openpmd_viewer import OpenPMDTimeSeries
from scipy.interpolate import RegularGridInterpolator
from scipy.special import j1, jn, jn_zeros

from pywarpx import picmi

constants = picmi.constants

# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
    sim = dill.load(f)

diag_dir = "diags/field_diags"

ts = OpenPMDTimeSeries(diag_dir, check_all_files=True)

def transform_spatially(data_for_transform):
    # interpolate from regular r-grid to special r-grid
    interp = RegularGridInterpolator(
        (info.z, info.r), data_for_transform,
        method='linear'
    )
    data_interp = interp((zg, rg))

    # Applying manual hankel in r
    # Fmz = np.sum(proj*data_for_transform, axis=(2,3))
    Fmz = np.einsum('ijkl,kl->ij', proj, data_interp)
    # Standard fourier in z
    Fmn = fft.fftshift(fft.fft(Fmz, axis=1), axes=1)
    return Fmn

def process(it):
    print(f"Processing iteration {it}", flush=True)
    field, info = ts.get_field('E', 'y', iteration=it)
    F_k = transform_spatially(field)
    return F_k

# grab the first iteration to get the grids
Bz, info = ts.get_field('B', 'z', iteration=0)

nr = len(info.r)
nz = len(info.z)

nkr = 12 # number of radial modes to solve for

r_max = np.max(info.r)

# create r-grid with points spaced out according to zeros of the Bessel function
r_grid = jn_zeros(1, nr) / jn_zeros(1, nr)[-1] * r_max

zg, rg = np.meshgrid(info.z, r_grid)

# Setup Hankel Transform
j_1M = jn_zeros(1, nr)[-1]
r_modes = np.arange(nkr)

A = (
    4.0 * np.pi * r_max**2 / j_1M**2
    * j1(np.outer(jn_zeros(1, max(r_modes)+1)[r_modes], jn_zeros(1, nr)) / j_1M)
    / jn(2 ,jn_zeros(1, nr))**2
)

# No transformation for z
B = np.identity(nz)

# combine projection arrays
proj = np.einsum('ab,cd->acbd', A, B)

results = np.zeros((len(ts.t), nkr, nz), dtype=complex)
for ii, it in enumerate(ts.iterations):
    results[ii] = process(it)

# now Fourier transform in time
F_kw = fft.fftshift(fft.fft(results, axis=0), axes=0)

dz = info.z[1] - info.z[0]
kz = 2*np.pi*fft.fftshift(fft.fftfreq(F_kw[0].shape[1], dz))
dt = ts.iterations[1] - ts.iterations[0]
omega = 2*np.pi*fft.fftshift(fft.fftfreq(F_kw.shape[0], sim.dt*dt))

# Save data for future plotting purposes
np.savez(
    "diags/spectrograms.npz",
    F_kw=F_kw, dz=dz, kz=kz, dt=dt, omega=omega
)

# plot the resulting dispersions
k = np.linspace(0, 250, 500)
kappa = k * sim.l_i

fig, axes = plt.subplots(2, 2, sharex=True, sharey=True, figsize=(6.75, 5))

vmin = [2e-3, 1.5e-3, 7.5e-4, 5e-4]
vmax = 1.0

# plot m = 1
for ii, m in enumerate([1, 3, 6, 8]):
    ax = axes.flatten()[ii]
    ax.set_title(f"m = {m}", fontsize=11)
    m -= 1
    pm1 = ax.pcolormesh(
        kz*sim.l_i, omega/sim.w_ci,
        abs(F_kw[:, m, :])/np.max(abs(F_kw[:, m, :])),
        norm=colors.LogNorm(vmin=vmin[ii], vmax=vmax),
        cmap='inferno'
    )
    cb = fig.colorbar(pm1, ax=ax)
    cb.set_label(r'Normalized $E_\theta(k_z, m, \omega)$')

    # Get dispersion relation - see for example
    # T. Stix, Waves in Plasmas (American Inst. of Physics, 1992), Chap 6, Sec 2
    nu_m = jn_zeros(1, m+1)[-1] / sim.Lr
    R2 = 0.5 * (nu_m**2 * (1.0 + kappa**2) + k**2 * (kappa**2 + 2.0))
    P4 = k**2 * (nu_m**2 + k**2)
    omega_fast = sim.vA * np.sqrt(R2 + np.sqrt(R2**2 - P4))
    omega_slow = sim.vA * np.sqrt(R2 - np.sqrt(R2**2 - P4))
    # Upper right corner
    ax.plot(k*sim.l_i, omega_fast/sim.w_ci, 'w--', label = f"$\omega_{{fast}}$")
    ax.plot(k*sim.l_i, omega_slow/sim.w_ci, color='white', linestyle='--', label = f"$\omega_{{slow}}$")
    # Thermal resonance
    thermal_res = sim.w_ci + 3*sim.v_ti*k
    ax.plot(k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "$\omega = \Omega_i + 3v_{th,i}k$")
    ax.plot(-k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "")
    thermal_res = sim.w_ci - 3*sim.v_ti*k
    ax.plot(k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "$\omega = \Omega_i + 3v_{th,i}k$")
    ax.plot(-k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "")


for ax in axes.flatten():
    ax.set_xlim(-1.75, 1.75)
    ax.set_ylim(0, 1.6)

axes[0, 0].set_ylabel('$\omega/\Omega_{ci}$')
axes[1, 0].set_ylabel('$\omega/\Omega_{ci}$')
axes[1, 0].set_xlabel('$k_zl_i$')
axes[1, 1].set_xlabel('$k_zl_i$')

plt.savefig('normal_modes_disp.png', dpi=600)
if not sim.test:
    plt.show()
else:
    plt.close()

    # check if power spectrum sampling match earlier results
    amps = np.abs(F_kw[2, 1, len(kz)//2-2:len(kz)//2+2])
    print("Amplitude sample: ", amps)
    assert np.allclose(
        amps, np.array([ 61.02377286,  19.80026021, 100.47687017,  10.83331295])
    )

if sim.test:
    import os
    import sys
    sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
    import checksumAPI

    # this will be the name of the plot file
    fn = sys.argv[1]
    test_name = os.path.split(os.getcwd())[1]
    checksumAPI.evaluate_checksum(test_name, fn, rtol=1e-6)

The following figure was produced with the above analysis script, showing excellent agreement between the calculated and theoretical dispersion relations.

Normal EM modes in a metallic cylinder

Cylindrical normal mode dispersion comparing the calculated spectrum with the theoretical one.

Ohm solver: Ion Beam R Instability

In this example a low density ion beam interacts with a “core” plasma population which induces an instability. Based on the relative density between the beam and the core plasma a resonant or non-resonant condition can be accessed.

Run

The same input script can be used for 1d, 2d or 3d simulations as well as replicating either the resonant or non-resonant condition as indicated below.

Script PICMI_inputs.py
You can copy this file from Examples/Tests/ohm_solver_ion_beam_instability/PICMI_inputs.py.
#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script simulates an ion beam instability wherein a
# --- low density ion beam interacts with background plasma. See Section 6.5 of
# --- Matthews (1994) and Section 4.4 of Munoz et al. (2018).

import argparse
import os
import sys
import time

import dill
import numpy as np
from mpi4py import MPI as mpi

from pywarpx import callbacks, fields, libwarpx, particle_containers, picmi

constants = picmi.constants

comm = mpi.COMM_WORLD

simulation = picmi.Simulation(
    warpx_serialize_initial_conditions=True,
    verbose=0
)


class HybridPICBeamInstability(object):
    '''This input is based on the ion beam R instability test as described by
    Munoz et al. (2018).
    '''
    # Applied field parameters
    B0          = 0.25 # Initial magnetic field strength (T)
    beta        = 1.0 # Plasma beta, used to calculate temperature

    # Plasma species parameters
    m_ion      = 100.0 # Ion mass (electron masses)
    vA_over_c  = 1e-4 # ratio of Alfven speed and the speed of light

    # Spatial domain
    Nz          = 1024 # number of cells in z direction
    Nx          = 8 # number of cells in x (and y) direction for >1 dimensions

    # Temporal domain (if not run as a CI test)
    LT          = 120.0 # Simulation temporal length (ion cyclotron periods)

    # Numerical parameters
    NPPC        = [1024, 256, 64] # Seed number of particles per cell
    DZ          = 1.0 / 4.0 # Cell size (ion skin depths)
    DT          = 0.01 # Time step (ion cyclotron periods)

    # Plasma resistivity - used to dampen the mode excitation
    eta = 1e-7
    # Number of substeps used to update B
    substeps = 10

    # Beam parameters
    n_beam = [0.02, 0.1]
    U_bc = 10.0 # relative drifts between beam and core in Alfven speeds

    def __init__(self, test, dim, resonant, verbose):
        """Get input parameters for the specific case desired."""
        self.test = test
        self.dim = int(dim)
        self.resonant = resonant
        self.verbose = verbose or self.test

        # sanity check
        assert (dim > 0 and dim < 4), f"{dim}-dimensions not a valid input"

        # calculate various plasma parameters based on the simulation input
        self.get_plasma_quantities()

        self.n_beam = self.n_beam[1 - int(resonant)]
        self.u_beam = 1.0 / (1.0 + self.n_beam) * self.U_bc * self.vA
        self.u_c = -1.0 * self.n_beam / (1.0 + self.n_beam) * self.U_bc * self.vA
        self.n_beam = self.n_beam * self.n_plasma

        self.dz = self.DZ * self.l_i
        self.Lz = self.Nz * self.dz
        self.Lx = self.Nx * self.dz

        if self.dim == 3:
            self.volume = self.Lx * self.Lx * self.Lz
            self.N_cells = self.Nx * self.Nx * self.Nz
        elif self.dim == 2:
            self.volume = self.Lx * self.Lz
            self.N_cells = self.Nx * self.Nz
        else:
            self.volume = self.Lz
            self.N_cells = self.Nz

        diag_period = 1 / 4.0 # Output interval (ion cyclotron periods)
        self.diag_steps = int(diag_period / self.DT)

        # if this is a test case run for only 25 cyclotron periods
        if self.test:
            self.LT = 25.0

        self.total_steps = int(np.ceil(self.LT / self.DT))

        self.dt = self.DT / self.w_ci

        # dump all the current attributes to a dill pickle file
        if comm.rank == 0:
            with open('sim_parameters.dpkl', 'wb') as f:
                dill.dump(self, f)

        # print out plasma parameters
        if comm.rank == 0:
            print(
                f"Initializing simulation with input parameters:\n"
                f"\tT = {self.T_plasma*1e-3:.1f} keV\n"
                f"\tn = {self.n_plasma:.1e} m^-3\n"
                f"\tB0 = {self.B0:.2f} T\n"
                f"\tM/m = {self.m_ion:.0f}\n"
            )
            print(
                f"Plasma parameters:\n"
                f"\tl_i = {self.l_i:.1e} m\n"
                f"\tt_ci = {self.t_ci:.1e} s\n"
                f"\tv_ti = {self.v_ti:.1e} m/s\n"
                f"\tvA = {self.vA:.1e} m/s\n"
            )
            print(
                f"Numerical parameters:\n"
                f"\tdz = {self.dz:.1e} m\n"
                f"\tdt = {self.dt:.1e} s\n"
                f"\tdiag steps = {self.diag_steps:d}\n"
                f"\ttotal steps = {self.total_steps:d}\n"
            )

        self.setup_run()

    def get_plasma_quantities(self):
        """Calculate various plasma parameters based on the simulation input."""
        # Ion mass (kg)
        self.M = self.m_ion * constants.m_e

        # Cyclotron angular frequency (rad/s) and period (s)
        self.w_ci = constants.q_e * abs(self.B0) / self.M
        self.t_ci = 2.0 * np.pi / self.w_ci

        # Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
        self.vA = self.vA_over_c * constants.c
        self.n_plasma = (
            (self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
        )

        # Ion plasma frequency (Hz)
        self.w_pi = np.sqrt(
            constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
        )

        # Skin depth (m)
        self.l_i = constants.c / self.w_pi

        # Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
        self.v_ti = np.sqrt(self.beta / 2.0) * self.vA

        # Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
        self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV

        # Larmor radius (m)
        self.rho_i = self.v_ti / self.w_ci

    def setup_run(self):
        """Setup simulation components."""

        #######################################################################
        # Set geometry and boundary conditions                                #
        #######################################################################

        if self.dim == 1:
            grid_object = picmi.Cartesian1DGrid
        elif self.dim == 2:
            grid_object = picmi.Cartesian2DGrid
        else:
            grid_object = picmi.Cartesian3DGrid

        self.grid = grid_object(
            number_of_cells=[self.Nx, self.Nx, self.Nz][-self.dim:],
            warpx_max_grid_size=self.Nz,
            lower_bound=[-self.Lx/2.0, -self.Lx/2.0, 0][-self.dim:],
            upper_bound=[self.Lx/2.0, self.Lx/2.0, self.Lz][-self.dim:],
            lower_boundary_conditions=['periodic']*self.dim,
            upper_boundary_conditions=['periodic']*self.dim
        )
        simulation.time_step_size = self.dt
        simulation.max_steps = self.total_steps
        simulation.current_deposition_algo = 'direct'
        simulation.particle_shape = 1
        simulation.verbose = self.verbose

        #######################################################################
        # Field solver and external field                                     #
        #######################################################################

        self.solver = picmi.HybridPICSolver(
            grid=self.grid, gamma=1.0,
            Te=self.T_plasma/10.0,
            n0=self.n_plasma+self.n_beam,
            plasma_resistivity=self.eta, substeps=self.substeps
        )
        simulation.solver = self.solver

        B_ext = picmi.AnalyticInitialField(
            Bx_expression=0.0,
            By_expression=0.0,
            Bz_expression=self.B0
        )
        simulation.add_applied_field(B_ext)

        #######################################################################
        # Particle types setup                                                #
        #######################################################################

        self.ions = picmi.Species(
            name='ions', charge='q_e', mass=self.M,
            initial_distribution=picmi.UniformDistribution(
                density=self.n_plasma,
                rms_velocity=[self.v_ti]*3,
                directed_velocity=[0, 0, self.u_c]
            )
        )
        simulation.add_species(
            self.ions,
            layout=picmi.PseudoRandomLayout(
                grid=self.grid, n_macroparticles_per_cell=self.NPPC[self.dim-1]
            )
        )
        self.beam_ions = picmi.Species(
            name='beam_ions', charge='q_e', mass=self.M,
            initial_distribution=picmi.UniformDistribution(
                density=self.n_beam,
                rms_velocity=[self.v_ti]*3,
                directed_velocity=[0, 0, self.u_beam]
            )
        )
        simulation.add_species(
            self.beam_ions,
            layout=picmi.PseudoRandomLayout(
                grid=self.grid,
                n_macroparticles_per_cell=self.NPPC[self.dim-1]/2
            )
        )

        #######################################################################
        # Add diagnostics                                                     #
        #######################################################################

        callbacks.installafterstep(self.energy_diagnostic)
        callbacks.installafterstep(self.text_diag)

        if self.test:
            part_diag = picmi.ParticleDiagnostic(
                name='diag1',
                period=1250,
                species=[self.ions, self.beam_ions],
                data_list = ['ux', 'uy', 'uz', 'z', 'weighting'],
                write_dir='.',
                warpx_file_prefix='Python_ohms_law_solver_ion_beam_1d_plt',
            )
            simulation.add_diagnostic(part_diag)
            field_diag = picmi.FieldDiagnostic(
                name='diag1',
                grid=self.grid,
                period=1250,
                data_list = ['Bx', 'By', 'Bz', 'Ex', 'Ey', 'Ez', 'Jx', 'Jy', 'Jz'],
                write_dir='.',
                warpx_file_prefix='Python_ohms_law_solver_ion_beam_1d_plt',
            )
            simulation.add_diagnostic(field_diag)

        # output the full particle data at t*w_ci = 40
        step = int(40.0 / self.DT)
        parts_diag = picmi.ParticleDiagnostic(
            name='parts_diag',
            period=f"{step}:{step}",
            species=[self.ions, self.beam_ions],
            write_dir='diags',
            warpx_file_prefix='Python_hybrid_PIC_plt',
            warpx_format = 'openpmd',
            warpx_openpmd_backend = 'h5'
        )
        simulation.add_diagnostic(parts_diag)

        self.output_file_name = 'field_data.txt'
        if self.dim == 1:
            line_diag = picmi.ReducedDiagnostic(
                diag_type='FieldProbe',
                probe_geometry='Line',
                z_probe=0,
                z1_probe=self.Lz,
                resolution=self.Nz - 1,
                name=self.output_file_name[:-4],
                period=self.diag_steps,
                path='diags/'
            )
            simulation.add_diagnostic(line_diag)
        else:
            # install a custom "reduced diagnostic" to save the average field
            callbacks.installafterEsolve(self._record_average_fields)
            try:
                os.mkdir("diags")
            except OSError:
                # diags directory already exists
                pass
            with open(f"diags/{self.output_file_name}", 'w') as f:
                f.write("[0]step() [1]time(s) [2]z_coord(m) [3]By_lev0-(T)\n")


        #######################################################################
        # Initialize simulation                                               #
        #######################################################################

        simulation.initialize_inputs()
        simulation.initialize_warpx()

        # create particle container wrapper for the ion species to access
        # particle data
        self.ion_container_wrapper = particle_containers.ParticleContainerWrapper(
            self.ions.name
        )
        self.beam_ion_container_wrapper = particle_containers.ParticleContainerWrapper(
            self.beam_ions.name
        )

    def _create_data_arrays(self):
        self.prev_time = time.time()
        self.start_time = self.prev_time
        self.prev_step = 0

        if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
            # allocate arrays for storing energy values
            self.energy_vals = np.zeros((self.total_steps//self.diag_steps, 4))

    def text_diag(self):
        """Diagnostic function to print out timing data and particle numbers."""
        step = simulation.extension.warpx.getistep(lev=0) - 1

        if not hasattr(self, "prev_time"):
            self._create_data_arrays()

        if step % (self.total_steps // 10) != 0:
            return

        wall_time = time.time() - self.prev_time
        steps = step - self.prev_step
        step_rate = steps / wall_time

        status_dict = {
            'step': step,
            'nplive beam ions': self.ion_container_wrapper.nps,
            'nplive ions': self.beam_ion_container_wrapper.nps,
            'wall_time': wall_time,
            'step_rate': step_rate,
            "diag_steps": self.diag_steps,
            'iproc': None
        }

        diag_string = (
            "Step #{step:6d}; "
            "{nplive beam ions} beam ions; "
            "{nplive ions} core ions; "
            "{wall_time:6.1f} s wall time; "
            "{step_rate:4.2f} steps/s"
        )

        if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
            print(diag_string.format(**status_dict))

        self.prev_time = time.time()
        self.prev_step = step

    def energy_diagnostic(self):
        """Diagnostic to get the total, magnetic and kinetic energies in the
        simulation."""
        step = simulation.extension.warpx.getistep(lev=0) - 1

        if step % self.diag_steps != 1:
            return

        idx = (step - 1) // self.diag_steps

        if not hasattr(self, "prev_time"):
            self._create_data_arrays()

        # get the simulation energies
        Ec_par, Ec_perp = self._get_kinetic_energy(self.ion_container_wrapper)
        Eb_par, Eb_perp = self._get_kinetic_energy(self.beam_ion_container_wrapper)

        if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
            return

        self.energy_vals[idx, 0] = Ec_par
        self.energy_vals[idx, 1] = Ec_perp
        self.energy_vals[idx, 2] = Eb_par
        self.energy_vals[idx, 3] = Eb_perp

        if step == self.total_steps:
            np.save('diags/energies.npy', run.energy_vals)

    def _get_kinetic_energy(self, container_wrapper):
        """Utility function to retrieve the total kinetic energy in the
        simulation."""
        try:
            ux = np.concatenate(container_wrapper.get_particle_ux())
            uy = np.concatenate(container_wrapper.get_particle_uy())
            uz = np.concatenate(container_wrapper.get_particle_uz())
            w = np.concatenate(container_wrapper.get_particle_weight())
        except ValueError:
            return 0.0, 0.0

        my_E_perp = 0.5 * self.M * np.sum(w * (ux**2 + uy**2))
        E_perp = comm.allreduce(my_E_perp, op=mpi.SUM)

        my_E_par = 0.5 * self.M * np.sum(w * uz**2)
        E_par = comm.allreduce(my_E_par, op=mpi.SUM)

        return E_par, E_perp

    def _record_average_fields(self):
        """A custom reduced diagnostic to store the average E&M fields in a
        similar format as the reduced diagnostic so that the same analysis
        script can be used regardless of the simulation dimension.
        """
        step = simulation.extension.warpx.getistep(lev=0) - 1

        if step % self.diag_steps != 0:
            return

        By_warpx = fields.BxWrapper()[...]

        if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
            return

        t = step * self.dt
        z_vals = np.linspace(0, self.Lz, self.Nz, endpoint=False)

        if self.dim == 2:
            By = np.mean(By_warpx[:-1], axis=0)
        else:
            By = np.mean(By_warpx[:-1], axis=(0, 1))

        with open(f"diags/{self.output_file_name}", 'a') as f:
            for ii in range(self.Nz):
                f.write(
                    f"{step:05d} {t:.10e} {z_vals[ii]:.10e} {By[ii]:+.10e}\n"
                )


##########################
# parse input parameters
##########################

parser = argparse.ArgumentParser()
parser.add_argument(
    '-t', '--test', help='toggle whether this script is run as a short CI test',
    action='store_true',
)
parser.add_argument(
    '-d', '--dim', help='Simulation dimension', required=False, type=int,
    default=1
)
parser.add_argument(
    '-r', '--resonant', help='Run the resonant case', required=False,
    action='store_true',
)
parser.add_argument(
    '-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left

run = HybridPICBeamInstability(
    test=args.test, dim=args.dim, resonant=args.resonant, verbose=args.verbose
)
simulation.step()

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

Execute:

python3 PICMI_inputs.py -dim {1/2/3} --resonant

Execute:

python3 PICMI_inputs.py -dim {1/2/3}
Analyze

The following script reads the simulation output from the above example, performs Fourier transforms of the field data and outputs the figures shown below.

Script analysis.py
You can copy this file from Examples/Tests/ohm_solver_ion_beam_instability/analysis.py.
#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example of ion beam R instability.

import dill
import h5py
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

from pywarpx import picmi

constants = picmi.constants

matplotlib.rcParams.update({'font.size': 20})

# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
    sim = dill.load(f)

if sim.resonant:
    resonant_str = 'resonant'
else:
    resonant_str = 'non resonant'

data = np.loadtxt("diags/field_data.txt", skiprows=1)
if sim.dim == 1:
    field_idx_dict = {'z': 4, 'By': 8}
else:
    field_idx_dict = {'z': 2, 'By': 3}

step = data[:,0]

num_steps = len(np.unique(step))

# get the spatial resolution
resolution = len(np.where(step == 0)[0]) - 1

# reshape to separate spatial and time coordinates
sim_data = data.reshape((num_steps, resolution+1, data.shape[1]))

z_grid = sim_data[1, :, field_idx_dict['z']]
idx = np.argsort(z_grid)[1:]
dz = np.mean(np.diff(z_grid[idx]))
dt = np.mean(np.diff(sim_data[:,0,1]))

data = np.zeros((num_steps, resolution))
for i in range(num_steps):
    data[i,:] = sim_data[i,idx,field_idx_dict['By']]

print(f"Data file contains {num_steps} time snapshots.")
print(f"Spatial resolution is {resolution}")

# Create the stack time plot
fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))

max_val = np.max(np.abs(data[:,:]/sim.B0))

extent = [0, sim.Lz/sim.l_i, 0, num_steps*dt*sim.w_ci] # num_steps*dt/sim.t_ci]
im = ax1.imshow(
    data[:,:]/sim.B0, extent=extent, origin='lower',
    cmap='seismic', vmin=-max_val, vmax=max_val, aspect="equal",
)

# Colorbar
fig.subplots_adjust(right=0.825)
cbar_ax = fig.add_axes([0.85, 0.2, 0.03, 0.6])
fig.colorbar(im, cax=cbar_ax, orientation='vertical', label='$B_y/B_0$')

ax1.set_xlabel("$x/l_i$")
ax1.set_ylabel("$t \Omega_i$ (rad)")

ax1.set_title(f"Ion beam R instability - {resonant_str} case")
plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}.png")
plt.close()

if sim.resonant:

    # Plot the 4th, 5th and 6th Fourier modes
    field_kt = np.fft.fft(data[:, :], axis=1)
    k = 2*np.pi * np.fft.fftfreq(resolution, dz) * sim.l_i

    t_grid = np.arange(num_steps)*dt*sim.w_ci
    plt.plot(t_grid, np.abs(field_kt[:, 4] / sim.B0), 'r', label=f'm = 4, $kl_i={k[4]:.2f}$')
    plt.plot(t_grid, np.abs(field_kt[:, 5] / sim.B0), 'b', label=f'm = 5, $kl_i={k[5]:.2f}$')
    plt.plot(t_grid, np.abs(field_kt[:, 6] / sim.B0), 'k', label=f'm = 6, $kl_i={k[6]:.2f}$')

    # The theoretical growth rates for the 4th, 5th and 6th Fourier modes of
    # the By-field was obtained from Fig. 12a of Munoz et al.
    # Note the rates here are gamma / w_ci
    gamma4 = 0.1915611861780133
    gamma5 = 0.20087036355662818
    gamma6 = 0.17123024228396777

    # Draw the line of best fit with the theoretical growth rate (slope) in the
    # window t*w_ci between 10 and 40
    idx = np.where((t_grid > 10) & (t_grid < 40))
    t_points = t_grid[idx]

    A4 = np.exp(np.mean(np.log(np.abs(field_kt[idx, 4] / sim.B0)) - t_points*gamma4))
    plt.plot(t_points, A4*np.exp(t_points*gamma4), 'r--', lw=3)
    A5 = np.exp(np.mean(np.log(np.abs(field_kt[idx, 5] / sim.B0)) - t_points*gamma5))
    plt.plot(t_points, A5*np.exp(t_points*gamma5), 'b--', lw=3)
    A6 = np.exp(np.mean(np.log(np.abs(field_kt[idx, 6] / sim.B0)) - t_points*gamma6))
    plt.plot(t_points, A6*np.exp(t_points*gamma6), 'k--', lw=3)

    plt.grid()
    plt.legend()
    plt.yscale('log')
    plt.ylabel('$|B_y/B_0|$')
    plt.xlabel('$t\Omega_i$ (rad)')
    plt.tight_layout()
    plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}_low_modes.png")
    plt.close()

    # check if the growth rate matches expectation
    m4_rms_error = np.sqrt(np.mean(
        (np.abs(field_kt[idx, 4] / sim.B0) - A4*np.exp(t_points*gamma4))**2
    ))
    m5_rms_error = np.sqrt(np.mean(
        (np.abs(field_kt[idx, 5] / sim.B0) - A5*np.exp(t_points*gamma5))**2
    ))
    m6_rms_error = np.sqrt(np.mean(
        (np.abs(field_kt[idx, 6] / sim.B0) - A6*np.exp(t_points*gamma6))**2
    ))
    print("Growth rate RMS errors:")
    print(f"    m = 4: {m4_rms_error:.3e}")
    print(f"    m = 5: {m5_rms_error:.3e}")
    print(f"    m = 6: {m6_rms_error:.3e}")

if not sim.test:
    with h5py.File('diags/Python_hybrid_PIC_plt/openpmd_004000.h5', 'r') as data:

        timestep = str(np.squeeze([key for key in data['data'].keys()]))

        z = np.array(data['data'][timestep]['particles']['ions']['position']['z'])
        vy = np.array(data['data'][timestep]['particles']['ions']['momentum']['y'])
        w = np.array(data['data'][timestep]['particles']['ions']['weighting'])

    fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))

    im = ax1.hist2d(
        z/sim.l_i, vy/sim.M/sim.vA, weights=w, density=True,
        range=[[0, 250], [-10, 10]], bins=250, cmin=1e-5
    )

    # Colorbar
    fig.subplots_adjust(bottom=0.15, right=0.815)
    cbar_ax = fig.add_axes([0.83, 0.2, 0.03, 0.6])
    fig.colorbar(im[3], cax=cbar_ax, orientation='vertical', format='%.0e', label='$f(z, v_y)$')

    ax1.set_xlabel("$x/l_i$")
    ax1.set_ylabel("$v_{y}/v_A$")

    ax1.set_title(f"Ion beam R instability - {resonant_str} case")
    plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}_core_phase_space.png")
    plt.close()

    with h5py.File('diags/Python_hybrid_PIC_plt/openpmd_004000.h5', 'r') as data:

        timestep = str(np.squeeze([key for key in data['data'].keys()]))

        z = np.array(data['data'][timestep]['particles']['beam_ions']['position']['z'])
        vy = np.array(data['data'][timestep]['particles']['beam_ions']['momentum']['y'])
        w = np.array(data['data'][timestep]['particles']['beam_ions']['weighting'])

    fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))

    im = ax1.hist2d(
        z/sim.l_i, vy/sim.M/sim.vA, weights=w, density=True,
        range=[[0, 250], [-10, 10]], bins=250, cmin=1e-5
    )

    # Colorbar
    fig.subplots_adjust(bottom=0.15, right=0.815)
    cbar_ax = fig.add_axes([0.83, 0.2, 0.03, 0.6])
    fig.colorbar(im[3], cax=cbar_ax, orientation='vertical', format='%.0e', label='$f(z, v_y)$')

    ax1.set_xlabel("$x/l_i$")
    ax1.set_ylabel("$v_{y}/v_A$")

    ax1.set_title(f"Ion beam R instability - {resonant_str} case")
    plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}_beam_phase_space.png")
    plt.show()

if sim.test:

    # physics based check - these error tolerances are not set from theory
    # but from the errors that were present when the test was created. If these
    # assert's fail, the full benchmark should be rerun (same as the test but
    # without the `--test` argument) and the growth rates (up to saturation)
    # compared to the theoretical ones to determine if the physics test passes.
    # At creation, the full test (3d) had the following errors (ran on 1 V100):
    # m4_rms_error = 3.329; m5_rms_error = 1.052; m6_rms_error = 2.583
    assert np.isclose(m4_rms_error, 1.515, atol=0.01)
    assert np.isclose(m5_rms_error, 0.718, atol=0.01)
    assert np.isclose(m6_rms_error, 0.357, atol=0.01)

    # checksum check
    import os
    import sys
    sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
    import checksumAPI

    # this will be the name of the plot file
    fn = sys.argv[1]
    test_name = os.path.split(os.getcwd())[1]
    checksumAPI.evaluate_checksum(test_name, fn)

The figures below show the evolution of the y-component of the magnetic field as the beam and core plasma interact.

Resonant ion beam R instability
Non-resonant ion beam R instability

Evolution of \(B_y\) for resonant (top) and non-resonant (bottom) conditions.

The growth rates of the strongest growing modes for the resonant case are compared to theory (dashed lines) in the figure below.

Resonant ion beam R instability growth rates

Time series of the mode amplitudes for m = 4, 5, 6 from simulation. The theoretical growth for these modes are also shown as dashed lines.

Ohm solver: Ion Landau Damping

Landau damping is a well known process in which electrostatic (acoustic) waves are damped by transferring energy to particles satisfying a resonance condition. The process can be simulated by seeding a plasma with a specific acoustic mode (density perturbation) and tracking the strength of the mode as a function of time.

Run

The same input script can be used for 1d, 2d or 3d simulations and to sweep different temperature ratios.

Script PICMI_inputs.py
You can copy this file from Examples/Tests/ohm_solver_ion_Landau_damping/PICMI_inputs.py.
#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script simulates ion Landau damping as described
# --- in section 4.5 of Munoz et al. (2018).

import argparse
import os
import sys
import time

import dill
import numpy as np
from mpi4py import MPI as mpi

from pywarpx import callbacks, fields, libwarpx, particle_containers, picmi

constants = picmi.constants

comm = mpi.COMM_WORLD

simulation = picmi.Simulation(
    warpx_serialize_initial_conditions=True,
    verbose=0
)


class IonLandauDamping(object):
    '''This input is based on the ion Landau damping test as described by
    Munoz et al. (2018).
    '''
    # Applied field parameters
    B0          = 0.1 # Initial magnetic field strength (T)
    beta        = 2.0 # Plasma beta, used to calculate temperature

    # Plasma species parameters
    m_ion      = 100.0 # Ion mass (electron masses)
    vA_over_c  = 1e-3 # ratio of Alfven speed and the speed of light

    # Spatial domain
    Nz          = 256 # number of cells in z direction
    Nx          = 4 # number of cells in x (and y) direction for >1 dimensions

    # Temporal domain (if not run as a CI test)
    LT          = 40.0 # Simulation temporal length (ion cyclotron periods)

    # Numerical parameters
    NPPC        = [8192, 4096, 1024] # Seed number of particles per cell
    DZ          = 1.0 / 6.0 # Cell size (ion skin depths)
    DT          = 1e-3 # Time step (ion cyclotron periods)

    # density perturbation strength
    epsilon = 0.03

    # Plasma resistivity - used to dampen the mode excitation
    eta = 1e-7
    # Number of substeps used to update B
    substeps = 10


    def __init__(self, test, dim, m, T_ratio, verbose):
        """Get input parameters for the specific case desired."""
        self.test = test
        self.dim = int(dim)
        self.m = m
        self.T_ratio = T_ratio
        self.verbose = verbose or self.test

        # sanity check
        assert (dim > 0 and dim < 4), f"{dim}-dimensions not a valid input"

        # calculate various plasma parameters based on the simulation input
        self.get_plasma_quantities()

        self.dz = self.DZ * self.l_i
        self.Lz = self.Nz * self.dz
        self.Lx = self.Nx * self.dz

        diag_period = 1 / 16.0 # Output interval (ion cyclotron periods)
        self.diag_steps = int(diag_period / self.DT)

        self.total_steps = int(np.ceil(self.LT / self.DT))
        # if this is a test case run for only 100 steps
        if self.test:
            self.total_steps = 100

        self.dt = self.DT / self.w_ci # self.DT * self.t_ci

        # dump all the current attributes to a dill pickle file
        if comm.rank == 0:
            with open('sim_parameters.dpkl', 'wb') as f:
                dill.dump(self, f)

        # print out plasma parameters
        if comm.rank == 0:
            print(
                f"Initializing simulation with input parameters:\n"
                f"\tT = {self.T_plasma*1e-3:.1f} keV\n"
                f"\tn = {self.n_plasma:.1e} m^-3\n"
                f"\tB0 = {self.B0:.2f} T\n"
                f"\tM/m = {self.m_ion:.0f}\n"
            )
            print(
                f"Plasma parameters:\n"
                f"\tl_i = {self.l_i:.1e} m\n"
                f"\tt_ci = {self.t_ci:.1e} s\n"
                f"\tv_ti = {self.v_ti:.1e} m/s\n"
                f"\tvA = {self.vA:.1e} m/s\n"
            )
            print(
                f"Numerical parameters:\n"
                f"\tdz = {self.dz:.1e} m\n"
                f"\tdt = {self.dt:.1e} s\n"
                f"\tdiag steps = {self.diag_steps:d}\n"
                f"\ttotal steps = {self.total_steps:d}\n"
            )

        self.setup_run()

    def get_plasma_quantities(self):
        """Calculate various plasma parameters based on the simulation input."""
        # Ion mass (kg)
        self.M = self.m_ion * constants.m_e

        # Cyclotron angular frequency (rad/s) and period (s)
        self.w_ci = constants.q_e * abs(self.B0) / self.M
        self.t_ci = 2.0 * np.pi / self.w_ci

        # Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
        self.vA = self.vA_over_c * constants.c
        self.n_plasma = (
            (self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
        )

        # Ion plasma frequency (Hz)
        self.w_pi = np.sqrt(
            constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
        )

        # Skin depth (m)
        self.l_i = constants.c / self.w_pi

        # Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
        self.v_ti = np.sqrt(self.beta / 2.0) * self.vA

        # Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
        self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV

        # Larmor radius (m)
        self.rho_i = self.v_ti / self.w_ci

    def setup_run(self):
        """Setup simulation components."""

        #######################################################################
        # Set geometry and boundary conditions                                #
        #######################################################################

        if self.dim == 1:
            grid_object = picmi.Cartesian1DGrid
        elif self.dim == 2:
            grid_object = picmi.Cartesian2DGrid
        else:
            grid_object = picmi.Cartesian3DGrid

        self.grid = grid_object(
            number_of_cells=[self.Nx, self.Nx, self.Nz][-self.dim:],
            warpx_max_grid_size=self.Nz,
            lower_bound=[-self.Lx/2.0, -self.Lx/2.0, 0][-self.dim:],
            upper_bound=[self.Lx/2.0, self.Lx/2.0, self.Lz][-self.dim:],
            lower_boundary_conditions=['periodic']*self.dim,
            upper_boundary_conditions=['periodic']*self.dim,
            warpx_blocking_factor=4
        )
        simulation.time_step_size = self.dt
        simulation.max_steps = self.total_steps
        simulation.current_deposition_algo = 'direct'
        simulation.particle_shape = 1
        simulation.verbose = self.verbose

        #######################################################################
        # Field solver and external field                                     #
        #######################################################################

        self.solver = picmi.HybridPICSolver(
            grid=self.grid, gamma=1.0,
            Te=self.T_plasma/self.T_ratio,
            n0=self.n_plasma,
            plasma_resistivity=self.eta, substeps=self.substeps
        )
        simulation.solver = self.solver

        #######################################################################
        # Particle types setup                                                #
        #######################################################################

        k_m = 2.0*np.pi*self.m / self.Lz
        self.ions = picmi.Species(
            name='ions', charge='q_e', mass=self.M,
            initial_distribution=picmi.AnalyticDistribution(
                density_expression=f"{self.n_plasma}*(1+{self.epsilon}*cos({k_m}*z))",
                rms_velocity=[self.v_ti]*3
            )
        )
        simulation.add_species(
            self.ions,
            layout=picmi.PseudoRandomLayout(
                grid=self.grid, n_macroparticles_per_cell=self.NPPC[self.dim-1]
            )
        )

        #######################################################################
        # Add diagnostics                                                     #
        #######################################################################

        callbacks.installafterstep(self.text_diag)

        if self.test:
            particle_diag = picmi.ParticleDiagnostic(
                name='diag1',
                period=100,
                write_dir='.',
                species=[self.ions],
                data_list = ['ux', 'uy', 'uz', 'x', 'z', 'weighting'],
                warpx_file_prefix=f'Python_ohms_law_solver_landau_damping_{self.dim}d_plt',
            )
            simulation.add_diagnostic(particle_diag)
            field_diag = picmi.FieldDiagnostic(
                name='diag1',
                grid=self.grid,
                period=100,
                write_dir='.',
                data_list = ['Bx', 'By', 'Bz', 'Ex', 'Ey', 'Ez', 'Jx', 'Jy', 'Jz'],
                warpx_file_prefix=f'Python_ohms_law_solver_landau_damping_{self.dim}d_plt',
            )
            simulation.add_diagnostic(field_diag)

        self.output_file_name = 'field_data.txt'
        # install a custom "reduced diagnostic" to save the average field
        callbacks.installafterEsolve(self._record_average_fields)
        try:
            os.mkdir("diags")
        except OSError:
            # diags directory already exists
            pass
        with open(f"diags/{self.output_file_name}", 'w') as f:
            f.write("[0]step() [1]time(s) [2]z_coord(m) [3]Ez_lev0-(V/m)\n")

        self.prev_time = time.time()
        self.start_time = self.prev_time
        self.prev_step = 0

        #######################################################################
        # Initialize simulation                                               #
        #######################################################################

        simulation.initialize_inputs()
        simulation.initialize_warpx()

        # get ion particle container wrapper
        self.ion_part_container = particle_containers.ParticleContainerWrapper(
            'ions'
        )

    def text_diag(self):
        """Diagnostic function to print out timing data and particle numbers."""
        step = simulation.extension.warpx.getistep(lev=0) - 1

        if step % (self.total_steps // 10) != 0:
            return

        wall_time = time.time() - self.prev_time
        steps = step - self.prev_step
        step_rate = steps / wall_time

        status_dict = {
            'step': step,
            'nplive ions': self.ion_part_container.nps,
            'wall_time': wall_time,
            'step_rate': step_rate,
            "diag_steps": self.diag_steps,
            'iproc': None
        }

        diag_string = (
            "Step #{step:6d}; "
            "{nplive ions} core ions; "
            "{wall_time:6.1f} s wall time; "
            "{step_rate:4.2f} steps/s"
        )

        if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
            print(diag_string.format(**status_dict))

        self.prev_time = time.time()
        self.prev_step = step

    def _record_average_fields(self):
        """A custom reduced diagnostic to store the average E&M fields in a
        similar format as the reduced diagnostic so that the same analysis
        script can be used regardless of the simulation dimension.
        """
        step = simulation.extension.warpx.getistep(lev=0) - 1

        if step % self.diag_steps != 0:
            return

        Ez_warpx = fields.EzWrapper()[...]

        if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
            return

        t = step * self.dt
        z_vals = np.linspace(0, self.Lz, self.Nz, endpoint=False)

        if self.dim == 1:
            Ez = Ez_warpx
        elif self.dim == 2:
            Ez = np.mean(Ez_warpx, axis=0)
        else:
            Ez = np.mean(Ez_warpx, axis=(0, 1))

        with open(f"diags/{self.output_file_name}", 'a') as f:
            for ii in range(self.Nz):
                f.write(
                    f"{step:05d} {t:.10e} {z_vals[ii]:.10e} {Ez[ii]:+.10e}\n"
                )


##########################
# parse input parameters
##########################

parser = argparse.ArgumentParser()
parser.add_argument(
    '-t', '--test', help='toggle whether this script is run as a short CI test',
    action='store_true',
)
parser.add_argument(
    '-d', '--dim', help='Simulation dimension', required=False, type=int,
    default=1
)
parser.add_argument(
    '-m', help='Mode number to excite', required=False, type=int,
    default=4
)
parser.add_argument(
    '--temp_ratio', help='Ratio of ion to electron temperature', required=False,
    type=float, default=1.0/3
)
parser.add_argument(
    '-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left

run = IonLandauDamping(
    test=args.test, dim=args.dim, m=args.m, T_ratio=args.temp_ratio,
    verbose=args.verbose
)
simulation.step()

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

python3 PICMI_inputs.py -dim {1/2/3} --temp_ratio {value}
Analyze

The following script extracts the amplitude of the seeded mode as a function of time and compares it to the theoretical damping rate.

Script analysis.py
You can copy this file from Examples/Tests/ohm_solver_ion_Landau_damping/analysis.py.
#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example of ion Landau damping.

import dill
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

from pywarpx import picmi

constants = picmi.constants

matplotlib.rcParams.update({'font.size': 20})

# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
    sim = dill.load(f)

# theoretical damping rates were taken from Fig. 14b of Munoz et al.
theoretical_damping_rate = np.array([
    [0.09456706, 0.05113443], [0.09864177, 0.05847507],
    [0.10339559, 0.0659153 ], [0.10747029, 0.07359366],
    [0.11290323, 0.08256106], [0.11833616, 0.09262114],
    [0.12580645, 0.10541121], [0.13327674, 0.11825558],
    [0.14006791, 0.13203098], [0.14889643, 0.14600538],
    [0.15772496, 0.16379615], [0.16791171, 0.18026693],
    [0.17606112, 0.19650209], [0.18828523, 0.21522808],
    [0.19983022, 0.23349062], [0.21273345, 0.25209216],
    [0.22835314, 0.27877403], [0.24465195, 0.30098317],
    [0.25959253, 0.32186286], [0.27657046, 0.34254601],
    [0.29626486, 0.36983567], [0.3139219 , 0.38984826],
    [0.33157895, 0.40897973], [0.35195246, 0.43526107],
    [0.37368421, 0.45662113], [0.39745331, 0.47902942],
    [0.44974533, 0.52973074], [0.50747029, 0.57743925],
    [0.57334465, 0.63246726], [0.64193548, 0.67634255]
])

expected_gamma = np.interp(
    sim.T_ratio, theoretical_damping_rate[:, 0], theoretical_damping_rate[:, 1]
)

data = np.loadtxt("diags/field_data.txt", skiprows=1)
field_idx_dict = {'z': 2, 'Ez': 3}

step = data[:,0]

num_steps = len(np.unique(step))

# get the spatial resolution
resolution = len(np.where(step == 0)[0]) - 1

# reshape to separate spatial and time coordinates
sim_data = data.reshape((num_steps, resolution+1, data.shape[1]))

z_grid = sim_data[1, :, field_idx_dict['z']]
idx = np.argsort(z_grid)[1:]
dz = np.mean(np.diff(z_grid[idx]))
dt = np.mean(np.diff(sim_data[:,0,1]))

data = np.zeros((num_steps, resolution))
for i in range(num_steps):
    data[i,:] = sim_data[i,idx,field_idx_dict['Ez']]

print(f"Data file contains {num_steps} time snapshots.")
print(f"Spatial resolution is {resolution}")

field_kt = np.fft.fft(data[:, :], axis=1)

t_norm = 2.0 * np.pi * sim.m / sim.Lz * sim.v_ti

# Plot the 4th Fourier mode
fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))

t_points = np.arange(num_steps)*dt*t_norm
ax1.plot(
    t_points, np.abs(field_kt[:, sim.m] / field_kt[0, sim.m]), 'r',
    label=f'$T_i/T_e$ = {sim.T_ratio:.2f}'
)

# Plot a line showing the expected damping rate
t_points = t_points[np.where(t_points < 8)]
ax1.plot(
    t_points, np.exp(-t_points*expected_gamma), 'k--', lw=2
)

ax1.grid()
ax1.legend()
ax1.set_yscale('log')
ax1.set_ylabel('$|E_z|/E_0$')
ax1.set_xlabel('t $(k_mv_{th,i})$')
ax1.set_xlim(0, 18)

ax1.set_title(f"Ion Landau damping - {sim.dim}d")
plt.tight_layout()
plt.savefig(f"diags/ion_Landau_damping_T_ratio_{sim.T_ratio}.png")

if sim.test:
    import os
    import sys
    sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
    import checksumAPI

    # this will be the name of the plot file
    fn = sys.argv[1]
    test_name = os.path.split(os.getcwd())[1]
    checksumAPI.evaluate_checksum(test_name, fn)

The figure below shows a set of such simulations with parameters matching those described in section 4.5 of Muñoz et al. [1]. The straight lines show the theoretical damping rate for the given temperature ratios.

Ion Landau damping

Decay of seeded modes as a function of time for different electron-ion temperature ratios. The theoretical damping of the given modes are shown in dashed lines.

High-Performance Computing and Numerics

The following examples are commonly used to study the performance of WarpX, e.g., for computing efficiency, scalability, and I/O patterns. While all prior examples are used for such studies as well, the examples here need less explanation on the physics, less-detail tuning on load balancing, and often simply scale (weak or strong) by changing the number of cells, AMReX block size and number of compute units.

Uniform Plasma

This example evolves a uniformly distributed, hot plasma over time.

Run

For MPI-parallel runs, prefix these lines with mpiexec -n 4 ... or srun -n 4 ..., depending on the system.

Note

TODO: This input file should be created following the inputs_3d file.

This example can be run either as WarpX executable using an input file: warpx.3d inputs_3d

You can copy this file from usage/examples/lwfa/inputs_3d.
#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 10
amr.n_cell =  64 32 32
amr.max_grid_size = 32
amr.blocking_factor = 16
amr.max_level = 0
geometry.dims = 3
geometry.prob_lo = -20.e-6   -20.e-6   -20.e-6    # physical domain
geometry.prob_hi =  20.e-6    20.e-6    20.e-6

#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic periodic
boundary.field_hi = periodic periodic periodic

#################################
############ NUMERICS ###########
#################################
warpx.serialize_initial_conditions = 1
warpx.verbose = 1
warpx.cfl = 1.0

# Order of particle shape factors
algo.particle_shape = 1

#################################
############ PLASMA #############
#################################
particles.species_names = electrons

electrons.species_type = electron
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 1 1 2
electrons.profile = constant
electrons.density = 1.e25  # number of electrons per m^3
electrons.momentum_distribution_type = "gaussian"
electrons.ux_th  = 0.01 # uth the std of the (unitless) momentum
electrons.uy_th  = 0.01 # uth the std of the (unitless) momentum
electrons.uz_th  = 0.01 # uth the std of the (unitless) momentum

# Diagnostics
diagnostics.diags_names = diag1 chk
diag1.intervals = 4
diag1.diag_type = Full
diag1.electrons.variables = ux uy uz w
diag1.fields_to_plot = Bx By Bz Ex Ey Ez jx jy jz rho

chk.intervals = 6
chk.diag_type = Full
chk.format = checkpoint

Note

TODO: This input file should be created following the inputs_2d file.

This example can be run either as WarpX executable using an input file: warpx.2d inputs_2d

You can copy this file from usage/examples/lwfa/inputs_2d.
#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 10
amr.n_cell =  128 128
amr.max_grid_size = 64
amr.blocking_factor = 32
amr.max_level = 0
geometry.dims = 2
geometry.prob_lo = -20.e-6   -20.e-6    # physical domain
geometry.prob_hi =  20.e-6    20.e-6

#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic
boundary.field_hi = periodic periodic

#################################
############ NUMERICS ###########
#################################
warpx.serialize_initial_conditions = 1
warpx.verbose = 1
warpx.cfl = 1.0
warpx.use_filter = 0

# Order of particle shape factors
algo.particle_shape = 1

#################################
############ PLASMA #############
#################################
particles.species_names = electrons

electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 2 2
electrons.profile = constant
electrons.density = 1.e25  # number of electrons per m^3
electrons.momentum_distribution_type = "gaussian"
electrons.ux_th  = 0.01 # uth the std of the (unitless) momentum
electrons.uy_th  = 0.01 # uth the std of the (unitless) momentum
electrons.uz_th  = 0.01 # uth the std of the (unitless) momentum

# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 10
diag1.diag_type = Full
Analyze

Note

This section is TODO.

Visualize

Note

This section is TODO.

Manipulating fields via Python

Note

TODO: The section needs to be sorted into either science cases (above) or later sections (workflows and Python API details).

An example of using Python to access the simulation charge density, solve the Poisson equation (using superLU) and write the resulting electrostatic potential back to the simulation is given in the input file below. This example uses the fields.py module included in the pywarpx library.

An example of initializing the fields by accessing their data through Python, advancing the simulation for a chosen number of time steps, and plotting the fields again through Python. The simulation runs with 128 regular cells, 8 guard cells, and 10 PML cells, in each direction. Moreover, it uses div(E) and div(B) cleaning both in the regular grid and in the PML and initializes all available electromagnetic fields (E,B,F,G) identically.

Many Further Examples, Demos and Tests

WarpX runs over 200 integration tests on a variety of modeling cases, which validate and demonstrate its functionality. Please see the Examples/Tests/ directory for many more examples.

Example References

[1]

P. A. Muñoz, N. Jain, P. Kilian, and J. Büchner. A new hybrid code (CHIEF) implementing the inertial electron fluid equation without approximation. Computer Physics Communications, 224:245–264, 2018. URL: https://www.sciencedirect.com/science/article/pii/S0010465517303521, doi:https://doi.org/10.1016/j.cpc.2017.10.012.

[2]

T. Tajima and J. M. Dawson. Laser accelerator by plasma waves. AIP Conference Proceedings, 91(1):69–93, Sep 1982. URL: https://doi.org/10.1063/1.33805, doi:10.1063/1.33805.

[3]

E. Esarey, P. Sprangle, J. Krall, and A. Ting. Overview of plasma-based accelerator concepts. IEEE Transactions on Plasma Science, 24(2):252–288, 1996. doi:10.1109/27.509991.

[4]

S. C. Wilks, A. B. Langdon, T. E. Cowan, M. Roth, M. Singh, S. Hatchett, M. H. Key, D. Pennington, A. MacKinnon, and R. A. Snavely. Energetic proton generation in ultra-intense laser–solid interactions. Physics of Plasmas, 8(2):542–549, Feb 2001. URL: https://doi.org/10.1063/1.1333697, arXiv:https://pubs.aip.org/aip/pop/article-pdf/8/2/542/12669088/542\_1\_online.pdf, doi:10.1063/1.1333697.

[5]

S. S. Bulanov, A. Brantov, V. Yu. Bychenkov, V. Chvykov, G. Kalinchenko, T. Matsuoka, P. Rousseau, S. Reed, V. Yanovsky, D. W. Litzenberg, K. Krushelnick, and A. Maksimchuk. Accelerating monoenergetic protons from ultrathin foils by flat-top laser pulses in the directed-Coulomb-explosion regime. Phys. Rev. E, 78:026412, Aug 2008. URL: https://link.aps.org/doi/10.1103/PhysRevE.78.026412, doi:10.1103/PhysRevE.78.026412.

[6]

A. Macchi, M. Borghesi, and M. Passoni. Ion acceleration by superintense laser-plasma interaction. Rev. Mod. Phys., 85:751–793, May 2013. URL: https://link.aps.org/doi/10.1103/RevModPhys.85.751, doi:10.1103/RevModPhys.85.751.

[7]

B. Dromey, S. Kar, M. Zepf, and P. Foster. The plasma mirror—A subpicosecond optical switch for ultrahigh power lasers. Review of Scientific Instruments, 75(3):645–649, Feb 2004. URL: https://doi.org/10.1063/1.1646737, arXiv:https://pubs.aip.org/aip/rsi/article-pdf/75/3/645/8814694/645\_1\_online.pdf, doi:10.1063/1.1646737.

[8]

C. Rödel, M. Heyer, M. Behmke, M. Kübel, O. Jäckel, W. Ziegler, D. Ehrt, M. C. Kaluza, and G. G. Paulus. High repetition rate plasma mirror for temporal contrast enhancement of terawatt femtosecond laser pulses by three orders of magnitude. Applied Physics B, 103(2):295–302, Nov 2010. URL: http://dx.doi.org/10.1007/s00340-010-4329-7, doi:10.1007/s00340-010-4329-7.

[9]

V. Yakimenko, S. Meuren, F. Del Gaudio, C. Baumann, A. Fedotov, F. Fiuza, T. Grismayer, M. J. Hogan, A. Pukhov, L. O. Silva, and G. White. Prospect of studying nonperturbative qed with beam-beam collisions. Phys. Rev. Lett., 122:190404, May 2019. doi:10.1103/PhysRevLett.122.190404.

[10]

A. Le, W. Daughton, H. Karimabadi, and J. Egedal. Hybrid simulations of magnetic reconnection with kinetic ions and fluid electron pressure anisotropy. Physics of Plasmas, Mar 2016. 032114. URL: https://doi.org/10.1063/1.4943893, doi:10.1063/1.4943893.

[11]

M. M. Turner, A. Derzsi, Z. Donkó, D. Eremin, S. J. Kelly, T. Lafleur, and T. Mussenbrock. Simulation benchmarks for low-pressure plasmas: Capacitive discharges. Physics of Plasmas, Jan 2013. 013507. URL: https://doi.org/10.1063/1.4775084, doi:10.1063/1.4775084.

[12]

T. H. Stix. Waves in Plasmas. American Inst. of Physics, 1992. ISBN 978-0-88318-859-0. URL: https://books.google.com/books?id=OsOWJ8iHpmMC.

Parameters: Python (PICMI)

This documents on how to use WarpX as a Python script (e.g., python3 PICMI_script.py).

WarpX uses the PICMI standard for its Python input files. Complete example input files can be found in the examples section.

In the input file, instances of classes are created defining the various aspects of the simulation. A variable of type pywarpx.picmi.Simulation is the central object to which all other options are passed, defining the simulation time, field solver, registered species, etc.

Once the simulation is fully configured, it can be used in one of two modes. Interactive use is the most common and can be extended with custom runtime functionality:

step(): run directly from Python

When run directly from Python, one can also extend WarpX with further custom user logic. See the detailed workflow page on how to extend WarpX from Python.

Simulation and Grid Setup

class pywarpx.picmi.Simulation(solver=None, time_step_size=None, max_steps=None, max_time=None, verbose=None, particle_shape='linear', gamma_boost=None, cpu_split=None, load_balancing=None, **kw)[source]

Creates a Simulation object

Parameters:
  • solver (field solver instance) – This is the field solver to be used in the simulation. It should be an instance of field solver classes.

  • time_step_size (float) – Absolute time step size of the simulation [s]. Needed if the CFL is not specified elsewhere.

  • max_steps (integer) – Maximum number of time steps. Specify either this, or max_time, or use the step function directly.

  • max_time (float) – Maximum physical time to run the simulation [s]. Specify either this, or max_steps, or use the step function directly.

  • verbose (integer, optional) – Verbosity flag. A larger integer results in more verbose output

  • particle_shape ({'NGP', 'linear', 'quadratic', 'cubic'}) – Default particle shape for species added to this simulation

  • gamma_boost (float, optional) – Lorentz factor of the boosted simulation frame. Note that all input values should be in the lab frame.

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_current_deposition_algo ({'direct', 'esirkepov', and 'vay'}, optional) – Current deposition algorithm. The default depends on conditions.

  • warpx_charge_deposition_algo ({'standard'}, optional) – Charge deposition algorithm.

  • warpx_field_gathering_algo ({'energy-conserving', 'momentum-conserving'}, optional) – Field gathering algorithm. The default depends on conditions.

  • warpx_particle_pusher_algo ({'boris', 'vay', 'higuera'}, default='boris') – Particle pushing algorithm.

  • warpx_use_filter (bool, optional) – Whether to use filtering. The default depends on the conditions.

  • warpx_do_multi_J (bool, default=0) – Whether to use the multi-J algorithm, where current deposition and field update are performed multiple times within each time step.

  • warpx_do_multi_J_n_depositions (integer) – Number of sub-steps to use with the multi-J algorithm, when warpx_do_multi_J=1. Note that this input parameter is not optional and must always be set in all input files where warpx.do_multi_J=1. No default value is provided automatically.

  • warpx_grid_type ({'collocated', 'staggered', 'hybrid'}, default='staggered') – Whether to use a collocated grid (all fields defined at the cell nodes), a staggered grid (fields defined on a Yee grid), or a hybrid grid (fields and currents are interpolated back and forth between a staggered grid and a collocated grid, must be used with momentum-conserving field gathering algorithm).

  • warpx_do_current_centering (bool, optional) – If true, the current is deposited on a nodal grid and then centered to a staggered grid (Yee grid), using finite-order interpolation. Default: warpx.do_current_centering=0 with collocated or staggered grids, warpx.do_current_centering=1 with hybrid grids.

  • warpx_field_centering_nox/noy/noz (integer, optional) – The order of interpolation used with staggered or hybrid grids (warpx_grid_type=staggered or warpx_grid_type=hybrid) and momentum-conserving field gathering (warpx_field_gathering_algo=momentum-conserving) to interpolate the electric and magnetic fields from the cell centers to the cell nodes, before gathering the fields from the cell nodes to the particle positions. Default: warpx_field_centering_no<x,y,z>=2 with staggered grids, warpx_field_centering_no<x,y,z>=8 with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).

  • warpx_current_centering_nox/noy/noz (integer, optional) – The order of interpolation used with hybrid grids (warpx_grid_type=hybrid) to interpolate the currents from the cell nodes to the cell centers when warpx_do_current_centering=1, before pushing the Maxwell fields on staggered grids. Default: warpx_current_centering_no<x,y,z>=8 with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).

  • warpx_serialize_initial_conditions (bool, default=False) – Controls the random numbers used for initialization. This parameter should only be used for testing and continuous integration.

  • warpx_random_seed (string or int, optional) – (See documentation)

  • warpx_do_dynamic_scheduling (bool, default=True) – Whether to do dynamic scheduling with OpenMP

  • warpx_load_balance_intervals (string, default='0') – The intervals for doing load balancing

  • warpx_load_balance_efficiency_ratio_threshold (float, default=1.1) – (See documentation)

  • warpx_load_balance_with_sfc (bool, default=0) – (See documentation)

  • warpx_load_balance_knapsack_factor (float, default=1.24) – (See documentation)

  • warpx_load_balance_costs_update ({'heuristic' or 'timers'}, optional) – (See documentation)

  • warpx_costs_heuristic_particles_wt (float, optional) – (See documentation)

  • warpx_costs_heuristic_cells_wt (float, optional) – (See documentation)

  • warpx_use_fdtd_nci_corr (bool, optional) – Whether to use the NCI correction when using the FDTD solver

  • warpx_amr_check_input (bool, optional) – Whether AMReX should perform checks on the input (primarily related to the max grid size and blocking factors)

  • warpx_amr_restart (string, optional) – The name of the restart to use

  • warpx_amrex_the_arena_is_managed (bool, optional) – Whether to use managed memory in the AMReX Arena

  • warpx_amrex_the_arena_init_size (long int, optional) – The amount of memory in bytes to allocate in the Arena.

  • warpx_amrex_use_gpu_aware_mpi (bool, optional) – Whether to use GPU-aware MPI communications

  • warpx_zmax_plasma_to_compute_max_step (float, optional) – Sets the simulation run time based on the maximum z value

  • warpx_compute_max_step_from_btd (bool, default=0) – If specified, automatically calculates the number of iterations required in the boosted frame for all back-transformed diagnostics to be completed.

  • warpx_collisions (collision instance, optional) – The collision instance specifying the particle collisions

  • warpx_embedded_boundary (embedded boundary instance, optional) –

  • warpx_break_signals (list of strings) – Signals on which to break

  • warpx_checkpoint_signals (list of strings) – Signals on which to write out a checkpoint

  • warpx_numprocs (list of ints (1 in 1D, 2 in 2D, 3 in 3D)) – Domain decomposition on the coarsest level. The domain will be chopped into the exact number of pieces in each dimension as specified by this parameter. https://warpx.readthedocs.io/en/latest/usage/parameters.html#distribution-across-mpi-ranks-and-parallelization https://warpx.readthedocs.io/en/latest/usage/domain_decomposition.html#simple-method

  • warpx_sort_intervals (string, optional (defaults: -1 on CPU; 4 on GPU)) – Using the Intervals parser syntax, this string defines the timesteps at which particles are sorted. If <=0, do not sort particles. It is turned on on GPUs for performance reasons (to improve memory locality).

  • warpx_sort_particles_for_deposition (bool, optional (default: true for the CUDA backend, otherwise false)) – This option controls the type of sorting used if particle sorting is turned on, i.e. if sort_intervals is not <=0. If true, particles will be sorted by cell to optimize deposition with many particles per cell, in the order x -> y -> z -> ppc. If false, particles will be sorted by bin, using the sort_bin_size parameter below, in the order ppc -> x -> y -> z. true is recommended for best performance on NVIDIA GPUs, especially if there are many particles per cell.

  • warpx_sort_idx_type (list of int, optional (default: 0 0 0)) –

    This controls the type of grid used to sort the particles when sort_particles_for_deposition is true. Possible values are:

    • idx_type = {0, 0, 0}: Sort particles to a cell centered grid,

    • idx_type = {1, 1, 1}: Sort particles to a node centered grid,

    • idx_type = {2, 2, 2}: Compromise between a cell and node centered grid.

    In 2D (XZ and RZ), only the first two elements are read. In 1D, only the first element is read.

  • warpx_sort_bin_size (list of int, optional (default 1 1 1)) – If sort_intervals is activated and sort_particles_for_deposition is false, particles are sorted in bins of sort_bin_size cells. In 2D, only the first two elements are read.

  • warpx_used_inputs_file (string, optional) – The name of the text file that the used input parameters is written to,

add_applied_field(applied_field)

Add an applied field

Parameters:

applied_field (applied field instance) – One of the applied field instance. Specifies the properties of the applied field.

add_laser(laser, injection_method)

Add a laser pulses that to be injected in the simulation

Parameters:
  • laser_profile (laser instance) – One of laser profile instances. Specifies the physical properties of the laser pulse (e.g. spatial and temporal profile, wavelength, amplitude, etc.).

  • injection_method (laser injection instance, optional) – Specifies how the laser is injected (numerically) into the simulation (e.g. through a laser antenna, or directly added to the mesh). This argument describes an algorithm, not a physical object. It is up to each code to define the default method of injection, if the user does not provide injection_method.

add_species(species, layout, initialize_self_field=None)

Add species to be used in the simulation

Parameters:
  • species (species instance) – An instance of one of the PICMI species objects. Defines species to be added from the physical point of view (e.g. charge, mass, initial distribution of particles).

  • layout (layout instance) – An instance of one of the PICMI particle layout objects. Defines how particles are added into the simulation, from the numerical point of view.

  • initialize_self_field (bool, optional) – Whether the initial space-charge fields of this species is calculated and added to the simulation

step(nsteps=None, mpi_comm=None)[source]

Run the simulation for nsteps timesteps

Parameters:

nsteps (integer, default=1) – The number of timesteps

write_input_file(file_name='inputs')[source]

Write the parameters of the simulation, as defined in the PICMI input, into a code-specific input file.

This can be used for codes that are not Python-driven (e.g. compiled, pure C++ or Fortran codes) and expect a text input in a given format.

Parameters:

file_name (string) – The path to the file that will be created

class pywarpx.picmi.Cartesian3DGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nx=None, ny=None, nz=None, xmin=None, xmax=None, ymin=None, ymax=None, zmin=None, zmax=None, bc_xmin=None, bc_xmax=None, bc_ymin=None, bc_ymax=None, bc_zmin=None, bc_zmax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, xmin_particles=None, xmax_particles=None, ymin_particles=None, ymax_particles=None, zmin_particles=None, zmax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_xmin_particles=None, bc_xmax_particles=None, bc_ymin_particles=None, bc_ymax_particles=None, bc_zmin_particles=None, bc_zmax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]

Three dimensional Cartesian grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)

Parameters:
  • number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)

  • lower_bound (vector of floats) – Position of the node at the lower bound [m]

  • upper_bound (vector of floats) – Position of the node at the upper bound [m]

  • lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • nx (integer) – Number of cells along X (number of nodes=nx+1)

  • ny (integer) – Number of cells along Y (number of nodes=ny+1)

  • nz (integer) – Number of cells along Z (number of nodes=nz+1)

  • xmin (float) – Position of first node along X [m]

  • xmax (float) – Position of last node along X [m]

  • ymin (float) – Position of first node along Y [m]

  • ymax (float) – Position of last node along Y [m]

  • zmin (float) – Position of first node along Z [m]

  • zmax (float) – Position of last node along Z [m]

  • bc_xmin (string) – Boundary condition at min X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_xmax (string) – Boundary condition at max X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_ymin (string) – Boundary condition at min Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_ymax (string) – Boundary condition at max Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_zmin (string) – Boundary condition at min Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_zmax (string) – Boundary condition at max Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]

  • refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 3 specifying the extent of the region, and refinement_factor defaulting to [2,2,2] (relative to next lower level)

  • lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]

  • upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]

  • xmin_particles (float, optional) – Position of min particle boundary along X [m]

  • xmax_particles (float, optional) – Position of max particle boundary along X [m]

  • ymin_particles (float, optional) – Position of min particle boundary along Y [m]

  • ymax_particles (float, optional) – Position of max particle boundary along Y [m]

  • float (zmin_particles) – Position of min particle boundary along Z [m]

  • optional – Position of min particle boundary along Z [m]

  • zmax_particles (float, optional) – Position of max particle boundary along Z [m]

  • lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal

  • upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal

  • bc_xmin_particles (string, optional) – Boundary condition at min X for particles: One of periodic, absorbing, reflect, thermal

  • bc_xmax_particles (string, optional) – Boundary condition at max X for particles: One of periodic, absorbing, reflect, thermal

  • bc_ymin_particles (string, optional) – Boundary condition at min Y for particles: One of periodic, absorbing, reflect, thermal

  • bc_ymax_particles (string, optional) – Boundary condition at max Y for particles: One of periodic, absorbing, reflect, thermal

  • bc_zmin_particles (string, optional) – Boundary condition at min Z for particles: One of periodic, absorbing, reflect, thermal

  • bc_zmax_particles (string, optional) – Boundary condition at max Z for particles: One of periodic, absorbing, reflect, thermal

  • guard_cells (vector of integers, optional) – Number of guard cells used along each direction

  • pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction

References

  • absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_max_grid_size (integer, default=32) – Maximum block size in either direction

  • warpx_max_grid_size_x (integer, optional) – Maximum block size in x direction

  • warpx_max_grid_size_y (integer, optional) – Maximum block size in z direction

  • warpx_max_grid_size_z (integer, optional) – Maximum block size in z direction

  • warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)

  • warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the x direction

  • warpx_blocking_factor_y (integer, optional) – Blocking factor (which controls the block size) in the z direction

  • warpx_blocking_factor_z (integer, optional) – Blocking factor (which controls the block size) in the z direction

  • warpx_potential_lo_x (float, default=0.) – Electrostatic potential on the lower x boundary

  • warpx_potential_hi_x (float, default=0.) – Electrostatic potential on the upper x boundary

  • warpx_potential_lo_y (float, default=0.) – Electrostatic potential on the lower z boundary

  • warpx_potential_hi_y (float, default=0.) – Electrostatic potential on the upper z boundary

  • warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower z boundary

  • warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper z boundary

  • warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts

  • warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.

  • warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.

class pywarpx.picmi.Cartesian2DGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nx=None, ny=None, xmin=None, xmax=None, ymin=None, ymax=None, bc_xmin=None, bc_xmax=None, bc_ymin=None, bc_ymax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, xmin_particles=None, xmax_particles=None, ymin_particles=None, ymax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_xmin_particles=None, bc_xmax_particles=None, bc_ymin_particles=None, bc_ymax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]

Two dimensional Cartesian grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)

Parameters:
  • number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)

  • lower_bound (vector of floats) – Position of the node at the lower bound [m]

  • upper_bound (vector of floats) – Position of the node at the upper bound [m]

  • lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • nx (integer) – Number of cells along X (number of nodes=nx+1)

  • ny (integer) – Number of cells along Y (number of nodes=ny+1)

  • xmin (float) – Position of first node along X [m]

  • xmax (float) – Position of last node along X [m]

  • ymin (float) – Position of first node along Y [m]

  • ymax (float) – Position of last node along Y [m]

  • bc_xmin (vector of strings) – Boundary condition at min X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_xmax (vector of strings) – Boundary condition at max X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_ymin (vector of strings) – Boundary condition at min Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_ymax (vector of strings) – Boundary condition at max Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]

  • refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 2 specifying the extent of the region, and refinement_factor defaulting to [2,2] (relative to next lower level)

  • lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]

  • upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]

  • xmin_particles (float, optional) – Position of min particle boundary along X [m]

  • xmax_particles (float, optional) – Position of max particle boundary along X [m]

  • ymin_particles (float, optional) – Position of min particle boundary along Y [m]

  • ymax_particles (float, optional) – Position of max particle boundary along Y [m]

  • lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal

  • upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal

  • bc_xmin_particles (string, optional) – Boundary condition at min X for particles: One of periodic, absorbing, reflect, thermal

  • bc_xmax_particles (string, optional) – Boundary condition at max X for particles: One of periodic, absorbing, reflect, thermal

  • bc_ymin_particles (string, optional) – Boundary condition at min Y for particles: One of periodic, absorbing, reflect, thermal

  • bc_ymax_particles (string, optional) – Boundary condition at max Y for particles: One of periodic, absorbing, reflect, thermal

  • guard_cells (vector of integers, optional) – Number of guard cells used along each direction

  • pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction

References

  • absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_max_grid_size (integer, default=32) – Maximum block size in either direction

  • warpx_max_grid_size_x (integer, optional) – Maximum block size in x direction

  • warpx_max_grid_size_y (integer, optional) – Maximum block size in z direction

  • warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)

  • warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the x direction

  • warpx_blocking_factor_y (integer, optional) – Blocking factor (which controls the block size) in the z direction

  • warpx_potential_lo_x (float, default=0.) – Electrostatic potential on the lower x boundary

  • warpx_potential_hi_x (float, default=0.) – Electrostatic potential on the upper x boundary

  • warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower z boundary

  • warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper z boundary

  • warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts

  • warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.

  • warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.

class pywarpx.picmi.Cartesian1DGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nx=None, xmin=None, xmax=None, bc_xmin=None, bc_xmax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, xmin_particles=None, xmax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_xmin_particles=None, bc_xmax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]

One-dimensional Cartesian grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)

Parameters:
  • number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)

  • lower_bound (vector of floats) – Position of the node at the lower bound [m]

  • upper_bound (vector of floats) – Position of the node at the upper bound [m]

  • lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • nx (integer) – Number of cells along X (number of nodes=nx+1)

  • xmin (float) – Position of first node along X [m]

  • xmax (float) – Position of last node along X [m]

  • bc_xmin (vector of strings) – Boundary condition at min X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_xmax (vector of strings) – Boundary condition at max X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]

  • refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 2 specifying the extent of the region, and refinement_factor defaulting to [2,2] (relative to next lower level)

  • lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]

  • upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]

  • xmin_particles (float, optional) – Position of min particle boundary along X [m]

  • xmax_particles (float, optional) – Position of max particle boundary along X [m]

  • lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal

  • upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal

  • bc_xmin_particles (string, optional) – Boundary condition at min X for particles: One of periodic, absorbing, reflect, thermal

  • bc_xmax_particles (string, optional) – Boundary condition at max X for particles: One of periodic, absorbing, reflect, thermal

  • guard_cells (vector of integers, optional) – Number of guard cells used along each direction

  • pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction

References

  • absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_max_grid_size (integer, default=32) – Maximum block size in either direction

  • warpx_max_grid_size_x (integer, optional) – Maximum block size in longitudinal direction

  • warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)

  • warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the longitudinal direction

  • warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower longitudinal boundary

  • warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper longitudinal boundary

  • warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts

  • warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.

  • warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.

class pywarpx.picmi.CylindricalGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nr=None, nz=None, n_azimuthal_modes=None, rmin=None, rmax=None, zmin=None, zmax=None, bc_rmin=None, bc_rmax=None, bc_zmin=None, bc_zmax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, rmin_particles=None, rmax_particles=None, zmin_particles=None, zmax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_rmin_particles=None, bc_rmax_particles=None, bc_zmin_particles=None, bc_zmax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]

Axisymmetric, cylindrical grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)

Parameters:
  • number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)

  • lower_bound (vector of floats) – Position of the node at the lower bound [m]

  • upper_bound (vector of floats) – Position of the node at the upper bound [m]

  • lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • nr (integer) – Number of cells along R (number of nodes=nr+1)

  • nz (integer) – Number of cells along Z (number of nodes=nz+1)

  • n_azimuthal_modes (integer) – Number of azimuthal modes

  • rmin (float) – Position of first node along R [m]

  • rmax (float) – Position of last node along R [m]

  • zmin (float) – Position of first node along Z [m]

  • zmax (float) – Position of last node along Z [m]

  • bc_rmin (vector of strings) – Boundary condition at min R: One of open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_rmax (vector of strings) – Boundary condition at max R: One of open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_zmin (vector of strings) – Boundary condition at min Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • bc_zmax (vector of strings) – Boundary condition at max Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann

  • moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]

  • refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 2 specifying the extent of the region, and refinement_factor defaulting to [2,2] (relative to next lower level)

  • lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]

  • upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]

  • rmin_particles (float, optional) – Position of min particle boundary along R [m]

  • rmax_particles (float, optional) – Position of max particle boundary along R [m]

  • zmin_particles (float, optional) – Position of min particle boundary along Z [m]

  • zmax_particles (float, optional) – Position of max particle boundary along Z [m]

  • lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal

  • upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal

  • bc_rmin_particles (string, optional) – Boundary condition at min R for particles: One of periodic, absorbing, reflect, thermal

  • bc_rmax_particles (string, optional) – Boundary condition at max R for particles: One of periodic, absorbing, reflect, thermal

  • bc_zmin_particles (string, optional) – Boundary condition at min Z for particles: One of periodic, absorbing, reflect, thermal

  • bc_zmax_particles (string, optional) – Boundary condition at max Z for particles: One of periodic, absorbing, reflect, thermal

  • guard_cells (vector of integers, optional) – Number of guard cells used along each direction

  • pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction

References

  • absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,

Implementation specific documentation

This assumes that WarpX was compiled with USE_RZ = TRUE

See Input Parameters for more information.

Parameters:
  • warpx_max_grid_size (integer, default=32) – Maximum block size in either direction

  • warpx_max_grid_size_x (integer, optional) – Maximum block size in radial direction

  • warpx_max_grid_size_y (integer, optional) – Maximum block size in longitudinal direction

  • warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)

  • warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the radial direction

  • warpx_blocking_factor_y (integer, optional) – Blocking factor (which controls the block size) in the longitudinal direction

  • warpx_potential_lo_r (float, default=0.) – Electrostatic potential on the lower radial boundary

  • warpx_potential_hi_r (float, default=0.) – Electrostatic potential on the upper radial boundary

  • warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower longitudinal boundary

  • warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper longitudinal boundary

  • warpx_reflect_all_velocities (bool default=False) – Whether the sign of all of the particle velocities are changed upon reflection on a boundary, or only the velocity normal to the surface

  • warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts

  • warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.

  • warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.

class pywarpx.picmi.EmbeddedBoundary(implicit_function=None, stl_file=None, stl_scale=None, stl_center=None, stl_reverse_normal=False, potential=None, cover_multiple_cuts=None, **kw)[source]

Custom class to handle set up of embedded boundaries specific to WarpX. If embedded boundary initialization is added to picmistandard this can be changed to inherit that functionality. The geometry can be specified either as an implicit function or as an STL file (ASCII or binary). In the latter case the geometry specified in the STL file can be scaled, translated and inverted.

Parameters:
  • implicit_function (string) – Analytic expression describing the embedded boundary

  • stl_file (string) – STL file path (string), file contains the embedded boundary geometry

  • stl_scale (float) – Factor by which the STL geometry is scaled

  • stl_center (vector of floats) – Vector by which the STL geometry is translated (in meters)

  • stl_reverse_normal (bool) – If True inverts the orientation of the STL geometry

  • potential (string, default=0.) – Analytic expression defining the potential. Can only be specified when the solver is electrostatic.

  • cover_multiple_cuts (bool, default=None) – Whether to cover cells with multiple cuts. (If False, this will raise an error if some cells have multiple cuts)

  • arguments. (Parameters used in the analytic expressions should be given as additional keyword) –

Field solvers define the updates of electric and magnetic fields.

class pywarpx.picmi.ElectromagneticSolver(grid, method=None, stencil_order=None, cfl=None, source_smoother=None, field_smoother=None, subcycling=None, galilean_velocity=None, divE_cleaning=None, divB_cleaning=None, pml_divE_cleaning=None, pml_divB_cleaning=None, **kw)[source]

Electromagnetic field solver

Parameters:
  • grid (grid instance) – Grid object for the diagnostic

  • method ({'Yee', 'CKC', 'Lehe', 'PSTD', 'PSATD', 'GPSTD', 'DS', 'ECT'}) –

    The advance method use to solve Maxwell’s equations. The default method is code dependent.

  • stencil_order (vector of integers) – Order of stencil for each axis (-1=infinite)

  • cfl (float, optional) – Fraction of the Courant-Friedrich-Lewy criteria [1]

  • source_smoother (smoother instance, optional) – Smoother object to apply to the sources

  • field_smoother (smoother instance, optional) – Smoother object to apply to the fields

  • subcycling (integer, optional) – Level of subcycling for the GPSTD solver

  • galilean_velocity (vector of floats, optional) – Velocity of Galilean reference frame [m/s]

  • divE_cleaning (bool, optional) – Solver uses div(E) cleaning if True

  • divB_cleaning (bool, optional) – Solver uses div(B) cleaning if True

  • pml_divE_cleaning (bool, optional) – Solver uses div(E) cleaning in the PML if True

  • pml_divB_cleaning (bool, optional) – Solver uses div(B) cleaning in the PML if True

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_pml_ncell (integer, optional) – The depth of the PML, in number of cells

  • warpx_periodic_single_box_fft (bool, default=False) – Whether to do the spectral solver FFTs assuming a single simulation block

  • warpx_current_correction (bool, default=True) – Whether to do the current correction for the spectral solver. See documentation for exceptions to the default value.

  • warpx_psatd_update_with_rho (bool, optional) – Whether to update with the actual rho for the spectral solver

  • warpx_psatd_do_time_averaging (bool, optional) – Whether to do the time averaging for the spectral solver

  • warpx_psatd_J_in_time ({'constant', 'linear'}, default='constant') – This determines whether the current density is assumed to be constant or linear in time, within the time step over which the electromagnetic fields are evolved.

  • warpx_psatd_rho_in_time ({'linear'}, default='linear') – This determines whether the charge density is assumed to be linear in time, within the time step over which the electromagnetic fields are evolved.

  • warpx_do_pml_in_domain (bool, default=False) – Whether to do the PML boundaries within the domain (versus in the guard cells)

  • warpx_pml_has_particles (bool, default=False) – Whether to allow particles in the PML region

  • warpx_do_pml_j_damping (bool, default=False) – Whether to do damping of J in the PML

class pywarpx.picmi.ElectrostaticSolver(grid, method=None, required_precision=None, maximum_iterations=None, **kw)[source]

Electrostatic field solver

Parameters:
  • grid (grid instance) – Grid object for the diagnostic

  • method (string) – One of ‘FFT’, or ‘Multigrid’

  • required_precision (float, optional) – Level of precision required for iterative solvers

  • maximum_iterations (integer, optional) – Maximum number of iterations for iterative solvers

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_relativistic (bool, default=False) – Whether to use the relativistic solver or lab frame solver

  • warpx_absolute_tolerance (float, default=0.) – Absolute tolerance on the lab frame solver

  • warpx_self_fields_verbosity (integer, default=2) – Level of verbosity for the lab frame solver

Constants

For convenience, the PICMI interface defines the following constants, which can be used directly inside any PICMI script. The values are in SI units.

  • picmi.constants.c: The speed of light in vacuum.

  • picmi.constants.ep0: The vacuum permittivity \(\epsilon_0\)

  • picmi.constants.mu0: The vacuum permeability \(\mu_0\)

  • picmi.constants.q_e: The elementary charge (absolute value of the charge of an electron).

  • picmi.constants.m_e: The electron mass

  • picmi.constants.m_p: The proton mass

Applied fields

Instances of the classes below need to be passed to the method add_applied_field of the Simulation class.

class pywarpx.picmi.AnalyticInitialField(Ex_expression=None, Ey_expression=None, Ez_expression=None, Bx_expression=None, By_expression=None, Bz_expression=None, lower_bound=[None, None, None], upper_bound=[None, None, None], **kw)[source]

Describes an analytic applied field

The expressions should be in terms of the position and time, written as ‘x’, ‘y’, ‘z’, ‘t’. Parameters can be used in the expression with the values given as additional keyword arguments. Expressions should be relative to the lab frame.

Parameters:
  • Ex_expression (string, optional) – Analytic expression describing Ex field [V/m]

  • Ey_expression (string, optional) – Analytic expression describing Ey field [V/m]

  • Ez_expression (string, optional) – Analytic expression describing Ez field [V/m]

  • Bx_expression (string, optional) – Analytic expression describing Bx field [T]

  • By_expression (string, optional) – Analytic expression describing By field [T]

  • Bz_expression (string, optional) – Analytic expression describing Bz field [T]

  • lower_bound (vector, optional) – Lower bound of the region where the field is applied [m].

  • upper_bound (vector, optional) – Upper bound of the region where the field is applied [m]

class pywarpx.picmi.ConstantAppliedField(Ex=None, Ey=None, Ez=None, Bx=None, By=None, Bz=None, lower_bound=[None, None, None], upper_bound=[None, None, None], **kw)[source]

Describes a constant applied field

Parameters:
  • Ex (float, default=0.) – Constant Ex field [V/m]

  • Ey (float, default=0.) – Constant Ey field [V/m]

  • Ez (float, default=0.) – Constant Ez field [V/m]

  • Bx (float, default=0.) – Constant Bx field [T]

  • By (float, default=0.) – Constant By field [T]

  • Bz (float, default=0.) – Constant Bz field [T]

  • lower_bound (vector, optional) – Lower bound of the region where the field is applied [m].

  • upper_bound (vector, optional) – Upper bound of the region where the field is applied [m]

class pywarpx.picmi.AnalyticAppliedField(Ex_expression=None, Ey_expression=None, Ez_expression=None, Bx_expression=None, By_expression=None, Bz_expression=None, lower_bound=[None, None, None], upper_bound=[None, None, None], **kw)[source]

Describes an analytic applied field

The expressions should be in terms of the position and time, written as ‘x’, ‘y’, ‘z’, ‘t’. Parameters can be used in the expression with the values given as additional keyword arguments. Expressions should be relative to the lab frame.

Parameters:
  • Ex_expression (string, optional) – Analytic expression describing Ex field [V/m]

  • Ey_expression (string, optional) – Analytic expression describing Ey field [V/m]

  • Ez_expression (string, optional) – Analytic expression describing Ez field [V/m]

  • Bx_expression (string, optional) – Analytic expression describing Bx field [T]

  • By_expression (string, optional) – Analytic expression describing By field [T]

  • Bz_expression (string, optional) – Analytic expression describing Bz field [T]

  • lower_bound (vector, optional) – Lower bound of the region where the field is applied [m].

  • upper_bound (vector, optional) – Upper bound of the region where the field is applied [m]

class pywarpx.picmi.PlasmaLens(period, starts, lengths, strengths_E=None, strengths_B=None, **kw)[source]

Custom class to setup a plasma lens lattice. The applied fields are dependent only on the transverse position.

Parameters:
  • period (float) – Periodicity of the lattice (in lab frame, in meters)

  • starts (list of floats) – The start of each lens relative to the periodic repeat

  • lengths (list of floats) – The length of each lens

  • strengths_E=None (list of floats, default = 0.) – The electric field strength of each lens

  • strengths_B=None (list of floats, default = 0.) – The magnetic field strength of each lens

The field that is applied depends on the transverse position of the particle, (x,y)

  • Ex = x*strengths_E

  • Ey = y*strengths_E

  • Bx = +y*strengths_B

  • By = -x*strengths_B

class pywarpx.picmi.Mirror(x_front_location=None, y_front_location=None, z_front_location=None, depth=None, number_of_cells=None, **kw)[source]

Describes a perfectly reflecting mirror, where the E and B fields are zeroed out in a plane of finite thickness.

Parameters:
  • x_front_location (float, optional (see comment below)) – Location in x of the front of the nirror [m]

  • y_front_location (float, optional (see comment below)) – Location in y of the front of the nirror [m]

  • z_front_location (float, optional (see comment below)) – Location in z of the front of the nirror [m]

  • depth (float, optional (see comment below)) – Depth of the mirror [m]

  • number_of_cells (integer, optional (see comment below)) – Minimum numer of cells zeroed out

Only one of the [x,y,z]_front_location should be specified. The mirror will be set perpendicular to the respective direction and infinite in the others. The depth of the mirror will be the maximum of the specified depth and number_of_cells, or the code’s default value if neither are specified.

Diagnostics

class pywarpx.picmi.ParticleDiagnostic(period, species=None, data_list=None, write_dir=None, step_min=None, step_max=None, parallelio=None, name=None, **kw)[source]

Defines the particle diagnostics in the simulation frame

Parameters:
  • period (integer) – Period of time steps that the diagnostic is performed

  • species (species instance or list of species instances, optional) – Species to write out. If not specified, all species are written. Note that the name attribute must be defined for the species.

  • data_list (list of strings, optional) – The data to be written out. Possible values ‘position’, ‘momentum’, ‘weighting’. Defaults to the output list of the implementing code.

  • write_dir (string, optional) – Directory where data is to be written

  • step_min (integer, default=0) – Minimum step at which diagnostics could be written

  • step_max (integer, default=unbounded) – Maximum step at which diagnostics could be written

  • parallelio (bool, optional) – If set to True, particle diagnostics are dumped in parallel

  • name (string, optional) – Sets the base name for the diagnostic output files

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_format ({plotfile, checkpoint, openpmd, ascent, sensei}, optional) – Diagnostic file format

  • warpx_openpmd_backend ({bp, h5, json}, optional) – Openpmd backend file format

  • warpx_file_prefix (string, optional) – Prefix on the diagnostic file name

  • warpx_file_min_digits (integer, optional) – Minimum number of digits for the time step number in the file name

  • warpx_random_fraction (float, optional) – Random fraction of particles to include in the diagnostic

  • warpx_uniform_stride (integer, optional) – Stride to down select to the particles to include in the diagnostic

  • warpx_plot_filter_function (string, optional) – Analytic expression to down select the particles to in the diagnostic

class pywarpx.picmi.FieldDiagnostic(grid, period, data_list=None, write_dir=None, step_min=None, step_max=None, number_of_cells=None, lower_bound=None, upper_bound=None, parallelio=None, name=None, **kw)[source]

Defines the electromagnetic field diagnostics in the simulation frame

Parameters:
  • grid (grid instance) – Grid object for the diagnostic

  • period (integer) – Period of time steps that the diagnostic is performed

  • data_list (list of strings, optional) – List of quantities to write out. Possible values ‘rho’, ‘E’, ‘B’, ‘J’, ‘Ex’ etc. Defaults to the output list of the implementing code.

  • write_dir (string, optional) – Directory where data is to be written

  • step_min (integer, default=0) – Minimum step at which diagnostics could be written

  • step_max (integer, default=unbounded) – Maximum step at which diagnostics could be written

  • number_of_cells (vector of integers, optional) – Number of cells in each dimension. If not given, will be obtained from grid.

  • lower_bound (vector of floats, optional) – Lower corner of diagnostics box in each direction. If not given, will be obtained from grid.

  • upper_bound (vector of floats, optional) – Higher corner of diagnostics box in each direction. If not given, will be obtained from grid.

  • parallelio (bool, optional) – If set to True, field diagnostics are dumped in parallel

  • name (string, optional) – Sets the base name for the diagnostic output files

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_plot_raw_fields (bool, optional) – Flag whether to dump the raw fields

  • warpx_plot_raw_fields_guards (bool, optional) – Flag whether the raw fields should include the guard cells

  • warpx_format ({plotfile, checkpoint, openpmd, ascent, sensei}, optional) – Diagnostic file format

  • warpx_openpmd_backend ({bp, h5, json}, optional) – Openpmd backend file format

  • warpx_file_prefix (string, optional) – Prefix on the diagnostic file name

  • warpx_file_min_digits (integer, optional) – Minimum number of digits for the time step number in the file name

  • warpx_dump_rz_modes (bool, optional) – Flag whether to dump the data for all RZ modes

  • warpx_particle_fields_to_plot (list of ParticleFieldDiagnostics) – List of ParticleFieldDiagnostic classes to install in the simulation. Error checking is handled in the class itself.

  • warpx_particle_fields_species (list of strings, optional) – Species for which to calculate particle_fields_to_plot functions. Fields will be calculated separately for each specified species. If not passed, default is all of the available particle species.

pywarpx.picmi.ElectrostaticFieldDiagnostic

alias of FieldDiagnostic

class pywarpx.picmi.Checkpoint(period=1, write_dir=None, name=None, **kw)[source]

Sets up checkpointing of the simulation, allowing for later restarts

See Input Parameters for more information.

Parameters:
  • warpx_file_prefix (string) – The prefix to the checkpoint directory names

  • warpx_file_min_digits (integer) – Minimum number of digits for the time step number in the checkpoint directory name.

class pywarpx.picmi.ReducedDiagnostic(diag_type, name=None, period=1, path=None, extension=None, separator=None, **kw)[source]

Sets up a reduced diagnostic in the simulation.

See Input Parameters for more information.

Parameters:
  • diag_type (string) – The type of reduced diagnostic. See the link above for all the different types of reduced diagnostics available.

  • name (string) – The name of this diagnostic which will also be the name of the data file written to disk.

  • period (integer) – The simulation step interval at which to output this diagnostic.

  • path (string) – The file path in which the diagnostic file should be written.

  • extension (string) – The file extension used for the diagnostic output.

  • separator (string) – The separator between row values in the output file.

  • species (species instance) – The name of the species for which to calculate the diagnostic, required for diagnostic types ‘BeamRelevant’, ‘ParticleHistogram’, and ‘ParticleExtrema’

  • bin_number (integer) – For diagnostic type ‘ParticleHistogram’, the number of bins used for the histogram

  • bin_max (float) – For diagnostic type ‘ParticleHistogram’, the maximum value of the bins

  • bin_min (float) – For diagnostic type ‘ParticleHistogram’, the minimum value of the bins

  • normalization ({'unity_particle_weight', 'max_to_unity', 'area_to_unity'}, optional) – For diagnostic type ‘ParticleHistogram’, normalization method of the histogram.

  • histogram_function (string) – For diagnostic type ‘ParticleHistogram’, the function evaluated to produce the histogram data

  • filter_function (string, optional) – For diagnostic type ‘ParticleHistogram’, the function to filter whether particles are included in the histogram

  • reduced_function (string) – For diagnostic type ‘FieldReduction’, the function of the fields to evaluate

  • weighting_function (string, optional) – For diagnostic type ‘ChargeOnEB’, the function to weight contributions to the total charge

  • reduction_type ({'Maximum', 'Minimum', or 'Integral'}) – For diagnostic type ‘FieldReduction’, the type of reduction

  • probe_geometry ({'Point', 'Line', 'Plane'}, default='Point') – For diagnostic type ‘FieldProbe’, the geometry of the probe

  • integrate (bool, default=false) – For diagnostic type ‘FieldProbe’, whether the field is integrated

  • do_moving_window_FP (bool, default=False) – For diagnostic type ‘FieldProbe’, whether the moving window is followed

  • x_probe (floats) – For diagnostic type ‘FieldProbe’, a probe location. For ‘Point’, the location of the point. For ‘Line’, the start of the line. For ‘Plane’, the center of the square detector.

  • y_probe (floats) – For diagnostic type ‘FieldProbe’, a probe location. For ‘Point’, the location of the point. For ‘Line’, the start of the line. For ‘Plane’, the center of the square detector.

  • z_probe (floats) – For diagnostic type ‘FieldProbe’, a probe location. For ‘Point’, the location of the point. For ‘Line’, the start of the line. For ‘Plane’, the center of the square detector.

  • interp_order (integer) – For diagnostic type ‘FieldProbe’, the interpolation order for ‘Line’ and ‘Plane’

  • resolution (integer) – For diagnostic type ‘FieldProbe’, the number of points along the ‘Line’ or along each edge of the square ‘Plane’

  • x1_probe (floats) – For diagnostic type ‘FieldProbe’, the end point for ‘Line’

  • y1_probe (floats) – For diagnostic type ‘FieldProbe’, the end point for ‘Line’

  • z1_probe (floats) – For diagnostic type ‘FieldProbe’, the end point for ‘Line’

  • detector_radius (float) – For diagnostic type ‘FieldProbe’, the detector “radius” (half edge length) of the ‘Plane’

  • target_normal_x (floats) – For diagnostic type ‘FieldProbe’, the normal vector to the ‘Plane’. Only applicable in 3D

  • target_normal_y (floats) – For diagnostic type ‘FieldProbe’, the normal vector to the ‘Plane’. Only applicable in 3D

  • target_normal_z (floats) – For diagnostic type ‘FieldProbe’, the normal vector to the ‘Plane’. Only applicable in 3D

  • target_up_x (floats) – For diagnostic type ‘FieldProbe’, the vector specifying up in the ‘Plane’

  • target_up_y (floats) – For diagnostic type ‘FieldProbe’, the vector specifying up in the ‘Plane’

  • target_up_z (floats) – For diagnostic type ‘FieldProbe’, the vector specifying up in the ‘Plane’

Lab-frame diagnostics diagnostics are used when running boosted-frame simulations.

class pywarpx.picmi.LabFrameParticleDiagnostic(grid, num_snapshots, dt_snapshots, data_list=None, time_start=0.0, species=None, write_dir=None, parallelio=None, name=None, **kw)[source]

Defines the particle diagnostics in the lab frame

Parameters:
  • grid (grid instance) – Grid object for the diagnostic

  • num_snapshots (integer) – Number of lab frame snapshots to make

  • dt_snapshots (float) – Time between each snapshot in lab frame

  • species (species instance or list of species instances, optional) – Species to write out. If not specified, all species are written. Note that the name attribute must be defined for the species.

  • data_list (list of strings, optional) – The data to be written out. Possible values ‘position’, ‘momentum’, ‘weighting’. Defaults to the output list of the implementing code.

  • time_start (float, default=0) – Time for the first snapshot in lab frame

  • write_dir (string, optional) – Directory where data is to be written

  • parallelio (bool, optional) – If set to True, particle diagnostics are dumped in parallel

  • name (string, optional) – Sets the base name for the diagnostic output files

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_format (string, optional) – Passed to <diagnostic name>.format

  • warpx_openpmd_backend (string, optional) – Passed to <diagnostic name>.openpmd_backend

  • warpx_file_prefix (string, optional) – Passed to <diagnostic name>.file_prefix

  • warpx_intervals (integer or string) – Selects the snapshots to be made, instead of using “num_snapshots” which makes all snapshots. “num_snapshots” is ignored.

  • warpx_file_min_digits (integer, optional) – Passed to <diagnostic name>.file_min_digits

  • warpx_buffer_size (integer, optional) – Passed to <diagnostic name>.buffer_size

class pywarpx.picmi.LabFrameFieldDiagnostic(grid, num_snapshots, dt_snapshots, data_list=None, z_subsampling=1, time_start=0.0, write_dir=None, parallelio=None, name=None, **kw)[source]

Defines the electromagnetic field diagnostics in the lab frame

Parameters:
  • grid (grid instance) – Grid object for the diagnostic

  • num_snapshots (integer) – Number of lab frame snapshots to make

  • dt_snapshots (float) – Time between each snapshot in lab frame

  • data_list (list of strings, optional) – List of quantities to write out. Possible values ‘rho’, ‘E’, ‘B’, ‘J’, ‘Ex’ etc. Defaults to the output list of the implementing code.

  • z_subsampling (integer, default=1) – A factor which is applied on the resolution of the lab frame reconstruction

  • time_start (float, default=0) – Time for the first snapshot in lab frame

  • write_dir (string, optional) – Directory where data is to be written

  • parallelio (bool, optional) – If set to True, field diagnostics are dumped in parallel

  • name (string, optional) – Sets the base name for the diagnostic output files

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_format (string, optional) – Passed to <diagnostic name>.format

  • warpx_openpmd_backend (string, optional) – Passed to <diagnostic name>.openpmd_backend

  • warpx_file_prefix (string, optional) – Passed to <diagnostic name>.file_prefix

  • warpx_intervals (integer or string) – Selects the snapshots to be made, instead of using “num_snapshots” which makes all snapshots. “num_snapshots” is ignored.

  • warpx_file_min_digits (integer, optional) – Passed to <diagnostic name>.file_min_digits

  • warpx_buffer_size (integer, optional) – Passed to <diagnostic name>.buffer_size

  • warpx_lower_bound (vector of floats, optional) – Passed to <diagnostic name>.lower_bound

  • warpx_upper_bound (vector of floats, optional) – Passed to <diagnostic name>.upper_bound

Particles

Species objects are a collection of particles with similar properties. For instance, background plasma electrons, background plasma ions and an externally injected beam could each be their own particle species.

class pywarpx.picmi.Species(particle_type=None, name=None, charge_state=None, charge=None, mass=None, initial_distribution=None, particle_shape=None, density_scale=None, method=None, **kw)[source]

Sets up the species to be simulated. The species charge and mass can be specified by setting the particle type or by setting them directly. If the particle type is specified, the charge or mass can be set to override the value from the type.

Parameters:
  • particle_type (string, optional) – A string specifying an elementary particle, atom, or other, as defined in the openPMD 2 species type extension, openPMD-standard/EXT_SpeciesType.md

  • name (string, optional) – Name of the species

  • method ({'Boris', 'Vay', 'Higuera-Cary', 'Li' , 'free-streaming', 'LLRK4'}) –

    The particle advance method to use. Code-specific method can be specified using ‘other:<method>’. The default is code dependent.

    • ’Boris’: Standard “leap-frog” Boris advance

    • ’Vay’:

    • ’Higuera-Cary’:

    • ’Li’ :

    • ’free-streaming’: Advance with no fields

    • ’LLRK4’: Landau-Lifschitz radiation reaction formula with RK-4)

  • charge_state (float, optional) – Charge state of the species (applies only to atoms) [1]

  • charge (float, optional) – Particle charge, required when type is not specified, otherwise determined from type [C]

  • mass (float, optional) – Particle mass, required when type is not specified, otherwise determined from type [kg]

  • initial_distribution (distribution instance) – The initial distribution loaded at t=0. Must be one of the standard distributions implemented.

  • density_scale (float, optional) – A scale factor on the density given by the initial_distribution

  • particle_shape ({'NGP', 'linear', 'quadratic', 'cubic'}) – Particle shape used for deposition and gather. If not specified, the value from the Simulation object will be used. Other values maybe specified that are code dependent.

Implementation specific documentation

See Input Parameters for more information.

Parameters:
  • warpx_boost_adjust_transverse_positions (bool, default=False) – Whether to adjust transverse positions when apply the boost to the simulation frame

  • warpx_self_fields_required_precision (float, default=1.e-11) – Relative precision on the electrostatic solver (when using the relativistic solver)

  • warpx_self_fields_absolute_tolerance (float, default=0.) – Absolute precision on the electrostatic solver (when using the relativistic solver)

  • warpx_self_fields_max_iters (integer, default=200) – Maximum number of iterations for the electrostatic solver for the species

  • warpx_self_fields_verbosity (integer, default=2) – Level of verbosity for the electrostatic solver

  • warpx_save_previous_position (bool, default=False) – Whether to save the old particle positions

  • warpx_do_not_deposit (bool, default=False) – Whether or not to deposit the charge and current density for for this species

  • warpx_do_not_push (bool, default=False) – Whether or not to push this species

  • warpx_do_not_gather (bool, default=False) – Whether or not to gather the fields from grids for this species

  • warpx_random_theta (bool, default=True) – Whether or not to add random angle to the particles in theta when in RZ mode.

  • warpx_reflection_model_xlo (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the lower x boundary

  • warpx_reflection_model_xhi (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the upper x boundary

  • warpx_reflection_model_ylo (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the lower y boundary

  • warpx_reflection_model_yhi (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the upper y boundary

  • warpx_reflection_model_zlo (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the lower z boundary

  • warpx_reflection_model_zhi (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the upper z boundary

  • warpx_save_particles_at_xlo (bool, default=False) – Whether to save particles lost at the lower x boundary

  • warpx_save_particles_at_xhi (bool, default=False) – Whether to save particles lost at the upper x boundary

  • warpx_save_particles_at_ylo (bool, default=False) – Whether to save particles lost at the lower y boundary

  • warpx_save_particles_at_yhi (bool, default=False) – Whether to save particles lost at the upper y boundary

  • warpx_save_particles_at_zlo (bool, default=False) – Whether to save particles lost at the lower z boundary

  • warpx_save_particles_at_zhi (bool, default=False) – Whether to save particles lost at the upper z boundary

  • warpx_save_particles_at_eb (bool, default=False) – Whether to save particles lost at the embedded boundary

  • warpx_do_resampling (bool, default=False) – Whether particles will be resampled

  • warpx_resampling_min_ppc (int, default=1) – Cells with fewer particles than this number will be skipped during resampling.

  • warpx_resampling_trigger_intervals (bool, default=0) – Timesteps at which to resample

  • warpx_resampling_trigger_max_avg_ppc (int, default=infinity) – Resampling will be done when the average number of particles per cell exceeds this number

  • warpx_resampling_algorithm (str, default="leveling_thinning") – Resampling algorithm to use.

  • warpx_resampling_algorithm_velocity_grid_type (str, default="spherical") – Type of grid to use when clustering particles in velocity space. Only applicable with the velocity_coincidence_thinning algorithm.

  • warpx_resampling_algorithm_delta_ur (float) – Size of velocity window used for clustering particles during grid-based merging, with velocity_grid_type == “spherical”.

  • warpx_resampling_algorithm_n_theta (int) – Number of bins to use in theta when clustering particle velocities during grid-based merging, with velocity_grid_type == “spherical”.

  • warpx_resampling_algorithm_n_phi (int) – Number of bins to use in phi when clustering particle velocities during grid-based merging, with velocity_grid_type == “spherical”.

  • warpx_resampling_algorithm_delta_u (array of floats or float) – Size of velocity window used in ux, uy and uz for clustering particles during grid-based merging, with velocity_grid_type == “cartesian”. If a single number is given the same du value will be used in all three directions.

class pywarpx.picmi.MultiSpecies(particle_types=None, names=None, charge_states=None, charges=None, masses=None, proportions=None, initial_distribution=None, particle_shape=None, **kw)[source]

INCOMPLETE: proportions argument is not implemented Multiple species that are initialized with the same distribution. Each parameter can be list, giving a value for each species, or a single value which is given to all species. The species charge and mass can be specified by setting the particle type or by setting them directly. If the particle type is specified, the charge or mass can be set to override the value from the type.

Parameters:
  • particle_types (list of strings, optional) – A string specifying an elementary particle, atom, or other, as defined in the openPMD 2 species type extension, openPMD-standard/EXT_SpeciesType.md

  • names (list of strings, optional) – Names of the species

  • charge_states (list of floats, optional) – Charge states of the species (applies only to atoms)

  • charges (list of floats, optional) – Particle charges, required when type is not specified, otherwise determined from type [C]

  • masses (list of floats, optional) – Particle masses, required when type is not specified, otherwise determined from type [kg]

  • proportions (list of floats, optional) – Proportions of the initial distribution made up by each species

  • initial_distribution (distribution instance) – Initial particle distribution, applied to all species

  • particle_shape ({'NGP', 'linear', 'quadratic', 'cubic'}) – Particle shape used for deposition and gather. If not specified, the value from the Simulation object will be used. Other values maybe specified that are code dependent.

Particle distributions can be used for to initialize particles in a particle species.

class pywarpx.picmi.GaussianBunchDistribution(n_physical_particles, rms_bunch_size, rms_velocity=[0.0, 0.0, 0.0], centroid_position=[0.0, 0.0, 0.0], centroid_velocity=[0.0, 0.0, 0.0], velocity_divergence=[0.0, 0.0, 0.0], **kw)[source]

Describes a Gaussian distribution of particles

Parameters:
  • n_physical_particles (integer) – Number of physical particles in the bunch

  • rms_bunch_size (vector of length 3 of floats) – RMS bunch size at t=0 [m]

  • rms_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – RMS velocity spread at t=0 [m/s]

  • centroid_position (vector of length 3 of floats, default=[0.,0.,0.]) – Position of the bunch centroid at t=0 [m]

  • centroid_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – Velocity (gamma*V) of the bunch centroid at t=0 [m/s]

  • velocity_divergence (vector of length 3 of floats, default=[0.,0.,0.]) – Expansion rate of the bunch at t=0 [m/s/m]

class pywarpx.picmi.UniformDistribution(density, lower_bound=[None, None, None], upper_bound=[None, None, None], rms_velocity=[0.0, 0.0, 0.0], directed_velocity=[0.0, 0.0, 0.0], fill_in=None, **kw)[source]

Describes a uniform density distribution of particles

Parameters:
  • density (float) – Physical number density [m^-3]

  • lower_bound (vector of length 3 of floats, optional) – Lower bound of the distribution [m]

  • upper_bound (vector of length 3 of floats, optional) – Upper bound of the distribution [m]

  • rms_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – Thermal velocity spread [m/s]

  • directed_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – Directed, average, velocity [m/s]

  • fill_in (bool, optional) – Flags whether to fill in the empty spaced opened up when the grid moves

class pywarpx.picmi.AnalyticDistribution(density_expression, momentum_expressions=[None, None, None], lower_bound=[None, None, None], upper_bound=[None, None, None], rms_velocity=[0.0, 0.0, 0.0], directed_velocity=[0.0, 0.0, 0.0], fill_in=None, **kw)[source]

Describes a uniform density plasma

Parameters:
  • density_expression (string) – Analytic expression describing physical number density (string) [m^-3]. Expression should be in terms of the position, written as ‘x’, ‘y’, and ‘z’. Parameters can be used in the expression with the values given as keyword arguments.

  • momentum_expressions (list of strings) – Analytic expressions describing the gamma*velocity for each axis [m/s]. Expressions should be in terms of the position, written as ‘x’, ‘y’, and ‘z’. Parameters can be used in the expression with the values given as keyword arguments. For any axis not supplied (set to None), directed_velocity will be used.

  • lower_bound (vector of length 3 of floats, optional) – Lower bound of the distribution [m]

  • upper_bound (vector of length 3 of floats, optional) – Upper bound of the distribution [m]

  • rms_velocity (vector of length 3 of floats, detault=[0.,0.,0.]) – Thermal velocity spread [m/s]

  • directed_velocity (vector of length 3 of floats, detault=[0.,0.,0.]) – Directed, average, velocity [m/s]

  • fill_in (bool, optional) – Flags whether to fill in the empty spaced opened up when the grid moves

This example will create a distribution where the density is n0 below rmax and zero elsewhere.:

.. code-block: python
dist = AnalyticDistribution(density_expression=’((x**2+y**2)<rmax**2)*n0’,

rmax = 1., n0 = 1.e20, …)

Implementation specific documentation

Parameters:

warpx_momentum_spread_expressions (list of string) – Analytic expressions describing the gamma*velocity spread for each axis [m/s]. Expressions should be in terms of the position, written as ‘x’, ‘y’, and ‘z’. Parameters can be used in the expression with the values given as keyword arguments. For any axis not supplied (set to None), zero will be used.

class pywarpx.picmi.ParticleListDistribution(x=0.0, y=0.0, z=0.0, ux=0.0, uy=0.0, uz=0.0, weight=0.0, **kw)[source]

Load particles at the specified positions and velocities

Parameters:
  • x (float, default=0.) – List of x positions of the particles [m]

  • y (float, default=0.) – List of y positions of the particles [m]

  • z (float, default=0.) – List of z positions of the particles [m]

  • ux (float, default=0.) – List of ux positions of the particles (ux = gamma*vx) [m/s]

  • uy (float, default=0.) – List of uy positions of the particles (uy = gamma*vy) [m/s]

  • uz (float, default=0.) – List of uz positions of the particles (uz = gamma*vz) [m/s]

  • weight (float) – Particle weight or list of weights, number of real particles per simulation particle

Particle layouts determine how to microscopically place macro particles in a grid cell.

class pywarpx.picmi.GriddedLayout(n_macroparticle_per_cell, grid=None, **kw)[source]

Specifies a gridded layout of particles

Parameters:
  • n_macroparticle_per_cell (vector of integers) – Number of particles per cell along each axis

  • grid (grid instance, optional) – Grid object specifying the grid to follow. If not specified, the underlying grid of the code is used.

class pywarpx.picmi.PseudoRandomLayout(n_macroparticles=None, n_macroparticles_per_cell=None, seed=None, grid=None, **kw)[source]

Specifies a pseudo-random layout of the particles

Parameters:
  • n_macroparticles (integer) – Total number of macroparticles to load. Either this argument or n_macroparticles_per_cell should be supplied.

  • n_macroparticles_per_cell (integer) – Number of macroparticles to load per cell. Either this argument or n_macroparticles should be supplied.

  • seed (integer, optional) – Pseudo-random number generator seed

  • grid (grid instance, optional) – Grid object specifying the grid to follow for n_macroparticles_per_cell. If not specified, the underlying grid of the code is used.

Other operations related to particles:

class pywarpx.picmi.CoulombCollisions(name, species, CoulombLog=None, ndt=None, **kw)[source]

Custom class to handle setup of binary Coulomb collisions in WarpX. If collision initialization is added to picmistandard this can be changed to inherit that functionality.

Parameters:
  • name (string) – Name of instance (used in the inputs file)

  • species (list of species instances) – The species involved in the collision. Must be of length 2.

  • CoulombLog (float, optional) – Value of the Coulomb log to use in the collision cross section. If not supplied, it is calculated from the local conditions.

  • ndt (integer, optional) – The collisions will be applied every “ndt” steps. Must be 1 or larger.

class pywarpx.picmi.DSMCCollisions(name, species, scattering_processes, ndt=None, **kw)[source]

Custom class to handle setup of DSMC collisions in WarpX. If collision initialization is added to picmistandard this can be changed to inherit that functionality.

Parameters:
  • name (string) – Name of instance (used in the inputs file)

  • species (species instance) – The species involved in the collision

  • scattering_processes (dictionary) – The scattering process to use and any needed information

  • ndt (integer, optional) – The collisions will be applied every “ndt” steps. Must be 1 or larger.

class pywarpx.picmi.MCCCollisions(name, species, background_density, background_temperature, scattering_processes, background_mass=None, max_background_density=None, ndt=None, **kw)[source]

Custom class to handle setup of MCC collisions in WarpX. If collision initialization is added to picmistandard this can be changed to inherit that functionality.

Parameters:
  • name (string) – Name of instance (used in the inputs file)

  • species (species instance) – The species involved in the collision

  • background_density (float or string) – The density of the background. An string expression as a function of (x, y, z, t) can be used.

  • background_temperature (float or string) – The temperature of the background. An string expression as a function of (x, y, z, t) can be used.

  • scattering_processes (dictionary) – The scattering process to use and any needed information

  • background_mass (float, optional) – The mass of the background particle. If not supplied, the default depends on the type of scattering process.

  • max_background_density (float) – The maximum background density. When the background_density is an expression, this must also be specified.

  • ndt (integer, optional) – The collisions will be applied every “ndt” steps. Must be 1 or larger.

class pywarpx.picmi.FieldIonization(model, ionized_species, product_species, **kw)[source]

Field ionization on an ion species

Parameters:
  • model (string) – Ionization model, e.g. “ADK”

  • ionized_species (species instance) – Species that is ionized

  • product_species (species instance) – Species in which ionized electrons are stored.

Implementation specific documentation

WarpX only has ADK ionization model implemented.

Laser Pulses

Laser profiles can be used to initialize laser pulses in the simulation.

class pywarpx.picmi.GaussianLaser(wavelength, waist, duration, propagation_direction, polarization_direction, focal_position, centroid_position, a0=None, E0=None, phi0=None, zeta=None, beta=None, phi2=None, name=None, fill_in=True, **kw)[source]

Specifies a Gaussian laser distribution.

More precisely, the electric field near the focal plane is given by:

\[E(\boldsymbol{x},t) = a_0\times E_0\, \exp\left( -\frac{r^2}{w_0^2} - \frac{(z-z_0-ct)^2}{c^2\tau^2} \right) \cos[ k_0( z - z_0 - ct ) - \phi_{cep} ]\]

where \(k_0 = 2\pi/\lambda_0\) is the wavevector and where \(E_0 = m_e c^2 k_0 / q_e\) is the field amplitude for \(a_0=1\).

Note

The additional terms that arise far from the focal plane (Gouy phase, wavefront curvature, …) are not included in the above formula for simplicity, but are of course taken into account by the code, when initializing the laser pulse away from the focal plane.

Parameters:
  • wavelength (float) – Laser wavelength [m], defined as \(\lambda_0\) in the above formula

  • waist (float) – Waist of the Gaussian pulse at focus [m], defined as \(w_0\) in the above formula

  • duration (float) – Duration of the Gaussian pulse [s], defined as \(\tau\) in the above formula

  • propagation_direction (unit vector of length 3 of floats) – Direction of propagation [1]

  • polarization_direction (unit vector of length 3 of floats) – Direction of polarization [1]

  • focal_position (vector of length 3 of floats) – Position of the laser focus [m]

  • centroid_position (vector of length 3 of floats) – Position of the laser centroid at time 0 [m]

  • a0 (float) – Normalized vector potential at focus Specify either a0 or E0 (E0 takes precedence).

  • E0 (float) – Maximum amplitude of the laser field [V/m] Specify either a0 or E0 (E0 takes precedence).

  • phi0 (float) – Carrier envelope phase (CEP) [rad]

  • zeta (float) – Spatial chirp at focus (in the lab frame) [m.s]

  • beta (float) – Angular dispersion at focus (in the lab frame) [rad.s]

  • phi2 (float) – Temporal chirp at focus (in the lab frame) [s^2]

  • fill_in (bool, default=True) – Flags whether to fill in the empty spaced opened up when the grid moves

  • name (string, optional) – Optional name of the laser

class pywarpx.picmi.AnalyticLaser(field_expression, wavelength, propagation_direction, polarization_direction, amax=None, Emax=None, name=None, fill_in=True, **kw)[source]

Specifies a laser with an analytically described distribution

Parameters:
  • name=None (string, optional) – Optional name of the laser

  • field_expression (string) – Analytic expression describing the electric field of the laser [V/m] Expression should be in terms of the position, ‘X’, ‘Y’, in the plane orthogonal to the propagation direction, and ‘t’ the time. The expression should describe the full field, including the oscillitory component. Parameters can be used in the expression with the values given as keyword arguments.

  • wavelength (float) – Laser wavelength. This should be built into the expression, but some codes require a specified value for numerical purposes.

  • propagation_direction (unit vector of length 3 of floats) – Direction of propagation [1]

  • polarization_direction (unit vector of length 3 of floats) – Direction of polarization [1]

  • amax (float, optional) – Maximum normalized vector potential. Specify either amax or Emax (Emax takes precedence). This should be built into the expression, but some codes require a specified value for numerical purposes.

  • Emax (float, optional) – Maximum amplitude of the laser field [V/m]. Specify either amax or Emax (Emax takes precedence). This should be built into the expression, but some codes require a specified value for numerical purposes.

  • fill_in (bool, default=True) – Flags whether to fill in the empty spaced opened up when the grid moves

Laser injectors control where to initialize laser pulses on the simulation grid.

class pywarpx.picmi.LaserAntenna(position, normal_vector=None, **kw)[source]

Specifies the laser antenna injection method

Parameters:
  • position (vector of strings) – Position of antenna launching the laser [m]

  • normal_vector (vector of strings, optional) – Vector normal to antenna plane, defaults to the laser direction of propagation [1]

Parameters: Inputs File

This documents on how to use WarpX with an inputs file (e.g., warpx.3d input_3d).

Complete example input files can be found in the examples section.

Note

WarpX input options are read via AMReX ParmParse.

Note

The AMReX parser (see Math parser and user-defined constants) is used for the right-hand-side of all input parameters that consist of one or more integers or floats, so expressions like <species_name>.density_max = "2.+1." and/or using user-defined constants are accepted.

Overall simulation parameters

  • authors (string: e.g. "Jane Doe <jane@example.com>, Jimmy Joe <jimmy@example.com>")

    Authors of an input file / simulation setup. When provided, this information is added as metadata to (openPMD) output files.

  • max_step (integer)

    The number of PIC cycles to perform.

  • stop_time (float; in seconds)

    The maximum physical time of the simulation. Can be provided instead of max_step. If both max_step and stop_time are provided, both criteria are used and the simulation stops when the first criterion is hit.

    Note: in boosted-frame simulations, stop_time refers to the time in the boosted frame.

  • warpx.used_inputs_file (string; default: warpx_used_inputs)

    Name of a file that WarpX writes to archive the used inputs. The context of this file will contain an exact copy of all explicitly and implicitly used inputs parameters, including those extended and overwritten from the command line.

  • warpx.gamma_boost (float)

    The Lorentz factor of the boosted frame in which the simulation is run. (The corresponding Lorentz transformation is assumed to be along warpx.boost_direction.)

    When using this parameter, the input parameters are interpreted as in the lab-frame and automatically converted to the boosted frame. (See the corresponding documentation of each input parameters for exceptions.)

  • warpx.boost_direction (string: x, y or z)

    The direction of the Lorentz-transform for boosted-frame simulations (The direction y cannot be used in 2D simulations.)

  • warpx.zmax_plasma_to_compute_max_step (float) optional

    Can be useful when running in a boosted frame. If specified, automatically calculates the number of iterations required in the boosted frame for the lower z end of the simulation domain to reach warpx.zmax_plasma_to_compute_max_step (typically the plasma end, given in the lab frame). The value of max_step is overwritten, and printed to standard output. Currently only works if the Lorentz boost and the moving window are along the z direction.

  • warpx.compute_max_step_from_btd (integer; 0 by default) optional

    Can be useful when computing back-transformed diagnostics. If specified, automatically calculates the number of iterations required in the boosted frame for all back-transformed diagnostics to be completed. If max_step, stop_time, or warpx.zmax_plasma_to_compute_max_step are not specified, or the current values of max_step and/or stop_time are too low to fill all BTD snapshots, the values of max_step and/or stop_time are overwritten with the new values and printed to standard output.

  • warpx.random_seed (string or int > 0) optional

    If provided warpx.random_seed = random, the random seed will be determined using std::random_device and std::clock(), thus every simulation run produces different random numbers. If provided warpx.random_seed = n, and it is required that n > 0, the random seed for each MPI rank is (mpi_rank+1) * n, where mpi_rank starts from 0. n = 1 and warpx.random_seed = default produce the default random seed. Note that when GPU threading is used, one should not expect to obtain the same random numbers, even if a fixed warpx.random_seed is provided.

  • algo.evolve_scheme (string, default: explicit)

    Specifies the evolve scheme used by WarpX.

    • explicit: Use an explicit solver, such as the standard FDTD or PSATD

    • implicit_picard: Use an implicit solver with exact energy conservation that uses a Picard iteration to solve the system. Note that this method is for demonstration only. It is inefficient and does not work well when \(\omega_{pe} \Delta t\) is close to or greater than one. The method is described in Angus et al., On numerical energy conservation for an implicit particle-in-cell method coupled with a binary Monte-Carlo algorithm for Coulomb collisions. The version implemented is an updated version that is relativistically correct, including the relativistic gamma factor for the particles. For exact energy conservation, algo.current_deposition = direct must be used with interpolation.galerkin_scheme = 0, and algo.current_deposition = Esirkepov must be used with interpolation.galerkin_scheme = 1 (which is the default, in which case charge will also be conserved).

    • semi_implicit_picard: Use an energy conserving semi-implicit solver that uses a Picard iteration to solve the system. Note that this method has the CFL limitation \(\Delta t < c/\sqrt( \sum_i 1/\Delta x_i^2 )\). It is inefficient and does not work well or at all when \(\omega_{pe} \Delta t\) is close to or greater than one. The method is described in Chen et al., A semi-implicit, energy- and charge-conserving particle-in-cell algorithm for the relativistic Vlasov-Maxwell equations. For energy conservation, algo.current_deposition = direct must be used with interpolation.galerkin_scheme = 0, and algo.current_deposition = Esirkepov must be used with interpolation.galerkin_scheme = 1 (which is the default, in which case charge will also be conserved).

  • algo.max_picard_iterations (integer, default: 10)

    When algo.evolve_scheme is either implicit_picard or semi_implicit_picard, this sets the maximum number of Picard itearations that are done each time step.

  • algo.picard_iteration_tolerance (float, default: 1.e-7)

    When algo.evolve_scheme is either implicit_picard or semi_implicit_picard, this sets the convergence tolerance of the iterations, the maximum of the relative change of the L2 norm of the field from one iteration to the next. If this is set to zero, the maximum number of iterations will always be done with the change only calculated on the last iteration (for a slight optimization).

  • algo.require_picard_convergence (bool, default: 1)

    When algo.evolve_scheme is either implicit_picard or semi_implicit_picard, this sets whether the iteration each step is required to converge. If it is required, an abort is raised if it does not converge and the code then exits. If not, then a warning is issued and the calculation continues.

  • warpx.do_electrostatic (string) optional (default none)

    Specifies the electrostatic mode. When turned on, instead of updating the fields at each iteration with the full Maxwell equations, the fields are recomputed at each iteration from the Poisson equation. There is no limitation on the timestep in this case, but electromagnetic effects (e.g. propagation of radiation, lasers, etc.) are not captured. There are several options:

    • labframe: Poisson’s equation is solved in the lab frame with the charge density of all species combined. More specifically, the code solves:

      \[\boldsymbol{\nabla}^2 \phi = - \rho/\epsilon_0 \qquad \boldsymbol{E} = - \boldsymbol{\nabla}\phi\]
    • labframe-electromagnetostatic: Poisson’s equation is solved in the lab frame with the charge density of all species combined. Additionally the 3-component vector potential is solved in the Coulomb Gauge with the current density of all species combined to include self magnetic fields. More specifically, the code solves:

      \[\begin{split}\boldsymbol{\nabla}^2 \phi = - \rho/\epsilon_0 \qquad \boldsymbol{E} = - \boldsymbol{\nabla}\phi \\ \boldsymbol{\nabla}^2 \boldsymbol{A} = - \mu_0 \boldsymbol{j} \qquad \boldsymbol{B} = \boldsymbol{\nabla}\times\boldsymbol{A}\end{split}\]
    • relativistic: Poisson’s equation is solved for each species in their respective rest frame. The corresponding field is mapped back to the simulation frame and will produce both E and B fields. More specifically, in the simulation frame, this is equivalent to solving for each species

      \[\boldsymbol{\nabla}^2 - (\boldsymbol{\beta}\cdot\boldsymbol{\nabla})^2\phi = - \rho/\epsilon_0 \qquad \boldsymbol{E} = -\boldsymbol{\nabla}\phi + \boldsymbol{\beta}(\boldsymbol{\beta} \cdot \boldsymbol{\nabla}\phi) \qquad \boldsymbol{B} = -\frac{1}{c}\boldsymbol{\beta}\times\boldsymbol{\nabla}\phi\]

      where \(\boldsymbol{\beta}\) is the average (normalized) velocity of the considered species (which can be relativistic). See, e.g., Vay [1] for more information.

  • warpx.poisson_solver (string) optional (default multigrid)

    • multigrid: Poisson’s equation is solved using an iterative multigrid (MLMG) solver.

      See the AMReX documentation for details of the MLMG solver (the default solver used with electrostatic simulations). The default behavior of the code is to check whether there is non-zero charge density in the system and if so force the MLMG solver to use the solution max norm when checking convergence. If there is no charge density, the MLMG solver will switch to using the initial guess max norm error when evaluating convergence and an absolute error tolerance of \(10^{-6}\) \(\mathrm{V/m}^2\) will be used (unless a different non-zero value is specified by the user via warpx.self_fields_absolute_tolerance).

    • fft: Poisson’s equation is solved using an Integrated Green Function method (which requires FFT calculations).

      See these references for more details , . It only works in 3D and it requires the compilation flag -DWarpX_PSATD=ON. If mesh refinement is enabled, this solver only works on the coarsest level. On the refined patches, the Poisson equation is solved with the multigrid solver. In electrostatic mode, this solver requires open field boundary conditions (boundary.field_lo,hi = open). In electromagnetic mode, this solver can be used to initialize the species’ self fields (<species_name>.initialize_self_fields=1) provided that the field BCs are PML (boundary.field_lo,hi = PML).

  • warpx.self_fields_required_precision (float, default: 1.e-11)

    The relative precision with which the electrostatic space-charge fields should be calculated. More specifically, the space-charge fields are computed with an iterative Multi-Level Multi-Grid (MLMG) solver. This solver can fail to reach the default precision within a reasonable time. This only applies when warpx.do_electrostatic = labframe.

  • warpx.self_fields_absolute_tolerance (float, default: 0.0)

    The absolute tolerance with which the space-charge fields should be calculated in units of \(\mathrm{V/m}^2\). More specifically, the acceptable residual with which the solution can be considered converged. In general this should be left as the default, but in cases where the simulation state changes very little between steps it can occur that the initial guess for the MLMG solver is so close to the converged value that it fails to improve that solution sufficiently to reach the self_fields_required_precision value.

  • warpx.self_fields_max_iters (integer, default: 200)

    Maximum number of iterations used for MLMG solver for space-charge fields calculation. In case if MLMG converges but fails to reach the desired self_fields_required_precision, this parameter may be increased. This only applies when warpx.do_electrostatic = labframe.

  • warpx.self_fields_verbosity (integer, default: 2)

    The verbosity used for MLMG solver for space-charge fields calculation. Currently MLMG solver looks for verbosity levels from 0-5. A higher number results in more verbose output.

  • amrex.abort_on_out_of_gpu_memory (0 or 1; default is 1 for true)

    When running on GPUs, memory that does not fit on the device will be automatically swapped to host memory when this option is set to 0. This will cause severe performance drops. Note that even with this set to 1 WarpX will not catch all out-of-memory events yet when operating close to maximum device memory. Please also see the documentation in AMReX.

  • amrex.the_arena_is_managed (0 or 1; default is 0 for false)

    When running on GPUs, device memory that is accessed from the host will automatically be transferred with managed memory. This is useful for convenience during development, but has sometimes severe performance and memory footprint implications if relied on (and sometimes vendor bugs). For all regular WarpX operations, we therefore do explicit memory transfers without the need for managed memory and thus changed the AMReX default to false. Please also see the documentation in AMReX.

  • amrex.omp_threads (system, nosmt or positive integer; default is nosmt)

    An integer number can be set in lieu of the OMP_NUM_THREADS environment variable to control the number of OpenMP threads to use for the OMP compute backend on CPUs. By default, we use the nosmt option, which overwrites the OpenMP default of spawning one thread per logical CPU core, and instead only spawns a number of threads equal to the number of physical CPU cores on the machine. If set, the environment variable OMP_NUM_THREADS takes precedence over system and nosmt, but not over integer numbers set in this option.

Signal Handling

WarpX can handle Unix (Linux/macOS) process signals. This can be useful to configure jobs on HPC and cloud systems to shut down cleanly when they are close to reaching their allocated walltime or to steer the simulation behavior interactively.

Allowed signal names are documented in the C++ standard and POSIX. We follow the same naming, but remove the SIG prefix, e.g., the WarpX signal configuration name for SIGINT is INT.

  • warpx.break_signals (array of string, separated by spaces) optional

    A list of signal names or numbers that the simulation should handle by cleanly terminating at the next timestep

  • warpx.checkpoint_signals (array of string, separated by spaces) optional

    A list of signal names or numbers that the simulation should handle by outputting a checkpoint at the next timestep. A diagnostic of type checkpoint must be configured.

Note

Certain signals are only available on specific platforms, please see the links above for details. Typically supported on Linux and macOS are HUP, INT, QUIT, ABRT, USR1, USR2, TERM, TSTP, URG, and IO among others.

Signals to think about twice before overwriting in interactive simulations: Note that INT (interupt) is the signal that Ctrl+C sends on the terminal, which most people use to abort a process; once overwritten you need to abort interactive jobs with, e.g., Ctrl+\ (QUIT) or sending the KILL signal. The TSTP (terminal stop) command is sent interactively from Ctrl+Z to temporarily send a process to sleep (until send in the background with commands such as bg or continued with fg), overwriting it would thus disable that functionality. The signals KILL and STOP cannot be used.

The FPE and ILL signals should not be overwritten in WarpX, as they are controlled by AMReX for debug workflows that catch invalid floating-point operations.

Tip

For example, the following logic can be added to Slurm batch scripts (signal name to number mapping here) to gracefully shut down 6 min prior to walltime. If you have a checkpoint diagnostics in your inputs file, this automatically will write a checkpoint due to the default <diag_name>.dump_last_timestep = 1 option in WarpX.

#SBATCH --signal=1@360

srun ...                   \
  warpx.break_signals=HUP  \
  > output.txt

For LSF batch systems, the equivalent job script lines are:

#BSUB -wa 'HUP' -wt '6'

jsrun ...                  \
  warpx.break_signals=HUP  \
  > output.txt

Setting up the field mesh

  • amr.n_cell (2 integers in 2D, 3 integers in 3D)

    The number of grid points along each direction (on the coarsest level)

  • amr.max_level (integer, default: 0)

    When using mesh refinement, the number of refinement levels that will be used.

    Use 0 in order to disable mesh refinement. Note: currently, 0 and 1 are supported.

  • amr.ref_ratio (integer per refined level, default: 2)

    When using mesh refinement, this is the refinement ratio per level. With this option, all directions are fined by the same ratio.

  • amr.ref_ratio_vect (3 integers for x,y,z per refined level)

    When using mesh refinement, this can be used to set the refinement ratio per direction and level, relative to the previous level.

    Example: for three levels, a value of 2 2 4 8 8 16 refines the first level by 2-fold in x and y and 4-fold in z compared to the coarsest level (level 0/mother grid); compared to the first level, the second level is refined 8-fold in x and y and 16-fold in z.

  • geometry.dims (string)

    The dimensions of the simulation geometry. Supported values are 1, 2, 3, RZ. For 3, a cartesian geometry of x, y, z is modeled. For 2, the axes are x and z and all physics in y is assumed to be translation symmetric. For 1, the only axis is z and the dimensions x and y are translation symmetric. For RZ, we apply an azimuthal mode decomposition, with warpx.n_rz_azimuthal_modes providing further control.

    Note that this value has to match the WarpX_DIMS compile-time option. If you installed WarpX from a package manager, then pick the right executable by name.

  • warpx.n_rz_azimuthal_modes (integer; 1 by default)

    When using the RZ version, this is the number of azimuthal modes. The default is 1, which corresponds to a perfectly axisymmetric simulation.

  • geometry.prob_lo and geometry.prob_hi (2 floats in 2D, 3 floats in 3D; in meters)

    The extent of the full simulation box. This box is rectangular, and thus its extent is given here by the coordinates of the lower corner (geometry.prob_lo) and upper corner (geometry.prob_hi). The first axis of the coordinates is x (or r with cylindrical) and the last is z.

  • warpx.do_moving_window (integer; 0 by default)

    Whether to use a moving window for the simulation

  • warpx.moving_window_dir (either x, y or z)

    The direction of the moving window.

  • warpx.moving_window_v (float)

    The speed of moving window, in units of the speed of light (i.e. use 1.0 for a moving window that moves exactly at the speed of light)

  • warpx.start_moving_window_step (integer; 0 by default)

    The timestep at which the moving window starts.

  • warpx.end_moving_window_step (integer; default is -1 for false)

    The timestep at which the moving window ends.

  • warpx.fine_tag_lo and warpx.fine_tag_hi (2 floats in 2D, 3 floats in 3D; in meters) optional

    When using static mesh refinement with 1 level, the extent of the refined patch. This patch is rectangular, and thus its extent is given here by the coordinates of the lower corner (warpx.fine_tag_lo) and upper corner (warpx.fine_tag_hi).

  • warpx.ref_patch_function(x,y,z) (string) optional

    A function of x, y, z that defines the extent of the refined patch when using static mesh refinement with amr.max_level>0. Note that the function can be used to define distinct regions for refinement, however, the refined regions should be such that the pml layer surrounding the patches should not overlap. For this reason, when defining distinct patches, please ensure that they are sufficiently separated.

  • warpx.refine_plasma (integer) optional (default 0)

    Increase the number of macro-particles that are injected “ahead” of a mesh refinement patch in a moving window simulation.

    Note: in development; only works with static mesh-refinement, specific to moving window plasma injection, and requires a single refined level.

  • warpx.n_current_deposition_buffer (integer)

    When using mesh refinement: the particles that are located inside a refinement patch, but within n_current_deposition_buffer cells of the edge of this patch, will deposit their charge and current to the lower refinement level, instead of depositing to the refinement patch itself. See the mesh-refinement section for more details. If this variable is not explicitly set in the input script, n_current_deposition_buffer is automatically set so as to be large enough to hold the particle shape, on the fine grid

  • warpx.n_field_gather_buffer (integer, optional)

    Default: warpx.n_field_gather_buffer = n_current_deposition_buffer + 1 (one cell larger than n_current_deposition_buffer on the fine grid).

    When using mesh refinement, particles that are located inside a refinement patch, but within n_field_gather_buffer cells of the edge of the patch, gather the fields from the lower refinement level, instead of gathering the fields from the refinement patch itself. This avoids some of the spurious effects that can occur inside the refinement patch, close to its edge. See the section Mesh refinement for more details.

  • warpx.do_single_precision_comms (integer; 0 by default)

    Perform MPI communications for field guard regions in single precision. Only meaningful for WarpX_PRECISION=DOUBLE.

  • particles.deposit_on_main_grid (list of strings)

    When using mesh refinement: the particle species whose name are included in the list will deposit their charge/current directly on the main grid (i.e. the coarsest level), even if they are inside a refinement patch.

  • particles.gather_from_main_grid (list of strings)

    When using mesh refinement: the particle species whose name are included in the list will gather their fields from the main grid (i.e. the coarsest level), even if they are inside a refinement patch.

Domain Boundary Conditions

  • boundary.field_lo and boundary.field_hi (2 strings for 2D, 3 strings for 3D, pml by default)

    Boundary conditions applied to fields at the lower and upper domain boundaries. Options are:

    • Periodic: This option can be used to set periodic domain boundaries. Note that if the fields for lo in a certain dimension are set to periodic, then the corresponding upper boundary must also be set to periodic. If particle boundaries are not specified in the input file, then particles boundaries by default will be set to periodic. If particles boundaries are specified, then they must be set to periodic corresponding to the periodic field boundaries.

    • pml (default): This option can be used to add Perfectly Matched Layers (PML) around the simulation domain. See the PML theory section for more details. Additional pml algorithms can be explored using the parameters warpx.do_pml_in_domain, warpx.pml_has_particles, and warpx.do_pml_j_damping.

    • absorbing_silver_mueller: This option can be used to set the Silver-Mueller absorbing boundary conditions. These boundary conditions are simpler and less computationally expensive than the pml, but are also less effective at absorbing the field. They only work with the Yee Maxwell solver.

    • damped: This is the recommended option in the moving direction when using the spectral solver with moving window (currently only supported along z). This boundary condition applies a damping factor to the electric and magnetic fields in the outer half of the guard cells, using a sine squared profile. As the spectral solver is by nature periodic, the damping prevents fields from wrapping around to the other end of the domain when the periodicity is not desired. This boundary condition is only valid when using the spectral solver.

    • pec: This option can be used to set a Perfect Electric Conductor at the simulation boundary. Please see the PEC theory section for more details. Note that PEC boundary is invalid at r=0 for the RZ solver. Please use none option. This boundary condition does not work with the spectral solver.

    • none: No boundary condition is applied to the fields with the electromagnetic solver. This option must be used for the RZ-solver at r=0.

    • neumann: For the electrostatic multigrid solver, a Neumann boundary condition (with gradient of the potential equal to 0) will be applied on the specified boundary.

    • open: For the electrostatic Poisson solver based on a Integrated Green Function method.

  • boundary.potential_lo_x/y/z and boundary.potential_hi_x/y/z (default 0)

    Gives the value of the electric potential at the boundaries, for pec boundaries. With electrostatic solvers (i.e., with warpx.do_electrostatic = ...), this is used in order to compute the potential in the simulation volume at each timestep. When using other solvers (e.g. Maxwell solver), setting these variables will trigger an electrostatic solve at t=0, to compute the initial electric field produced by the boundaries.

  • boundary.particle_lo and boundary.particle_hi (2 strings for 2D, 3 strings for 3D, absorbing by default)

    Options are:

    • Absorbing: Particles leaving the boundary will be deleted.

    • Periodic: Particles leaving the boundary will re-enter from the opposite boundary. The field boundary condition must be consistently set to periodic and both lower and upper boundaries must be periodic.

    • Reflecting: Particles leaving the boundary are reflected from the boundary back into the domain. When boundary.reflect_all_velocities is false, the sign of only the normal velocity is changed, otherwise the sign of all velocities are changed.

    • Thermal: Particles leaving the boundary are reflected from the boundary back into the domain and their velocities are thermalized. The tangential velocity components are sampled from gaussian distribution and the component normal to the boundary is sampled from gaussian flux distribution. The standard deviation for these distributions should be provided for each species using boundary.<species>.u_th. The same standard deviation is used to sample all components.

  • boundary.reflect_all_velocities (bool) optional (default false)

    For a reflecting boundary condition, this flags whether the sign of only the normal velocity is changed or all velocities.

  • boundary.verboncoeur_axis_correction (bool) optional (default true)

    Whether to apply the Verboncoeur correction on the charge and current density on axis when using RZ. For nodal values (rho and Jz), the cell volume for values on axis is calculated as \(\pi*\Delta r^2/4\). In Verboncoeur [2], it is shown that using \(\pi*\Delta r^2/3\) instead will give a uniform density if the particle density is uniform.

Additional PML parameters

  • warpx.pml_ncell (int; default: 10)

    The depth of the PML, in number of cells.

  • do_similar_dm_pml (int; default: 1)

    Whether or not to use an amrex::DistributionMapping for the PML grids that is similar to the mother grids, meaning that the mapping will be computed to minimize the communication costs between the PML and the mother grids.

  • warpx.pml_delta (int; default: 10)

    The characteristic depth, in number of cells, over which the absorption coefficients of the PML increases.

  • warpx.do_pml_in_domain (int; default: 0)

    Whether to create the PML inside the simulation area or outside. If inside, it allows the user to propagate particles in PML and to use extended PML

  • warpx.pml_has_particles (int; default: 0)

    Whether to propagate particles in PML or not. Can only be done if PML are in simulation domain, i.e. if warpx.do_pml_in_domain = 1.

  • warpx.do_pml_j_damping (int; default: 0)

    Whether to damp current in PML. Can only be used if particles are propagated in PML, i.e. if warpx.pml_has_particles = 1.

  • warpx.v_particle_pml (float; default: 1)

    When warpx.do_pml_j_damping = 1, the assumed velocity of the particles to be absorbed in the PML, in units of the speed of light c.

  • warpx.do_pml_dive_cleaning (bool)

    Whether to use divergence cleaning for E in the PML region. The value must match warpx.do_pml_divb_cleaning (either both false or both true). This option seems to be necessary in order to avoid strong Nyquist instabilities in 3D simulations with the PSATD solver, open boundary conditions and PML in all directions. 2D simulations and 3D simulations with open boundary conditions and PML only in one direction might run well even without divergence cleaning. This option is implemented only for the Cartesian PSATD solver; it is turned on by default in this case.

  • warpx.do_pml_divb_cleaning (bool)

    Whether to use divergence cleaning for B in the PML region. The value must match warpx.do_pml_dive_cleaning (either both false or both true). This option seems to be necessary in order to avoid strong Nyquist instabilities in 3D simulations with the PSATD solver, open boundary conditions and PML in all directions. 2D simulations and 3D simulations with open boundary conditions and PML only in one direction might run well even without divergence cleaning. This option is implemented only for the Cartesian PSATD solver; it is turned on by default in this case.

Embedded Boundary Conditions

  • warpx.eb_implicit_function (string)

    A function of x, y, z that defines the surface of the embedded boundary. That surface lies where the function value is 0 ; the physics simulation area is where the function value is negative ; the interior of the embeddded boundary is where the function value is positive.

  • warpx.eb_potential(x,y,z,t) (string)

    Gives the value of the electric potential at the surface of the embedded boundary, as a function of x, y, z and t. With electrostatic solvers (i.e., with warpx.do_electrostatic = ...), this is used in order to compute the potential in the simulation volume at each timestep. When using other solvers (e.g. Maxwell solver), setting this variable will trigger an electrostatic solve at t=0, to compute the initial electric field produced by the boundaries. Note that this function is also evaluated inside the embedded boundary. For this reason, it is important to define this function in such a way that it is constant inside the embedded boundary.

Distribution across MPI ranks and parallelization

  • warpx.numprocs (2 ints for 2D, 3 ints for 3D) optional (default none)

    This optional parameter can be used to control the domain decomposition on the coarsest level. The domain will be chopped into the exact number of pieces in each dimension as specified by this parameter. If it’s not specified, the domain decomposition will be determined by the parameters that will be discussed below. If specified, the product of the numbers must be equal to the number of MPI processes.

  • amr.max_grid_size (integer) optional (default 128)

    Maximum allowable size of each subdomain (expressed in number of grid points, in each direction). Each subdomain has its own ghost cells, and can be handled by a different MPI rank ; several OpenMP threads can work simultaneously on the same subdomain.

    If max_grid_size is such that the total number of subdomains is larger that the number of MPI ranks used, than some MPI ranks will handle several subdomains, thereby providing additional flexibility for load balancing.

    When using mesh refinement, this number applies to the subdomains of the coarsest level, but also to any of the finer level.

  • algo.load_balance_intervals (string) optional (default 0)

    Using the Intervals parser syntax, this string defines the timesteps at which WarpX should try to redistribute the work across MPI ranks, in order to have better load balancing. Use 0 to disable load_balancing.

    When performing load balancing, WarpX measures the wall time for computational parts of the PIC cycle. It then uses this data to decide how to redistribute the subdomains across MPI ranks. (Each subdomain is unchanged, but its owner is changed in order to have better performance.) This relies on each MPI rank handling several (in fact many) subdomains (see max_grid_size).

  • algo.load_balance_efficiency_ratio_threshold (float) optional (default 1.1)

    Controls whether to adopt a proposed distribution mapping computed during a load balance. If the the ratio of the proposed to current distribution mapping efficiency (i.e., average cost per MPI process; efficiency is a number in the range [0, 1]) is greater than the threshold value, the proposed distribution mapping is adopted. The suggested range of values is algo.load_balance_efficiency_ratio_threshold >= 1, which ensures that the new distribution mapping is adopted only if doing so would improve the load balance efficiency. The higher the threshold value, the more conservative is the criterion for adoption of a proposed distribution; for example, with algo.load_balance_efficiency_ratio_threshold = 1, the proposed distribution is adopted any time the proposed distribution improves load balancing; if instead algo.load_balance_efficiency_ratio_threshold = 2, the proposed distribution is adopted only if doing so would yield a 100% to the load balance efficiency (with this threshold value, if the current efficiency is 0.45, the new distribution would only be adopted if the proposed efficiency were greater than 0.9).

  • algo.load_balance_with_sfc (0 or 1) optional (default 0)

    If this is 1: use a Space-Filling Curve (SFC) algorithm in order to perform load-balancing of the simulation. If this is 0: the Knapsack algorithm is used instead.

  • algo.load_balance_knapsack_factor (float) optional (default 1.24)

    Controls the maximum number of boxes that can be assigned to a rank during load balance when using the ‘knapsack’ policy for update of the distribution mapping; the maximum is load_balance_knapsack_factor*(average number of boxes per rank). For example, if there are 4 boxes per rank and load_balance_knapsack_factor=2, no more than 8 boxes can be assigned to any rank.

  • algo.load_balance_costs_update (heuristic or timers) optional (default timers)

    If this is heuristic: load balance costs are updated according to a measure of particles and cells assigned to each box of the domain. The cost \(c\) is computed as

    \[c = n_{\text{particle}} \cdot w_{\text{particle}} + n_{\text{cell}} \cdot w_{\text{cell}},\]

    where \(n_{\text{particle}}\) is the number of particles on the box, \(w_{\text{particle}}\) is the particle cost weight factor (controlled by algo.costs_heuristic_particles_wt), \(n_{\text{cell}}\) is the number of cells on the box, and \(w_{\text{cell}}\) is the cell cost weight factor (controlled by algo.costs_heuristic_cells_wt).

    If this is timers: costs are updated according to in-code timers.

  • algo.costs_heuristic_particles_wt (float) optional

    Particle weight factor used in Heuristic strategy for costs update; if running on GPU, the particle weight is set to a value determined from single-GPU tests on Summit, depending on the choice of solver (FDTD or PSATD) and order of the particle shape. If running on CPU, the default value is 0.9. If running on GPU, the default value is

    Particle shape factor

    1

    2

    3

    FDTD/CKC

    0.599

    0.732

    0.855

    PSATD

    0.425

    0.595

    0.75

  • algo.costs_heuristic_cells_wt (float) optional

    Cell weight factor used in Heuristic strategy for costs update; if running on GPU, the cell weight is set to a value determined from single-GPU tests on Summit, depending on the choice of solver (FDTD or PSATD) and order of the particle shape. If running on CPU, the default value is 0.1. If running on GPU, the default value is

    Particle shape factor

    1

    2

    3

    FDTD/CKC

    0.401

    0.268

    0.145

    PSATD

    0.575

    0.405

    0.25

  • warpx.do_dynamic_scheduling (0 or 1) optional (default 1)

    Whether to activate OpenMP dynamic scheduling.

Math parser and user-defined constants

WarpX uses AMReX’s math parser that reads expressions in the input file. It can be used in all input parameters that consist of one or more integers or floats. Integer input expecting boolean, 0 or 1, are not parsed. Note that when multiple values are expected, the expressions are space delimited. For integer input values, the expressions are evaluated as real numbers and the final result rounded to the nearest integer. See this section of the AMReX documentation for a complete list of functions supported by the math parser.

WarpX constants

WarpX provides a few pre-defined constants, that can be used for any parameter that consists of one or more floats.

q_e

elementary charge

m_e

electron mass

m_p

proton mass

m_u

unified atomic mass unit (Dalton)

epsilon0

vacuum permittivity

mu0

vacuum permeability

clight

speed of light

kb

Boltzmann’s constant (J/K)

pi

math constant pi

See Source/Utils/WarpXConst.H for the values.

User-defined constants

Users can define their own constants in the input file. These constants can be used for any parameter that consists of one or more integers or floats. User-defined constant names can contain only letters, numbers and the character _. The name of each constant has to begin with a letter. The following names are used by WarpX, and cannot be used as user-defined constants: x, y, z, X, Y, t. The values of the constants can include the predefined WarpX constants listed above as well as other user-defined constants. For example:

  • my_constants.a0 = 3.0

  • my_constants.z_plateau = 150.e-6

  • my_constants.n0 = 1.e22

  • my_constants.wp = sqrt(n0*q_e**2/(epsilon0*m_e))

Coordinates

Besides, for profiles that depend on spatial coordinates (the plasma momentum distribution or the laser field, see below Particle initialization and Laser initialization), the parser will interpret some variables as spatial coordinates. These are specified in the input parameter, i.e., density_function(x,y,z) and field_function(X,Y,t).

The parser reads python-style expressions between double quotes, for instance "a0*x**2 * (1-y*1.e2) * (x>0)" is a valid expression where a0 is a user-defined constant (see above) and x and y are spatial coordinates. The names are case sensitive. The factor (x>0) is 1 where x>0 and 0 where x<=0. It allows the user to define functions by intervals. Alternatively the expression above can be written as if(x>0, a0*x**2 * (1-y*1.e2), 0).

Particle initialization

  • particles.species_names (strings, separated by spaces)

    The name of each species. This is then used in the rest of the input deck ; in this documentation we use <species_name> as a placeholder.

  • particles.photon_species (strings, separated by spaces)

    List of species that are photon species, if any. This is required when compiling with QED=TRUE.

  • particles.use_fdtd_nci_corr (0 or 1) optional (default 0)

    Whether to activate the FDTD Numerical Cherenkov Instability corrector. Not currently available in the RZ configuration.

  • particles.rigid_injected_species (strings, separated by spaces)

    List of species injected using the rigid injection method. The rigid injection method is useful when injecting a relativistic particle beam in boosted-frame simulations; see the input-output section for more details. For species injected using this method, particles are translated along the +z axis with constant velocity as long as their z coordinate verifies z<zinject_plane. When z>zinject_plane, particles are pushed in a standard way, using the specified pusher. (see the parameter <species_name>.zinject_plane below)

  • particles.do_tiling (bool) optional (default false if WarpX is compiled for GPUs, true otherwise)

    Controls whether tiling (‘cache blocking’) transformation is used for particles. Tiling should be on when using OpenMP and off when using GPUs.

  • <species_name>.species_type (string) optional (default unspecified)

    Type of physical species. Currently, the accepted species are "electron", "positron", "muon", "antimuon", "photon", "neutron", "proton" , "alpha", "hydrogen1" (a.k.a. "protium"), "hydrogen2" (a.k.a. "deuterium"), "hydrogen3" (a.k.a. "tritium"), "helium", "helium3", "helium4", "lithium", "lithium6", "lithium7", "beryllium", "beryllium9", "boron", "boron10", "boron11", "carbon", "carbon12", "carbon13", "carbon14", "nitrogen", "nitrogen14", "nitrogen15", "oxygen", "oxygen16", "oxygen17", "oxygen18", "fluorine", "fluorine19", "neon", "neon20", "neon21", "neon22", "aluminium", "argon", "copper", "xenon" and "gold". The difference between "proton" and "hydrogen1" is that the mass of the latter includes also the mass of the bound electron (same for "alpha" and "helium4"). When only the name of an element is specified, the mass is a weighted average of the masses of the stable isotopes. For all the elements with Z < 11 we provide also the stable isotopes as an option for species_type (e.g., "helium3" and "helium4"). Either species_type or both mass and charge have to be specified.

  • <species_name>.charge (float) optional (default NaN)

    The charge of one physical particle of this species. If species_type is specified, the charge will be set to the physical value and charge is optional. When <species>.do_field_ionization = 1, the physical particle charge is equal to ionization_initial_level * charge, so latter parameter should be equal to q_e (which is defined in WarpX as the elementary charge in coulombs).

  • <species_name>.mass (float) optional (default NaN)

    The mass of one physical particle of this species. If species_type is specified, the mass will be set to the physical value and mass is optional.

  • <species_name>.xmin,ymin,zmin and <species_name>.xmax,ymax,zmax (float) optional (default unlimited)

    When <species_name>.xmin and <species_name>.xmax are set, they delimit the region within which particles are injected. If periodic boundary conditions are used in direction i, then the default (i.e. if the range is not specified) range will be the simulation box, [geometry.prob_hi[i], geometry.prob_lo[i]].

  • <species_name>.injection_sources (list of strings) optional

    Names of additional injection sources. By default, WarpX assumes one injection source per species, hence all of the input parameters below describing the injection are parameters directly of the species. However, this option allows additional sources, the names of which are specified here. For each source, the name of the source is added to the input parameters below. For instance, with <species_name>.injection_sources = source1 source2 there can be the two input parameters <species_name>.source1.injection_style and <species_name>.source2.injection_style. For the parameters of each source, the parameter with the name of the source will be used. If it is not given, the value of the parameter without the source name will be used. This allows parameters used for all sources to be specified once. For example, if the source1 and source2 have the same value of uz_m, then it can be set using <species_name>.uz_m instead of setting it for each source. Note that since by default <species_name>.injection_style = none, all injection sources can be input this way. Note that if a moving window is used, the bulk velocity of all of the sources must be the same since it is used when updating the window.

  • <species_name>.injection_style (string; default: none)

    Determines how the (macro-)particles will be injected in the simulation. The number of particles per cell is always given with respect to the coarsest level (level 0/mother grid), even if particles are immediately assigned to a refined patch.

    The options are:

    • NUniformPerCell: injection with a fixed number of evenly-spaced particles per cell. This requires the additional parameter <species_name>.num_particles_per_cell_each_dim.

    • NRandomPerCell: injection with a fixed number of randomly-distributed particles per cell. This requires the additional parameter <species_name>.num_particles_per_cell.

    • SingleParticle: Inject a single macroparticle. This requires the additional parameters:

      • <species_name>.single_particle_pos (3 doubles, particle 3D position [meter])

      • <species_name>.single_particle_u (3 doubles, particle 3D normalized momentum, i.e. \(\gamma \beta\))

      • <species_name>.single_particle_weight ( double, macroparticle weight, i.e. number of physical particles it represents)

    • MultipleParticles: Inject multiple macroparticles. This requires the additional parameters:

      • <species_name>.multiple_particles_pos_x (list of doubles, X positions of the particles [meter])

      • <species_name>.multiple_particles_pos_y (list of doubles, Y positions of the particles [meter])

      • <species_name>.multiple_particles_pos_z (list of doubles, Z positions of the particles [meter])

      • <species_name>.multiple_particles_ux (list of doubles, X normalized momenta of the particles, i.e. \(\gamma \beta_x\))

      • <species_name>.multiple_particles_uy (list of doubles, Y normalized momenta of the particles, i.e. \(\gamma \beta_y\))

      • <species_name>.multiple_particles_uz (list of doubles, Z normalized momenta of the particles, i.e. \(\gamma \beta_z\))

      • <species_name>.multiple_particles_weight (list of doubles, macroparticle weights, i.e. number of physical particles each represents)

    • gaussian_beam: Inject particle beam with gaussian distribution in space in all directions. This requires additional parameters:

      • <species_name>.q_tot (beam charge),

      • <species_name>.npart (number of macroparticles in the beam),

      • <species_name>.x/y/z_m (average position in x/y/z),

      • <species_name>.x/y/z_rms (standard deviation in x/y/z),

      There are additional optional parameters:

      • <species_name>.x/y/z_cut (optional, particles with abs(x-x_m) > x_cut*x_rms are not injected, same for y and z. <species_name>.q_tot is the charge of the un-cut beam, so that cutting the distribution is likely to result in a lower total charge),

      • <species_name>.do_symmetrize (optional, whether to symmetrize the beam)

      • <species_name>.symmetrization_order (order of symmetrization, default is 4, can be 4 or 8).

      If <species_name>.do_symmetrize is 0, no symmetrization occurs. If <species_name>.do_symmetrize is 1, then the beam is symmetrized according to the value of <species_name>.symmetrization_order. If set to 4, symmetrization is in the x and y direction, (x,y) (-x,y) (x,-y) (-x,-y). If set to 8, symmetrization is also done with x and y exchanged, (y,x), (-y,x), (y,-x), (-y,-x)).

      • <species_name>.focal_distance (optional, distance between the beam centroid and the position of the focal plane of the beam, along the direction of the beam mean velocity; space charge is ignored in the initialization of the particles)

      If <species_name>.focal_distance is specified, x_rms, y_rms and z_rms are the sizes of the beam in the focal plane. Since the beam is not necessarily initialized close to its focal plane, the initial size of the beam will differ from x_rms, y_rms, z_rms.

      Usually, in accelerator physics the operative quantities are the normalized emittances \(\epsilon_{x,y}\) and beta functions \(\beta_{x,y}\). We assume that the beam travels along \(z\) and we mark the quantities evaluated at the focal plane with a \(*\). Therefore, the normalized transverse emittances and beta functions are related to the focal distance \(f = z - z^*\), the beam sizes \(\sigma_{x,y}\) (which in the code are x_rms, y_rms), the beam relativistic Lorentz factor \(\gamma\), and the normalized momentum spread \(\Delta u_{x,y}\) according to the equations below (Wiedemann [3]).

      \[ \begin{align}\begin{aligned}\Delta u_{x,y} &= \frac{\epsilon^*_{x,y}}{\sigma^*_{x,y}},\\\sigma*_{x, y} &= \sqrt{ \frac{ \epsilon^*_{x,y} \beta^*_{x,y} }{\gamma}},\\\sigma_{x,y}(z) &= \sigma^*_{x,y} \sqrt{1 + \left( \frac{z - z^*}{\beta^*_{x,y}} \right)^2}\end{aligned}\end{align} \]
    • external_file: Inject macroparticles with properties (mass, charge, position, and momentum - \(\gamma \beta m c\)) read from an external openPMD file. With it users can specify the additional arguments:

      • <species_name>.injection_file (string) openPMD file name and

      • <species_name>.charge (double) optional (default is read from openPMD file) when set this will be the charge of the physical particle represented by the injected macroparticles.

      • <species_name>.mass (double) optional (default is read from openPMD file) when set this will be the charge of the physical particle represented by the injected macroparticles.

      • <species_name>.z_shift (double) optional (default is no shift) when set this value will be added to the longitudinal, z, position of the particles.

      • <species_name>.impose_t_lab_from_file (bool) optional (default is false) only read if warpx.gamma_boost > 1., it allows to set t_lab for the Lorentz Transform as being the time stored in the openPMD file.

      Warning: q_tot!=0 is not supported with the external_file injection style. If a value is provided, it is ignored and no re-scaling is done. The external file must include the species openPMD::Record labeled position and momentum (double arrays), with dimensionality and units set via openPMD::setUnitDimension and setUnitSI. If the external file also contains openPMD::Records for mass and charge (constant double scalars) then the species will use these, unless overwritten in the input file (see <species_name>.mass, <species_name>.charge or <species_name>.species_type). The external_file option is currently implemented for 2D, 3D and RZ geometries, with record components in the cartesian coordinates (x,y,z) for 3D and RZ, and (x,z) for 2D. For more information on the openPMD format and how to build WarpX with it, please visit the install section.

    • NFluxPerCell: Continuously inject a flux of macroparticles from a planar surface. This requires the additional parameters:

      • <species_name>.flux_profile (see the description of this parameter further below)

      • <species_name>.surface_flux_pos (double, location of the injection plane [meter])

      • <species_name>.flux_normal_axis (x, y, or z for 3D, x or z for 2D, or r, t, or z for RZ. When flux_normal_axis is r or t, the x and y components of the user-specified momentum distribution are interpreted as the r and t components respectively)

      • <species_name>.flux_direction (-1 or +1, direction of flux relative to the plane)

      • <species_name>.num_particles_per_cell (double)

      • <species_name>.flux_tmin (double, Optional time at which the flux will be turned on. Ignored when negative.)

      • <species_name>.flux_tmax (double, Optional time at which the flux will be turned off. Ignored when negative.)

    • none: Do not inject macro-particles (for example, in a simulation that starts with neutral, ionizable atoms, one may want to create the electrons species – where ionized electrons can be stored later on – without injecting electron macro-particles).

  • <species_name>.num_particles_per_cell_each_dim (3 integers in 3D and RZ, 2 integers in 2D)

    With the NUniformPerCell injection style, this specifies the number of particles along each axis within a cell. Note that for RZ, the three axis are radius, theta, and z and that the recommended number of particles per theta is at least two times the number of azimuthal modes requested. (It is recommended to do a convergence scan of the number of particles per theta)

  • <species_name>.random_theta (bool) optional (default 1)

    When using RZ geometry, whether to randomize the azimuthal position of particles. This is used when <species_name>.injection_style = NUniformPerCell.

  • <species_name>.do_splitting (bool) optional (default 0)

    Split particles of the species when crossing the boundary from a lower resolution domain to a higher resolution domain.

    Currently implemented on CPU only.

  • <species_name>.do_continuous_injection (0 or 1)

    Whether to inject particles during the simulation, and not only at initialization. This can be required with a moving window and/or when running in a boosted frame.

  • <species_name>.initialize_self_fields (0 or 1)

    Whether to calculate the space-charge fields associated with this species at the beginning of the simulation. The fields are calculated for the mean gamma of the species.

  • <species_name>.self_fields_required_precision (float, default: 1.e-11)

    The relative precision with which the initial space-charge fields should be calculated. More specifically, the initial space-charge fields are computed with an iterative Multi-Level Multi-Grid (MLMG) solver. For highly-relativistic beams, this solver can fail to reach the default precision within a reasonable time ; in that case, users can set a relaxed precision requirement through self_fields_required_precision.

  • <species_name>.self_fields_absolute_tolerance (float, default: 0.0)

    The absolute tolerance with which the space-charge fields should be calculated in units of \(\mathrm{V/m}^2\). More specifically, the acceptable residual with which the solution can be considered converged. In general this should be left as the default, but in cases where the simulation state changes very little between steps it can occur that the initial guess for the MLMG solver is so close to the converged value that it fails to improve that solution sufficiently to reach the self_fields_required_precision value.

  • <species_name>.self_fields_max_iters (integer, default: 200)

    Maximum number of iterations used for MLMG solver for initial space-charge fields calculation. In case if MLMG converges but fails to reach the desired self_fields_required_precision, this parameter may be increased.

  • <species_name>.profile (string)

    Density profile for this species. The options are:

    • constant: Constant density profile within the box, or between <species_name>.xmin and <species_name>.xmax (and same in all directions). This requires additional parameter <species_name>.density. i.e., the plasma density in \(m^{-3}\).

    • predefined: Predefined density profile. This requires additional parameters <species_name>.predefined_profile_name and <species_name>.predefined_profile_params. Currently, only a parabolic channel density profile is implemented.

    • parse_density_function: the density is given by a function in the input file. It requires additional argument <species_name>.density_function(x,y,z), which is a mathematical expression for the density of the species, e.g. electrons.density_function(x,y,z) = "n0+n0*x**2*1.e12" where n0 is a user-defined constant, see above. WARNING: where density_function(x,y,z) is close to zero, particles will still be injected between xmin and xmax etc., with a null weight. This is undesirable because it results in useless computing. To avoid this, see option density_min below.

  • <species_name>.flux_profile (string)

    Defines the expression of the flux, when using <species_name>.injection_style=NFluxPerCell

    • constant: Constant flux. This requires the additional parameter <species_name>.flux. i.e., the injection flux in \(m^{-2}.s^{-1}\).

    • parse_flux_function: the flux is given by a function in the input file. It requires the additional argument <species_name>.flux_function(x,y,z,t), which is a mathematical expression for the flux of the species.

  • <species_name>.density_min (float) optional (default 0.)

    Minimum plasma density. No particle is injected where the density is below this value.

  • <species_name>.density_max (float) optional (default infinity)

    Maximum plasma density. The density at each point is the minimum between the value given in the profile, and density_max.

  • <species_name>.radially_weighted (bool) optional (default true)

    Whether particle’s weight is varied with their radius. This only applies to cylindrical geometry. The only valid value is true.

  • <species_name>.momentum_distribution_type (string)

    Distribution of the normalized momentum (u=p/mc) for this species. The options are:

    • at_rest: Particles are initialized with zero momentum.

    • constant: constant momentum profile. This can be controlled with the additional parameters <species_name>.ux, <species_name>.uy and <species_name>.uz, the normalized momenta in the x, y and z direction respectively, which are all 0. by default.

    • uniform: uniform probability distribution between a minimum and a maximum value. The x, y and z directions are sampled independently and the final momentum space is a cuboid. The parameters that control the minimum and maximum domain of the distribution are <species_name>.u<x,y,z>_min and <species_name>.u<x,y,z>_max in each direction respectively (e.g., <species_name>.uz_min = 0.2 and <species_name>.uz_max = 0.4 to control the generation along the z direction). All the parameters default to 0.

    • gaussian: gaussian momentum distribution in all 3 directions. This can be controlled with the additional arguments for the average momenta along each direction <species_name>.ux_m, <species_name>.uy_m and <species_name>.uz_m as well as standard deviations along each direction <species_name>.ux_th, <species_name>.uy_th and <species_name>.uz_th. These 6 parameters are all 0. by default.

    • gaussianflux: Gaussian momentum flux distribution, which is Gaussian in the plane and v*Gaussian normal to the plane. It can only be used when injection_style = NFluxPerCell. This can be controlled with the additional arguments to specify the plane’s orientation, <species_name>.flux_normal_axis and <species_name>.flux_direction, for the average momenta along each direction <species_name>.ux_m, <species_name>.uy_m and <species_name>.uz_m, as well as standard deviations along each direction <species_name>.ux_th, <species_name>.uy_th and <species_name>.uz_th. ux_m, uy_m, uz_m, ux_th, uy_th and uz_th are all 0. by default.

    • maxwell_boltzmann: Maxwell-Boltzmann distribution that takes a dimensionless temperature parameter \(\theta\) as an input, where \(\theta = \frac{k_\mathrm{B} \cdot T}{m \cdot c^2}\), \(T\) is the temperature in Kelvin, \(k_\mathrm{B}\) is the Boltzmann constant, \(c\) is the speed of light, and \(m\) is the mass of the species. Theta is specified by a combination of <species_name>.theta_distribution_type, <species_name>.theta, and <species_name>.theta_function(x,y,z) (see below). For values of \(\theta > 0.01\), errors due to ignored relativistic terms exceed 1%. Temperatures less than zero are not allowed. The plasma can be initialized to move at a bulk velocity \(\beta = v/c\). The speed is specified by the parameters <species_name>.beta_distribution_type, <species_name>.beta, and <species_name>.beta_function(x,y,z) (see below). \(\beta\) can be positive or negative and is limited to the range \(-1 < \beta < 1\). The direction of the velocity field is given by <species_name>.bulk_vel_dir = (+/-) 'x', 'y', 'z', and must be the same across the domain. Please leave no whitespace between the sign and the character on input. A direction without a sign will be treated as positive. The MB distribution is initialized in the drifting frame by sampling three Gaussian distributions in each dimension using, the Box Mueller method, and then the distribution is transformed to the simulation frame using the flipping method. The flipping method can be found in Zenitani 2015 section III. B. (Phys. Plasmas 22, 042116). By default, beta is equal to 0. and bulk_vel_dir is +x.

      Note that though the particles may move at relativistic speeds in the simulation frame, they are not relativistic in the drift frame. This is as opposed to the Maxwell Juttner setting, which initializes particles with relativistic momentums in their drifting frame.

    • maxwell_juttner: Maxwell-Juttner distribution for high temperature plasma that takes a dimensionless temperature parameter \(\theta\) as an input, where \(\theta = \frac{k_\mathrm{B} \cdot T}{m \cdot c^2}\), \(T\) is the temperature in Kelvin, \(k_\mathrm{B}\) is the Boltzmann constant, and \(m\) is the mass of the species. Theta is specified by a combination of <species_name>.theta_distribution_type, <species_name>.theta, and <species_name>.theta_function(x,y,z) (see below). The Sobol method used to generate the distribution will not terminate for \(\theta \lesssim 0.1\), and the code will abort if it encounters a temperature below that threshold. The Maxwell-Boltzmann distribution is recommended for temperatures in the range \(0.01 < \theta < 0.1\). Errors due to relativistic effects can be expected to approximately between 1% and 10%. The plasma can be initialized to move at a bulk velocity \(\beta = v/c\). The speed is specified by the parameters <species_name>.beta_distribution_type, <species_name>.beta, and <species_name>.beta_function(x,y,z) (see below). \(\beta\) can be positive or negative and is limited to the range \(-1 < \beta < 1\). The direction of the velocity field is given by <species_name>.bulk_vel_dir = (+/-) 'x', 'y', 'z', and must be the same across the domain. Please leave no whitespace between the sign and the character on input. A direction without a sign will be treated as positive. The MJ distribution will be initialized in the moving frame using the Sobol method, and then the distribution will be transformed to the simulation frame using the flipping method. Both the Sobol and the flipping method can be found in Zenitani 2015 (Phys. Plasmas 22, 042116). By default, beta is equal to 0. and bulk_vel_dir is +x.

      Please take notice that particles initialized with this setting can be relativistic in two ways. In the simulation frame, they can drift with a relativistic speed beta. Then, in the drifting frame they are still moving with relativistic speeds due to high temperature. This is as opposed to the Maxwell Boltzmann setting, which initializes non-relativistic plasma in their relativistic drifting frame.

    • radial_expansion: momentum depends on the radial coordinate linearly. This can be controlled with additional parameter u_over_r which is the slope (0. by default).

    • parse_momentum_function: the momentum \(u = (u_{x},u_{y},u_{z})=(\gamma v_{x}/c,\gamma v_{y}/c,\gamma v_{z}/c)\) is given by a function in the input file. It requires additional arguments <species_name>.momentum_function_ux(x,y,z), <species_name>.momentum_function_uy(x,y,z) and <species_name>.momentum_function_uz(x,y,z), which gives the distribution of each component of the momentum as a function of space.

    • gaussian_parse_momentum_function: Gaussian momentum distribution where the mean and the standard deviation are given by functions of position in the input file. Both are assumed to be non-relativistic. The mean is the normalized momentum, \(u_m = \gamma v_m/c\). The standard deviation is normalized, \(u_th = v_th/c\). For example, this might be u_th = sqrt(T*q_e/mass)/clight given the temperature (in eV) and mass. It requires the following arguments:

      • <species_name>.momentum_function_ux_m(x,y,z): mean \(u_{x}\)

      • <species_name>.momentum_function_uy_m(x,y,z): mean \(u_{y}\)

      • <species_name>.momentum_function_uz_m(x,y,z): mean \(u_{z}\)

      • <species_name>.momentum_function_ux_th(x,y,z): standard deviation of \(u_{x}\)

      • <species_name>.momentum_function_uy_th(x,y,z): standard deviation of \(u_{y}\)

      • <species_name>.momentum_function_uz_th(x,y,z): standard deviation of \(u_{z}\)

  • <species_name>.theta_distribution_type (string) optional (default constant)

    Only read if <species_name>.momentum_distribution_type is maxwell_boltzmann or maxwell_juttner. See documentation for these distributions (above) for constraints on values of theta. Temperatures less than zero are not allowed.

    • If constant, use a constant temperature, given by the required float parameter <species_name>.theta.

    • If parser, use a spatially-dependent analytic parser function, given by the required parameter <species_name>.theta_function(x,y,z).

  • <species_name>.beta_distribution_type (string) optional (default constant)

    Only read if <species_name>.momentum_distribution_type is maxwell_boltzmann or maxwell_juttner. See documentation for these distributions (above) for constraints on values of beta.

    • If constant, use a constant speed, given by the required float parameter <species_name>.beta.

    • If parser, use a spatially-dependent analytic parser function, given by the required parameter <species_name>.beta_function(x,y,z).

  • <species_name>.zinject_plane (float)

    Only read if <species_name> is in particles.rigid_injected_species. Injection plane when using the rigid injection method. See particles.rigid_injected_species above.

  • <species_name>.rigid_advance (bool)

    Only read if <species_name> is in particles.rigid_injected_species.

    • If false, each particle is advanced with its own velocity vz until it reaches zinject_plane.

    • If true, each particle is advanced with the average speed of the species vzbar until it reaches zinject_plane.

  • species_name.predefined_profile_name (string)

    Only read if <species_name>.profile is predefined.

    • If parabolic_channel, the plasma profile is a parabolic profile with cosine-like ramps at the beginning and the end of the profile. The density is given by

      \[n = n_0 n(x,y) n(z-z_0)\]

      with

      \[n(x,y) = 1 + 4\frac{x^2+y^2}{k_p^2 R_c^4}\]

      where \(k_p\) is the plasma wavenumber associated with density \(n_0\). Here, with \(z_0\) as the start of the plasma, \(n(z-z_0)\) is a cosine-like up-ramp from \(0\) to \(L_{ramp,up}\), constant to \(1\) from \(L_{ramp,up}\) to \(L_{ramp,up} + L_{plateau}\) and a cosine-like down-ramp from \(L_{ramp,up} + L_{plateau}\) to \(L_{ramp,up} + L_{plateau}+L_{ramp,down}\). All parameters are given in predefined_profile_params.

  • <species_name>.predefined_profile_params (list of float)

    Parameters for the predefined profiles.

    • If species_name.predefined_profile_name is parabolic_channel, predefined_profile_params contains a space-separated list of the following parameters, in this order: \(z_0\) \(L_{ramp,up}\) \(L_{plateau}\) \(L_{ramp,down}\) \(R_c\) \(n_0\)

  • <species_name>.do_backward_propagation (bool)

    Inject a backward-propagating beam to reduce the effect of charge-separation fields when running in the boosted frame. See examples.

  • <species_name>.split_type (int) optional (default 0)

    Splitting technique. When 0, particles are split along the simulation axes (4 particles in 2D, 6 particles in 3D). When 1, particles are split along the diagonals (4 particles in 2D, 8 particles in 3D).

  • <species_name>.do_not_deposit (0 or 1 optional; default 0)

    If 1 is given, both charge deposition and current deposition will not be done, thus that species does not contribute to the fields.

  • <species_name>.do_not_gather (0 or 1 optional; default 0)

    If 1 is given, field gather from grids will not be done, thus that species will not be affected by the field on grids.

  • <species_name>.do_not_push (0 or 1 optional; default 0)

    If 1 is given, this species will not be pushed by any pusher during the simulation.

  • <species_name>.addIntegerAttributes (list of string)

    User-defined integer particle attribute for species, species_name. These integer attributes will be initialized with user-defined functions when the particles are generated. If the user-defined integer attribute is <int_attrib_name> then the following required parameter must be specified to initialize the attribute. * <species_name>.attribute.<int_attrib_name>(x,y,z,ux,uy,uz,t) (string) t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent the particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g. If electrons.addIntegerAttributes = upstream and electrons.upstream(x,y,z,ux,uy,uz,t) = (x>0.0)*1 is provided then, an integer attribute upstream is added to all electron particles and when these particles are generated, the particles with position less than 0 are assigned a value of 1.

  • <species_name>.addRealAttributes (list of string)

    User-defined real particle attribute for species, species_name. These real attributes will be initialized with user-defined functions when the particles are generated. If the user-defined real attribute is <real_attrib_name> then the following required parameter must be specified to initialize the attribute.

    • <species_name>.attribute.<real_attrib_name>(x,y,z,ux,uy,uz,t) (string) t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent the particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light.

  • <species>.save_particles_at_xlo/ylo/zlo, <species>.save_particles_at_xhi/yhi/zhi and <species>.save_particles_at_eb (0 or 1 optional, default 0)

    If 1 particles of this species will be copied to the scraped particle buffer for the specified boundary if they leave the simulation domain in the specified direction. If USE_EB=TRUE the save_particles_at_eb flag can be set to 1 to also save particle data for the particles of this species that impact the embedded boundary. The scraped particle buffer can be used to track particle fluxes out of the simulation. The particle data can be written out by setting up a BoundaryScrapingDiagnostic. It is also accessible via the Python interface. The function get_particle_boundary_buffer, found in the picmi.Simulation class as sim.extension.get_particle_boundary_buffer(), can be used to access the scraped particle buffer. An entry is included for every particle in the buffer of the timestep at which the particle was scraped. This can be accessed by passing the argument comp_name="step_scraped" to the above mentioned function.

    Note

    When accessing the data via Python, the scraped particle buffer relies on the user to clear the buffer after processing the data. The buffer will grow unbounded as particles are scraped and therefore could lead to memory issues if not periodically cleared. To clear the buffer call clear_buffer().

  • <species>.do_field_ionization (0 or 1) optional (default 0)

    Do field ionization for this species (using the ADK theory).

  • <species>.do_adk_correction (0 or 1) optional (default 0)

    Whether to apply the correction to the ADK theory proposed by Zhang, Lan and Lu in Q. Zhang et al. (Phys. Rev. A 90, 043410, 2014). If so, the probability of ionization is modified using an empirical model that should be more accurate in the regime of high electric fields. Currently, this is only implemented for Hydrogen, although Argon is also available in the same reference.

  • <species>.physical_element (string)

    Only read if do_field_ionization = 1. Symbol of chemical element for this species. Example: for Helium, use physical_element = He. All the elements up to atomic number Z=100 (Fermium) are supported.

  • <species>.ionization_product_species (string)

    Only read if do_field_ionization = 1. Name of species in which ionized electrons are stored. This species must be created as a regular species in the input file (in particular, it must be in particles.species_names).

  • <species>.ionization_initial_level (int) optional (default 0)

    Only read if do_field_ionization = 1. Initial ionization level of the species (must be smaller than the atomic number of chemical element given in physical_element).

  • <species>.do_classical_radiation_reaction (int) optional (default 0)

    Enables Radiation Reaction (or Radiation Friction) for the species. Species must be either electrons or positrons. Boris pusher must be used for the simulation. If both <species>.do_classical_radiation_reaction and <species>.do_qed_quantum_sync are enabled, then the classical module will be used when the particle’s chi parameter is below qed_qs.chi_min, the discrete quantum module otherwise.

  • <species>.do_qed_quantum_sync (int) optional (default 0)

    Enables Quantum synchrotron emission for this species. Quantum synchrotron lookup table should be either generated or loaded from disk to enable this process (see “Lookup tables for QED modules” section below). <species> must be either an electron or a positron species. This feature requires to compile with QED=TRUE

  • <species>.do_qed_breit_wheeler (int) optional (default 0)

    Enables non-linear Breit-Wheeler process for this species. Breit-Wheeler lookup table should be either generated or loaded from disk to enable this process (see “Lookup tables for QED modules” section below). <species> must be a photon species. This feature requires to compile with QED=TRUE

  • <species>.qed_quantum_sync_phot_product_species (string)

    If an electron or a positron species has the Quantum synchrotron process, a photon product species must be specified (the name of an existing photon species must be provided) This feature requires to compile with QED=TRUE

  • <species>.qed_breit_wheeler_ele_product_species (string)

    If a photon species has the Breit-Wheeler process, an electron product species must be specified (the name of an existing electron species must be provided) This feature requires to compile with QED=TRUE

  • <species>.qed_breit_wheeler_pos_product_species (string)

    If a photon species has the Breit-Wheeler process, a positron product species must be specified (the name of an existing positron species must be provided). This feature requires to compile with QED=TRUE

  • <species>.do_resampling (0 or 1) optional (default 0)

    If 1 resampling is performed for this species. This means that the number of macroparticles will be reduced at specific timesteps while preserving the distribution function as much as possible (details depend on the chosen resampling algorithm). This can be useful in situations with continuous creation of particles (e.g. with ionization or with QED effects). At least one resampling trigger (see below) must be specified to actually perform resampling.

  • <species>.resampling_algorithm (string) optional (default leveling_thinning)

    The algorithm used for resampling:

    • leveling_thinning This algorithm is defined in Muraviev et al. [4]. It has one parameter:

      • <species>.resampling_algorithm_target_ratio (float) optional (default 1.5)

        This roughly corresponds to the ratio between the number of particles before and after resampling.

    • velocity_coincidence_thinning` The particles are sorted into phase space cells and merged, similar to the approach described in Vranic et al. [5]. It has three parameters:

      • <species>.resampling_algorithm_delta_ur (float)

        The width of momentum cells used in clustering particles, in m/s.

      • <species>.resampling_algorithm_n_theta (int)

        The number of cell divisions to use in the \(\theta\) direction when clustering the particle velocities.

      • <species>.resampling_algorithm_n_phi (int)

        The number of cell divisions to use in the \(\phi\) direction when clustering the particle velocities.

  • <species>.resampling_min_ppc (int) optional (default 1)

    Resampling is not performed in cells with a number of macroparticles strictly smaller than this parameter.

  • <species>.resampling_trigger_intervals (string) optional (default 0)

    Using the Intervals parser syntax, this string defines timesteps at which resampling is performed.

  • <species>.resampling_trigger_max_avg_ppc (float) optional (default infinity)

    Resampling is performed everytime the number of macroparticles per cell of the species averaged over the whole simulation domain exceeds this parameter.

Cold Relativistic Fluid initialization

  • fluids.species_names (strings, separated by spaces)

    Defines the names of each fluid species. It is a required input to create and evolve fluid species using the cold relativistic fluid equations. Most of the parameters described in the section “Particle initialization” can also be used to initialize fluid properties (e.g. initial density distribution). For fluid-specific inputs we use <fluid_species_name> as a placeholder. Also see external fields for how to specify these for fluids as the function names differ.

Laser initialization

  • lasers.names (list of string)

    Name of each laser. This is then used in the rest of the input deck ; in this documentation we use <laser_name> as a placeholder. The parameters below must be provided for each laser pulse.

  • <laser_name>.position (3 floats in 3D and 2D ; in meters)

    The coordinates of one of the point of the antenna that will emit the laser. The plane of the antenna is entirely defined by <laser_name>.position and <laser_name>.direction.

    <laser_name>.position also corresponds to the origin of the coordinates system for the laser tranverse profile. For instance, for a Gaussian laser profile, the peak of intensity will be at the position given by <laser_name>.position. This variable can thus be used to shift the position of the laser pulse transversally.

    Note

    In 2D, <laser_name>.position is still given by 3 numbers, but the second number is ignored.

    When running a boosted-frame simulation, provide the value of <laser_name>.position in the laboratory frame, and use warpx.gamma_boost to automatically perform the conversion to the boosted frame. Note that, in this case, the laser antenna will be moving, in the boosted frame.

  • <laser_name>.polarization (3 floats in 3D and 2D)

    The coordinates of a vector that points in the direction of polarization of the laser. The norm of this vector is unimportant, only its direction matters.

    Note

    Even in 2D, all the 3 components of this vectors are important (i.e. the polarization can be orthogonal to the plane of the simulation).

  • <laser_name>.direction (3 floats in 3D)

    The coordinates of a vector that points in the propagation direction of the laser. The norm of this vector is unimportant, only its direction matters.

    The plane of the antenna that will emit the laser is orthogonal to this vector.

    Warning

    When running boosted-frame simulations, <laser_name>.direction should be parallel to warpx.boost_direction, for now.

  • <laser_name>.e_max (float ; in V/m)

    Peak amplitude of the laser field, in the focal plane.

    For a laser with a wavelength \(\lambda = 0.8\,\mu m\), the peak amplitude is related to \(a_0\) by:

    \[E_{max} = a_0 \frac{2 \pi m_e c^2}{e\lambda} = a_0 \times (4.0 \cdot 10^{12} \;V.m^{-1})\]

    When running a boosted-frame simulation, provide the value of <laser_name>.e_max in the laboratory frame, and use warpx.gamma_boost to automatically perform the conversion to the boosted frame.

  • <laser_name>.a0 (float ; dimensionless)

    Peak normalized amplitude of the laser field, in the focal plane (given in the lab frame, just as e_max above). See the description of <laser_name>.e_max for the conversion between a0 and e_max. Either a0 or e_max must be specified.

  • <laser_name>.wavelength (float; in meters)

    The wavelength of the laser in vacuum.

    When running a boosted-frame simulation, provide the value of <laser_name>.wavelength in the laboratory frame, and use warpx.gamma_boost to automatically perform the conversion to the boosted frame.

  • <laser_name>.profile (string)

    The spatio-temporal shape of the laser. The options that are currently implemented are:

    • "Gaussian": The transverse and longitudinal profiles are Gaussian.

    • "parse_field_function": the laser electric field is given by a function in the input file. It requires additional argument <laser_name>.field_function(X,Y,t), which is a mathematical expression , e.g. <laser_name>.field_function(X,Y,t) = "a0*X**2 * (X>0) * cos(omega0*t)" where a0 and omega0 are a user-defined constant, see above. The profile passed here is the full profile, not only the laser envelope. t is time and X and Y are coordinates orthogonal to <laser_name>.direction (not necessarily the x and y coordinates of the simulation). All parameters above are required, but none of the parameters below are used when <laser_name>.parse_field_function=1. Even though <laser_name>.wavelength and <laser_name>.e_max should be included in the laser function, they still have to be specified as they are used for numerical purposes.

    • "from_file": the electric field of the laser is read from an external file. Currently both the lasy format as well as a custom binary format are supported. It requires to provide the name of the file to load setting the additional parameter <laser_name>.binary_file_name or <laser_name>.lasy_file_name (string). It accepts an optional parameter <laser_name>.time_chunk_size (int), supported for both lasy and binary files; this allows to read only time_chunk_size timesteps from the file. New timesteps are read as soon as they are needed.

      The default value is automatically set to the number of timesteps contained in the file (i.e. only one read is performed at the beginning of the simulation). It also accepts the optional parameter <laser_name>.delay (float; in seconds), which allows delaying (delay > 0) or anticipating (delay < 0) the laser by the specified amount of time.

      Details about the usage of the lasy format: lasy can produce either 3D Cartesian files or RZ files. WarpX can read both types of files independently of the geometry in which it was compiled (e.g. WarpX compiled with WarpX_DIMS=RZ can read 3D Cartesian lasy files). In the case where WarpX is compiled in 2D (or 1D) Cartesian, the laser antenna will emit the field values that correspond to the slice y=0 in the lasy file (and x=0 in the 1D case). One can generate a lasy file from Python, see an example at Examples/Tests/laser_injection_from_file.

      Details about the usage of the binary format: The external binary file should provide E(x,y,t) on a rectangular (necessarily uniform) grid. The code performs a bi-linear (in 2D) or tri-linear (in 3D) interpolation to set the field values. x,y,t are meant to be in S.I. units, while the field value is meant to be multiplied by <laser_name>.e_max (i.e. in most cases the maximum of abs(E(x,y,t)) should be 1, so that the maximum field intensity can be set straightforwardly with <laser_name>.e_max). The binary file has to respect the following format:

      • flag to indicate the grid is uniform (1 byte, 0 means non-uniform, !=0 means uniform) - only uniform is supported

      • nt, number of timesteps (uint32_t, must be >=2)

      • nx, number of points along x (uint32_t, must be >=2)

      • ny, number of points along y (uint32_t, must be 1 for 2D simulations and >=2 for 3D simulations)

      • timesteps (double[2]=[t_min,t_max])

      • x_coords (double[2]=[x_min,x_max])

      • y_coords (double[1] in 2D, double[2]=[y_min,y_max] in 3D)

      • field_data (double[nt x nx * ny], with nt being the slowest coordinate).

      A binary file can be generated from Python, see an example at Examples/Tests/laser_injection_from_file

  • <laser_name>.profile_t_peak (float; in seconds)

    The time at which the laser reaches its peak intensity, at the position given by <laser_name>.position (only used for the "gaussian" profile)

    When running a boosted-frame simulation, provide the value of <laser_name>.profile_t_peak in the laboratory frame, and use warpx.gamma_boost to automatically perform the conversion to the boosted frame.

  • <laser_name>.profile_duration (float ; in seconds)

    The duration of the laser pulse for the "gaussian" profile, defined as \(\tau\) below:

    \[E(\boldsymbol{x},t) \propto \exp\left( -\frac{(t-t_{peak})^2}{\tau^2} \right)\]

    Note that \(\tau\) relates to the full width at half maximum (FWHM) of intensity, which is closer to pulse length measurements in experiments, as \(\tau = \mathrm{FWHM}_I / \sqrt{2\ln(2)}\) \(\approx \mathrm{FWHM}_I / 1.1774\).

    For a chirped laser pulse (i.e. with a non-zero <laser_name>.phi2), profile_duration is the Fourier-limited duration of the pulse, not the actual duration of the pulse. See the documentation for <laser_name>.phi2 for more detail.

    When running a boosted-frame simulation, provide the value of <laser_name>.profile_duration in the laboratory frame, and use warpx.gamma_boost to automatically perform the conversion to the boosted frame.

  • <laser_name>.profile_waist (float ; in meters)

    The waist of the transverse Gaussian \(w_0\), i.e. defined such that the electric field of the laser pulse in the focal plane is of the form:

    \[E(\boldsymbol{x},t) \propto \exp\left( -\frac{\boldsymbol{x}_\perp^2}{w_0^2} \right)\]
  • <laser_name>.profile_focal_distance (float; in meters)

    The distance from laser_position to the focal plane. (where the distance is defined along the direction given by <laser_name>.direction.)

    Use a negative number for a defocussing laser instead of a focussing laser.

    When running a boosted-frame simulation, provide the value of <laser_name>.profile_focal_distance in the laboratory frame, and use warpx.gamma_boost to automatically perform the conversion to the boosted frame.

  • <laser_name>.phi0 (float; in radians) optional (default 0.)

    The Carrier Envelope Phase, i.e. the phase of the laser oscillation, at the position where the laser envelope is maximum (only used for the "gaussian" profile)

  • <laser_name>.stc_direction (3 floats) optional (default 1. 0. 0.)

    Direction of laser spatio-temporal couplings. See definition in Akturk et al. [6].

  • <laser_name>.zeta (float; in meters.seconds) optional (default 0.)

    Spatial chirp at focus in direction <laser_name>.stc_direction. See definition in Akturk et al. [6].

  • <laser_name>.beta (float; in seconds) optional (default 0.)

    Angular dispersion (or angular chirp) at focus in direction <laser_name>.stc_direction. See definition in Akturk et al. [6].

  • <laser_name>.phi2 (float; in seconds**2) optional (default 0.)

    The amount of temporal chirp \(\phi^{(2)}\) at focus (in the lab frame). Namely, a wave packet centered on the frequency \((\omega_0 + \delta \omega)\) will reach its peak intensity at \(z(\delta \omega) = z_0 - c \phi^{(2)} \, \delta \omega\). Thus, a positive \(\phi^{(2)}\) corresponds to positive chirp, i.e. red part of the spectrum in the front of the pulse and blue part of the spectrum in the back. More specifically, the electric field in the focal plane is of the form:

    \[E(\boldsymbol{x},t) \propto Re\left[ \exp\left( -\frac{(t-t_{peak})^2}{\tau^2 + 2i\phi^{(2)}} + i\omega_0 (t-t_{peak}) + i\phi_0 \right) \right]\]

    where \(\tau\) is given by <laser_name>.profile_duration and represents the Fourier-limited duration of the laser pulse. Thus, the actual duration of the chirped laser pulse is:

    \[\tau' = \sqrt{ \tau^2 + 4 (\phi^{(2)})^2/\tau^2 }\]

    See also the definition in Akturk et al. [6].

  • <laser_name>.do_continuous_injection (0 or 1) optional (default 0).

    Whether or not to use continuous injection. If the antenna starts outside of the simulation domain but enters it at some point (due to moving window or moving antenna in the boosted frame), use this so that the laser antenna is injected when it reaches the box boundary. If running in a boosted frame, this requires the boost direction, moving window direction and laser propagation direction to be along z. If not running in a boosted frame, this requires the moving window and laser propagation directions to be the same (x, y or z)

  • <laser_name>.min_particles_per_mode (int) optional (default 4)

    When using the RZ version, this specifies the minimum number of particles per angular mode. The laser particles are loaded into radial spokes, with the number of spokes given by min_particles_per_mode*(warpx.n_rz_azimuthal_modes-1).

  • lasers.deposit_on_main_grid (int) optional (default 0)

    When using mesh refinement, whether the antenna that emits the laser deposits charge/current only on the main grid (i.e. level 0), or also on the higher mesh-refinement levels.

  • warpx.num_mirrors (int) optional (default 0)

    Users can input perfect mirror condition inside the simulation domain. The number of mirrors is given by warpx.num_mirrors. The mirrors are orthogonal to the z direction. The following parameters are required when warpx.num_mirrors is >0.

  • warpx.mirror_z (list of float) required if warpx.num_mirrors>0

    z location of the front of the mirrors.

  • warpx.mirror_z_width (list of float) required if warpx.num_mirrors>0

    z width of the mirrors.

  • warpx.mirror_z_npoints (list of int) required if warpx.num_mirrors>0

    In the boosted frame, depending on gamma_boost, warpx.mirror_z_width can be smaller than the cell size, so that the mirror would not work. This parameter is the minimum number of points for the mirror. If mirror_z_width < dz/cell_size, the upper bound of the mirror is increased so that it contains at least mirror_z_npoints.

External fields

Applied to the grid

The external fields defined with input parameters that start with warpx.B_ext_grid_init_ or warpx.E_ext_grid_init_ are applied to the grid directly. In particular, these fields can be seen in the diagnostics that output the fields on the grid.

  • When using an electromagnetic field solver, these fields are applied to the grid at the beginning of the simulation, and serve as initial condition for the Maxwell solver.

  • When using an electrostatic or magnetostatic field solver, these fields are added to the fields computed by the Poisson solver, at each timestep.

  • warpx.B_ext_grid_init_style (string) optional

    This parameter determines the type of initialization for the external magnetic field. By default, the external magnetic field (Bx,By,Bz) is initialized to (0.0, 0.0, 0.0). The string can be set to “constant” if a constant magnetic field is required to be set at initialization. If set to “constant”, then an additional parameter, namely, warpx.B_external_grid must be specified. If set to parse_B_ext_grid_function, then a mathematical expression can be used to initialize the external magnetic field on the grid. It requires additional parameters in the input file, namely, warpx.Bx_external_grid_function(x,y,z), warpx.By_external_grid_function(x,y,z), warpx.Bz_external_grid_function(x,y,z) to initialize the external magnetic field for each of the three components on the grid. Constants required in the expression can be set using my_constants. For example, if warpx.Bx_external_grid_function(x,y,z)=Bo*x + delta*(y + z) then the constants Bo and delta required in the above equation can be set using my_constants.Bo= and my_constants.delta= in the input file. For a two-dimensional simulation, it is assumed that the first dimension is x and the second dimension is z, and the value of y is set to zero. Note that the current implementation of the parser for external B-field does not work with RZ and the code will abort with an error message.

    If B_ext_grid_init_style is set to be read_from_file, an additional parameter, indicating the path of an openPMD data file, warpx.read_fields_from_path must be specified, from which external B field data can be loaded into WarpX. One can refer to input files in Examples/Tests/LoadExternalField for more information. Regarding how to prepare the openPMD data file, one can refer to the openPMD-example-datasets.

  • warpx.E_ext_grid_init_style (string) optional

    This parameter determines the type of initialization for the external electric field. By default, the external electric field (Ex,Ey,Ez) to (0.0, 0.0, 0.0). The string can be set to “constant” if a constant electric field is required to be set at initialization. If set to “constant”, then an additional parameter, namely, warpx.E_external_grid must be specified in the input file. If set to parse_E_ext_grid_function, then a mathematical expression can be used to initialize the external electric field on the grid. It required additional parameters in the input file, namely, warpx.Ex_external_grid_function(x,y,z), warpx.Ey_external_grid_function(x,y,z), warpx.Ez_external_grid_function(x,y,z) to initialize the external electric field for each of the three components on the grid. Constants required in the expression can be set using my_constants. For example, if warpx.Ex_external_grid_function(x,y,z)=Eo*x + delta*(y + z) then the constants Bo and delta required in the above equation can be set using my_constants.Eo= and my_constants.delta= in the input file. For a two-dimensional simulation, it is assumed that the first dimension is x and the second dimension is z, and the value of y is set to zero. Note that the current implementation of the parser for external E-field does not work with RZ and the code will abort with an error message.

    If E_ext_grid_init_style is set to be read_from_file, an additional parameter, indicating the path of an openPMD data file, warpx.read_fields_from_path must be specified, from which external E field data can be loaded into WarpX. One can refer to input files in Examples/Tests/LoadExternalField for more information. Regarding how to prepare the openPMD data file, one can refer to the openPMD-example-datasets. Note that if both B_ext_grid_init_style and E_ext_grid_init_style are set to read_from_file, the openPMD file specified by warpx.read_fields_from_path should contain both B and E external fields data.

  • warpx.E_external_grid & warpx.B_external_grid (list of 3 floats)

    required when warpx.E_ext_grid_init_style="constant" and when warpx.B_ext_grid_init_style="constant", respectively. External uniform and constant electrostatic and magnetostatic field added to the grid at initialization. Use with caution as these fields are used for the field solver. In particular, do not use any other boundary condition than periodic.

  • warpx.maxlevel_extEMfield_init (default is maximum number of levels in the simulation)

    With this parameter, the externally applied electric and magnetic fields will not be applied for levels greater than warpx.maxlevel_extEMfield_init. For some mesh-refinement simulations, the external fields are only applied to the parent grid and not the refined patches. In such cases, warpx.maxlevel_extEMfield_init can be set to 0. In that case, the other levels have external field values of 0.

Applied to Particles

The external fields defined with input parameters that start with warpx.B_ext_particle_init_ or warpx.E_ext_particle_init_ are applied to the particles directly, at each timestep. As a results, these fields cannot be seen in the diagnostics that output the fields on the grid.

  • particles.E_ext_particle_init_style & particles.B_ext_particle_init_style (string) optional (default “none”)

    These parameters determine the type of the external electric and magnetic fields respectively that are applied directly to the particles at every timestep. The field values are specified in the lab frame. With the default none style, no field is applied. Possible values are constant, parse_E_ext_particle_function or parse_B_ext_particle_function, or repeated_plasma_lens.

    • constant: a constant field is applied, given by the input parameters particles.E_external_particle or particles.B_external_particle, which are lists of the field components.

    • parse_E_ext_particle_function or parse_B_ext_particle_function: the field is specified as an analytic expression that is a function of space (x,y,z) and time (t), relative to the lab frame. The E-field is specified by the input parameters:

      • particles.Ex_external_particle_function(x,y,z,t)

      • particles.Ey_external_particle_function(x,y,z,t)

      • particles.Ez_external_particle_function(x,y,z,t)

      The B-field is specified by the input parameters:

      • particles.Bx_external_particle_function(x,y,z,t)

      • particles.By_external_particle_function(x,y,z,t)

      • particles.Bz_external_particle_function(x,y,z,t)

      Note that the position is defined in Cartesian coordinates, as a function of (x,y,z), even for RZ.

    • repeated_plasma_lens: apply a series of plasma lenses. The properties of the lenses are defined in the lab frame by the input parameters:

      • repeated_plasma_lens_period, the period length of the repeat, a single float number,

      • repeated_plasma_lens_starts, the start of each lens relative to the period, an array of floats,

      • repeated_plasma_lens_lengths, the length of each lens, an array of floats,

      • repeated_plasma_lens_strengths_E, the electric focusing strength of each lens, an array of floats, when particles.E_ext_particle_init_style is set to repeated_plasma_lens.

      • repeated_plasma_lens_strengths_B, the magnetic focusing strength of each lens, an array of floats, when particles.B_ext_particle_init_style is set to repeated_plasma_lens.

      The repeated lenses are only defined for \(z > 0\). Once the number of lenses specified in the input are exceeded, the repeated lens stops.

      The applied field is uniform longitudinally (along z) with a hard edge, where residence corrections are used for more accurate field calculation. On the time step when a particle enters or leaves each lens, the field applied is scaled by the fraction of the time step spent within the lens. The fields are of the form \(E_x = \mathrm{strength} \cdot x\), \(E_y = \mathrm{strength} \cdot y\), and \(E_z = 0\), and \(B_x = \mathrm{strength} \cdot y\), \(B_y = -\mathrm{strength} \cdot x\), and \(B_z = 0\).

Applied to Cold Relativistic Fluids

The external fields defined with input parameters that start with warpx.B_ext_init_ or warpx.E_ext_init_ are applied to the fluids directly, at each timestep. As a results, these fields cannot be seen in the diagnostics that output the fields on the grid.

  • <fluid_species_name>.E_ext_init_style & <fluid_species_name>.B_ext_init_style (string) optional (default “none”)

    These parameters determine the type of the external electric and magnetic fields respectively that are applied directly to the cold relativistic fluids at every timestep. The field values are specified in the lab frame. With the default none style, no field is applied. Possible values are parse_E_ext_function or parse_B_ext_function.

    • parse_E_ext_function or parse_B_ext_function: the field is specified as an analytic expression that is a function of space (x,y,z) and time (t), relative to the lab frame. The E-field is specified by the input parameters:

      • <fluid_species_name>.Ex_external_function(x,y,z,t)

      • <fluid_species_name>.Ey_external_function(x,y,z,t)

      • <fluid_species_name>.Ez_external_function(x,y,z,t)

      The B-field is specified by the input parameters:

      • <fluid_species_name>.Bx_external_function(x,y,z,t)

      • <fluid_species_name>.By_external_function(x,y,z,t)

      • <fluid_species_name>.Bz_external_function(x,y,z,t)

      Note that the position is defined in Cartesian coordinates, as a function of (x,y,z), even for RZ.

Accelerator Lattice

Several accelerator lattice elements can be defined as described below. The elements are defined relative to the z axis and in the lab frame, starting at z = 0. They are described using a simplified MAD like syntax. Note that elements of the same type cannot overlap each other.

  • lattice.elements (list of strings) optional (default: no elements)

    A list of names (one name per lattice element), in the order that they appear in the lattice.

  • lattice.reverse (boolean) optional (default: false)

    Reverse the list of elements in the lattice.

  • <element_name>.type (string)

    Indicates the element type for this lattice element. This should be one of:

    • drift for free drift. This requires this additional parameter:

      • <element_name>.ds (float, in meters) the segment length

    • quad for a hard edged quadrupole. This applies a quadrupole field that is uniform within the z extent of the element with a sharp cut off at the ends. This uses residence corrections, with the field scaled by the amount of time within the element for particles entering or leaving it, to increase the accuracy. This requires these additional parameters:

      • <element_name>.ds (float, in meters) the segment length

      • <element_name>.dEdx (float, in volts/meter^2) optional (default: 0.) the electric quadrupole field gradient The field applied to the particles will be Ex = dEdx*x and Ey = -dEdx*y.

      • <element_name>.dBdx (float, in Tesla/meter) optional (default: 0.) the magnetic quadrupole field gradient The field applied to the particles will be Bx = dBdx*y and By = dBdx*x.

    • plasmalens for a field modeling a plasma lens This applies a radially directed plasma lens field that is uniform within the z extent of the element with a sharp cut off at the ends. This uses residence corrections, with the field scaled by the amount of time within the element for particles entering or leaving it, to increase the accuracy. This requires these additional parameters:

      • <element_name>.ds (float, in meters) the segment length

      • <element_name>.dEdx (float, in volts/meter^2) optional (default: 0.) the electric field gradient The field applied to the particles will be Ex = dEdx*x and Ey = dEdx*y.

      • <element_name>.dBdx (float, in Tesla/meter) optional (default: 0.) the magnetic field gradient The field applied to the particles will be Bx = dBdx*y and By = -dBdx*x.

    • line a sub-lattice (line) of elements to append to the lattice.

      • <element_name>.elements (list of strings) optional (default: no elements) A list of names (one name per lattice element), in the order that they appear in the lattice.

      • <element_name>.reverse (boolean) optional (default: false) Reverse the list of elements in the line before appending to the lattice.

Collision models

WarpX provides several particle collision models, using varying degrees of approximation. Details about the collision models can be found in the theory section.

  • collisions.collision_names (strings, separated by spaces)

    The name of each collision type. This is then used in the rest of the input deck; in this documentation we use <collision_name> as a placeholder.

  • <collision_name>.type (string) optional

    The type of collision. The types implemented are:

    • pairwisecoulomb for pair-wise Coulomb collisions, the default if unspecified. This provides a pair-wise relativistic elastic Monte Carlo binary Coulomb collision model, following the algorithm given by Pérez et al. [7]. When the RZ mode is used, warpx.n_rz_azimuthal_modes must be set to 1 at the moment, since the current implementation of the collision module assumes axisymmetry.

    • nuclearfusion for fusion reactions. This implements the pair-wise fusion model by Higginson et al. [8]. Currently, WarpX supports deuterium-deuterium, deuterium-tritium, deuterium-helium and proton-boron fusion. When initializing the reactant and product species, you need to use species_type (see the documentation for this parameter), so that WarpX can identify the type of reaction to use. (e.g. <species_name>.species_type = 'deuterium')

    • dsmc for pair-wise, non-Coulomb collisions between kinetic species. This is a “direct simulation Monte Carlo” treatment of collisions between kinetic species. See DSMC section.

    • background_mcc for collisions between particles and a neutral background. This is a relativistic Monte Carlo treatment for particles colliding with a neutral background gas. See MCC section.

    • background_stopping for slowing of ions due to collisions with electrons or ions. This implements the approximate formulae as derived in Introduction to Plasma Physics, from Goldston and Rutherford, section 14.2.

  • <collision_name>.species (strings)

    If using dsmc, pairwisecoulomb or nuclearfusion, this should be the name(s) of the species, between which the collision will be considered. (Provide only one name for intra-species collisions.) If using background_mcc or background_stopping type this should be the name of the species for which collisions with a background will be included. In this case, only one species name should be given.

  • <collision_name>.product_species (strings)

    Only for nuclearfusion. The name(s) of the species in which to add the new macroparticles created by the reaction.

  • <collision_name>.ndt (int) optional

    Execute collision every # time steps. The default value is 1.

  • <collision_name>.CoulombLog (float) optional

    Only for pairwisecoulomb. A provided fixed Coulomb logarithm of the collision type <collision_name>. For example, a typical Coulomb logarithm has a form of \(\ln(\lambda_D/R)\), where \(\lambda_D\) is the Debye length, \(R\approx1.4A^{1/3}\) is the effective Coulombic radius of the nucleus, \(A\) is the mass number. If this is not provided, or if a non-positive value is provided, a Coulomb logarithm will be computed automatically according to the algorithm in Pérez et al. [7].

  • <collision_name>.fusion_multiplier (float) optional.

    Only for nuclearfusion. Increasing fusion_multiplier creates more macroparticles of fusion products, but with lower weight (in such a way that the corresponding total number of physical particle remains the same). This can improve the statistics of the simulation, in the case where fusion reactions are very rare. More specifically, in a fusion reaction between two macroparticles with weight w_1 and w_2, the weight of the product macroparticles will be min(w_1,w_2)/fusion_multiplier. (And the weights of the reactant macroparticles are reduced correspondingly after the reaction.) See Higginson et al. [8] for more details. The default value of fusion_multiplier is 1.

  • <collision_name>.fusion_probability_threshold (float) optional.

    Only for nuclearfusion. If the fusion multiplier is too high and results in a fusion probability that approaches 1 (for a given collision between two macroparticles), then there is a risk of underestimating the total fusion yield. In these cases, WarpX reduces the fusion multiplier used in that given collision. m_probability_threshold is the fusion probability threshold above which WarpX reduces the fusion multiplier.

  • <collision_name>.fusion_probability_target_value (float) optional.

    Only for nuclearfusion. When the probability of fusion for a given collision exceeds fusion_probability_threshold, WarpX reduces the fusion multiplier for that collisions such that the fusion probability approches fusion_probability_target_value.

  • <collision_name>.background_density (float)

    Only for background_mcc and background_stopping. The density of the background in \(m^{-3}\). Can also provide <collision_name>.background_density(x,y,z,t) using the parser initialization style for spatially and temporally varying density. With background_mcc, if a function is used for the background density, the input parameter <collision_name>.max_background_density must also be provided to calculate the maximum collision probability.

  • <collision_name>.background_temperature (float)

    Only for background_mcc and background_stopping. The temperature of the background in Kelvin. Can also provide <collision_name>.background_temperature(x,y,z,t) using the parser initialization style for spatially and temporally varying temperature.

  • <collision_name>.background_mass (float) optional

    Only for background_mcc and background_stopping. The mass of the background gas in kg. With background_mcc, if not given the mass of the colliding species will be used unless ionization is included in which case the mass of the product species will be used. With background_stopping, and background_type set to electrons, if not given defaults to the electron mass. With background_type set to ions, the mass must be given.

  • <collision_name>.background_charge_state (float)

    Only for background_stopping, where it is required when background_type is set to ions. This specifies the charge state of the background ions.

  • <collision_name>.background_type (string)

    Only for background_stopping, where it is required, the type of the background. The possible values are electrons and ions. When electrons, equation 14.12 from Goldston and Rutherford is used. This formula is based on Coulomb collisions with the approximations that \(M_b >> m_e\) and \(V << v_{thermal\_e}\), and the assumption that the electrons have a Maxwellian distribution with temperature \(T_e\).

    \[\frac{dV}{dt} = - \frac{2^{1/2}n_eZ_b^2e^4m_e^{1/2}\log\Lambda}{12\pi^{3/2}\epsilon_0M_bT_e^{3/2}}V\]

    where \(V\) is each velocity component, \(n_e\) is the background density, \(Z_b\) is the ion charge state, \(e\) is the electron charge, \(m_e\) is the background mass, \(\log\Lambda=\log((12\pi/Z_b)(n_e\lambda_{de}^3))\), \(\lambda_{de}\) is the DeBye length, and \(M_b\) is the ion mass. The equation is integrated over a time step, giving \(V(t+dt) = V(t)*\exp(-\alpha*{dt})\) where \(\alpha\) is the factor multiplying \(V\).

    When ions, equation 14.20 is used. This formula is based on Coulomb collisions with the approximations that \(M_b >> M\) and \(V >> v_{thermal\_i}\). The background ion temperature only appears in the \(\log\Lambda\) term.

    \[\frac{dW_b}{dt} = - \frac{2^{1/2}n_iZ^2Z_b^2e^4M_b^{1/2}\log\Lambda}{8\pi\epsilon_0MW_b^{1/2}}\]

    where \(W_b\) is the ion energy, \(n_i\) is the background density, \(Z\) is the charge state of the background ions, \(Z_b\) is the ion charge state, \(e\) is the electron charge, \(M_b\) is the ion mass, \(\log\Lambda=\log((12\pi/Z_b)(n_i\lambda_{di}^3))\), \(\lambda_{di}\) is the DeBye length, and \(M\) is the background ion mass. The equation is integrated over a time step, giving \(W_b(t+dt) = ((W_b(t)^{3/2}) - 3/2\beta{dt})^{2/3}\) where \(\beta\) is the term on the r.h.s except \(W_b\).

  • <collision_name>.scattering_processes (strings separated by spaces)

    Only for dsmc and background_mcc. The scattering processes that should be included. Available options are elastic, back & charge_exchange for ions and elastic, excitationX & ionization for electrons. Multiple excitation events can be included for electrons corresponding to excitation to different levels, the X above can be changed to a unique identifier for each excitation process. For each scattering process specified a path to a cross-section data file must also be given. We use <scattering_process> as a placeholder going forward.

  • <collision_name>.<scattering_process>_cross_section (string)

    Only for dsmc and background_mcc. Path to the file containing cross-section data for the given scattering processes. The cross-section file must have exactly 2 columns of data, the first containing equally spaced energies in eV and the second the corresponding cross-section in \(m^2\). The energy column should represent the kinetic energy of the colliding particles in the center-of-mass frame.

  • <collision_name>.<scattering_process>_energy (float)

    Only for background_mcc. If the scattering process is either excitationX or ionization the energy cost of that process must be given in eV.

  • <collision_name>.ionization_species (float)

    Only for background_mcc. If the scattering process is ionization the produced species must also be given. For example if argon properties is used for the background gas, a species of argon ions should be specified here.

Numerics and algorithms

This section describes the input parameters used to select numerical methods and algorithms for your simulation setup.

Time step

  • warpx.cfl (float) optional (default 0.999)

    The ratio between the actual timestep that is used in the simulation and the Courant-Friedrichs-Lewy (CFL) limit. (e.g. for warpx.cfl=1, the timestep will be exactly equal to the CFL limit.) This parameter will only be used with the electromagnetic solver.

  • warpx.const_dt (float)

    Allows direct specification of the time step size, in units of seconds. When the electrostatic solver is being used, this must be supplied. This can be used with the electromagnetic solver, overriding warpx.cfl, but it is up to the user to ensure that the CFL condition is met.

Filtering

  • warpx.use_filter (0 or 1; default: 1, except for RZ FDTD)

    Whether to smooth the charge and currents on the mesh, after depositing them from the macro-particles. This uses a bilinear filter (see the filtering section). The default is 1 in all cases, except for simulations in RZ geometry using the FDTD solver. With the RZ PSATD solver, the filtering is done in \(k\)-space.

    Warning

    Known bug: filter currently not working with FDTD solver in RZ geometry (see https://github.com/ECP-WarpX/WarpX/issues/1943).

  • warpx.filter_npass_each_dir (3 int) optional (default 1 1 1)

    Number of passes along each direction for the bilinear filter. In 2D simulations, only the first two values are read.

  • warpx.use_filter_compensation (0 or 1; default: 0)

    Whether to add compensation when applying filtering. This is only supported with the RZ spectral solver.

Particle push, charge and current deposition, field gathering

  • algo.current_deposition (string, optional)

    This parameter selects the algorithm for the deposition of the current density. Available options are: direct, esirkepov, and vay. The default choice is esirkepov for FDTD maxwell solvers but direct for standard or Galilean PSATD solver (i.e. with algo.maxwell_solver = psatd) and for the hybrid-PIC solver (i.e. with algo.maxwell_solver = hybrid) and for diagnostics output with the electrostatic solvers (i.e., with warpx.do_electrostatic = ...). Note that vay is only available for algo.maxwell_solver = psatd.

    1. direct

      The current density is deposited as described in the section Current deposition. This deposition scheme does not conserve charge.

    2. esirkepov

      The current density is deposited as described in Esirkepov [9]. This deposition scheme guarantees charge conservation for shape factors of arbitrary order.

    3. vay

      The current density is deposited as described in Vay et al. [10] (see section Current deposition for more details). This option guarantees charge conservation only when used in combination with psatd.periodic_single_box_fft=1, that is, only for periodic single-box simulations with global FFTs without guard cells. The implementation for domain decomposition with local FFTs over guard cells is planned but not yet completed.

  • algo.charge_deposition (string, optional)

    The algorithm for the charge density deposition. Available options are:

  • algo.field_gathering (string, optional)

    The algorithm for field gathering. Available options are:

    • energy-conserving: gathers directly from the grid points (either staggered or nodal grid points depending on warpx.grid_type).

    • momentum-conserving: first average the fields from the grid points to the nodes, and then gather from the nodes.

    Default: algo.field_gathering = energy-conserving with collocated or staggered grids (note that energy-conserving and momentum-conserving are equivalent with collocated grids), algo.field_gathering = momentum-conserving with hybrid grids.

  • algo.particle_pusher (string, optional)

    The algorithm for the particle pusher. Available options are:

    • boris: Boris pusher.

    • vay: Vay pusher (see Vay [1])

    • higuera: Higuera-Cary pusher (see Higuera and Cary [11])

    If algo.particle_pusher is not specified, boris is the default.

  • algo.particle_shape (integer; 1, 2, 3, or 4)

    The order of the shape factors (splines) for the macro-particles along all spatial directions: 1 for linear, 2 for quadratic, 3 for cubic, 4 for quartic. Low-order shape factors result in faster simulations, but may lead to more noisy results. High-order shape factors are computationally more expensive, but may increase the overall accuracy of the results. For production runs it is generally safer to use high-order shape factors, such as cubic order.

    Note that this input parameter is not optional and must always be set in all input files provided that there is at least one particle species (set in input as particles.species_names) or one laser species (set in input as lasers.names) in the simulation. No default value is provided automatically.

Maxwell solver

Two families of Maxwell solvers are implemented in WarpX, based on the Finite-Difference Time-Domain method (FDTD) or the Pseudo-Spectral Analytical Time-Domain method (PSATD), respectively.

  • algo.maxwell_solver (string, optional)

    The algorithm for the Maxwell field solver. Available options are:

    • yee: Yee FDTD solver.

    • ckc: (not available in RZ geometry) Cole-Karkkainen solver with Cowan coefficients (see Cowan et al. [12]).

    • psatd: Pseudo-spectral solver (see theory).

    • ect: Enlarged cell technique (conformal finite difference solver. See Xiao and Liu [13]).

    • hybrid: The E-field will be solved using Ohm’s law and a kinetic-fluid hybrid model (see theory).

    • none: No field solve will be performed.

    If algo.maxwell_solver is not specified, yee is the default.

  • algo.em_solver_medium (string, optional)

    The medium for evaluating the Maxwell solver. Available options are :

    • vacuum: vacuum properties are used in the Maxwell solver.

    • macroscopic: macroscopic Maxwell equation is evaluated. If this option is selected, then the corresponding properties of the medium must be provided using macroscopic.sigma, macroscopic.epsilon, and macroscopic.mu for each case where the initialization style is constant. Otherwise if the initialization style uses the parser, macroscopic.sigma_function(x,y,z), macroscopic.epsilon_function(x,y,z) and/or macroscopic.mu_function(x,y,z) must be provided using the parser initialization style for spatially varying macroscopic properties.

    If algo.em_solver_medium is not specified, vacuum is the default.

Maxwell solver: PSATD method

  • psatd.nox, psatd.noy, pstad.noz (integer) optional (default 16 for all)

    The order of accuracy of the spatial derivatives, when using the code compiled with a PSATD solver. If psatd.periodic_single_box_fft is used, these can be set to inf for infinite-order PSATD.

  • psatd.nx_guard, psatd.ny_guard, psatd.nz_guard (integer) optional

    The number of guard cells to use with PSATD solver. If not set by users, these values are calculated automatically and determined empirically and equal the order of the solver for collocated grids and half the order of the solver for staggered grids.

  • psatd.periodic_single_box_fft (0 or 1; default: 0)

    If true, this will not incorporate the guard cells into the box over which FFTs are performed. This is only valid when WarpX is run with periodic boundaries and a single box. In this case, using psatd.periodic_single_box_fft is equivalent to using a global FFT over the whole domain. Therefore, all the approximations that are usually made when using local FFTs with guard cells (for problems with multiple boxes) become exact in the case of the periodic, single-box FFT without guard cells.

  • psatd.current_correction (0 or 1; default: 1, with the exceptions mentioned below)

    If true, a current correction scheme in Fourier space is applied in order to guarantee charge conservation. The default value is psatd.current_correction=1, unless a charge-conserving current deposition scheme is used (by setting algo.current_deposition=esirkepov or algo.current_deposition=vay) or unless the div(E) cleaning scheme is used (by setting warpx.do_dive_cleaning=1).

    If psatd.v_galilean is zero, the spectral solver used is the standard PSATD scheme described in Vay et al. [10] and the current correction reads

    \[\widehat{\boldsymbol{J}}^{\,n+1/2}_{\mathrm{correct}} = \widehat{\boldsymbol{J}}^{\,n+1/2} - \bigg(\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2} - i \frac{\widehat{\rho}^{n+1} - \widehat{\rho}^{n}}{\Delta{t}}\bigg) \frac{\boldsymbol{k}}{k^2}\]

    If psatd.v_galilean is non-zero, the spectral solver used is the Galilean PSATD scheme described in Lehe et al. [14] and the current correction reads

    \[\widehat{\boldsymbol{J}}^{\,n+1/2}_{\mathrm{correct}} = \widehat{\boldsymbol{J}}^{\,n+1/2} - \bigg(\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2} - (\boldsymbol{k}\cdot\boldsymbol{v}_G) \,\frac{\widehat\rho^{n+1} - \widehat\rho^{n}\theta^2}{1 - \theta^2}\bigg) \frac{\boldsymbol{k}}{k^2}\]

    where \(\theta=\exp(i\,\boldsymbol{k}\cdot\boldsymbol{v}_G\,\Delta{t}/2)\).

    This option is currently implemented only for the standard PSATD, Galilean PSATD, and averaged Galilean PSATD schemes, while it is not yet available for the multi-J algorithm.

  • psatd.update_with_rho (0 or 1)

    If true, the update equation for the electric field is expressed in terms of both the current density and the charge density, namely \(\widehat{\boldsymbol{J}}^{\,n+1/2}\), \(\widehat\rho^{n}\), and \(\widehat\rho^{n+1}\). If false, instead, the update equation for the electric field is expressed in terms of the current density \(\widehat{\boldsymbol{J}}^{\,n+1/2}\) only. If charge is expected to be conserved (by setting, for example, psatd.current_correction=1), then the two formulations are expected to be equivalent.

    If psatd.v_galilean is zero, the spectral solver used is the standard PSATD scheme described in Vay et al. [10]:

    1. if psatd.update_with_rho=0, the update equation for the electric field reads

    \[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1}= & \: C \widehat{\boldsymbol{E}}^{\,n} + i \, \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} - \frac{S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & +\frac{1-C}{k^2} (\boldsymbol{k}\cdot\widehat{\boldsymbol{E}}^{\,n}) \boldsymbol{k} + \frac{1}{\epsilon_0 k^2} \left(\frac{S}{c \, k}-\Delta{t}\right) (\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2}) \boldsymbol{k} \end{split}\end{split}\]
    1. if psatd.update_with_rho=1, the update equation for the electric field reads

    \[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1}= & \: C\widehat{\boldsymbol{E}}^{\,n} + i \, \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} - \frac{S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & + \frac{i}{\epsilon_0 k^2} \left(C-\frac{S}{c\,k}\frac{1}{\Delta{t}}\right) \widehat{\rho}^{n} \boldsymbol{k} - \frac{i}{\epsilon_0 k^2} \left(1-\frac{S}{c \, k} \frac{1}{\Delta{t}}\right)\widehat{\rho}^{n+1} \boldsymbol{k} \end{split}\end{split}\]

    The coefficients \(C\) and \(S\) are defined in Vay et al. [10].

    If psatd.v_galilean is non-zero, the spectral solver used is the Galilean PSATD scheme described in Lehe et al. [14]:

    1. if psatd.update_with_rho=0, the update equation for the electric field reads

    \[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1} = & \: \theta^{2} C \widehat{\boldsymbol{E}}^{\,n} + i \, \theta^{2} \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} + \frac{i \, \nu \, \theta \, \chi_1 - \theta^{2} S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & + \theta^{2} \frac{\chi_2-\chi_3}{k^{2}} (\boldsymbol{k}\cdot\widehat{\boldsymbol{E}}^{\,n}) \boldsymbol{k} + i \, \frac{\chi_2\left(\theta^{2}-1\right)}{\epsilon_0 c \, k^{3} \nu} (\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2}) \boldsymbol{k} \end{split}\end{split}\]
    1. if psatd.update_with_rho=1, the update equation for the electric field reads

    \[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1} = & \: \theta^{2} C \widehat{\boldsymbol{E}}^{\,n} + i \, \theta^{2} \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} + \frac{i \, \nu \, \theta \, \chi_1 - \theta^{2} S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & + i \, \frac{\theta^{2} \chi_3}{\epsilon_0 k^{2}} \widehat{\rho}^{\,n} \boldsymbol{k} - i \, \frac{\chi_2}{\epsilon_0 k^{2}} \widehat{\rho}^{\,n+1} \boldsymbol{k} \end{split}\end{split}\]

    The coefficients \(C\), \(S\), \(\theta\), \(\nu\), \(\chi_1\), \(\chi_2\), and \(\chi_3\) are defined in Lehe et al. [14].

    The default value for psatd.update_with_rho is 1 if psatd.v_galilean is non-zero and 0 otherwise. The option psatd.update_with_rho=0 is not implemented with the following algorithms: comoving PSATD (psatd.v_comoving), time averaging (psatd.do_time_averaging=1), div(E) cleaning (warpx.do_dive_cleaning=1), and multi-J (warpx.do_multi_J=1).

    Note that the update with and without rho is also supported in RZ geometry.

  • psatd.J_in_time (constant or linear; default constant)

    This determines whether the current density is assumed to be constant or linear in time, within the time step over which the electromagnetic fields are evolved.

  • psatd.rho_in_time (linear; default linear)

    This determines whether the charge density is assumed to be linear in time, within the time step over which the electromagnetic fields are evolved.

  • psatd.v_galilean (3 floats, in units of the speed of light; default 0. 0. 0.)

    Defines the Galilean velocity. A non-zero velocity activates the Galilean algorithm, which suppresses numerical Cherenkov instabilities (NCI) in boosted-frame simulations (see the section Numerical Stability and alternate formulation in a Galilean frame for more information). This requires the code to be compiled with the spectral solver. It also requires the use of the direct current deposition algorithm (by setting algo.current_deposition = direct).

  • psatd.use_default_v_galilean (0 or 1; default: 0)

    This can be used in boosted-frame simulations only and sets the Galilean velocity along the \(z\) direction automatically as \(v_{G} = -\sqrt{1-1/\gamma^2}\), where \(\gamma\) is the Lorentz factor of the boosted frame (set by warpx.gamma_boost). See the section Numerical Stability and alternate formulation in a Galilean frame for more information on the Galilean algorithm for boosted-frame simulations.

  • psatd.v_comoving (3 floating-point values, in units of the speed of light; default 0. 0. 0.)

    Defines the comoving velocity in the comoving PSATD scheme. A non-zero comoving velocity selects the comoving PSATD algorithm, which suppresses the numerical Cherenkov instability (NCI) in boosted-frame simulations, under certain assumptions. This option requires that WarpX is compiled with USE_PSATD = TRUE. It also requires the use of direct current deposition (algo.current_deposition = direct) and has not been neither implemented nor tested with other current deposition schemes.

  • psatd.do_time_averaging (0 or 1; default: 0)

    Whether to use an averaged Galilean PSATD algorithm or standard Galilean PSATD.

  • warpx.do_multi_J (0 or 1; default: 0)

    Whether to use the multi-J algorithm, where current deposition and field update are performed multiple times within each time step. The number of sub-steps is determined by the input parameter warpx.do_multi_J_n_depositions. Unlike sub-cycling, field gathering is performed only once per time step, as in regular PIC cycles. When warpx.do_multi_J = 1, we perform linear interpolation of two distinct currents deposited at the beginning and the end of the time step, instead of using one single current deposited at half time. For simulations with strong numerical Cherenkov instability (NCI), it is recommended to use the multi-J algorithm in combination with psatd.do_time_averaging = 1.

  • warpx.do_multi_J_n_depositions (integer)

    Number of sub-steps to use with the multi-J algorithm, when warpx.do_multi_J = 1. Note that this input parameter is not optional and must always be set in all input files where warpx.do_multi_J = 1. No default value is provided automatically.

Maxwell solver: macroscopic media

  • algo.macroscopic_sigma_method (string, optional)

    The algorithm for updating electric field when algo.em_solver_medium is macroscopic. Available options are:

    • backwardeuler is a fully-implicit, first-order in time scheme for E-update (default).

    • laxwendroff is the semi-implicit, second order in time scheme for E-update.

    Comparing the two methods, Lax-Wendroff is more prone to developing oscillations and requires a smaller timestep for stability. On the other hand, Backward Euler is more robust but it is first-order accurate in time compared to the second-order Lax-Wendroff method.

  • macroscopic.sigma_function(x,y,z), macroscopic.epsilon_function(x,y,z), macroscopic.mu_function(x,y,z) (string)

    To initialize spatially varying conductivity, permittivity, and permeability, respectively, using a mathematical function in the input. Constants required in the mathematical expression can be set using my_constants. These parameters are parsed if algo.em_solver_medium=macroscopic.

  • macroscopic.sigma, macroscopic.epsilon, macroscopic.mu (double)

    To initialize a constant conductivity, permittivity, and permeability of the computational medium, respectively. The default values are the corresponding values in vacuum.

Maxwell solver: kinetic-fluid hybrid

  • hybrid_pic_model.elec_temp (float)

    If algo.maxwell_solver is set to hybrid, this sets the electron temperature, in eV, used to calculate the electron pressure (see here).

  • hybrid_pic_model.n0_ref (float)

    If algo.maxwell_solver is set to hybrid, this sets the reference density, in \(m^{-3}\), used to calculate the electron pressure (see here).

  • hybrid_pic_model.gamma (float) optional (default 5/3)

    If algo.maxwell_solver is set to hybrid, this sets the exponent used to calculate the electron pressure (see here).

  • hybrid_pic_model.plasma_resistivity(rho,J) (float or str) optional (default 0)

    If algo.maxwell_solver is set to hybrid, this sets the plasma resistivity in \(\Omega m\).

  • hybrid_pic_model.plasma_hyper_resistivity (float or str) optional (default 0)

    If algo.maxwell_solver is set to hybrid, this sets the plasma hyper-resistivity in \(\Omega m^3\).

  • hybrid_pic_model.J[x/y/z]_external_grid_function(x, y, z, t) (float or str) optional (default 0)

    If algo.maxwell_solver is set to hybrid, this sets the external current (on the grid) in \(A/m^2\).

  • hybrid_pic_model.n_floor (float) optional (default 1)

    If algo.maxwell_solver is set to hybrid, this sets the plasma density floor, in \(m^{-3}\), which is useful since the generalized Ohm’s law used to calculate the E-field includes a \(1/n\) term.

  • hybrid_pic_model.substeps (int) optional (default 10)

    If algo.maxwell_solver is set to hybrid, this sets the number of sub-steps to take during the B-field update.

Note

Based on results from Stanier et al. [15] it is recommended to use linear particles when using the hybrid-PIC model.

Grid types (collocated, staggered, hybrid)

  • warpx.grid_type (string, collocated, staggered or hybrid)

    Whether to use a collocated grid (all fields defined at the cell nodes), a staggered grid (fields defined on a Yee grid), or a hybrid grid (fields and currents are interpolated back and forth between a staggered grid and a nodal grid, must be used with momentum-conserving field gathering algorithm, algo.field_gathering = momentum-conserving). The option hybrid is currently not supported in RZ geometry.

    Default: warpx.grid_type = staggered.

  • interpolation.galerkin_scheme (0 or 1)

    Whether to use a Galerkin scheme when gathering fields to particles. When set to 1, the interpolation orders used for field-gathering are reduced for certain field components along certain directions. For example, \(E_z\) is gathered using algo.particle_shape along \((x,y)\) and algo.particle_shape - 1 along \(z\). See equations (21)-(23) of Godfrey and Vay [16] and associated references for details.

    Default: interpolation.galerkin_scheme = 0 with collocated grids and/or momentum-conserving field gathering, interpolation.galerkin_scheme = 1 otherwise.

    Warning

    The default behavior should not normally be changed. At present, this parameter is intended mainly for testing and development purposes.

  • warpx.field_centering_nox, warpx.field_centering_noy, warpx.field_centering_noz (integer, optional)

    The order of interpolation used with staggered or hybrid grids (warpx.grid_type = staggered or warpx.grid_type = hybrid) and momentum-conserving field gathering (algo.field_gathering = momentum-conserving) to interpolate the electric and magnetic fields from the cell centers to the cell nodes, before gathering the fields from the cell nodes to the particle positions.

    Default: warpx.field_centering_no<x,y,z> = 2 with staggered grids, warpx.field_centering_no<x,y,z> = 8 with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).

  • warpx.current_centering_nox, warpx.current_centering_noy, warpx.current_centering_noz (integer, optional)

    The order of interpolation used with hybrid grids (warpx.grid_type = hybrid) to interpolate the currents from the cell nodes to the cell centers when warpx.do_current_centering = 1, before pushing the Maxwell fields on staggered grids.

    Default: warpx.current_centering_no<x,y,z> = 8 with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).

  • warpx.do_current_centering (bool, 0 or 1)

    If true, the current is deposited on a nodal grid and then centered to a staggered grid (Yee grid), using finite-order interpolation.

    Default: warpx.do_current_centering = 0 with collocated or staggered grids, warpx.do_current_centering = 1 with hybrid grids.

Additional parameters

  • warpx.do_dive_cleaning (0 or 1 ; default: 0)

    Whether to use modified Maxwell equations that progressively eliminate the error in \(div(E)-\rho\). This can be useful when using a current deposition algorithm which is not strictly charge-conserving, or when using mesh refinement. These modified Maxwell equation will cause the error to propagate (at the speed of light) to the boundaries of the simulation domain, where it can be absorbed.

  • warpx.do_subcycling (0 or 1; default: 0)

    Whether or not to use sub-cycling. Different refinement levels have a different cell size, which results in different Courant–Friedrichs–Lewy (CFL) limits for the time step. By default, when using mesh refinement, the same time step is used for all levels. This time step is taken as the CFL limit of the finest level. Hence, for coarser levels, the timestep is only a fraction of the CFL limit for this level, which may lead to numerical artifacts. With sub-cycling, each level evolves with its own time step, set to its own CFL limit. In practice, it means that when level 0 performs one iteration, level 1 performs two iterations. Currently, this option is only supported when amr.max_level = 1. More information can be found at https://ieeexplore.ieee.org/document/8659392.

  • warpx.override_sync_intervals (string) optional (default 1)

    Using the Intervals parser syntax, this string defines the timesteps at which synchronization of sources (rho and J) and fields (E and B) on grid nodes at box boundaries is performed. Since the grid nodes at the interface between two neighbor boxes are duplicated in both boxes, an instability can occur if they have too different values. This option makes sure that they are synchronized periodically. Note that if Perfectly Matched Layers (PML) are used, synchronization of the E and B fields is performed at every timestep regardless of this parameter.

  • warpx.use_hybrid_QED (bool; default: 0)

    Will use the Hybrid QED Maxwell solver when pushing fields: a QED correction is added to the field solver to solve non-linear Maxwell’s equations, according to Grismayer et al. [17]. Note that this option can only be used with the PSATD build. Furthermore, one must set warpx.grid_type = collocated (which otherwise would be staggered by default).

  • warpx.quantum_xi (float; default: 1.3050122.e-52)

    Overwrites the actual quantum parameter used in Maxwell’s QED equations. Assigning a value here will make the simulation unphysical, but will allow QED effects to become more apparent. Note that this option will only have an effect if the warpx.use_Hybrid_QED flag is also triggered.

  • warpx.do_device_synchronize (bool) optional (default 1)

    When running in an accelerated platform, whether to call a amrex::Gpu::synchronize() around profiling regions. This allows the profiler to give meaningful timers, but (hardly) slows down the simulation.

  • warpx.sort_intervals (string) optional (defaults: -1 on CPU; 4 on GPU)

    Using the Intervals parser syntax, this string defines the timesteps at which particles are sorted. If <=0, do not sort particles. It is turned on on GPUs for performance reasons (to improve memory locality).

  • warpx.sort_particles_for_deposition (bool) optional (default: true for the CUDA backend, otherwise false)

    This option controls the type of sorting used if particle sorting is turned on, i.e. if sort_intervals is not <=0. If true, particles will be sorted by cell to optimize deposition with many particles per cell, in the order x -> y -> z -> ppc. If false, particles will be sorted by bin, using the sort_bin_size parameter below, in the order ppc -> x -> y -> z. true is recommend for best performance on NVIDIA GPUs, especially if there are many particles per cell.

  • warpx.sort_idx_type (list of int) optional (default: 0 0 0)

    This controls the type of grid used to sort the particles when sort_particles_for_deposition is true. Possible values are: idx_type = {0, 0, 0}: Sort particles to a cell centered grid idx_type = {1, 1, 1}: Sort particles to a node centered grid idx_type = {2, 2, 2}: Compromise between a cell and node centered grid. In 2D (XZ and RZ), only the first two elements are read. In 1D, only the first element is read.

  • warpx.sort_bin_size (list of int) optional (default 1 1 1)

    If sort_intervals is activated and sort_particles_for_deposition is false, particles are sorted in bins of sort_bin_size cells. In 2D, only the first two elements are read.

  • warpx.do_shared_mem_charge_deposition (bool) optional (default false)

    If activated, charge deposition will allocate and use small temporary buffers on which to accumulate deposited charge values from particles. On GPUs these buffers will reside in __shared__ memory, which is faster than the usual __global__ memory. Performance impact will depend on the relative overhead of assigning the particles to bins small enough to fit in the space available for the temporary buffers.

  • warpx.do_shared_mem_current_deposition (bool) optional (default false)

    If activated, current deposition will allocate and use small temporary buffers on which to accumulate deposited current values from particles. On GPUs these buffers will reside in __shared__ memory, which is faster than the usual __global__ memory. Performance impact will depend on the relative overhead of assigning the particles to bins small enough to fit in the space available for the temporary buffers. Performance is mostly improved when there is lots of contention between particles writing to the same cell (e.g. for high particles per cell). This feature is only available for CUDA and HIP, and is only recommended for 3D or 2D.

  • warpx.shared_tilesize (list of int) optional (default 6 6 8 in 3D; 14 14 in 2D; 1s otherwise)

    Used to tune performance when do_shared_mem_current_deposition or do_shared_mem_charge_deposition is enabled. shared_tilesize is the size of the temporary buffer allocated in shared memory for a threadblock. A larger tilesize requires more shared memory, but gives more work to each threadblock, which can lead to higher occupancy, and allows for more buffered writes to __shared__ instead of __global__. The defaults in 2D and 3D are chosen from experimentation, but can be improved upon for specific problems. The other defaults are not optimized and should always be fine tuned for the problem.

  • warpx.shared_mem_current_tpb (int) optional (default 128)

    Used to tune performance when do_shared_mem_current_deposition is enabled. shared_mem_current_tpb controls the number of threads per block (tpb), i.e. the number of threads operating on a shared buffer.

Diagnostics and output

In-situ visualization

WarpX has four types of diagnostics: FullDiagnostics consist in dumps of fields and particles at given iterations, BackTransformedDiagnostics are used when running a simulation in a boosted frame, to reconstruct output data to the lab frame, BoundaryScrapingDiagnostics are used to collect the particles that are absorbed at the boundary, throughout the simulation, and ReducedDiags allow the user to compute some reduced quantity (particle temperature, max of a field) and write a small amount of data to text files. Similar to what is done for physical species, WarpX has a class Diagnostics that allows users to initialize different diagnostics, each of them with different fields, resolution and period. This currently applies to standard diagnostics, but should be extended to back-transformed diagnostics and reduced diagnostics (and others) in a near future.

Full Diagnostics

FullDiagnostics consist in dumps of fields and particles at given iterations. Similar to what is done for physical species, WarpX has a class Diagnostics that allows users to initialize different diagnostics, each of them with different fields, resolution and period. The user specifies the number of diagnostics and the name of each of them, and then specifies options for each of them separately. Note that some parameter (those that do not start with a <diag_name>. prefix) apply to all diagnostics. This should be changed in the future. In-situ capabilities can be used by turning on Sensei or Ascent (provided they are installed) through the output format, see below.

  • diagnostics.enable (0 or 1, optional, default 1)

    Whether to enable or disable diagnostics. This flag overwrites all other diagnostics input parameters.

  • diagnostics.diags_names (list of string optional, default empty)

    Name of each diagnostics. example: diagnostics.diags_names = diag1 my_second_diag.

  • <diag_name>.intervals (string)

    Using the Intervals parser syntax, this string defines the timesteps at which data is dumped. Use a negative number or 0 to disable data dumping. example: diag1.intervals = 10,20:25:1. Note that by default the last timestep is dumped regardless of this parameter. This can be changed using the parameter <diag_name>.dump_last_timestep described below.

  • <diag_name>.dump_last_timestep (bool optional, default 1)

    If this is 1, the last timestep is dumped regardless of <diag_name>.intervals.

  • <diag_name>.diag_type (string)

    Type of diagnostics. Full, BackTransformed, and BoundaryScraping example: diag1.diag_type = Full or diag1.diag_type = BackTransformed

  • <diag_name>.format (string optional, default plotfile)

    Flush format. Possible values are:

    • plotfile for native AMReX format.

    • checkpoint for a checkpoint file, only works with <diag_name>.diag_type = Full.

    • openpmd for OpenPMD format openPMD. Requires to build WarpX with USE_OPENPMD=TRUE (see instructions).

    • ascent for in-situ visualization using Ascent.

    • sensei for in-situ visualization using Sensei.

    example: diag1.format = openpmd.

  • <diag_name>.sensei_config (string)

    Only read if <diag_name>.format = sensei. Points to the SENSEI XML file which selects and configures the desired back end.

  • <diag_name>.sensei_pin_mesh (integer; 0 by default)

    Only read if <diag_name>.format = sensei. When 1 lower left corner of the mesh is pinned to 0.,0.,0.

  • <diag_name>.openpmd_backend (bp, h5 or json) optional, only used if <diag_name>.format = openpmd

    I/O backend for openPMD data dumps. bp is the ADIOS I/O library, h5 is the HDF5 format, and json is a simple text format. json only works with serial/single-rank jobs. When WarpX is compiled with openPMD support, the first available backend in the order given above is taken.

  • <diag_name>.openpmd_encoding (optional, v (variable based), f (file based) or g (group based) ) only read if <diag_name>.format = openpmd.

    openPMD file output encoding. File based: one file per timestep (slower), group/variable based: one file for all steps (faster)). variable based is an experimental feature with ADIOS2 and not supported for back-transformed diagnostics. Default: f (full diagnostics)

  • <diag_name>.adios2_operator.type (zfp, blosc) optional,

    ADIOS2 I/O operator type for openPMD data dumps.

  • <diag_name>.adios2_operator.parameters.* optional,

    ADIOS2 I/O operator parameters for openPMD data dumps.

    A typical example for ADIOS2 output using lossless compression with blosc using the zstd compressor and 6 CPU treads per MPI Rank (e.g. for a GPU run with spare CPU resources):

    <diag_name>.adios2_operator.type = blosc
    <diag_name>.adios2_operator.parameters.compressor = zstd
    <diag_name>.adios2_operator.parameters.clevel = 1
    <diag_name>.adios2_operator.parameters.doshuffle = BLOSC_BITSHUFFLE
    <diag_name>.adios2_operator.parameters.threshold = 2048
    <diag_name>.adios2_operator.parameters.nthreads = 6  # per MPI rank (and thus per GPU)
    

    or for the lossy ZFP compressor using very strong compression per scalar:

    <diag_name>.adios2_operator.type = zfp
    <diag_name>.adios2_operator.parameters.precision = 3
    
  • <diag_name>.adios2_engine.type (bp4, sst, ssc, dataman) optional,

    ADIOS2 Engine type for openPMD data dumps. See full list of engines at ADIOS2 readthedocs

  • <diag_name>.adios2_engine.parameters.* optional,

    ADIOS2 Engine parameters for openPMD data dumps.

    An example for parameters for the BP engine are setting the number of writers (NumAggregators), transparently redirecting data to burst buffers etc. A detailed list of engine-specific parameters are available at the official ADIOS2 documentation

    <diag_name>.adios2_engine.parameters.NumAggregators = 2048
    <diag_name>.adios2_engine.parameters.BurstBufferPath="/mnt/bb/username"
    
  • <diag_name>.fields_to_plot (list of strings, optional)

    Fields written to output. Possible scalar fields: part_per_cell rho phi F part_per_grid divE divB and rho_<species_name>, where <species_name> must match the name of one of the available particle species. Note that phi will only be written out when do_electrostatic==labframe. Also, note that for <diag_name>.diag_type = BackTransformed, the only scalar field currently supported is rho. Possible vector field components in Cartesian geometry: Ex Ey Ez Bx By Bz jx jy jz. Possible vector field components in RZ geometry: Er Et Ez Br Bt Bz jr jt jz. The default is <diag_name>.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz in Cartesian geometry and <diag_name>.fields_to_plot = Er Et Ez Br Bt Bz jr jt jz in RZ geometry. When the special value none is specified, no fields are written out. Note that the fields are averaged on the cell centers before they are written to file. Otherwise, we reconstruct a 2D Cartesian slice of the fields for output at \(\theta=0\).

  • <diag_name>.dump_rz_modes (0 or 1) optional (default 0)

    Whether to save all modes when in RZ. When openpmd_backend = openpmd, this parameter is ignored and all modes are saved.

  • <diag_name>.particle_fields_to_plot (list of strings, optional)

    Names of per-cell diagnostics of particle properties to calculate and output as additional fields. Note that the deposition onto the grid does not respect the particle shape factor, but instead uses nearest-grid point interpolation. Default is none. Parser functions for these field names are specified by <diag_name>.particle_fields.<field_name>(x,y,z,ux,uy,uz). Also, note that this option is only available for <diag_name>.diag_type = Full

  • <diag_name>.particle_fields_species (list of strings, optional)

    Species for which to calculate particle_fields_to_plot. Fields will be calculated separately for each specified species. The default is a list of all of the available particle species.

  • <diag_name>.particle_fields.<field_name>.do_average (0 or 1) optional (default 1)

    Whether the diagnostic is an average or a sum. With an average, the sum over the specified function is divided by the sum of the particle weights in each cell.

  • <diag_name>.particle_fields.<field_name>(x,y,z,ux,uy,uz) (parser string)

    Parser function to be calculated for each particle per cell. The averaged field written is

    \[\texttt{<field_name>_<species>} = \frac{\sum_{i=1}^N w_i \, f(x_i,y_i,z_i,u_{x,i},u_{y,i},u_{z,i})}{\sum_{i=1}^N w_i}\]

    where \(w_i\) is the particle weight, \(f()\) is the parser function, and \((x_i,y_i,z_i)\) are particle positions in units of a meter. The sums are over all particles of type <species> in a cell (ignoring the particle shape factor) that satisfy <diag_name>.particle_fields.<field_name>.filter(x,y,z,ux,uy,uz). When <diag_name>.particle_fields.<field_name>.do_average is 0, the division by the sum over particle weights is not done. In 1D or 2D, the particle coordinates will follow the WarpX convention. \((u_{x,i},u_{y,i},u_{z,i})\) are components of the particle four-momentum. \(u = \gamma v/c\), \(\gamma\) is the Lorentz factor, \(v\) is the particle velocity and \(c\) is the speed of light. For photons, we use the standardized momentum \(u = p/(m_{e}c)\), where \(p\) is the momentum of the photon and \(m_{e}\) the mass of an electron.

  • <diag_name>.particle_fields.<field_name>.filter(x,y,z,ux,uy,uz) (parser string, optional)

    Parser function returning a boolean for whether to include a particle in the diagnostic. If not specified, all particles will be included (see above). The function arguments are the same as above.

  • <diag_name>.plot_raw_fields (0 or 1) optional (default 0)

    By default, the fields written in the plot files are averaged on the cell centers. When <diag_name>.plot_raw_fields = 1, then the raw (i.e. non-averaged) fields are also saved in the output files. Only works with <diag_name>.format = plotfile. See this section in the yt documentation for more details on how to view raw fields.

  • <diag_name>.plot_raw_fields_guards (0 or 1) optional (default 0)

    Only used when <diag_name>.plot_raw_fields = 1. Whether to include the guard cells in the output of the raw fields. Only works with <diag_name>.format = plotfile.

  • <diag_name>.coarsening_ratio (list of int) optional (default 1 1 1)

    Reduce size of the selected diagnostic fields output by this ratio in each dimension. (For a ratio of N, this is done by averaging the fields over N or (N+1) points depending on the staggering). If blocking_factor and max_grid_size are used for the domain decomposition, as detailed in the domain decomposition section, coarsening_ratio should be an integer divisor of blocking_factor. If warpx.numprocs is used instead, the total number of cells in a given dimension must be a multiple of the coarsening_ratio multiplied by numprocs in that dimension.

  • <diag_name>.file_prefix (string) optional (default diags/<diag_name>)

    Root for output file names. Supports sub-directories.

  • <diag_name>.file_min_digits (int) optional (default 6)

    The minimum number of digits used for the iteration number appended to the diagnostic file names.

  • <diag_name>.diag_lo (list float, 1 per dimension) optional (default -infinity -infinity -infinity)

    Lower corner of the output fields (if smaller than warpx.dom_lo, then set to warpx.dom_lo). Currently, when the diag_lo is different from warpx.dom_lo, particle output is disabled.

  • <diag_name>.diag_hi (list float, 1 per dimension) optional (default +infinity +infinity +infinity)

    Higher corner of the output fields (if larger than warpx.dom_hi, then set to warpx.dom_hi). Currently, when the diag_hi is different from warpx.dom_hi, particle output is disabled.

  • <diag_name>.write_species (0 or 1) optional (default 1)

    Whether to write species output or not. For checkpoint format, always set this parameter to 1.

  • <diag_name>.species (list of string, default all physical species in the simulation)

    Which species dumped in this diagnostics.

  • <diag_name>.<species_name>.variables (list of strings separated by spaces, optional)

    List of particle quantities to write to output. Choices are w for the particle weight and ux uy uz for the particle momenta. When using the lab-frame electrostatic solver, phi (electrostatic potential, on the macroparticles) is also available. By default, all particle quantities (except phi) are written. If <diag_name>.<species_name>.variables = none, no particle data are written, except for particle positions, which are always included.

  • <diag_name>.<species_name>.random_fraction (float) optional

    If provided <diag_name>.<species_name>.random_fraction = a, only a fraction of the particle data of this species will be dumped randomly in diag <diag_name>, i.e. if rand() < a, this particle will be dumped, where rand() denotes a random number generator. The value a provided should be between 0 and 1.

  • <diag_name>.<species_name>.uniform_stride (int) optional

    If provided <diag_name>.<species_name>.uniform_stride = n, every n particle of this species will be dumped, selected uniformly. The value provided should be an integer greater than or equal to 0.

  • <diag_name>.<species_name>.plot_filter_function(t,x,y,z,ux,uy,uz) (string) optional

    Users can provide an expression returning a boolean for whether a particle is dumped. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g. If provided (x>0.0)*(uz<10.0) only those particles located at positions x greater than 0, and those having momentum uz less than 10, will be dumped.

  • amrex.async_out (0 or 1) optional (default 0)

    Whether to use asynchronous IO when writing plotfiles. This only has an effect when using the AMReX plotfile format. Please see the data analysis section for more information.

  • amrex.async_out_nfiles (int) optional (default 64)

    The maximum number of files to write to when using asynchronous IO. To use asynchronous IO with more than amrex.async_out_nfiles MPI ranks, WarpX must be configured with -DWarpX_MPI_THREAD_MULTIPLE=ON. Please see the data analysis section for more information.

  • warpx.field_io_nfiles and warpx.particle_io_nfiles (int) optional (default 1024)

    The maximum number of files to use when writing field and particle data to plotfile directories.

  • warpx.mffile_nstreams (int) optional (default 4)

    Limit the number of concurrent readers per file.

BackTransformed Diagnostics

BackTransformed diag type are used when running a simulation in a boosted frame, to reconstruct output data to the lab frame. This option can be set using <diag_name>.diag_type = BackTransformed. We support the following list of options from Full Diagnostics

<diag_name>.format, <diag_name>.openpmd_backend, <diag_name>.dump_rz_modes, <diag_name>.file_prefix, <diag_name>.diag_lo, <diag_name>.diag_hi, <diag_name>.write_species, <diag_name>.species.

Additional options for this diagnostic include:

  • <diag_name>.num_snapshots_lab (integer)

    Only used when <diag_name>.diag_type is BackTransformed. The number of lab-frame snapshots that will be written. Only this option or intervals should be specified; a run-time error occurs if the user attempts to set both num_snapshots_lab and intervals.

  • <diag_name>.intervals (string)

    Only used when <diag_name>.diag_type is BackTransformed. Using the Intervals parser syntax, this string defines the lab frame times at which data is dumped, given as multiples of the step size dt_snapshots_lab or dz_snapshots_lab described below. Example: btdiag1.intervals = 10:11,20:24:2 and btdiag1.dt_snapshots_lab = 1.e-12 indicate to dump at lab times 1e-11, 1.1e-11, 2e-11, 2.2e-11, and 2.4e-11 seconds. Note that the stop interval, the second number in the slice, must always be specified. Only this option or num_snapshots_lab should be specified; a run-time error occurs if the user attempts to set both num_snapshots_lab and intervals.

  • <diag_name>.dt_snapshots_lab (float, in seconds)

    Only used when <diag_name>.diag_type is BackTransformed. The time interval in between the lab-frame snapshots (where this time interval is expressed in the laboratory frame).

  • <diag_name>.dz_snapshots_lab (float, in meters)

    Only used when <diag_name>.diag_type is BackTransformed. Distance between the lab-frame snapshots (expressed in the laboratory frame). dt_snapshots_lab is then computed by dt_snapshots_lab = dz_snapshots_lab/c. Either dt_snapshots_lab or dz_snapshot_lab is required.

  • <diag_name>.buffer_size (integer)

    Only used when <diag_name>.diag_type is BackTransformed. The default size of the back transformed diagnostic buffers used to generate lab-frame data is 256. That is, when the multifab with lab-frame data has 256 z-slices, the data will be flushed out. However, if many lab-frame snapshots are required for diagnostics and visualization, the GPU may run out of memory with many large boxes with a size of 256 in the z-direction. This input parameter can then be used to set a smaller buffer-size, preferably multiples of 8, such that, a large number of lab-frame snapshot data can be generated without running out of gpu memory. The downside to using a small buffer size, is that the I/O time may increase due to frequent flushes of the lab-frame data. The other option is to keep the default value for buffer size and use slices to reduce the memory footprint and maintain optimum I/O performance.

  • <diag_name>.do_back_transformed_fields (0 or 1) optional (default 1)

    Only used when <diag_name>.diag_type is BackTransformed Whether to back transform the fields or not. Note that for BackTransformed diagnostics, at least one of the options <diag_name>.do_back_transformed_fields or <diag_name>.do_back_transformed_particles must be 1.

  • <diag_name>.do_back_transformed_particles (0 or 1) optional (default 1)

    Only used when <diag_name>.diag_type is BackTransformed Whether to back transform the particle data or not. Note that for BackTransformed diagnostics, at least one of the options <diag_name>.do_back_transformed_fields or <diag_name>.do_back_transformed_particles must be 1. If diag_name.write_species = 0, then <diag_name>.do_back_transformed_particles will be set to 0 in the simulation and particles will not be backtransformed.

Boundary Scraping Diagnostics

BoundaryScrapingDiagnostics are used to collect the particles that are absorbed at the boundaries, throughout the simulation. This diagnostic type is specified by setting <diag_name>.diag_type = BoundaryScraping. Currently, the only supported output format is openPMD, so the user also needs to set <diag>.format=openpmd and WarpX must be compiled with openPMD turned on. The data that is to be collected and recorded is controlled per species and per boundary by setting one or more of the flags to 1, <species>.save_particles_at_xlo/ylo/zlo, <species>.save_particles_at_xhi/yhi/zhi, and <species>.save_particles_at_eb. (Note that this diagnostics does not save any field ; it only saves particles.)

The data collected at each boundary is written out to a subdirectory of the diagnostics directory with the name of the boundary, for example, particles_at_xlo, particles_at_zhi, or particles_at_eb. By default, all of the collected particle data is written out at the end of the simulation. Optionally, the <diag_name>.intervals parameter can be given to specify writing out the data more often. This can be important if a large number of particles are lost, avoiding filling up memory with the accumulated lost particle data.

In addition to their usual attributes, the saved particles have

an integer attribute stepScraped, which indicates the PIC iteration at which each particle was absorbed at the boundary, a real attribute deltaTimeScraped, which indicates the time between the time associated to stepScraped and the exact time when each particle hits the boundary. 3 real attributes nx, ny, nz, which represents the three components of the normal to the boundary on the point of contact of the particles (not saved if they reach non-EB boundaries)

BoundaryScrapingDiagnostics can be used with <diag_name>.<species>.random_fraction, <diag_name>.<species>.uniform_stride, and <diag_name>.<species>.plot_filter_function, which have the same behavior as for FullDiagnostics. For BoundaryScrapingDiagnostics, these filters are applied at the time the data is written to file. An implication of this is that more particles may initially be accumulated in memory than are ultimately written. t in plot_filter_function refers to the time the diagnostic is written rather than the time the particle crossed the boundary.

Reduced Diagnostics

ReducedDiags allow the user to compute some reduced quantity (particle temperature, max of a field) and write a small amount of data to text files.

  • warpx.reduced_diags_names (strings, separated by spaces)

    The names given by the user of simple reduced diagnostics. Also the names of the output .txt files. This reduced diagnostics aims to produce simple outputs of the time history of some physical quantities. If warpx.reduced_diags_names is not provided in the input file, no reduced diagnostics will be done. This is then used in the rest of the input deck; in this documentation we use <reduced_diags_name> as a placeholder.

  • <reduced_diags_name>.type (string)

    The type of reduced diagnostics associated with this <reduced_diags_name>. For example, ParticleEnergy, FieldEnergy, etc. All available types are described below in detail. For all reduced diagnostics, the first and the second columns in the output file are the time step and the corresponding physical time in seconds, respectively.

    • ParticleEnergy

      This type computes the total and mean relativistic particle kinetic energy among all species:

      \[E_p = \sum_{i=1}^N w_i \, \left( \sqrt{|\boldsymbol{p}_i|^2 c^2 + m_0^2 c^4} - m_0 c^2 \right)\]

      where \(\boldsymbol{p}_i\) is the relativistic momentum of the \(i\)-th particle, \(c\) is the speed of light, \(m_0\) is the rest mass, \(N\) is the number of particles, and \(w_i\) is the weight of the \(i\)-th particle.

      The output columns are the total energy of all species, the total energy per species, the total mean energy \(E_p / \sum_i w_i\) of all species, and the total mean energy per species.

    • ParticleMomentum

      This type computes the total and mean relativistic particle momentum among all species:

      \[\boldsymbol{P}_p = \sum_{i=1}^N w_i \, \boldsymbol{p}_i\]

      where \(\boldsymbol{p}_i\) is the relativistic momentum of the \(i\)-th particle, \(N\) is the number of particles, and \(w_i\) is the weight of the \(i\)-th particle.

      The output columns are the components of the total momentum of all species, the total momentum per species, the total mean momentum \(\boldsymbol{P}_p / \sum_i w_i\) of all species, and the total mean momentum per species.

    • FieldEnergy

      This type computes the electromagnetic field energy

      \[E_f = \frac{1}{2} \sum_{\text{cells}} \left( \varepsilon_0 |\boldsymbol{E}|^2 + \frac{|\boldsymbol{B}|^2}{\mu_0} \right) \Delta V\]

      where \(\boldsymbol{E}\) is the electric field, \(\boldsymbol{B}\) is the magnetic field, \(\varepsilon_0\) is the vacuum permittivity, \(\mu_0\) is the vacuum permeability, \(\Delta V\) is the cell volume (or cell area in 2D), and the sum is over all cells.

      The output columns are the total field energy \(E_f\), the \(\boldsymbol{E}\) field energy, and the \(\boldsymbol{B}\) field energy, at each mesh refinement level.

    • FieldMomentum

      This type computes the electromagnetic field momentum

      \[\boldsymbol{P}_f = \varepsilon_0 \sum_{\text{cells}} \left( \boldsymbol{E} \times \boldsymbol{B} \right) \Delta V\]

      where \(\boldsymbol{E}\) is the electric field, \(\boldsymbol{B}\) is the magnetic field, \(\varepsilon_0\) is the vacuum permittivity, \(\Delta V\) is the cell volume (or cell area in 2D), and the sum is over all cells.

      The output columns are the components of the total field momentum \(\boldsymbol{P}_f\) at each mesh refinement level.

      Note that the fields are not averaged on the cell centers before their energy is computed.

    • FieldMaximum

      This type computes the maximum value of each component of the electric and magnetic fields and of the norm of the electric and magnetic field vectors. Measuring maximum fields in a plasma might be very noisy in PIC, use this instead for analysis of scenarios such as an electromagnetic wave propagating in vacuum.

      The output columns are the maximum value of the \(E_x\) field, the maximum value of the \(E_y\) field, the maximum value of the \(E_z\) field, the maximum value of the norm \(|E|\) of the electric field, the maximum value of the \(B_x\) field, the maximum value of the \(B_y\) field, the maximum value of the \(B_z\) field and the maximum value of the norm \(|B|\) of the magnetic field, at mesh refinement levels from 0 to \(n\).

      Note that the fields are averaged on the cell centers before their maximum values are computed.

    • FieldProbe

      This type computes the value of each component of the electric and magnetic fields and of the Poynting vector (a measure of electromagnetic flux) at points in the domain.

      Multiple geometries for point probes can be specified via <reduced_diags_name>.probe_geometry = ...:

      • Point (default): a single point

      • Line: a line of points with equal spacing

      • Plane: a plane of points with equal spacing

      Point: The point where the fields are measured is specified through the input parameters <reduced_diags_name>.x_probe, <reduced_diags_name>.y_probe and <reduced_diags_name>.z_probe.

      Line: probe a 1 dimensional line of points to create a line detector. Initial input parameters x_probe, y_probe, and z_probe designate one end of the line detector, while the far end is specified via <reduced_diags_name>.x1_probe, <reduced_diags_name>.y1_probe, <reduced_diags_name>.z1_probe. Additionally, <reduced_diags_name>.resolution must be defined to give the number of detector points along the line (equally spaced) to probe.

      Plane: probe a 2 dimensional plane of points to create a square plane detector. Initial input parameters x_probe, y_probe, and z_probe designate the center of the detector. The detector plane is normal to a vector specified by <reduced_diags_name>.target_normal_x, <reduced_diags_name>.target_normal_y, and <reduced_diags_name>.target_normal_z. Note that it is not necessary to specify the target_normal vector in a 2D simulation (the only supported normal is in y). The top of the plane is perpendicular to an “up” vector denoted by <reduced_diags_name>.target_up_x, <reduced_diags_name>.target_up_y, and <reduced_diags_name>.target_up_z. The detector has a square radius to be determined by <reduced_diags_name>.detector_radius. Similarly to the line detector, the plane detector requires a resolution <reduced_diags_name>.resolution, which denotes the number of detector particles along each side of the square detector.

      The output columns are the value of the \(E_x\) field, the value of the \(E_y\) field, the value of the \(E_z\) field, the value of the \(B_x\) field, the value of the \(B_y\) field, the value of the \(B_z\) field and the value of the Poynting Vector \(|S|\) of the electromagnetic fields, at mesh refinement levels from 0 to \(n\), at point (\(x\), \(y\), \(z\)).

      The fields are always interpolated to the measurement point. The interpolation order can be set by specifying <reduced_diags_name>.interp_order, defaulting to 1. In RZ geometry, this only saves the 0’th azimuthal mode component of the fields. Time integrated electric and magnetic field components can instead be obtained by specifying <reduced_diags_name>.integrate = true. The integration is done every time step even when the data is written out less often. In a moving window simulation, the FieldProbe can be set to follow the moving frame by specifying <reduced_diags_name>.do_moving_window_FP = 1 (default 0).

      Warning

      The FieldProbe reduced diagnostic does not yet add a Lorentz back transformation for boosted frame simulations. Thus, it records field data in the boosted frame, not (yet) in the lab frame.

    • RhoMaximum

      This type computes the maximum and minimum values of the total charge density as well as the maximum absolute value of the charge density of each charged species. Please be aware that measuring maximum charge densities might be very noisy in PIC simulations.

      The output columns are the maximum value of the \(rho\) field, the minimum value of the \(rho\) field, the maximum value of the absolute \(|rho|\) field of each charged species.

      Note that the charge densities are averaged on the cell centers before their maximum values are computed.

    • FieldReduction

      This type computes an arbitrary reduction of the positions, the current density, and the electromagnetic fields.

      • <reduced_diags_name>.reduced_function(x,y,z,Ex,Ey,Ez,Bx,By,Bz,jx,jy,jz) (string)

        An analytic function to be reduced must be provided, using the math parser.

      • <reduced_diags_name>.reduction_type (string)

        The type of reduction to be performed. It must be either Maximum, Minimum or Integral. Integral computes the spatial integral of the function defined in the parser by summing its value on all grid points and multiplying the result by the volume of a cell. Please be also aware that measuring maximum quantities might be very noisy in PIC simulations.

      The only output column is the reduced value.

      Note that the fields are averaged on the cell centers before the reduction is performed.

    • ParticleNumber

      This type computes the total number of macroparticles and of physical particles (i.e. the sum of their weights) in the whole simulation domain (for each species and summed over all species). It can be useful in particular for simulations with creation (ionization, QED processes) or removal (resampling) of particles.

      The output columns are total number of macroparticles summed over all species, total number of macroparticles of each species, sum of the particles’ weight summed over all species, sum of the particles’ weight of each species.

    • BeamRelevant

      This type computes properties of a particle beam relevant for particle accelerators, like position, momentum, emittance, etc.

      <reduced_diags_name>.species must be provided, such that the diagnostics are done for this (beam-like) species only.

      The output columns (for 3D-XYZ) are the following, where the average is done over the whole species (typical usage: the particle beam is in a separate species):

      [0]: simulation step (iteration).

      [1]: time (s).

      [2], [3], [4]: The mean values of beam positions (m) \(\langle x \rangle\), \(\langle y \rangle\), \(\langle z \rangle\).

      [5], [6], [7]: The mean values of beam relativistic momenta (kg m/s) \(\langle p_x \rangle\), \(\langle p_y \rangle\), \(\langle p_z \rangle\).

      [8]: The mean Lorentz factor \(\langle \gamma \rangle\).

      [9], [10], [11]: The RMS values of beam positions (m) \(\delta_x = \sqrt{ \langle (x - \langle x \rangle)^2 \rangle }\), \(\delta_y = \sqrt{ \langle (y - \langle y \rangle)^2 \rangle }\), \(\delta_z = \sqrt{ \langle (z - \langle z \rangle)^2 \rangle }\).

      [12], [13], [14]: The RMS values of beam relativistic momenta (kg m/s) \(\delta_{px} = \sqrt{ \langle (p_x - \langle p_x \rangle)^2 \rangle }\), \(\delta_{py} = \sqrt{ \langle (p_y - \langle p_y \rangle)^2 \rangle }\), \(\delta_{pz} = \sqrt{ \langle (p_z - \langle p_z \rangle)^2 \rangle }\).

      [15]: The RMS value of the Lorentz factor \(\sqrt{ \langle (\gamma - \langle \gamma \rangle)^2 \rangle }\).

      [16], [17], [18]: beam projected transverse RMS normalized emittance (m) \(\epsilon_x = \dfrac{1}{mc} \sqrt{\delta_x^2 \delta_{px}^2 - \Big\langle (x-\langle x \rangle) (p_x-\langle p_x \rangle) \Big\rangle^2}\), \(\epsilon_y = \dfrac{1}{mc} \sqrt{\delta_y^2 \delta_{py}^2 - \Big\langle (y-\langle y \rangle) (p_y-\langle p_y \rangle) \Big\rangle^2}\), \(\epsilon_z = \dfrac{1}{mc} \sqrt{\delta_z^2 \delta_{pz}^2 - \Big\langle (z-\langle z \rangle) (p_z-\langle p_z \rangle) \Big\rangle^2}\).

      [19], [20]: Twiss alpha for the transverse directions \(\alpha_x = - \Big\langle (x-\langle x \rangle) (p_x-\langle p_x \rangle) \Big\rangle \Big/ \epsilon_x\), \(\alpha_y = - \Big\langle (y-\langle y \rangle) (p_y-\langle p_y \rangle) \Big\rangle \Big/ \epsilon_y\).

      [21], [22]: beta function for the transverse directions (m) \(\beta_x = \dfrac{{\delta_x}^2}{\epsilon_x}\), \(\beta_y = \dfrac{{\delta_y}^2}{\epsilon_y}\).

      [23]: The charge of the beam (C).

      For 2D-XZ, \(\langle y \rangle\), \(\delta_y\), and \(\epsilon_y\) will not be outputted.

    • LoadBalanceCosts

      This type computes the cost, used in load balancing, for each box on the domain. The cost \(c\) is computed as

      \[c = n_{\text{particle}} \cdot w_{\text{particle}} + n_{\text{cell}} \cdot w_{\text{cell}},\]

      where \(n_{\text{particle}}\) is the number of particles on the box, \(w_{\text{particle}}\) is the particle cost weight factor (controlled by algo.costs_heuristic_particles_wt), \(n_{\text{cell}}\) is the number of cells on the box, and \(w_{\text{cell}}\) is the cell cost weight factor (controlled by algo.costs_heuristic_cells_wt).

    • LoadBalanceEfficiency

      This type computes the load balance efficiency, given the present costs and distribution mapping. Load balance efficiency is computed as the mean cost over all ranks, divided by the maximum cost over all ranks. Until costs are recorded, load balance efficiency is output as -1; at earliest, the load balance efficiency can be output starting at step 2, since costs are not recorded until step 1.

    • ParticleHistogram

      This type computes a user defined particle histogram.

      • <reduced_diags_name>.species (string)

        A species name must be provided, such that the diagnostics are done for this species.

      • <reduced_diags_name>.histogram_function(t,x,y,z,ux,uy,uz) (string)

        A histogram function must be provided. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent the particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g. x produces the position (density) distribution in x. ux produces the momentum distribution in x, sqrt(ux*ux+uy*uy+uz*uz) produces the speed distribution. The default value of the histogram without normalization is \(f = \sum\limits_{i=1}^N w_i\), where \(\sum\limits_{i=1}^N\) is the sum over \(N\) particles in that bin, \(w_i\) denotes the weight of the ith particle.

      • <reduced_diags_name>.bin_number (int > 0)

        This is the number of bins used for the histogram.

      • <reduced_diags_name>.bin_max (float)

        This is the maximum value of the bins.

      • <reduced_diags_name>.bin_min (float)

        This is the minimum value of the bins.

      • <reduced_diags_name>.normalization (optional)

        This provides options to normalize the histogram:

        unity_particle_weight uses unity particle weight to compute the histogram, such that the values of the histogram are the number of counted macroparticles in that bin, i.e. \(f = \sum\limits_{i=1}^N 1\), \(N\) is the number of particles in that bin.

        max_to_unity will normalize the histogram such that its maximum value is one.

        area_to_unity will normalize the histogram such that the area under the histogram is one, so the histogram is also the probability density function.

        If nothing is provided, the macroparticle weight will be used to compute the histogram, and no normalization will be done.

      • <reduced_diags_name>.filter_function(t,x,y,z,ux,uy,uz) (string) optional

        Users can provide an expression returning a boolean for whether a particle is taken into account when calculating the histogram. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g. If provided (x>0.0)*(uz<10.0) only those particles located at positions x greater than 0, and those having momentum uz less than 10, will be taken into account when calculating the histogram.

      The output columns are values of the 1st bin, the 2nd bin, …, the nth bin. An example input file and a loading python script of using the histogram reduced diagnostics are given in Examples/Tests/initial_distribution/.

    • ParticleHistogram2D

      This type computes a user defined, 2D particle histogram.

      • <reduced_diags_name>.species (string)

        A species name must be provided, such that the diagnostics are done for this species.

      • <reduced_diags_name>.file_min_digits (int) optional (default 6)

        The minimum number of digits used for the iteration number appended to the diagnostic file names.

      • <reduced_diags_name>.histogram_function_abs(t,x,y,z,ux,uy,uz,w) (string)

        A histogram function must be provided for the abscissa axis. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent the particle velocities in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. w represents the weight.

      • <reduced_diags_name>.histogram_function_ord(t,x,y,z,ux,uy,uz,w) (string)

        A histogram function must be provided for the ordinate axis.

      • <reduced_diags_name>.bin_number_abs (int > 0) and <reduced_diags_name>.bin_number_ord (int > 0)

        These are the number of bins used for the histogram for the abscissa and ordinate axis respectively.

      • <reduced_diags_name>.bin_max_abs (float) and <reduced_diags_name>.bin_max_ord (float)

        These are the maximum value of the bins for the abscissa and ordinate axis respectively. Particles with values outside of these ranges are discarded.

      • <reduced_diags_name>.bin_min_abs (float) and <reduced_diags_name>.bin_min_ord (float)

        These are the minimum value of the bins for the abscissa and ordinate axis respectively. Particles with values outside of these ranges are discarded.

      • <reduced_diags_name>.filter_function(t,x,y,z,ux,uy,uz,w) (string) optional

        Users can provide an expression returning a boolean for whether a particle is taken into account when calculating the histogram. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle velocities in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. w represents the weight.

      • <reduced_diags_name>.value_function(t,x,y,z,ux,uy,uz,w) (string) optional

        Users can provide an expression for the weight used to calculate the number of particles per cell associated with the selected abscissa and ordinate functions and/or the filter function. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle velocities in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. w represents the weight.

      The output is a <reduced_diags_name> folder containing a set of openPMD files. An example input file and a loading python script of using the histogram2D reduced diagnostics are given in Examples/Tests/histogram2D/.

    • ParticleExtrema

      This type computes the minimum and maximum values of particle position, momentum, gamma, weight, and the \(\chi\) parameter for QED species.

      <reduced_diags_name>.species must be provided, such that the diagnostics are done for this species only.

      The output columns are minimum and maximum position \(x\), \(y\), \(z\); minimum and maximum momentum \(p_x\), \(p_y\), \(p_z\); minimum and maximum gamma \(\gamma\); minimum and maximum weight \(w\); minimum and maximum \(\chi\).

      Note that when the QED parameter \(\chi\) is computed, field gather is carried out at every output, so the time of the diagnostic may be long depending on the simulation size.

    • ChargeOnEB

      This type computes the total surface charge on the embedded boundary (in Coulombs), by using the formula

      \[Q_{tot} = \epsilon_0 \iint dS \cdot E\]

      where the integral is performed over the surface of the embedded boundary.

      When providing <reduced_diags_name>.weighting_function(x,y,z), the computed integral is weighted:

      \[Q = \epsilon_0 \iint dS \cdot E \times weighting(x, y, z)\]

      In particular, by choosing a weighting function which returns either 1 or 0, it is possible to compute the charge on only some part of the embedded boundary.

    • ColliderRelevant

      This diagnostics computes properties of two colliding beams that are relevant for particle colliders. Two species must be specified. Photon species are not supported yet. It is assumed that the two species propagate and collide along the z direction. The output columns (for 3D-XYZ) are the following, where the minimum, average and maximum are done over the whole species:

      [0]: simulation step (iteration).

      [1]: time (s).

      [2]: time derivative of the luminosity (\(m^{-2}s^{-1}\)) defined as:

      \[\frac{dL}{dt} = 2 c \iiint n_1(x,y,z) n_2(x,y,z) dx dy dz\]

      where \(n_1\), \(n_2\) are the number densities of the two colliding species.

      [3], [4], [5]: If, QED is enabled, the minimum, average and maximum values of the quantum parameter \(\chi\) of species 1: \(\chi_{min}\), \(\langle \chi \rangle\), \(\chi_{max}\). If QED is not enabled, these numbers are not computed.

      [6], [7]: The average and standard deviation of the values of the transverse coordinate \(x\) (m) of species 1: \(\langle x \rangle\), \(\sqrt{\langle x- \langle x \rangle \rangle^2}\).

      [8], [9]: The average and standard deviation of the values of the transverse coordinate \(y\) (m) of species 1: \(\langle y \rangle\), \(\sqrt{\langle y- \langle y \rangle \rangle^2}\).

      [10], [11], [12], [13]: The minimum, average, maximum and standard deviation of the angle \(\theta_x = \angle (u_x, u_z)\) (rad) of species 1: \({\theta_x}_{min}\), \(\langle \theta_x \rangle\), \({\theta_x}_{max}\), \(\sqrt{\langle \theta_x- \langle \theta_x \rangle \rangle^2}\).

      [14], [15], [16], [17]: The minimum, average, maximum and standard deviation of the angle \(\theta_y = \angle (u_y, u_z)\) (rad) of species 1: \({\theta_y}_{min}\), \(\langle \theta_y \rangle\), \({\theta_y}_{max}\), \(\sqrt{\langle \theta_y- \langle \theta_y \rangle \rangle^2}\).

      [18], …, [32]: Analogous quantities for species 2.

      For 2D-XZ, \(y\)-related quantities are not outputted. For 1D-Z, \(x\)-related and \(y\)-related quantities are not outputted. RZ geometry is not supported yet.

  • <reduced_diags_name>.intervals (string)

    Using the Intervals Parser syntax, this string defines the timesteps at which reduced diagnostics are written to file.

  • <reduced_diags_name>.path (string) optional (default ./diags/reducedfiles/)

    The path that the output file will be stored.

  • <reduced_diags_name>.extension (string) optional (default txt)

    The extension of the output file.

  • <reduced_diags_name>.separator (string) optional (default a whitespace)

    The separator between row values in the output file. The default separator is a whitespace.

  • <reduced_diags_name>.precision (integer) optional (default 14)

    The precision used when writing out the data to the text files.

Lookup tables and other settings for QED modules

Lookup tables store pre-computed values for functions used by the QED modules. This feature requires to compile with QED=TRUE (and also with QED_TABLE_GEN=TRUE for table generation)

  • qed_bw.lookup_table_mode (string)

    There are three options to prepare the lookup table required by the Breit-Wheeler module:

    • builtin: a built-in table is used (Warning: the table gives reasonable results but its resolution is quite low).

    • generate: a new table is generated. This option requires Boost math library (version >= 1.66) and to compile with QED_TABLE_GEN=TRUE. All the following parameters must be specified (table 1 is used to evolve the optical depth of the photons, while table 2 is used for pair generation):

      • qed_bw.tab_dndt_chi_min (float): minimum chi parameter for lookup table 1 ( used for the evolution of the optical depth of the photons)

      • qed_bw.tab_dndt_chi_max (float): maximum chi parameter for lookup table 1

      • qed_bw.tab_dndt_how_many (int): number of points to be used for lookup table 1

      • qed_bw.tab_pair_chi_min (float): minimum chi parameter for lookup table 2 ( used for pair generation)

      • qed_bw.tab_pair_chi_max (float): maximum chi parameter for lookup table 2

      • qed_bw.tab_pair_chi_how_many (int): number of points to be used for chi axis in lookup table 2

      • qed_bw.tab_pair_frac_how_many (int): number of points to be used for the second axis in lookup table 2 (the second axis is the ratio between the quantum parameter of the less energetic particle of the pair and the quantum parameter of the photon).

      • qed_bw.save_table_in (string): where to save the lookup table

      Alternatively, the lookup table can be generated using a standalone tool (see qed tools section).

    • load: a lookup table is loaded from a pre-generated binary file. The following parameter must be specified:

      • qed_bw.load_table_from (string): name of the lookup table file to read from.

  • qed_qs.lookup_table_mode (string)

    There are three options to prepare the lookup table required by the Quantum Synchrotron module:

    • builtin: a built-in table is used (Warning: the table gives reasonable results but its resolution is quite low).

    • generate: a new table is generated. This option requires Boost math library (version >= 1.66) and to compile with QED_TABLE_GEN=TRUE. All the following parameters must be specified (table 1 is used to evolve the optical depth of the particles, while table 2 is used for photon emission):

      • qed_qs.tab_dndt_chi_min (float): minimum chi parameter for lookup table 1 ( used for the evolution of the optical depth of electrons and positrons)

      • qed_qs.tab_dndt_chi_max (float): maximum chi parameter for lookup table 1

      • qed_qs.tab_dndt_how_many (int): number of points to be used for lookup table 1

      • qed_qs.tab_em_chi_min (float): minimum chi parameter for lookup table 2 ( used for photon emission)

      • qed_qs.tab_em_chi_max (float): maximum chi parameter for lookup table 2

      • qed_qs.tab_em_chi_how_many (int): number of points to be used for chi axis in lookup table 2

      • qed_qs.tab_em_frac_how_many (int): number of points to be used for the second axis in lookup table 2 (the second axis is the ratio between the quantum parameter of the photon and the quantum parameter of the charged particle).

      • qed_qs.tab_em_frac_min (float): minimum value to be considered for the second axis of lookup table 2

      • qed_qs.save_table_in (string): where to save the lookup table

      Alternatively, the lookup table can be generated using a standalone tool (see qed tools section).

    • load: a lookup table is loaded from a pre-generated binary file. The following parameter must be specified:

      • qed_qs.load_table_from (string): name of the lookup table file to read from.

  • qed_bw.chi_min (float): minimum chi parameter to be considered by the Breit-Wheeler engine

    (suggested value : 0.01)

  • qed_qs.chi_min (float): minimum chi parameter to be considered by the Quantum Synchrotron engine

    (suggested value : 0.001)

  • qed_qs.photon_creation_energy_threshold (float) optional (default 2)

    Energy threshold for photon particle creation in *me*c^2 units.

  • warpx.do_qed_schwinger (bool) optional (default 0)

    If this is 1, Schwinger electron-positron pairs can be generated in vacuum in the cells where the EM field is high enough. Activating the Schwinger process requires the code to be compiled with QED=TRUE and PICSAR. If warpx.do_qed_schwinger = 1, Schwinger product species must be specified with qed_schwinger.ele_product_species and qed_schwinger.pos_product_species. Schwinger process requires either warpx.grid_type = collocated or algo.field_gathering=momentum-conserving (so that different field components are computed at the same location in the grid) and does not currently support mesh refinement, cylindrical coordinates or single precision.

  • qed_schwinger.ele_product_species (string)

    If Schwinger process is activated, an electron product species must be specified (the name of an existing electron species must be provided).

  • qed_schwinger.pos_product_species (string)

    If Schwinger process is activated, a positron product species must be specified (the name of an existing positron species must be provided).

  • qed_schwinger.y_size (float; in meters)

    If Schwinger process is activated with DIM=2D, a transverse size must be specified. It is used to convert the pair production rate per unit volume into an actual number of created particles. This value should correspond to the typical transverse extent for which the EM field has a very high value (e.g. the beam waist for a focused laser beam).

  • qed_schwinger.xmin,ymin,zmin and qed_schwinger.xmax,ymax,zmax (float) optional (default unlimited)

    When qed_schwinger.xmin and qed_schwinger.xmax are set, they delimit the region within which Schwinger pairs can be created. The same is applicable in the other directions.

  • qed_schwinger.threshold_poisson_gaussian (integer) optional (default 25)

    If the expected number of physical pairs created in a cell at a given timestep is smaller than this threshold, a Poisson distribution is used to draw the actual number of physical pairs created. Otherwise a Gaussian distribution is used. Note that, regardless of this parameter, the number of macroparticles created is at most one per cell per timestep per species (with a weight corresponding to the number of physical pairs created).

Checkpoints and restart

WarpX supports checkpoints/restart via AMReX. The checkpoint capability can be turned with regular diagnostics: <diag_name>.format = checkpoint.

  • amr.restart (string)

    Name of the checkpoint file to restart from. Returns an error if the folder does not exist or if it is not properly formatted.

  • warpx.write_diagnostics_on_restart (bool) optional (default false)

    When true, write the diagnostics after restart at the time of the restart.

Intervals parser

WarpX can parse time step interval expressions of the form start:stop:period, e.g. 1:2:3, 4::, 5:6, :, ::10. A comma is used as a separator between groups of intervals, which we call slices. The resulting time steps are the union set of all given slices. White spaces are ignored. A single slice can have 0, 1 or 2 colons :, just as numpy slices, but with inclusive upper bound for stop.

  • For 0 colon the given value is the period

  • For 1 colon the given string is of the type start:stop

  • For 2 colons the given string is of the type start:stop:period

Any value that is not given is set to default. Default is 0 for the start, std::numeric_limits<int>::max() for the stop and 1 for the period. For the 1 and 2 colon syntax, actually having values in the string is optional (this means that ::5, 100 ::10 and 100 : are all valid syntaxes).

All values can be expressions that will be parsed in the same way as other integer input parameters.

Examples

  • something_intervals = 50 -> do something at timesteps 0, 50, 100, 150, etc. (equivalent to something_intervals = ::50)

  • something_intervals = 300:600:100 -> do something at timesteps 300, 400, 500 and 600.

  • something_intervals = 300::50 -> do something at timesteps 300, 350, 400, 450, etc.

  • something_intervals = 105:108,205:208 -> do something at timesteps 105, 106, 107, 108, 205, 206, 207 and 208. (equivalent to something_intervals = 105 : 108 : , 205 : 208 :)

  • something_intervals = : or something_intervals = :: -> do something at every timestep.

  • something_intervals = 167:167,253:253,275:425:50 do something at timesteps 167, 253, 275, 325, 375 and 425.

This is essentially the python slicing syntax except that the stop is inclusive (0:100 contains 100) and that no colon means that the given value is the period.

Note that if a given period is zero or negative, the corresponding slice is disregarded. For example, something_intervals = -1 deactivates something and something_intervals = ::-1,100:1000:25 is equivalent to something_intervals = 100:1000:25.

Testing and Debugging

When developing, testing and debugging WarpX, the following options can be considered.

  • warpx.verbose (0 or 1; default is 1 for true)

    Controls how much information is printed to the terminal, when running WarpX.

  • warpx.always_warn_immediately (0 or 1; default is 0 for false)

    If set to 1, WarpX immediately prints every warning message as soon as it is generated. It is mainly intended for debug purposes, in case a simulation crashes before a global warning report can be printed.

  • warpx.abort_on_warning_threshold (string: low, medium or high) optional

    Optional threshold to abort as soon as a warning is raised. If the threshold is set, warning messages with priority greater than or equal to the threshold trigger an immediate abort. It is mainly intended for debug purposes, and is best used with warpx.always_warn_immediately=1.

  • amrex.abort_on_unused_inputs (0 or 1; default is 0 for false)

    When set to 1, this option causes simulation to fail after its completion if there were unused parameters. It is mainly intended for continuous integration and automated testing to check that all tests and inputs are adapted to API changes.

  • amrex.use_profiler_syncs (0 or 1; default is 0 for false)

    Adds a synchronization at the start of communication, so any load balance will be caught there (the timer is called SyncBeforeComms), then the comm operation will run. This will slow down the run.

  • warpx.serialize_initial_conditions (0 or 1) optional (default 0)

    Serialize the initial conditions for reproducible testing, e.g, in our continuous integration tests. Mainly whether or not to use OpenMP threading for particle initialization.

  • warpx.safe_guard_cells (0 or 1) optional (default 0)

    Run in safe mode, exchanging more guard cells, and more often in the PIC loop (for debugging).

  • ablastr.fillboundary_always_sync (0 or 1) optional (default 0)

    Run all FillBoundary operations on MultiFab to force-synchronize shared nodal points. This slightly increases communication cost and can help to spot missing nodal_sync flags in these operations.

[1] (1,2)

J.-L. Vay. Simulation Of Beams Or Plasmas Crossing At Relativistic Velocity. Physics of Plasmas, 15(5):56701, May 2008. doi:10.1063/1.2837054.

[2]

J.P. Verboncoeur. Symmetric Spline Weighting for Charge and Current Density in Particle Simulation. Journal of Computational Physics, 174(1):421–427, 2001. doi:10.1006/jcph.2001.6923.

[3]

H. Wiedemann. Particle Accelerator Physics. Springer Cham, 2015. ISBN 978-3-319-18317-6. doi:10.1007/978-3-319-18317-6.

[4]

A. Muraviev, A. Bashinov, E. Efimenko, V. Volokitin, I. Meyerov, and A. Gonoskov. Strategies for particle resampling in PIC simulations. Computer Physics Communications, 262:107826, 2021. doi:10.1016/j.cpc.2021.107826.

[5]

M. Vranic, T. Grismayer, J.L. Martins, R.A. Fonseca, and L.O. Silva. Particle merging algorithm for pic codes. Computer Physics Communications, 191:65–73, 2015. doi:https://doi.org/10.1016/j.cpc.2015.01.020.

[6] (1,2,3,4)

S. Akturk, X. Gu, E. Zeek, and R. Trebino. Pulse-front tilt caused by spatial and temporal chirp. Opt. Express, 12(19):4399–4410, Sep 2004. doi:10.1364/OPEX.12.004399.

[7] (1,2)

F. Pérez, L. Gremillet, A. Decoster, M. Drouin, and E. Lefebvre. Improved modeling of relativistic collisions and collisional ionization in particle-in-cell codes. Physics of Plasmas, 19(8):083104, Aug 2012. doi:10.1063/1.4742167.

[8] (1,2)

D. P. Higginson, A. Link, and A. Schmidt. A pairwise nuclear fusion algorithm for weighted particle-in-cell plasma simulations. Journal of Computational Physics, 388:439–453, Jul 2019. doi:10.1016/j.jcp.2019.03.020.

[9]

T. Z. Esirkepov. Exact Charge Conservation Scheme For Particle-In-Cell Simulation With An Arbitrary Form-Factor. Computer Physics Communications, 135(2):144–153, Apr 2001.

[10] (1,2,3,4)

J.-L. Vay, I. Haber, and B. B. Godfrey. A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas. Journal of Computational Physics, 243:260–268, Jun 2013. doi:10.1016/j.jcp.2013.03.010.

[11]

A. V. Higuera and J. R. Cary. Structure-preserving second-order integration of relativistic charged particle trajectories in electromagnetic fields. Physics of Plasmas, 24(5):052104, 04 2017. URL: https://doi.org/10.1063/1.4979989, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.4979989/15988441/052104\_1\_online.pdf, doi:10.1063/1.4979989.

[12]

B. M. Cowan, D. L. Bruhwiler, J. R. Cary, E. Cormier-Michel, and C. G. R. Geddes. Generalized algorithm for control of numerical dispersion in explicit time-domain electromagnetic simulations. Physical Review Special Topics-Accelerators And Beams, Apr 2013. doi:10.1103/PhysRevSTAB.16.041303.

[13]

T. Xiao and Q. H. Liu. An enlarged cell technique for the conformal FDTD method to model perfectly conducting objects. In 2005 IEEE Antennas and Propagation Society International Symposium, volume 1A, 122–125 Vol. 1A. 2005. doi:10.1109/APS.2005.1551259.

[14] (1,2,3)

R. Lehe, M. Kirchen, B. B. Godfrey, A. R. Maier, and J.-L. Vay. Elimination of numerical Cherenkov instability in flowing-plasma particle-in-cell simulations by using Galilean coordinates. Phys. Rev. E, 94:053305, Nov 2016. URL: https://link.aps.org/doi/10.1103/PhysRevE.94.053305, doi:10.1103/PhysRevE.94.053305.

[15]

A. Stanier, L. Chacón, and A. Le. A cancellation problem in hybrid particle-in-cell schemes due to finite particle size. Journal of Computational Physics, 420:109705, 2020. URL: https://www.sciencedirect.com/science/article/pii/S0021999120304794, doi:https://doi.org/10.1016/j.jcp.2020.109705.

[16]

B. B. Godfrey and J.-L. Vay. Numerical stability of relativistic beam multidimensional PIC simulations employing the Esirkepov algorithm. Journal of Computational Physics, 248:33–46, 2013.

[17]

T. Grismayer, R. Torres, P. Carneiro, F. Cruz, R. A. Fonseca, and L. O. Silva. Quantum Electrodynamics vacuum polarization solver. New Journal of Physics, 23(9):095005, Sep 2021. doi:10.1088/1367-2630/ac2004.

Workflows

This section collects typical user workflows and best practices for WarpX.

Extend a Simulation with Python

When running WarpX directly from Python it is possible to interact with the simulation.

For instance, with the step() method of the simulation class, one could run sim.step(nsteps=1) in a loop:

# Preparation: set up the simulation
#   sim = picmi.Simulation(...)
#   ...

steps = 1000
for _ in range(steps):
    sim.step(nsteps=1)

    # do something custom with the sim object

As a more flexible alternative, one can install callback functions, which will execute a given Python function at a specific location in the WarpX simulation loop.

Callback Locations

These are the functions which allow installing user created functions so that they are called at various places along the time step.

The following three functions allow the user to install, uninstall and verify the different call back types.

These functions all take a callback location name (string) and function or instance method as an argument. Note that if an instance method is used, an extra reference to the method’s object is saved.

Functions can be called at the following times:

  • beforeInitEsolve: before the initial solve for the E fields (i.e. before the PIC loop starts)

  • afterinit: immediately after the init is complete

  • beforeEsolve: before the solve for E fields

  • poissonsolver: In place of the computePhi call but only in an electrostatic simulation

  • afterEsolve: after the solve for E fields

  • afterBpush: after the B field advance for electromagnetic solvers

  • afterEpush: after the E field advance for electromagnetic solvers

  • beforedeposition: before the particle deposition (for charge and/or current)

  • afterdeposition: after particle deposition (for charge and/or current)

  • beforestep: before the time step

  • afterstep: after the time step

  • afterdiagnostics: after diagnostic output

  • oncheckpointsignal: on a checkpoint signal

  • onbreaksignal: on a break signal. These callbacks will be the last ones executed before the simulation ends.

  • particlescraper: just after the particle boundary conditions are applied but before lost particles are processed

  • particleloader: at the time that the standard particle loader is called

  • particleinjection: called when particle injection happens, after the position advance and before deposition is called, allowing a user defined particle distribution to be injected each time step

Example that calls the Python function myplots after each step:

from pywarpx.callbacks import installcallback

def myplots():
    # do something here

installcallback('afterstep', myplots)

# run simulation
sim.step(nsteps=100)

The install can also be done using a Python decorator, which has the prefix callfrom. To use a decorator, the syntax is as follows. This will install the function myplots to be called after each step. The above example is quivalent to the following:

from pywarpx.callbacks import callfromafterstep

@callfromafterstep
def myplots():
    # do something here

# run simulation
sim.step(nsteps=100)
pywarpx.callbacks.installcallback(name, f)[source]

Installs a function to be called at that specified time.

Adds a function to the list of functions called by this callback.

pywarpx.callbacks.isinstalled(name, f)[source]

Checks if a function is installed for this callback.

pywarpx.callbacks.uninstallcallback(name, f)[source]

Uninstalls the function (so it won’t be called anymore).

Removes the function from the list of functions called by this callback.

pyAMReX

Many of the following classes are provided through pyAMReX. After the simulation is initialized, the pyAMReX module can be accessed via

from pywarpx import picmi, libwarpx

# ... simulation definition ...

# equivalent to
#   import amrex.space3d as amr
# for a 3D simulation
amr = libwarpx.amr  # picks the right 1d, 2d or 3d variant

Full details for pyAMReX APIs are documented here. Important APIs include:

Data Access

While the simulation is running, callbacks can have read and write access the WarpX simulation data in situ.

An important object in the pywarpx.picmi module for data access is Simulation.extension.warpx, which is available only during the simulation run. This object is the Python equivalent to the C++ WarpX simulation class.

class pywarpx.callbacks.WarpX
getistep(lev: int)

Get the current step on mesh-refinement level lev.

gett_new(lev: int)

Get the current physical time on mesh-refinement level lev.

getdt(lev: int)

Get the current physical time step size on mesh-refinement level lev.

multifab(multifab_name: str)

Return MultiFabs by name, e.g., "Efield_aux[x][level=0]", "Efield_cp[x][level=0]", …

The physical fields in WarpX have the following naming:

  • _fp are the “fine” patches, the regular resolution of a current mesh-refinement level

  • _aux are temporary (auxiliar) patches at the same resolution as _fp. They usually include contributions from other levels and can be interpolated for gather routines of particles.

  • _cp are “coarse” patches, at the same resolution (but not necessary values) as the _fp of level - 1 (only for level 1 and higher).

multi_particle_container()
get_particle_boundary_buffer()
set_potential_on_domain_boundary(potential_[lo/hi]_[x/y/z]: str)

The potential on the domain boundaries can be modified when using the electrostatic solver. This function updates the strings and function parsers which set the domain boundary potentials during the Poisson solve.

set_potential_on_eb(potential: str)

The embedded boundary (EB) conditions can be modified when using the electrostatic solver. This set the EB potential string and updates the function parser.

evolve(numsteps=-1)

Evolve the simulation the specified number of steps.

finalize(finalize_mpi=1)

Call finalize for WarpX and AMReX. Registered to run at program exit.

The WarpX also provides read and write access to field MultiFab and ParticleContainer data, shown in the following examples.

Fields

This example accesses the \(E_x(x,y,z)\) field at level 0 after every time step and calculate the largest value in it.

from pywarpx import picmi
from pywarpx.callbacks import callfromafterstep

# Preparation: set up the simulation
#   sim = picmi.Simulation(...)
#   ...


@callfromafterstep
def set_E_x():
    warpx = sim.extension.warpx

    # data access
    E_x_mf = warpx.multifab(f"Efield_fp[x][level=0]")

    # compute
    # iterate over mesh-refinement levels
    for lev in range(warpx.finest_level + 1):
        # grow (aka guard/ghost/halo) regions
        ngv = E_x_mf.n_grow_vect

        # get every local block of the field
        for mfi in E_x_mf:
            # global index space box, including guards
            bx = mfi.tilebox().grow(ngv)
            print(bx)  # note: global index space of this block

            # numpy representation: non-copying view, including the
            # guard/ghost region;     .to_cupy() for GPU!
            E_x_np = E_x_mf.array(mfi).to_numpy()

            # notes on indexing in E_x_np:
            # - numpy uses locally zero-based indexing
            # - layout is F_CONTIGUOUS by default, just like AMReX

            # notes:
            # Only the next lines are the "HOT LOOP" of the computation.
            # For efficiency, use numpy array operation for speed on CPUs.
            # For GPUs use .to_cupy() above and compute with cupy or numba.
            E_x_np[()] = 42.0


sim.step(nsteps=100)

For further details on how to access GPU data or compute on E_x, please see the pyAMReX documentation.

High-Level Field Wrapper

Note

TODO

Note

TODO: What are the benefits of using the high-level wrapper? TODO: What are the limitations (e.g., in memory usage or compute scalability) of using the high-level wrapper?

Particles
from pywarpx import picmi
from pywarpx.callbacks import callfromafterstep

# Preparation: set up the simulation
#   sim = picmi.Simulation(...)
#   ...

@callfromafterstep
def my_after_step_callback():
    warpx = sim.extension.warpx
    Config = sim.extension.Config

    # data access
    multi_pc = warpx.multi_particle_container()
    pc = multi_pc.get_particle_container_from_name("electrons")

    # compute
    # iterate over mesh-refinement levels
    for lvl in range(pc.finest_level + 1):
        # get every local chunk of particles
        for pti in pc.iterator(pc, level=lvl):
            # compile-time and runtime attributes in SoA format
            soa = pti.soa().to_cupy() if Config.have_gpu else \
                  pti.soa().to_numpy()

            # notes:
            # Only the next lines are the "HOT LOOP" of the computation.
            # For speed, use array operation.

            # write to all particles in the chunk
            # note: careful, if you change particle positions, you might need to
            #       redistribute particles before continuing the simulation step
            soa.real[0][()] = 0.30  # x
            soa.real[1][()] = 0.35  # y
            soa.real[2][()] = 0.40  # z

            # all other attributes: weight, momentum x, y, z, ...
            for soa_real in soa.real[3:]:
                soa_real[()] = 42.0

            # by default empty unless ionization or QED physics is used
            # or other runtime attributes were added manually
            for soa_int in soa.int:
                soa_int[()] = 12


sim.step(nsteps=100)

For further details on how to access GPU data or compute on electrons, please see the pyAMReX documentation.

High-Level Particle Wrapper

Note

TODO: What are the benefits of using the high-level wrapper? TODO: What are the limitations (e.g., in memory usage or compute scalability) of using the high-level wrapper?

Particles can be added to the simulation at specific positions and with specific attribute values:

from pywarpx import particle_containers, picmi

# ...

electron_wrapper = particle_containers.ParticleContainerWrapper("electrons")
class pywarpx.particle_containers.ParticleContainerWrapper(species_name)[source]

Wrapper around particle containers. This provides a convenient way to query and set data in the particle containers.

Parameters:

species_name (string) – The name of the species to be accessed.

add_particles(x=None, y=None, z=None, ux=None, uy=None, uz=None, w=None, unique_particles=True, **kwargs)[source]

A function for adding particles to the WarpX simulation.

Parameters:
  • species_name (str) – The type of species for which particles will be added

  • x (arrays or scalars) – The particle positions (m) (default = 0.)

  • y (arrays or scalars) – The particle positions (m) (default = 0.)

  • z (arrays or scalars) – The particle positions (m) (default = 0.)

  • ux (arrays or scalars) – The particle proper velocities (m/s) (default = 0.)

  • uy (arrays or scalars) – The particle proper velocities (m/s) (default = 0.)

  • uz (arrays or scalars) – The particle proper velocities (m/s) (default = 0.)

  • w (array or scalars) – Particle weights (default = 0.)

  • unique_particles (bool) – True means the added particles are duplicated by each process; False means the number of added particles is independent of the number of processes (default = True)

  • kwargs (dict) – Containing an entry for all the extra particle attribute arrays. If an attribute is not given it will be set to 0.

add_real_comp(pid_name, comm=True)[source]

Add a real component to the particle data array.

Parameters:
  • pid_name (str) – Name that can be used to identify the new component

  • comm (bool) – Should the component be communicated

deposit_charge_density(level, clear_rho=True, sync_rho=True)[source]

Deposit this species’ charge density in rho_fp in order to access that data via pywarpx.fields.RhoFPWrapper().

Parameters:
  • species_name (str) – The species name that will be deposited.

  • level (int) – Which AMR level to retrieve scraped particle data from.

  • clear_rho (bool) – If True, zero out rho_fp before deposition.

  • sync_rho (bool) – If True, perform MPI exchange and properly set boundary cells for rho_fp.

get_particle_count(local=False)[source]

Get the number of particles of this species in the simulation.

Parameters:

local (bool) – If True the particle count on this processor will be returned. Default False.

Returns:

An integer count of the number of particles

Return type:

int

get_particle_cpu(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle ‘cpu’ numbers on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle cpus

Return type:

List of arrays

get_particle_id(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle ‘id’ numbers on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle ids

Return type:

List of arrays

get_particle_idcpu(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle ‘idcpu’ numbers on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle idcpu

Return type:

List of arrays

get_particle_idcpu_arrays(level, copy_to_host=False)[source]

This returns a list of numpy or cupy arrays containing the particle idcpu data on each tile for this process.

Unless copy_to_host is specified, the data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle array data

Return type:

List of arrays

get_particle_int_arrays(comp_name, level, copy_to_host=False)[source]

This returns a list of numpy or cupy arrays containing the particle int array data on each tile for this process.

Unless copy_to_host is specified, the data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.

Parameters:
  • comp_name (str) – The component of the array data that will be returned

  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle array data

Return type:

List of arrays

get_particle_r(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle ‘r’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle r position

Return type:

List of arrays

get_particle_real_arrays(comp_name, level, copy_to_host=False)[source]

This returns a list of numpy or cupy arrays containing the particle real array data on each tile for this process.

Unless copy_to_host is specified, the data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.

Parameters:
  • comp_name (str) – The component of the array data that will be returned

  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle array data

Return type:

List of arrays

get_particle_theta(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle theta on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle theta position

Return type:

List of arrays

get_particle_ux(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle x momentum on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle x momentum

Return type:

List of arrays

get_particle_uy(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle y momentum on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle y momentum

Return type:

List of arrays

get_particle_uz(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle z momentum on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle z momentum

Return type:

List of arrays

get_particle_weight(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle weight on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle weight

Return type:

List of arrays

get_particle_x(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle ‘x’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle x position

Return type:

List of arrays

get_particle_y(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle ‘y’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle y position

Return type:

List of arrays

get_particle_z(level=0, copy_to_host=False)[source]

Return a list of numpy or cupy arrays containing the particle ‘z’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle z position

Return type:

List of arrays

get_species_charge_sum(local=False)[source]

Returns the total charge in the simulation due to the given species.

Parameters:

local (bool) – If True return total charge per processor

property idcpu

Return a list of numpy or cupy arrays containing the particle ‘idcpu’ numbers on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle idcpu

Return type:

List of arrays

property nps

Get the number of particles of this species in the simulation.

Parameters:

local (bool) – If True the particle count on this processor will be returned. Default False.

Returns:

An integer count of the number of particles

Return type:

int

property rp

Return a list of numpy or cupy arrays containing the particle ‘r’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle r position

Return type:

List of arrays

property thetap

Return a list of numpy or cupy arrays containing the particle theta on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle theta position

Return type:

List of arrays

property uxp

Return a list of numpy or cupy arrays containing the particle x momentum on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle x momentum

Return type:

List of arrays

property uyp

Return a list of numpy or cupy arrays containing the particle y momentum on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle y momentum

Return type:

List of arrays

property uzp

Return a list of numpy or cupy arrays containing the particle z momentum on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle z momentum

Return type:

List of arrays

property wp

Return a list of numpy or cupy arrays containing the particle weight on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle weight

Return type:

List of arrays

property xp

Return a list of numpy or cupy arrays containing the particle ‘x’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle x position

Return type:

List of arrays

property yp

Return a list of numpy or cupy arrays containing the particle ‘y’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle y position

Return type:

List of arrays

property zp

Return a list of numpy or cupy arrays containing the particle ‘z’ positions on each tile.

Parameters:
  • level (int) – The refinement level to reference (default=0)

  • copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.

Returns:

The requested particle z position

Return type:

List of arrays

The get_particle_real_arrays(), get_particle_int_arrays() and get_particle_idcpu_arrays() functions are called by several utility functions of the form get_particle_{comp_name} where comp_name is one of x, y, z, r, theta, id, cpu, weight, ux, uy or uz.

Diagnostics

Various diagnostics are also accessible from Python. This includes getting the deposited or total charge density from a given species as well as accessing the scraped particle buffer. See the example in Examples/Tests/ParticleBoundaryScrape for a reference on how to interact with scraped particle data.

class pywarpx.particle_containers.ParticleBoundaryBufferWrapper[source]

Wrapper around particle boundary buffer containers. This provides a convenient way to query data in the particle boundary buffer containers.

clear_buffer()[source]

Clear the buffer that holds the particles lost at the boundaries.

get_particle_boundary_buffer(species_name, boundary, comp_name, level)[source]

This returns a list of numpy or cupy arrays containing the particle array data for a species that has been scraped by a specific simulation boundary.

The data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.

You can find here https://github.com/ECP-WarpX/WarpX/blob/319e55b10ad4f7c71b84a4fb21afbafe1f5b65c2/Examples/Tests/particle_boundary_interaction/PICMI_inputs_rz.py an example of a simple case of particle-boundary interaction (reflection).

Parameters:
  • species_name (str) – The species name that the data will be returned for.

  • boundary (str) – The boundary from which to get the scraped particle data in the form x/y/z_hi/lo or eb.

  • comp_name (str) – The component of the array data that will be returned. “x”, “y”, “z”, “ux”, “uy”, “uz”, “w” “stepScraped”,”deltaTimeScraped”, if boundary=’eb’: “nx”, “ny”, “nz”

  • level (int) – Which AMR level to retrieve scraped particle data from.

get_particle_boundary_buffer_size(species_name, boundary, local=False)[source]

This returns the number of particles that have been scraped so far in the simulation from the specified boundary and of the specified species.

Parameters:
  • species_name (str) – Return the number of scraped particles of this species

  • boundary (str) – The boundary from which to get the scraped particle data in the form x/y/z_hi/lo

  • local (bool) – Whether to only return the number of particles in the current processor’s buffer

Modify Solvers

From Python, one can also replace numerical solvers in the PIC loop or add new physical processes into the time step loop. Examples:

  • Capacitive Discharge: replaces the Poisson solver of an electrostatic simulation (default: MLMG) with a python function that uses superLU to directly solve the Poisson equation.

Domain Decomposition

WarpX relies on a spatial domain decomposition for MPI parallelization. It provides two different ways for users to specify this decomposition, a simple way recommended for most users, and a flexible way recommended if more control is desired. The flexible method is required for dynamic load balancing to be useful.

1. Simple Method

The first and simplest method is to provide the warpx.numprocs = nx ny nz parameter, either at the command line or somewhere in your inputs deck. In this case, WarpX will split up the overall problem domain into exactly the specified number of subdomains, or Boxes in the AMReX terminology, with the data defined on each Box having its own guard cells. The product of nx, ny, and nz must be exactly the desired number of MPI ranks. Note that, because there is exactly one Box per MPI rank when run this way, dynamic load balancing will not be possible, as there is no way of shifting Boxes around to achieve a more even load. This is the approach recommended for new users as it is the easiest to use.

Note

If warpx.numprocs is not specified, WarpX will fall back on using the amr.max_grid_size and amr.blocking_factor parameters, described below.

2. More General Method

The second way of specifying the domain decomposition provides greater flexibility and enables dynamic load balancing, but is not as easy to use. In this method, the user specifies inputs parameters amr.max_grid_size and amr.blocking_factor, which can be thought of as the maximum and minimum allowed Box sizes. Now, the overall problem domain (specified by the amr.ncell input parameter) will be broken up into some number of Boxes with the specified characteristics. By default, WarpX will make the Boxes as big as possible given the constraints.

For example, if amr.ncell = 768 768 768, amr.max_grid_size =  128, and amr.blocking_factor = 32, then AMReX will make 6 Boxes in each direction, for a total of 216 (the amr.blocking_factor does not factor in yet; however, see the section on mesh refinement below). If this problem is then run on 54 MPI ranks, there will be 4 boxes per rank initially. This problem could be run on as many as 216 ranks without performing any splitting.

Note

Both amr.ncell and amr.max_grid_size must be divisible by amr.blocking_factor, in each direction.

When WarpX is run using this approach to domain decomposition, the number of MPI ranks does not need to be exactly equal to the number of Boxes. Note also that if you run WarpX with more MPI ranks than there are boxes on the base level, WarpX will attempt to split the available Boxes until there is at least one for each rank to work on; this may cause it violate the constraints of amr.max_grid_size and amr.blocking_factor.

Note

The AMReX documentation on Grid Creation may also be helpful.

You can also specify a separate max_grid_size and blocking_factor for each direction, using the parameters amr.max_grid_size_x, amr.max_grid_size_y, etc… . This allows you to request, for example, a “pencil” type domain decomposition that is long in one direction. Note that, in RZ geometry, the parameters corresponding to the longitudinal direction are amr.max_grid_size_y and amr.blocking_factor_y.

3. Performance Considerations

In terms of performance, in general there is a trade off. Having many small boxes provides flexibility in terms of load balancing; however, the cost is increased time spent in communication due to surface-to-volume effects and increased kernel launch overhead when running on the GPUs. The ideal number of boxes per rank depends on how important dynamic load balancing is on your problem. If your problem is intrinsically well-balanced, like in a uniform plasma, then having a few, large boxes is best. But, if the problem is non-uniform and achieving a good load balance is critical for performance, having more, smaller Boxes can be worth it. In general, we find that running with something in the range of 4-8 Boxes per process is a good compromise for most problems.

Note

For specific information on the dynamic load balancer used in WarpX, visit the Load Balancing page on the AMReX documentation.

The best values for these parameters can also depend strongly on a number of numerical parameters:

  • Algorithms used (Maxwell/spectral field solver, filters, order of the particle shape factor)

  • Number of guard cells (that depends on the particle shape factor and the type and order of the Maxwell solver, the filters used, etc.)

  • Number of particles per cell, and the number of species

and the details of the on-node parallelization and computer architecture used for the run:

  • GPU or CPU

  • Number of OpenMP threads

  • Amount of high-bandwidth memory.

Because these parameters put additional constraints on the domain size for a simulation, it can be cumbersome to calculate the number of cells and the physical size of the computational domain for a given resolution. This Python script does it automatically.

When using the RZ spectral solver, the values of amr.max_grid_size and amr.blocking_factor are constrained since the solver requires that the full radial extent be within a each block. For the radial values, any input is ignored and the max grid size and blocking factor are both set equal to the number of radial cells. For the longitudinal values, the blocking factor has a minimum size of 8, allowing the computational domain of each block to be large enough relative to the guard cells for reasonable performance, but the max grid size and blocking factor must also be small enough so that there will be at least one block per processor. If max grid size and/or blocking factor are too large, they will be silently reduced as needed. If there are too many processors so that there is not enough blocks for the number processors, WarpX will abort.

4. Mesh Refinement

With mesh refinement, the above picture is more complicated, as in general the number of boxes can not be predicted at the start of the simulation. The decomposition of the base level will proceed as outlined above. The refined region, however, will be covered by some number of Boxes whose sizes are consistent with amr.max_grid_size and amr.blocking_factor. With mesh refinement, the blocking factor is important, as WarpX may decide to use Boxes smaller than amr.max_grid_size so as not to over-refine outside of the requested area. Note that you can specify a vector of values to make these parameters vary by level. For example, amr.max_grid_size = 128 64 will make the max grid size be 128 on level 0 and 64 on level 1.

In general, the above performance considerations apply - varying these values such that there are 4-8 Boxes per rank on each level is a good guideline.

Visualizing a distribution mapping

WarpX provides via reduced diagnostics an output LoadBalanceCosts, which allows for visualization of a simulation’s distribution mapping and computational costs. Here we demonstrate the workflow for generating this data and using it to plot distribution mappings and load balance costs.

Generating the data

To generate ‘Load Balance Costs’ reduced diagnostics output, WarpX should be run with the following lines added to the input file (the name of the reduced diagnostics file, LBC, and interval in steps to output reduced diagnostics data, 100, may be changed as needed):

warpx.reduced_diags_names = LBC
LBC.type = LoadBalanceCosts
LBC.intervals = 100

The line warpx.reduced_diags_names = LBC sets the name of the reduced diagnostics output file to LBC. The next line LBC.type = LoadBalanceCosts tells WarpX that the reduced diagnostics is a LoadBalanceCosts diagnostic, and instructs WarpX to record costs and rank layouts. The final line, LBC.intervals = 100, controls the interval for output of this reduced diagnostic’s data.

Loading and plotting the data

After generating data (called LBC_knapsack.txt and LBC_sfc.txt in the example below), the following Python code, along with a helper class in plot_distribution_mapping.py can be used to read the data:

# Math
import numpy as np
import random

# Plotting
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib.colors import ListedColormap
from mpl_toolkits.axes_grid1 import make_axes_locatable

# Data handling
import plot_distribution_mapping as pdm

sim_knapsack = pdm.SimData('LBC_knapsack.txt', # Data directory
                           [2800],             # Files to process
                           is_3D=False         # if this is a 2D sim
                          )
sim_sfc = pdm.SimData('LBC_sfc.txt', [2800])

# Set reduced diagnostics data for step 2800
for sim in [sim_knapsack, sim_sfc]: sim(2800)

For 2D data, the following function can be used for visualization of distribution mappings:

# Plotting -- we know beforehand the data is 2D
def plot(sim):
    """
    Plot MPI rank layout for a set of `LoadBalanceCosts` reduced diagnostics
    (2D) data.

    Arguments:
    sim -- SimData class with data (2D) loaded for desired iteration
    """
    # Make first cmap
    cmap = plt.cm.nipy_spectral
    cmaplist = [cmap(i) for i in range(cmap.N)][::-1]
    unique_ranks = np.unique(sim.rank_arr)
    sz = len(unique_ranks)
    cmap = mpl.colors.LinearSegmentedColormap.from_list(
        'my_cmap', cmaplist, sz) # create the new map

    # Make cmap from 1 --> 96 then randomize
    cmaplist= [cmap(i) for i in range(sz)]
    random.Random(6).shuffle(cmaplist)
    cmap = mpl.colors.LinearSegmentedColormap.from_list(
        'my_cmap', cmaplist, sz) # create the new map

    # Define the bins and normalize
    bounds = np.linspace(0, sz, sz + 1)
    norm = mpl.colors.BoundaryNorm(bounds, sz)

    my, mx = sim.rank_arr.shape
    xcoord, ycoord = np.linspace(0,mx,mx+1), np.linspace(0,my,my+1)
    im = plt.pcolormesh(xcoord, ycoord, sim.rank_arr,
                        cmap=cmap, norm=norm)

    # Grid lines
    plt.ylabel('$j$')
    plt.xlabel('$i$')
    plt.minorticks_on()
    plt.hlines(ycoord, xcoord[0], xcoord[-1],
               alpha=0.7, linewidth=0.3, color='lightgrey')
    plt.vlines(xcoord, ycoord[0], ycoord[-1],
               alpha=0.7, linewidth=0.3, color='lightgrey')

    plt.gca().set_aspect('equal')

    # Center rank label
    for j in range(my):
        for i in range(mx):
            text = plt.gca().text(i+0.5, j+0.5, int(sim.rank_arr[j][i]),
                                  ha="center", va="center",
                                  color="w", fontsize=8)

    # Colorbar
    divider = make_axes_locatable(plt.gca())
    cax = divider.new_horizontal(size="5%", pad=0.05)
    plt.gcf().add_axes(cax)
    cb=plt.gcf().colorbar(im, label='rank', cax=cax, orientation="vertical")
    minorticks = np.linspace(0, 1, len(unique_ranks) + 1)
    cb.ax.yaxis.set_ticks(minorticks, minor=True)

The function can be used as follows:

fig, axs = plt.subplots(1, 2, figsize=(12, 6))
plt.sca(axs[0])
plt.title('Knapsack')
plot(sim_knapsack)
plt.sca(axs[1])
plt.title('SFC')
plot(sim_sfc)
plt.tight_layout()

This generates plots like in [fig:knapsack_sfc_distribution_mapping_2D]:

Sample distribution mappings from simulations with knapsack (left) and space-filling curve (right) policies for update of the distribution mapping when load balancing.

Sample distribution mappings from simulations with knapsack (left) and space-filling curve (right) policies for update of the distribution mapping when load balancing.

Similarly, the computational costs per box can be plotted with the following code:

fig, axs = plt.subplots(1, 2, figsize=(12, 6))
plt.sca(axs[0])
plt.title('Knapsack')
plt.pcolormesh(sim_knapsack.cost_arr)
plt.sca(axs[1])
plt.title('SFC')
plt.pcolormesh(sim_sfc.cost_arr)

for ax in axs:
    plt.sca(ax)
    plt.ylabel('$j$')
    plt.xlabel('$i$')
    ax.set_aspect('equal')

plt.tight_layout()

This generates plots like in [fig:knapsack_sfc_costs_2D]:

Sample computational cost per box from simulations with knapsack (left) and space-filling curve (right) policies for update of the distribution mapping when load balancing.

Sample computational cost per box from simulations with knapsack (left) and space-filling curve (right) policies for update of the distribution mapping when load balancing.

Loading 3D data works the same as loading 2D data, but this time the cost and rank arrays will be 3 dimensional. Here we load and plot some example 3D data (LBC_3D.txt) from a simulation run on 4 MPI ranks. Particles fill the box from \(k=0\) to \(k=1\).

sim_3D = pdm.SimData('LBC_3D.txt', [1,2,3])
sim_3D(1)

# Plotting -- we know beforehand the data is 3D
def plot_3D(sim, j0):
    """
    Plot MPI rank layout for a set of `LoadBalanceCosts` reduced diagnostics
    (3D) data.

    Arguments:
    sim -- SimData class with data (3D) loaded for desired iteration
    j0 -- slice along j direction to plot ik slice
    """
    # Make first cmap
    cmap = plt.cm.viridis
    cmaplist = [cmap(i) for i in range(cmap.N)][::-1]
    unique_ranks = np.unique(sim.rank_arr)
    sz = len(unique_ranks)
    cmap = mpl.colors.LinearSegmentedColormap.from_list(
        'my_cmap', cmaplist, sz) # create the new map

    # Make cmap from 1 --> 96 then randomize
    cmaplist= [cmap(i) for i in range(sz)]
    random.Random(6).shuffle(cmaplist)
    cmap = mpl.colors.LinearSegmentedColormap.from_list(
        'my_cmap', cmaplist, sz) # create the new map

    # Define the bins and normalize
    bounds = np.linspace(0, sz, sz + 1)
    norm = mpl.colors.BoundaryNorm(bounds, sz)

    mz, my, mx = sim.rank_arr.shape
    xcoord, ycoord, zcoord = np.linspace(0,mx,mx+1), np.linspace(0,my,my+1),
                                                     np.linspace(0,mz,mz+1)
    im = plt.pcolormesh(xcoord, zcoord, sim.rank_arr[:,j0,:],
                        cmap=cmap, norm=norm)

    # Grid lines
    plt.ylabel('$k$')
    plt.xlabel('$i$')
    plt.minorticks_on()
    plt.hlines(zcoord, xcoord[0], xcoord[-1],
               alpha=0.7, linewidth=0.3, color='lightgrey')
    plt.vlines(xcoord, zcoord[0], zcoord[-1],
               alpha=0.7, linewidth=0.3, color='lightgrey')

    plt.gca().set_aspect('equal')

    # Center rank label
    for k in range(mz):
        for i in range(mx):
            text = plt.gca().text(i+0.5, k+0.5, int(sim.rank_arr[k][j0][i]),
                                  ha="center", va="center",
                                  color="red", fontsize=8)

    # Colorbar
    divider = make_axes_locatable(plt.gca())
    cax = divider.new_horizontal(size="5%", pad=0.05)
    plt.gcf().add_axes(cax)
    cb=plt.gcf().colorbar(im, label='rank', cax=cax, orientation="vertical")
    ticks = np.linspace(0, 1, len(unique_ranks)+1)
    cb.ax.yaxis.set_ticks(ticks)
    cb.ax.yaxis.set_ticklabels([0, 1, 2, 3, " "])

fig, axs = plt.subplots(2, 2, figsize=(8, 8))
for j,ax in enumerate(axs.flatten()):
    plt.sca(ax)
    plt.title('j={}'.format(j))
    plot_3D(sim_3D, j)
    plt.tight_layout()

This generates plots like in [fig:distribution_mapping_3D]:

Sample distribution mappings from 3D simulations, visualized as slices in the :math:`ik` plane along :math:`j`.

Sample distribution mappings from 3D simulations, visualized as slices in the \(ik\) plane along \(j\).

Debugging the code

Sometimes, the code does not give you the result that you are expecting. This can be due to a variety of reasons, from misunderstandings or changes in the input parameters, system specific quirks, or bugs. You might also want to debug your code as you implement new features in WarpX during development.

This section gives a step-by-step guidance on how to systematically check what might be going wrong.

Debugging Workflow

Try the following steps to debug a simulation:

  1. Check the output text file, usually called output.txt: are there warnings or errors present?

  2. On an HPC system, look for the job output and error files, usually called WarpX.e... and WarpX.o.... Read long messages from the top and follow potential guidance.

  3. If your simulation already created output data files: Check if they look reasonable before the problem occurred; are the initial conditions of the simulation as you expected? Do you spot numerical artifacts or instabilities that could point to missing resolution or unexpected/incompatible numerical parameters?

  4. Did the job output files indicate a crash? Check the Backtrace.<mpirank> files for the location of the code that triggered the crash. Backtraces are read from bottom (high-level) to top (most specific line that crashed).

    1. Was this a segmentation fault in C++, but the run was controlled from Python (PICMI)? To get the last called Python line for the backtrace, run again and add the Python faulthandler, e.g., with python3 -X faulthandler PICMI_your_script_here.py.

  5. Try to make the reproducible scenario as small as possible by modifying the inputs file. Reduce number of cells, particles and MPI processes to something as small and as quick to execute as possible. The next steps in debugging will increase runtime, so you will benefit from a fast reproducer.

  6. Consider adding runtime debug options that can narrow down typical causes in numerical implementations.

  7. In case of a crash, Backtraces can be more detailed if you re-compile with debug flags: for example, try compiling with -DCMAKE_BUILD_TYPE=RelWithDebInfo (some slowdown) or even -DCMAKE_BUILD_TYPE=Debug (this will make the simulation way slower) and rerun.

  8. If debug builds are too costly, try instead compiling with -DAMReX_ASSERTIONS=ON to activate more checks and rerun.

  9. If the problem looks like a memory violation, this could be from an invalid field or particle index access. Try compiling with -DAMReX_BOUND_CHECK=ON (this will make the simulation very slow), and rerun.

  10. If the problem looks like a random memory might be used, try initializing memory with signaling Not-a-Number (NaN) values through the runtime option fab.init_snan = 1. Further useful runtime options are amrex.fpe_trap_invalid, amrex.fpe_trap_zero and amrex.fpe_trap_overflow (see details in the AMReX link below).

  11. On Nvidia GPUs, if you suspect the problem might be a race condition due to a missing host / device synchronization, set the environment variable export CUDA_LAUNCH_BLOCKING=1 and rerun.

  12. Consider simplifying your input options and re-adding more options after having found a working baseline.

Fore more information, see also the AMReX Debugging Manual.

Last but not least: the community of WarpX developers and users can help if you get stuck. Collect your above findings, describe where and what you are running and how you installed the code, describe the issue you are seeing with details and input files used and what you already tried. Can you reproduce the problem with a smaller setup (less parallelism and/or less resolution)? Report these details in a WarpX GitHub issue.

Debuggers

See the AMReX debugger section on additional runtime parameters to

  • disable backtraces

  • rethrow exceptions

  • avoid AMReX-level signal handling

You will need to set those runtime options to work directly with debuggers.

Typical Error Messages

By default, the code is run in Release mode (see compilation options). That means, code errors will likely show up as symptoms of earlier errors in the code instead of directly showing the underlying line that caused the error.

For instance, we have these checks in release mode

Particles shape does not fit within tile (CPU) or guard cells (GPU) used for charge deposition
Particles shape does not fit within tile (CPU) or guard cells (GPU) used for current deposition

which prevent that particles with positions that violate the local definitions of guard cells cause confusing errors in charge/current deposition.

In such a case, as described above, rebuild and rerun in Debug mode before searching further for the bug. Usually, the bug is from NaN or infinite numbers assigned to particles or fields earlier in the code or from ill-defined guard sizes. Building in debug mode will likely move the first thrown error to an earlier location in the code, which is then closer to the underlying cause.

Then, continue following the workflow above, adding more compilation guards and runtime flags that can trap array bound violations and invalid floating point values.

Generate QED lookup tables using the standalone tool

We provide tools to generate and convert into a human-readable format the QED lookup tables. Such tools can be compiled with cmake by setting the flag WarpX_QED_TOOLS=ON (this requires both PICSAR and Boost libraries). The tools are compiled alongside the WarpX executable in the folder bin. We report here the help message displayed by the tools:

$ ./qed_table_reader -h
### QED Table Reader ###
Command line options:
-h [NO ARG] Prints all command line arguments
-i [STRING] Name of the file to open
--table [STRING] Either BW (Breit-Wheeler) or QS (Quantum Synchrotron)
--mode [STRING] Precision of the calculations: either DP (double) or SP (single)
-o [STRING] filename to save the lookup table in human-readable format

$ ./qed_table_generator -h
### QED Table Generator ###
Command line options:
-h [NO ARG] Prints all command line arguments
--table [STRING] Either BW (Breit-Wheeler) or QS (Quantum Synchrotron)
--mode [STRING] Precision of the calculations: either DP (double) or SP (single)
--dndt_chi_min [DOUBLE] minimum chi parameter for the dNdt table
--dndt_chi_max [DOUBLE] maximum chi parameter for the dNdt table
--dndt_how_many [INTEGR] number of points in the dNdt table
--pair_chi_min [DOUBLE] minimum chi for the pair production table (BW only)
--pair_chi_max [DOUBLE] maximum chi for the pair production table (BW only)
--pair_chi_how_many [INTEGR] number of chi points in the pair production table (BW only)
--pair_frac_how_many [INTEGR] number of frac points in the pair production table (BW only)
--em_chi_min [DOUBLE] minimum chi for the photon emission table (QS only)
--em_chi_max [DOUBLE] maximum chi for the photon emission production table (QS only)
--em_frac_min [DOUBLE] minimum frac for the photon emission production table (QS only)
--em_chi_how_many [INTEGR] number of chi points in the photon emission table (QS only)
--em_frac_how_many [INTEGR] number of frac points in the photon emission table (QS only)
-o [STRING] filename to save the lookup table

These tools are meant to be compatible with WarpX: qed_table_generator should generate tables that can be loaded into WarpX and qed_table_reader should be able to read tables generated with WarpX. It is not safe to use these tools to generate a table on a machine using a different endianness with respect to the machine where the table is used.

Plot timestep duration

We provide a simple python script to generate plots of the timestep duration from the stdandard output of WarpX (provided that warpx.verbose is set to 1): plot_timestep_duration.py.

If the standard output of a simulation has been redirected to a file named log_file, the script can be used as follows:

python plot_timestep_duration.py log_file

The script generates two pictures: log_file_ts_duration.png, which shows the duration of each timestep in seconds as a function of the timestep number, and log_file_ts_cumulative_duration.png, which shows the total duration of the simulation as a function of the timestep number.

Predicting the Number of Guard Cells for PSATD Simulations

When the computational domain is decomposed in parallel subdomains and the pseudo-spectral analytical time-domain (PSATD) method is used to solve Maxwell’s equations (by setting algo.maxwell_solver = psatd in the input file), the number of guard cells used to exchange fields between neighboring subdomains can be chosen based on the extent of the stencil of the leading term in Maxwell’s equations, in Fourier space. A measure of such stencil can be obtained by computing the inverse Fourier transform of the given term along a chosen axis and by averaging the result over the remaining axes in Fourier space. The idea is to look at how quickly such stencils fall off to machine precision, with respect to their extension in units of grid cells, and identify consequently the number of cells after which the stencils will be truncated, with the aim of balancing numerical accuracy and locality. See (Zoni et al., 2021) for reference.

A user can run the Python script Stencil.py, located in ./Tools/DevUtils, in order to compute such stencils and estimate the number of guard cells needed for a given PSATD simulation with domain decomposition. In particular, the script computes the minimum number of guard cells for a given error threshold, that is, the minimum number of guard cells such that the stencil measure is not larger than the error threshold. The user can modify the input parameters set in the main function in order to reproduce the simulation setup. These parameters include: cell size, time step, spectral order, Lorentz boost, whether the PSATD algorithm is based on the Galilean scheme, and error threshold (this is not an input parameter of a WarpX simulation, but rather an empirical error threshold chosen to balance numerical accuracy and locality, as mentioned above).

Archiving

Archiving simulation inputs, scripts and output data is a common need for computational physicists. Here are some popular tools and workflows to make archiving easy.

HPC Systems: HPSS

A very common tape filesystem is HPSS, e.g., on NERSC or OLCF.

  • What’s in my archive file system? hsi ls

  • Already something in my archive location? hsi ls 2019/cool_campaign/ as usual

  • Let’s create a neat directory structure:

    • new directory on the archive: hsi mkdir 2021

    • create sub-dirs per campaign as usual: hsi mkdir 2021/reproduce_paper

  • Create an archive of a simulation: htar -cvf 2021/reproduce_paper/sim_042.tar /global/cfs/cdirs/m1234/ahuebl/reproduce_paper/sim_042

    • This copies all files over to the tape filesystem and stores them as a single .tar archive

    • The first argument here will be the new archive .tar file on the archive file system, all following arguments (can be multiple, separated by a space) are locations to directories and files on the parallel file system.

    • Don’t be confused, these tools also create an index .tar.idx file along it; just leave that file be and don’t interact with it

  • Change permissions of your archive, so your team can read your files:

    • Check the unix permissions via hsi ls -al 2021/ and hsi ls -al 2021/reproduce_paper/

    • Files must be group (g) readable (r): hsi chmod g+r 2021/reproduce_paper/sim_042.tar

    • Directories must be group (g) readable (r) and group accessible (x): hsi chmod -R g+rx 2021

  • Restore things:

    • mkdir here_we_restore

    • cd here_we_restore

    • htar -xvf 2021/reproduce_paper/sim_42.tar

      • this copies the .tar file back from tape to our parallel filesystem and extracts its content in the current directory

Argument meaning: -c create; -x extract; -v verbose; -f tar filename. That’s it, folks!

Note

Sometimes, for large dirs, htar takes a while. You could then consider running it as part of a (single-node/single-cpu) job script.

Desktops/Laptops: Cloud Drives

Even for small simulation runs, it is worth to create data archives. A good location for such an archive might be the cloud storage provided by one’s institution.

Tools like rclone can help with this, e.g., to quickly sync a large amount of directories to a Google Drive.

Asynchronous File Copies: Globus

The scientific data service Globus allows to perform large-scale data copies, between HPC centers as well as local computers, with ease and a graphical user interface. Copies can be kicked off asynchronously, often use dedicated internet backbones and are checked when transfers are complete.

Many HPC centers also add their archives as a storage endpoint and one can download a client program to add also one’s desktop/laptop.

Scientific Data for Publications

It is good practice to make computational results accessible, scrutinizable and ideally even reusable.

For data artifacts up to approximately 50 GB, consider using free services like Zenodo and Figshare to store supplementary materials of your publications.

For more information, see the open science movement, open data and open access.

Note

More information, guidance and templates will be posted here in the future.

Training a Surrogate Model from WarpX Data

Suppose we have a WarpX simulation that we wish to replace with a neural network surrogate model. For example, a simulation determined by the following input script

Python Input for Training Simulation
#!/usr/bin/env python3
import math

import numpy as np

from pywarpx import picmi

# Physical constants
c = picmi.constants.c
q_e = picmi.constants.q_e
m_e = picmi.constants.m_e
m_p = picmi.constants.m_p
ep0 = picmi.constants.ep0

# Number of cells
dim = '3'
nx = ny = 128
nz = 35328 #17664 #8832
if dim == 'rz':
    nr = nx//2

# Computational domain
rmin =  0.
rmax =  128e-6
zmin = -180e-6
zmax =  0.

# Number of processes for static load balancing
# Check with your submit script
num_procs = [1, 1, 64*4]
if dim == 'rz':
    num_procs = [1, 64]

# Number of time steps
gamma_boost = 60.
beta_boost = np.sqrt(1.-gamma_boost**-2)

# Create grid
if dim == 'rz':
    grid = picmi.CylindricalGrid(
        number_of_cells=[nr, nz],
        guard_cells=[32, 32],
        n_azimuthal_modes=2,
        lower_bound=[rmin, zmin],
        upper_bound=[rmax, zmax],
        lower_boundary_conditions=['none', 'damped'],
        upper_boundary_conditions=['none', 'damped'],
        lower_boundary_conditions_particles=['absorbing', 'absorbing'],
        upper_boundary_conditions_particles=['absorbing', 'absorbing'],
        moving_window_velocity=[0., c],
        warpx_max_grid_size=256,
        warpx_blocking_factor=64)
else:
    grid = picmi.Cartesian3DGrid(
        number_of_cells=[nx, ny, nz],
        guard_cells=[11, 11, 12],
        lower_bound=[-rmax, -rmax, zmin],
        upper_bound=[rmax, rmax, zmax],
        lower_boundary_conditions=['periodic', 'periodic', 'damped'],
        upper_boundary_conditions=['periodic', 'periodic', 'damped'],
        lower_boundary_conditions_particles=['periodic', 'periodic', 'absorbing'],
        upper_boundary_conditions_particles=['periodic', 'periodic', 'absorbing'],
        moving_window_velocity=[0., 0., c],
        warpx_max_grid_size=256,
        warpx_blocking_factor=32)


# plasma region
plasma_rlim = 100.e-6
N_stage = 15
L_plasma_bulk = 0.28
L_ramp = 1.e-9
L_ramp_up = L_ramp
L_ramp_down = L_ramp
L_stage = L_plasma_bulk + 2*L_ramp

# focusing
# lens external fields
beam_gamma1 = 15095
lens_focal_length = 0.015
lens_width = 0.003

stage_spacing = L_plasma_bulk + 2*lens_focal_length

def get_species_of_accelerator_stage(stage_idx, stage_zmin, stage_zmax,
                        stage_xmin=-plasma_rlim, stage_xmax=plasma_rlim,
                        stage_ymin=-plasma_rlim, stage_ymax=plasma_rlim,
                        Lplus = L_ramp_up, Lp = L_plasma_bulk,
                        Lminus = L_ramp_down):
    # Parabolic density profile
    n0 = 1.7e23
    Rc = 40.e-6
    Lstage = Lplus + Lp + Lminus
    if not np.isclose(stage_zmax-stage_zmin, Lstage):
        print('Warning: zmax disagrees with stage length')
    parabolic_distribution = picmi.AnalyticDistribution(
        density_expression=
            f'n0*(1.+4.*(x**2+y**2)/(kp**2*Rc**4))*(0.5*(1.-cos(pi*(z-{stage_zmin})/Lplus)))*((z-{stage_zmin})<Lplus)' \
            + f'+n0*(1.+4.*(x**2+y**2)/(kp**2*Rc**4))*((z-{stage_zmin})>=Lplus)*((z-{stage_zmin})<(Lplus+Lp))' \
            + f'+n0*(1.+4.*(x**2+y**2)/(kp**2*Rc**4))*(0.5*(1.+cos(pi*((z-{stage_zmin})-Lplus-Lp)/Lminus)))*((z-{stage_zmin})>=(Lplus+Lp))*((z-{stage_zmin})<(Lplus+Lp+Lminus))',
        pi=3.141592653589793,
        n0=n0,
        kp=q_e/c*math.sqrt(n0/(m_e*ep0)),
        Rc=Rc,
        Lplus=Lplus,
        Lp=Lp,
        Lminus=Lminus,
        lower_bound=[stage_xmin, stage_ymin, stage_zmin],
        upper_bound=[stage_xmax, stage_ymax, stage_zmax],
        fill_in=True)

    electrons = picmi.Species(
        particle_type='electron',
        name=f'electrons{stage_idx}',
        initial_distribution=parabolic_distribution)

    ions = picmi.Species(
        particle_type='proton',
        name=f'ions{stage_idx}',
        initial_distribution=parabolic_distribution)

    return electrons, ions

species_list = []
for i_stage in range(1):
    # Add plasma
    zmin_stage = i_stage * stage_spacing
    zmax_stage = zmin_stage + L_stage
    electrons, ions = get_species_of_accelerator_stage(i_stage+1, zmin_stage, zmax_stage)
    species_list.append(electrons)
    species_list.append(ions)

# add beam to species_list
beam_charge = -10.e-15 # in Coulombs
N_beam_particles = int(1e6)
beam_centroid_z = -107.e-6
beam_rms_z = 2.e-6
beam_gammas = [1960 + 13246 * i_stage for i_stage in range(N_stage)]
#beam_gammas = [1957, 15188, 28432, 41678, 54926, 68174, 81423,94672, 107922,121171] # From 3D run
beams = []
for i_stage in range(N_stage):
    beam_gamma = beam_gammas[i_stage]
    sigma_gamma = 0.06 * beam_gamma
    gaussian_distribution = picmi.GaussianBunchDistribution(
        n_physical_particles= abs(beam_charge) / q_e,
        rms_bunch_size=[2.e-6, 2.e-6, beam_rms_z],
        rms_velocity=[8*c, 8*c, sigma_gamma*c],
        centroid_position=[0., 0., beam_centroid_z],
        centroid_velocity=[0., 0., beam_gamma*c],
    )
    beam = picmi.Species(
        particle_type='electron',
        name=f'beam_stage_{i_stage}',
        initial_distribution= gaussian_distribution
    )
    beams.append(beam)

# Laser
antenna_z = -1e-9
profile_t_peak = 1.46764864e-13
def get_laser(antenna_z, profile_t_peak, fill_in=True):
    profile_focal_distance = 0.
    laser = picmi.GaussianLaser(
        wavelength=0.8e-06,
        waist=36e-06,
        duration=7.33841e-14,
        focal_position=[0., 0., profile_focal_distance + antenna_z],
        centroid_position=[0., 0., antenna_z - c*profile_t_peak],
        propagation_direction=[0., 0., 1.],
        polarization_direction=[0., 1., 0.],
        a0=2.36,
        fill_in=fill_in)
    laser_antenna = picmi.LaserAntenna(
        position=[0., 0., antenna_z],
        normal_vector=[0., 0., 1.])
    return (laser, laser_antenna)
lasers = []
for i_stage in range(1):
    fill_in = True
    if i_stage == 0:
        fill_in = False
    lasers.append(
        get_laser(antenna_z + i_stage*stage_spacing,
                  profile_t_peak + i_stage*stage_spacing/c,
                  fill_in)
    )

# Electromagnetic solver

psatd_algo = 'multij'
if psatd_algo == 'galilean':
    galilean_velocity = [0.,0.] if dim=='3' else [0.]
    galilean_velocity += [-c*beta_boost]
    n_pass_z = 1
    do_multiJ = None
    do_multi_J_n_depositions=None
    J_in_time = None
    current_correction = True
    divE_cleaning = False
elif psatd_algo == 'multij':
    n_pass_z = 4
    galilean_velocity = None
    do_multiJ = True
    do_multi_J_n_depositions = 2
    J_in_time = "linear"
    current_correction = False
    divE_cleaning = True
else:
    raise Exception(f'PSATD algorithm \'{psatd_algo}\' is not recognized!\n'\
                    'Valid options are \'multiJ\' or \'galilean\'.')
if dim == 'rz':
    stencil_order = [8, 16]
    smoother = picmi.BinomialSmoother(n_pass=[1,n_pass_z])
    grid_type = 'collocated'
else:
    stencil_order = [8, 8, 16]
    smoother = picmi.BinomialSmoother(n_pass=[1,1,n_pass_z])
    grid_type = 'hybrid'


solver = picmi.ElectromagneticSolver(
    grid=grid,
    method='PSATD',
    cfl=0.9999,
    source_smoother=smoother,
    stencil_order=stencil_order,
    galilean_velocity=galilean_velocity,
    warpx_psatd_update_with_rho=True,
    warpx_current_correction=current_correction,
    divE_cleaning=divE_cleaning,
    warpx_psatd_J_in_time=J_in_time
    )

# Diagnostics
diag_field_list = ['B', 'E', 'J', 'rho']
diag_particle_list = ['weighting','position','momentum']
coarse_btd_end = int((L_plasma_bulk+0.001+stage_spacing*(N_stage-1))*100000)
stage_end_snapshots=[f'{int((L_plasma_bulk+stage_spacing*ii)*100000)}:{int((L_plasma_bulk+stage_spacing*ii)*100000+50)}:5' for ii in range(1)]
btd_particle_diag = picmi.LabFrameParticleDiagnostic(
    name='lab_particle_diags',
    species=beams,
    grid=grid,
    num_snapshots=25*N_stage,
    #warpx_intervals=', '.join([f':{coarse_btd_end}:1000']+stage_end_snapshots),
    warpx_intervals=', '.join(['0:0']+stage_end_snapshots),
    dt_snapshots=0.00001/c,
    data_list=diag_particle_list,
    write_dir='lab_particle_diags',
    warpx_format='openpmd',
    warpx_openpmd_backend='bp')

btd_field_diag = picmi.LabFrameFieldDiagnostic(
    name='lab_field_diags',
    grid=grid,
    num_snapshots=25*N_stage,
    dt_snapshots=stage_spacing/25/c,
    data_list=diag_field_list,
    warpx_lower_bound=[-128.e-6, 0.e-6, -180.e-6],
    warpx_upper_bound=[128.e-6, 0.e-6, 0.],
    write_dir='lab_field_diags',
    warpx_format='openpmd',
    warpx_openpmd_backend='bp')

field_diag = picmi.FieldDiagnostic(
    name='field_diags',
    data_list=diag_field_list,
    grid=grid,
    period=100,
    write_dir='field_diags',
    lower_bound=[-128.e-6, 0.e-6, -180.e-6],
    upper_bound=[128.e-6, 0.e-6, 0.],
    warpx_format='openpmd',
    warpx_openpmd_backend='h5')

particle_diag = picmi.ParticleDiagnostic(
    name='particle_diags',
    species=beams,
    period=100,
    write_dir='particle_diags',
    warpx_format='openpmd',
    warpx_openpmd_backend='h5')

beamrel_red_diag = picmi.ReducedDiagnostic(
    diag_type='BeamRelevant',
    name='beamrel',
    species=beam,
    period=1)

# Set up simulation
sim = picmi.Simulation(
    solver=solver,
    warpx_numprocs=num_procs,
    warpx_compute_max_step_from_btd=True,
    verbose=2,
    particle_shape='cubic',
    gamma_boost=gamma_boost,
    warpx_charge_deposition_algo='standard',
    warpx_current_deposition_algo='direct',
    warpx_field_gathering_algo='momentum-conserving',
    warpx_particle_pusher_algo='vay',
    warpx_amrex_the_arena_is_managed=False,
    warpx_amrex_use_gpu_aware_mpi=True,
    warpx_do_multi_J=do_multiJ,
    warpx_do_multi_J_n_depositions=do_multi_J_n_depositions,
    warpx_grid_type=grid_type,
    # default: 2 for staggered grids, 8 for hybrid grids
    warpx_field_centering_order=[16,16,16],
    # only for hybrid grids, default: 8
    warpx_current_centering_order=[16,16,16]
    )

for species in species_list:
    if dim=='rz':
        n_macroparticle_per_cell=[2,4,2]
    else:
        n_macroparticle_per_cell=[2,2,2]
    sim.add_species(
        species,
        layout=picmi.GriddedLayout(grid=grid,
            n_macroparticle_per_cell=n_macroparticle_per_cell)
    )

for i_stage in range(N_stage):
    sim.add_species_through_plane(
        species=beams[i_stage],
        layout=picmi.PseudoRandomLayout(grid=grid, n_macroparticles=N_beam_particles),
        injection_plane_position=0.,
        injection_plane_normal_vector=[0.,0.,1.])

for i_stage in range(1):
    # Add laser
    (laser, laser_antenna) = lasers[i_stage]
    sim.add_laser(laser, injection_method=laser_antenna)

# Add diagnostics
sim.add_diagnostic(btd_particle_diag)
#sim.add_diagnostic(btd_field_diag)
#sim.add_diagnostic(field_diag)
#sim.add_diagnostic(particle_diag)

# Add reduced diagnostic
sim.add_diagnostic(beamrel_red_diag)

sim.write_input_file(f'inputs_training_{N_stage}_stages')

# Advance simulation until last time step
sim.step()

In this section we walk through a workflow for data processing and model training, using data from this input script as an example. The simulation output is stored in an online Zenodo archive, in the lab_particle_diags directory. In the example scripts provided here, the data is downloaded from the Zenodo archive, properly formatted, and used to train a neural network. This workflow was developed and first presented in Sandberg et al. [1], Sandberg et al. [2]. It assumes you have an up-to-date environment with PyTorch and openPMD.

Data Cleaning

It is important to inspect the data for artifacts, to check that input/output data make sense. If we plot the final phase space of the particle beam, shown in Fig. 18. we see outlying particles. Looking closer at the z-pz space, we see that some particles were not trapped in the accelerating region of the wake and have much less energy than the rest of the beam.

Plot showing the final phase space projections of a particle beam through a laser-plasma acceleration element where some beam particles were not accelerated.

The final phase space projections of a particle beam through a laser-plasma acceleration element where some beam particles were not accelerated.

To assist our neural network in learning dynamics of interest, we filter out these particles. It is sufficient for our purposes to select particles that are not too far back, setting particle_selection={'z':[0.280025, None]}. After filtering, we can see in Fig. 19 that the beam phase space projections are much cleaner – this is the beam we want to train on.

Plot showing the final phase space projections of a particle beam through a laser-plasma acceleration element after filtering out outlying particles.

The final phase space projections of a particle beam through a laser-plasma acceleration element after filtering out outlying particles.

A particle tracker is set up to make sure we consistently filter out these particles from both the initial and final data.

iteration = ts.iterations[survivor_select_index]
pt = ParticleTracker( ts,
                     species=species,
                     iteration=iteration,
                     select=particle_selection)

This data cleaning ensures that the particle data is distributed in a single blob, as is optimal for training neural networks.

Create Normalized Dataset

Having chosen training data we are content with, we now need to format the data, normalize it, and store the normalized data as well as the normalizations. The script below will take the openPMD data we have selected and format, normalize, and store it.

Python dataset creation
#!/usr/bin/env python3
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Ryan Sandberg
# License: BSD-3-Clause-LBNL
#

import os
import zipfile
from urllib import request

import numpy as np
import torch
from openpmd_viewer import OpenPMDTimeSeries, ParticleTracker

c = 2.998e8
###############

def sanitize_dir_strings(*dir_strings):
    """append '/' to a string for concatenation in building up file tree descriptions
    """
    dir_strings = list(dir_strings)
    for ii, dir_string in enumerate(dir_strings):
        if dir_string[-1] != '/':
            dir_strings[ii] = dir_string + '/'

    return dir_strings

def download_and_unzip(url, data_dir):
    request.urlretrieve(url, data_dir)
    with zipfile.ZipFile(data_dir, 'r') as zip_dataset:
        zip_dataset.extractall()

def create_source_target_data(data_dir,
                              species,
                              source_index=0,
                              target_index=-1,
                              survivor_select_index=-1,
                              particle_selection=None
                             ):
    """Create dataset from openPMD files

    Parameters
    ---
    data_dir : string, location of diagnostic data
    source_index : int, which index to take source data from
    target_index : int, which index to take target data from
    particle_selection: dictionary, optional, selection criterion for dataset

    Returns
    ---
    source_data:  Nx6 array of source particle data
    source_means: 6 element array of source particle coordinate means
    source_stds:  6 element array of source particle coordinate standard deviations
    target_data:  Nx6 array of target particle data
    target_means: 6 element array of target particle coordinate means
    target_stds:  6 element array of source particle coordinate standard deviations
    relevant times: 2 element array of source and target times
    """
    data_dir, = sanitize_dir_strings(data_dir)
    data_path = data_dir
    print('loading openPMD data from', data_path)
    ts = OpenPMDTimeSeries(data_path)
    relevant_times = [ts.t[source_index], ts.t[target_index]]

    # Manual: Particle tracking START
    iteration = ts.iterations[survivor_select_index]
    pt = ParticleTracker( ts,
                         species=species,
                         iteration=iteration,
                         select=particle_selection)
    # Manual: Particle tracking END

    #### create normalized source, target data sets ####
    print('creating data sets')

    # Manual: Load openPMD START
    iteration = ts.iterations[source_index]
    source_data = ts.get_particle(species=species,
                                  iteration=iteration,
                                  var_list=['x','y','z','ux','uy','uz'],
                                  select=pt)

    iteration = ts.iterations[target_index]
    target_data = ts.get_particle(species=species,
                                  iteration=iteration,
                                  var_list=['x','y','z','ux','uy','uz'],
                                  select=pt)
    # Manual: Load openPMD END

    # Manual: Normalization START
    target_means = np.zeros(6)
    target_stds = np.zeros(6)
    source_means = np.zeros(6)
    source_stds = np.zeros(6)
    for jj in range(6):
        source_means[jj] = source_data[jj].mean()
        source_stds[jj] = source_data[jj].std()
        source_data[jj] -= source_means[jj]
        source_data[jj] /= source_stds[jj]

    for jj in range(6):
        target_means[jj] = target_data[jj].mean()
        target_stds[jj] = target_data[jj].std()
        target_data[jj] -= target_means[jj]
        target_data[jj] /= target_stds[jj]
    # Manual: Normalization END

    # Manual: Format data START
    source_data = torch.tensor(np.column_stack(source_data))
    target_data = torch.tensor(np.column_stack(target_data))
    # Manual: Format data END

    return source_data, source_means, source_stds, target_data, target_means, target_stds, relevant_times

def save_warpx_surrogate_data(dataset_fullpath_filename,
                              diag_dir,
                              species,
                              training_frac,
                              batch_size,
                              source_index,
                              target_index,
                              survivor_select_index,
                              particle_selection=None
                             ):

    source_target_data = create_source_target_data(
        data_dir=diag_dir,
        species=species,
        source_index=source_index,
        target_index=target_index,
        survivor_select_index=survivor_select_index,
        particle_selection=particle_selection
    )
    source_data, source_means, source_stds, target_data, target_means, target_stds, times = source_target_data

    # Manual: Save dataset START
    full_dataset = torch.utils.data.TensorDataset(source_data.float(), target_data.float())

    n_samples = full_dataset.tensors[0].size(0)
    n_train = int(training_frac*n_samples)
    n_test = n_samples - n_train

    train_data, test_data = torch.utils.data.random_split(full_dataset, [n_train, n_test])

    torch.save({'dataset':full_dataset,
                'train_indices':train_data.indices,
                'test_indices':test_data.indices,
                'source_means':source_means,
                'source_stds':source_stds,
                'target_means':target_means,
                'target_stds':target_stds,
                'times':times,
               },
                dataset_fullpath_filename
              )
    # Manual: Save dataset END

######## end utility functions #############
######## start dataset creation ############

data_url = "https://zenodo.org/records/10810754/files/lab_particle_diags.zip?download=1"
download_and_unzip(data_url, "lab_particle_diags.zip")
data_dir = "lab_particle_diags/lab_particle_diags/"

# create data set

source_index = 0
target_index = 1
survivor_select_index = 1
batch_size=1200
training_frac = 0.7

os.makedirs('datasets', exist_ok=True)

# improve stage 0 dataset
stage_i = 0
select = {'z':[0.280025, None]}
species = f'beam_stage_{stage_i}'
dataset_filename = f'dataset_{species}.pt'
dataset_file = 'datasets/' + dataset_filename
save_warpx_surrogate_data(dataset_fullpath_filename=dataset_file,
                diag_dir=data_dir,
                species=species,
                training_frac=training_frac,
                batch_size=batch_size,
                source_index=source_index,
                target_index=target_index,
                survivor_select_index=survivor_select_index,
                particle_selection=select
               )

for stage_i in range(1,15):
    species = f'beam_stage_{stage_i}'
    dataset_filename = f'dataset_{species}.pt'
    dataset_file = 'datasets/' + dataset_filename
    save_warpx_surrogate_data(dataset_fullpath_filename=dataset_file,
                    diag_dir=data_dir,
                    species=species,
                    training_frac=training_frac,
                    batch_size=batch_size,
                    source_index=source_index,
                    target_index=target_index,
                    survivor_select_index=survivor_select_index
                   )
Load openPMD Data

First the openPMD data is loaded, using the particle selector as chosen above. The neural network will make predictions from the initial phase space coordinates, using the final phase space coordinates to measure how well it is making predictions. Hence we load two sets of particle data, the source and target particle arrays.

iteration = ts.iterations[source_index]
source_data = ts.get_particle(species=species,
                              iteration=iteration,
                              var_list=['x','y','z','ux','uy','uz'],
                              select=pt)

iteration = ts.iterations[target_index]
target_data = ts.get_particle(species=species,
                              iteration=iteration,
                              var_list=['x','y','z','ux','uy','uz'],
                              select=pt)
Normalize Data

Neural networks learn better on appropriately normalized data. Here we subtract out the mean in each coordinate direction and divide by the standard deviation in each coordinate direction, for normalized data that is centered on the origin with unit variance.

target_means = np.zeros(6)
target_stds = np.zeros(6)
source_means = np.zeros(6)
source_stds = np.zeros(6)
for jj in range(6):
    source_means[jj] = source_data[jj].mean()
    source_stds[jj] = source_data[jj].std()
    source_data[jj] -= source_means[jj]
    source_data[jj] /= source_stds[jj]

for jj in range(6):
    target_means[jj] = target_data[jj].mean()
    target_stds[jj] = target_data[jj].std()
    target_data[jj] -= target_means[jj]
    target_data[jj] /= target_stds[jj]
openPMD to PyTorch Data

With the data normalized, it must be stored in a form PyTorch recognizes. The openPMD data are 6 lists of arrays, for each of the 6 phase space coordinates \(x, y, z, p_x, p_y,\) and \(p_z\). This data are converted to an \(N\times 6\) numpy array and then to a PyTorch \(N\times 6\) tensor.

source_data = torch.tensor(np.column_stack(source_data))
target_data = torch.tensor(np.column_stack(target_data))
Save Normalizations and Normalized Data

The data is split into training and testing subsets. We take most of the data (70%) for training, meaning that data is used to update the neural network parameters. The testing data is reserved to determine how well the neural network generalizes; that is, how well the neural network performs on data that wasn’t used to update the neural network parameters. With the data split and properly normalized, it and the normalizations are saved to file for use in training and inference.

full_dataset = torch.utils.data.TensorDataset(source_data.float(), target_data.float())

n_samples = full_dataset.tensors[0].size(0)
n_train = int(training_frac*n_samples)
n_test = n_samples - n_train

train_data, test_data = torch.utils.data.random_split(full_dataset, [n_train, n_test])

torch.save({'dataset':full_dataset,
            'train_indices':train_data.indices,
            'test_indices':test_data.indices,
            'source_means':source_means,
            'source_stds':source_stds,
            'target_means':target_means,
            'target_stds':target_stds,
            'times':times,
           },
            dataset_fullpath_filename
          )

Neural Network Structure

It was found in Sandberg et al. [2] that a reasonable surrogate model is obtained with shallow feedforward neural networks consisting of about 5 hidden layers and 700-900 nodes per layer. The example shown here uses 3 hidden layers and 20 nodes per layer and is trained for 10 epochs.

Some utility functions for creating neural networks are provided in the script below. These are mostly convenience wrappers and utilities for working with PyTorch neural network objects. This script is imported in the training scripts shown later.

Python neural network class definitions
#!/usr/bin/env python3
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Ryan Sandberg
# License: BSD-3-Clause-LBNL
#
from enum import Enum

from torch import nn


class ActivationType(Enum):
    """
    Activation class provides an enumeration type for the supported activation layers
    """
    ReLU = 1
    Tanh = 2
    PReLU = 3
    Sigmoid = 4

def get_enum_type(type_to_test, EnumClass):
    """
    Returns the enumeration type associated to type_to_test in EnumClass

    Parameters
    ----------
    type_to_test: EnumClass, int or str
        object whose Enum class is to be obtained
    EnumClass: Enum class
        Enum class to test
    """
    if type(type_to_test) is EnumClass:
        return type_to_test
    if type(type_to_test) is int:
        return EnumClass(type_to_test)
    if type(type_to_test) is str:
        return getattr(EnumClass, type_to_test)
    else:
        raise Exception("unsupported type entered")



class ConnectedNN(nn.Module):
    """
    ConnectedNN is a class of fully connected neural networks
    """
    def __init__(self, layers):
        super().__init__()
        self.stack = nn.Sequential(*layers)
    def forward(self, x):
        return self.stack(x)

class OneActNN(ConnectedNN):
    """
    OneActNN is class of fully connected neural networks admitting only one activation function
    """
    def __init__(self,
                 n_in,
                 n_out,
                 n_hidden_nodes,
                 n_hidden_layers,
                 act):

        self.n_in = n_in
        self.n_out = n_out
        self.n_hidden_layers = n_hidden_layers
        self.n_hidden_nodes = n_hidden_nodes

        self.act = get_enum_type(act, ActivationType)

        layers = [nn.Linear(self.n_in, self.n_hidden_nodes)]

        for ii in range(self.n_hidden_layers):
            if self.act is ActivationType.ReLU:
                layers += [nn.ReLU()]
            if self.act is ActivationType.Tanh:
                layers += [nn.Tanh()]
            if self.act is ActivationType.PReLU:
                layers += [nn.PReLU()]
            if self.act is ActivationType.Sigmoid:
                layers += [nn.Sigmoid()]

            if ii < self.n_hidden_layers - 1:
                layers += [nn.Linear(self.n_hidden_nodes,self.n_hidden_nodes)]

        layers += [nn.Linear(self.n_hidden_nodes, self.n_out)]

        super().__init__(layers)

Train and Save Neural Network

The script below trains the neural network on the dataset just created. In subsequent sections we discuss the various parts of the training process.

Python neural network training
#!/usr/bin/env python3
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Ryan Sandberg
# License: BSD-3-Clause-LBNL
#
import os
import time

import neural_network_classes as mynn
import torch
import torch.nn.functional as F
import torch.optim as optim

############# set model parameters #################

stage_i = 0
species = f'beam_stage_{stage_i}'
source_index = 0
target_index = 1
survivor_select_index = 1

data_dim = 6
n_in = data_dim
n_out = data_dim

learning_rate = 0.0001
n_epochs = 10
batch_size = 1200

loss_fun = F.mse_loss

n_hidden_nodes = 20
n_hidden_layers = 3
activation_type = 'ReLU'

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'device={device}')
#################### load dataset ################
dataset_filename = f'dataset_{species}.pt'
dataset_file = 'datasets/' + dataset_filename

print(f"trying to load dataset+test-train split in {dataset_file}")

dataset_with_indices = torch.load(dataset_file)
train_data = torch.utils.data.dataset.Subset(dataset_with_indices['dataset'], dataset_with_indices['train_indices'])
test_data = torch.utils.data.dataset.Subset(dataset_with_indices['dataset'], dataset_with_indices['test_indices'])
source_data = dataset_with_indices['dataset']
source_means = dataset_with_indices['source_means']
source_stds = dataset_with_indices['source_stds']
target_means = dataset_with_indices['target_means']
target_stds = dataset_with_indices['target_stds']
print("able to load data and test/train split")

###### move data to device (GPU) if available ########
source_device = train_data.dataset.tensors[0].to(device) # equivalently, test_data.tensors[0].to(device)
target_device = train_data.dataset.tensors[1].to(device)
full_dataset_device = torch.utils.data.TensorDataset(source_device.float(), target_device.float())

train_data_device = torch.utils.data.dataset.Subset(full_dataset_device, train_data.indices)
test_data_device = torch.utils.data.dataset.Subset(full_dataset_device, test_data.indices)

train_loader_device = torch.utils.data.DataLoader(train_data_device, batch_size=batch_size, shuffle=True)
test_loader_device = torch.utils.data.DataLoader(test_data_device, batch_size=batch_size, shuffle=True)

test_source_device = test_data_device.dataset.tensors[0]
test_target_device = test_data_device.dataset.tensors[1]

training_set_size = len(train_data_device.indices)
testing_set_size = len(test_data_device.indices)

###### create model ###########

model = mynn.OneActNN(n_in = n_in,
                      n_out = n_out,
                      n_hidden_nodes=n_hidden_nodes,
                      n_hidden_layers = n_hidden_layers,
                      act=activation_type
                     )

training_time = 0
train_loss_list = []
test_loss_list = []

model.to(device=device);

########## train and test functions ####
# Manual: Train function START
def train(model, optimizer, train_loader, loss_fun):
    model.train()
    total_loss = 0.
    for batch_idx, (data, target) in enumerate(train_loader):
        #evaluate network with data
        output = model(data)
        #compute loss
         # sum the differences squared, take mean afterward
        loss = loss_fun(output, target,reduction='sum')
        #backpropagation: step optimizer and reset gradients
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        total_loss += loss.item()
    return total_loss
# Manual: Train function END

def test(model, test_loader, loss_fun):
    model.eval()
    total_loss = 0.
    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(test_loader):
            output = model(data)
            total_loss += loss_fun(output, target, reduction='sum').item()
    return total_loss

# Manual: Test function START
def test_dataset(model, test_source, test_target, loss_fun):
    model.eval()
    with torch.no_grad():
        output = model(test_source)
        return loss_fun(output, test_target, reduction='sum').item()
# Manual: Test function END

######## training loop ########

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

do_print = True

t3 = time.time()
# Manual: Training loop START
for epoch in range(n_epochs):
    if do_print:
        t1 = time.time()
    ave_train_loss = train(model, optimizer, train_loader_device, loss_fun) / data_dim / training_set_size
    ave_test_loss = test_dataset(model, test_source_device, test_target_device, loss_fun)  / data_dim / training_set_size
    train_loss_list.append(ave_train_loss)
    test_loss_list.append(ave_test_loss)

    if do_print:
        t2 = time.time()
        print('Train Epoch: {:04d} \tTrain Loss: {:.6f} \tTest Loss: {:.6f}, this epoch: {:.3f} s'.format(
                epoch + 1, ave_train_loss, ave_test_loss, t2-t1))
# Manual: Training loop END
t4 = time.time()
print(f'total training time: {t4-t3:.3f}s')

######### save model #########

os.makedirs('models', exist_ok=True)

# Manual: Save model START
model.to(device='cpu')
torch.save({
    'n_hidden_layers':n_hidden_layers,
    'n_hidden_nodes':n_hidden_nodes,
    'activation':activation_type,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    'train_loss_list': train_loss_list,
    'test_loss_list': test_loss_list,
    'training_time': training_time,
    }, f'models/{species}_model.pt')
# Manual: Save model END
Training Function

In the training function, the model weights are updated. Iterating through batches, the loss function is evaluated on each batch. PyTorch provides automatic differentiation, so the direction of steepest descent is determined when the loss function is evaluated and the loss.backward() function is invoked. The optimizer uses this information to update the weights in the optimizer.step() call. The training loop then resets the optimizer and updates the summed error for the whole dataset with the error on the batch and continues iterating through batches. Note that this function returns the sum of all errors across the entire dataset, which is later divided by the size of the dataset in the training loop.

def train(model, optimizer, train_loader, loss_fun):
    model.train()
    total_loss = 0.
    for batch_idx, (data, target) in enumerate(train_loader):
        #evaluate network with data
        output = model(data)
        #compute loss
         # sum the differences squared, take mean afterward
        loss = loss_fun(output, target,reduction='sum')
        #backpropagation: step optimizer and reset gradients
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        total_loss += loss.item()
    return total_loss
Testing Function

The testing function just evaluates the neural network on the testing data that has not been used to update the model parameters. This testing function requires that the testing dataset is small enough to be loaded all at once. The PyTorch dataloader can load data in batches if this size assumption is not satisfied. The error, measured by the loss function, is returned by the testing function to be aggregated and stored. Note that this function returns the sum of all errors across the entire dataset, which is later divided by the size of the dataset in the training loop.

def test_dataset(model, test_source, test_target, loss_fun):
    model.eval()
    with torch.no_grad():
        output = model(test_source)
        return loss_fun(output, test_target, reduction='sum').item()
Training Loop

The full training loop performs n_epochs number of iterations. At each iteration the training and testing functions are called, the respective errors are divided by the size of the dataset and recorded, and a status update is printed to the console.

for epoch in range(n_epochs):
    if do_print:
        t1 = time.time()
    ave_train_loss = train(model, optimizer, train_loader_device, loss_fun) / data_dim / training_set_size
    ave_test_loss = test_dataset(model, test_source_device, test_target_device, loss_fun)  / data_dim / training_set_size
    train_loss_list.append(ave_train_loss)
    test_loss_list.append(ave_test_loss)

    if do_print:
        t2 = time.time()
        print('Train Epoch: {:04d} \tTrain Loss: {:.6f} \tTest Loss: {:.6f}, this epoch: {:.3f} s'.format(
                epoch + 1, ave_train_loss, ave_test_loss, t2-t1))
Save Neural Network Parameters

The model weights are saved after training to record the updates to the model parameters. Additionally, we save some model metainformation with the model for convenience, including the model hyperparameters, the training and testing losses, and how long the training took.

model.to(device='cpu')
torch.save({
    'n_hidden_layers':n_hidden_layers,
    'n_hidden_nodes':n_hidden_nodes,
    'activation':activation_type,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    'train_loss_list': train_loss_list,
    'test_loss_list': test_loss_list,
    'training_time': training_time,
    }, f'models/{species}_model.pt')

Evaluate

In this section we show two ways to diagnose how well the neural network is learning the data. First we consider the train-test loss curves, shown in Fig. 20. This figure shows the model error on the training data (in blue) and testing data (in green) as a function of the number of epochs seen. The training data is used to update the model parameters, so training error should be lower than testing error. A key feature to look for in the train-test loss curve is the inflection point in the test loss trend. The testing data is set aside as a sample of data the neural network hasn’t seen before. The testing error serves as a metric of model generalizability, indicating how well the model performs on data it hasn’t seen yet. When the test-loss starts to trend flat or even upward, the neural network is no longer improving its ability to generalize to new data.

Plot of training and testing loss curves versus number of training epochs.

Training (in blue) and testing (in green) loss curves versus number of training epochs.

Plot comparing model prediction with simulation output.

A comparison of model prediction (yellow-red dots, colored by mean-squared error) with simulation output (black dots).

A visual inspection of the model prediction can be seen in Fig. 21. This plot compares the model prediction, with dots colored by mean-square error, on the testing data with the actual simulation output in black. The model obtained with the hyperparameters chosen here trains quickly but is not very accurate. A more accurate model is obtained with 5 hidden layers and 900 nodes per layer, as discussed in Sandberg et al. [2].

These figures can be generated with the following Python script.

Python visualization of progress training neural network
#!/usr/bin/env python3
#
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Ryan Sandberg
# License: BSD-3-Clause-LBNL
#
import neural_network_classes as mynn
import numpy as np
import torch
import torch.nn.functional as F
from matplotlib import pyplot as plt

c = 2.998e8


# open model file
stage_i = 0
species = f'beam_stage_{stage_i}'
model_data = torch.load(f'models/{species}_model.pt',map_location=torch.device('cpu'))
data_dim = 6
n_in = data_dim
n_out = data_dim
n_hidden_layers = model_data['n_hidden_layers']
n_hidden_nodes = model_data['n_hidden_nodes']
activation_type = model_data['activation']
train_loss_list = model_data['train_loss_list']
test_loss_list = model_data['test_loss_list']
training_time = model_data['training_time']
loss_fun = F.mse_loss


n_epochs = len(train_loss_list)
train_counter = np.arange(n_epochs)+1
test_counter = train_counter

do_log_plot = False
fig, ax = plt.subplots()
if do_log_plot:
    ax.semilogy(train_counter, train_loss_list, '.-',color='blue',label='training loss')
    ax.semilogy(test_counter, test_loss_list, color='green',label='testing loss')
else:
    ax.plot(train_counter, train_loss_list, '.-',color='blue',label='training loss')
    ax.plot(test_counter, test_loss_list, color='green',label='testing loss')
ax.set_xlabel('number of epochs seen')
ax.set_ylabel(' loss')
ax.legend()
fig_dir = 'figures/'
ax.set_title(f'final test error = {test_loss_list[-1]:.3e} ')
ax.grid()
plt.tight_layout()
plt.savefig(f'{species}_training_testing_error.png')


######### plot phase space comparison #######
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'device={device}')

model = mynn.OneActNN(n_in = n_in,
                       n_out = n_out,
                       n_hidden_nodes=n_hidden_nodes,
                       n_hidden_layers = n_hidden_layers,
                       act = activation_type
                  )
model.load_state_dict(model_data['model_state_dict'])
model.to(device=device);

###### load model data ###############
dataset_filename = f'dataset_{species}.pt'
dataset_dir = 'datasets/'
model_input_data = torch.load(dataset_dir + dataset_filename)
dataset = model_input_data['dataset']
train_indices = model_input_data['train_indices']
test_indices = model_input_data['test_indices']
source_means = model_input_data['source_means']
source_stds = model_input_data['source_stds']
target_means = model_input_data['target_means']
target_stds = model_input_data['target_stds']
source_time, target_time = model_input_data['times']


source = dataset.tensors[0]
test_source = source[test_indices]
test_source_device = test_source.to(device)
with torch.no_grad():
    evaluation_device = model(test_source_device.float())
eval_cpu = evaluation_device.to('cpu')

target = dataset.tensors[1]
test_target = target[test_indices]

target_si = test_target * target_stds + target_means
eval_cpu_si = eval_cpu * target_stds + target_means
target_mu = np.copy(target_si)
eval_cpu_mu = np.copy(eval_cpu_si)
target_mu[:,2] -= c*target_time
eval_cpu_mu[:,2] -= c*target_time
target_mu[:,:3] *= 1e6
eval_cpu_mu[:,:3] *= 1e6



loss_tensor = torch.sum(loss_fun(eval_cpu,
                                 test_target,
                                 reduction='none'),
                        axis=1)/6
loss_array = loss_tensor.detach().numpy()

tinds = np.nonzero(loss_array > 0.0)[0]
skip = 10

plt.figure()
fig, axT = plt.subplots(3,3)
axes_label = {0:r'x [$\mu$m]', 1:r'y [$\mu$m]', 2:r'z - %.2f cm [$\mu$m]'%(c*target_time),3:r'$p_x$',4:r'$p_y$',5:r'$p_z$'}
xy_inds = [(0,1),(2,0),(2,1)]
def set_axes(ax, indx, indy):
    ax.scatter(target_mu[::skip,indx],target_mu[::skip,indy],s=8,c='k', label='simulation')
    ax.scatter(eval_cpu_mu[::skip,indx],eval_cpu_mu[::skip,indy],marker='*',c=loss_array[::skip],s=0.02, label='surrogate',cmap='YlOrRd')
    ax.set_xlabel(axes_label[indx])
    ax.set_ylabel(axes_label[indy])
    # return


for ii in range(3):
    ax = axT[0,ii]
    indx,indy = xy_inds[ii]
    set_axes(ax,indx,indy)

for ii in range(2):
    indx,indy = xy_inds[ii]
    ax = axT[1,ii]
    set_axes(ax,indx+3,indy+3)

for ii in range(3):
    ax = axT[2,ii]
    indx = ii
    indy = ii+3
    set_axes(ax, indx, indy)


ax = axT[1,2]
indx = 5
indy = 4
ax.scatter(target_mu[::skip,indx],target_mu[::skip,indy],s=8,c='k', label='simulation')
evalplt = ax.scatter(eval_cpu_mu[::skip,indx],eval_cpu_mu[::skip,indy],marker='*',c=loss_array[::skip],s=2, label='surrogate',cmap='YlOrRd')
ax.set_xlabel(axes_label[indx])
ax.set_ylabel(axes_label[indy])

cb = plt.colorbar(evalplt, ax=ax)
cb.set_label('MSE loss')

fig.suptitle(f'stage {stage_i} prediction')

plt.tight_layout()

plt.savefig(f'{species}_model_evaluation.png')
Surrogate Usage in Accelerator Physics

A neural network such as the one we trained here can be incorporated in other BLAST codes. Consider this example using neural network surrogates of WarpX simulations in ImpactX.

[1]

R. Sandberg, R. Lehe, C. E. Mitchell, M. Garten, J. Qiang, J.-L. Vay, and A. Huebl. Hybrid beamline element ML-training for surrogates in the ImpactX beam-dynamics code. In Proc. 14th International Particle Accelerator Conference, number 14 in IPAC'23 - 14th International Particle Accelerator Conference, 2885–2888. Venice, Italy, May 2023. JACoW Publishing, Geneva, Switzerland. URL: https://indico.jacow.org/event/41/contributions/2276, doi:10.18429/JACoW-IPAC2023-WEPA101.

[2] (1,2,3)

R. Sandberg, R. Lehe, C. E. Mitchell, M. Garten, A. Myers, J. Qiang, J.-L. Vay, and A. Huebl. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. 2024. accepted. URL: https://arxiv.org/abs/2402.17248, doi:10.48550/arXiv.2402.17248.

Optimizing with Optimas

optimas is an open-source Python library that enables highly scalable parallel optimization, from a typical laptop to exascale HPC systems. While a WarpX simulation can provide insight in some physics, it remains a single point evaluation in the space of parameters. If you have a simulation ready for use, but would like to (i) scan over some input parameters uniformly for, e.g., a tolerance study, or (ii) have a random evaluation of the space of input parameters within a given span or (iii) tune some input parameters to optimize an output parameter, e.g., beam emittance, energy spread, etc., optimas provides these capabilities and will take care of tasks monitoring with fault tolerance on multiple platforms (optimas targets modern HPC platforms like Perlmutter and Frontier).

A more detailed description of optimas is provided in the optimas documentation. In particular, the online optimas documentation provides an example optimization with optimas that runs WarpX simulations.

FAQ

This section lists frequently asked usage questions.

What is “MPI initialized with thread support level …”?

When we start up WarpX, we report a couple of information on used MPI processes across parallel compute processes, CPU threads or GPUs and further capabilities. For instance, a parallel, multi-process, multi-threaded CPU run could output:

MPI initialized with 4 MPI processes
MPI initialized with thread support level 3
OMP initialized with 8 OMP threads
AMReX (22.10-20-g3082028e4287) initialized
...

The 1st line is the number of parallel MPI processes (also called MPI ranks).

The 2nd line reports on the support level of MPI functions to be called from threads. We currently only use this for optional, async IO with AMReX plotfiles. In the past, requesting MPI threading support had performance penalties, but we have not seen such anymore on recent systems. Thus, we request it by default but you can overwrite it with a compile time option if it ever becomes needed.

The 3rd line is the number of CPU OpenMP (OMP) threads per MPI process. After that, information on software versions follow.

How do I suppress tiny profiler output if I do not care to see it?

Via AMReX_TINY_PROFILE=OFF (see: build options and then AMReX build options). We change the default in cmake/dependencies/AMReX.cmake.

Note that the tiny profiler adds literally no overhead to the simulation runtime, thus we enable it by default.

What design principles should I keep in mind when creating an input file?

Leave a cushion between lasers, particles, and the edge of computational domain. The laser antenna and plasma species zmin can be less than or greater than the geometry.prob_hi, but not exactly equal.

What do I need to know about using the boosted frame?

The input deck can be designed in the lab frame and little modification to the physical set-up is needed – most of the work is done internally. Here are a few practical items to assist in designing boosted frame simulations:

  • Ions must be explicitly included

  • Best practice is to separate counter-propagating objects; things moving to the right should start with \(z <= 0\) and things stationary or moving to the left (moving to the left in the boosted frame) should start with \(z > 0\)

  • Don’t forget the general design principles listed above

  • The boosted frame simulation begins at boosted time \(t'=0\)

  • Numerics and algorithms need to be adjusted, as there are numerical instabilities that arise in the boosted frame. For example, setting particles.use_fdtd_nci_corr=1 for an FDTD simulation or setting psatd.use_default_v_galilean=1 for a PSATD simulation. Be careful as this is overly simplistic and these options will not work in all cases. Please see the input parameters documentation and the examples for more information

An in-depth discussion of the boosted frame is provided in the moving window and optimal Lorentz boosted frame section.

What about Back-transformed diagnostics (BTD)?

[fig:BTD_features] Minkowski diagram indicating several features of the back-transformed diagnostic (BTD). The diagram explains why the first BTD begins to fill at boosted time :math:`t'=0` but this doesn't necessarily correspond to lab time :math:`t=0`, how the BTD grid-spacing is determined by the boosted time step :math:`\Delta t'`, hence why the snapshot length don't correspond to the grid spacing and length in the input script, and how the BTD snapshots complete when the effective snapshot length is covered in the boosted frame.

[fig:BTD_features] Minkowski diagram indicating several features of the back-transformed diagnostic (BTD). The diagram explains why the first BTD begins to fill at boosted time \(t'=0\) but this doesn’t necessarily correspond to lab time \(t=0\), how the BTD grid-spacing is determined by the boosted time step \(\Delta t'\), hence why the snapshot length don’t correspond to the grid spacing and length in the input script, and how the BTD snapshots complete when the effective snapshot length is covered in the boosted frame.

Several BTD quantities differ slightly from the lab frame domain described in the input deck. In the following discussion, we will use a subscript input (e.g. \(\Delta z_{\rm input}\)) to denote properties of the lab frame domain.

  • The first back-transformed diagnostic (BTD) snapshot may not occur at \(t=0\). Rather, it occurs at \(t_0=\frac{z_{max}}c \beta(1+\beta)\gamma^2\). This is the first time when the boosted frame can complete the snapshot.

  • The grid spacing of the BTD snapshot is different from the grid spacing indicated in the input script. It is given by \(\Delta z_{\rm grid,snapshot}=\frac{c\Delta t_{\rm boost}}{\gamma\beta}\). For a CFL-limited time step, \(\Delta z_{\rm grid,snapshot}\approx \frac{1+\beta}{\beta} \Delta z_{\rm input}\approx 2 \Delta z_{\rm input}\). Hence in many common use cases at large boost, it is expected that the BTD snapshot has a grid spacing twice what is expressed in the input script.

  • The effective length of the BTD snapshot may be longer than anticipated from the input script because the grid spacing is different. Additionally, the number of grid points in the BTD snapshot is a multiple of <BTD>.buffer_size whereas the number of grid cells specified in the input deck may not be.

  • The code may require longer than anticipated to complete a BTD snapshot. The code starts filling the \(i^{th}\) snapshot around step \(j_{\rm BTD start}={\rm ceil}\left( i\gamma(1-\beta)\frac{\Delta t_{\rm snapshot}}{\Delta t_{\rm boost}}\right)\). The code then saves information for one BTD cell every time step in the boosted frame simulation. The \(i^{th}\) snapshot is completed and saved \(n_{z,{\rm snapshot}}=n_{\rm buffers}\cdot ({\rm buffer\ size})\) time steps after it begins, which is when the effective snapshot length is covered by the simulation.

What kinds of RZ output do you support?

In RZ, supported detail of RZ output depends on the output format that is configured in the inputs file.

openPMD supports output of the detailed RZ modes and reconstructs representations on-the-fly in post-processing, e.g, in openPMD-viewer or other tools. For some tools, this is in-development.

AMReX plotfiles and other in situ methods output a 2D reconstructed Cartesian slice at \(\theta=0\) by default (and can opt-in to dump raw modes).

Data Analysis

Output formats

WarpX can write diagnostics data either in

Plotfiles are AMReX’ native data format, while openPMD is implemented in popular community formats such as ADIOS and HDF5.

This section describes some of the tools available to visualize the data.

Asynchronous IO

When using the AMReX plotfile format, users can set the amrex.async_out=1 option to perform the IO in a non-blocking fashion, meaning that the simulation will continue to run while an IO thread controls writing the data to disk. This can significantly reduce the overall time spent in IO. This is primarily intended for large runs on supercomputers (e.g. at OLCF or NERSC); depending on the MPI implementation you are using, you may not see a benefit on your workstation.

When writing plotfiles, each rank will write to a separate file, up to some maximum number (by default, 64). This maximum can be adjusted using the amrex.async_out_nfiles inputs parameter. To use asynchronous IO with than amrex.async_out_nfiles MPI ranks, WarpX WarpX must be configured with -DWarpX_MPI_THREAD_MULTIPLE=ON. Please see the building instructions for details.

In Situ Capabilities

WarpX includes so-called reduced diagnostics. Reduced diagnostics create observables on-the-fly, such as energy histograms or particle beam statistics and are easily visualized in post-processing.

In addition, WarpX also has vn-situ visualization capabilities (i.e. visualizing the data directly from the simulation, without dumping data files to disk).

In situ Visualization with SENSEI

SENSEI is a light weight framework for in situ data analysis. SENSEI’s data model and API provide uniform access to and run time selection of a diverse set of visualization and analysis back ends including VisIt Libsim, ParaView Catalyst, VTK-m, Ascent, ADIOS, Yt, and Python.

SENSEI uses an XML file to select and configure one or more back ends at run time. Run time selection of the back end via XML means one user can access Catalyst, another Libsim, yet another Python with no changes to the code.

System Architecture
https://data.kitware.com/api/v1/item/5c06cd538d777f2179d4aaca/download

SENSEI’s in situ architecture enables use of a diverse of back ends which can be selected at run time via an XML configuration file

The three major architectural components in SENSEI are data adaptors which present simulation data in SENSEI’s data model, analysis adaptors which present the back end data consumers to the simulation, and bridge code from which the simulation manages adaptors and periodically pushes data through the system. SENSEI comes equipped with a number of analysis adaptors enabling use of popular analysis and visualization libraries such as VisIt Libsim, ParaView Catalyst, Python, and ADIOS to name a few. AMReX contains SENSEI data adaptors and bridge code making it easy to use in AMReX based simulation codes.

SENSEI provides a configurable analysis adaptor which uses an XML file to select and configure one or more back ends at run time. Run time selection of the back end via XML means one user can access Catalyst, another Libsim, yet another Python with no changes to the code. This is depicted in Fig. 23. On the left side of the figure AMReX produces data, the bridge code pushes the data through the configurable analysis adaptor to the back end that was selected at run time.

Compiling with GNU Make

For codes making use of AMReX’s build system add the following variable to the code’s main GNUmakefile.

USE_SENSEI_INSITU = TRUE

When set, AMReX’s make files will query environment variables for the lists of compiler and linker flags, include directories, and link libraries. These lists can be quite elaborate when using more sophisticated back ends, and are best set automatically using the sensei_config command line tool that should be installed with SENSEI. Prior to invoking make use the following command to set these variables:

source sensei_config

Typically, the sensei_config tool is in the users PATH after loading the desired SENSEI module. After configuring the build environment with sensei_config, proceed as usual.

make -j4 -f GNUmakefile
ParmParse Configuration

Once an AMReX code has been compiled with SENSEI features enabled, it will need to be enabled and configured at runtime. This is done using ParmParse input file. The supported parameters are described in the following table.

parameter

description

default

insitu.int

turns in situ processing on or off and controls how often data is processed.

0

insitu.start

controls when in situ processing starts.

0

insitu.config

points to the SENSEI XML file which selects and configures the desired back end.

insitu.pin_mesh

when 1 lower left corner of the mesh is pinned to 0.,0.,0.

0

A typical use case is to enabled SENSEI by setting insitu.int to be greater than 1, and insitu.config to point SENSEI to an XML file that selects and configures the desired back end.

insitu.int = 2
insitu.config = render_iso_catalyst.xml
Back-end Selection and Configuration

The back end is selected and configured at run time using the SENSEI XML file. The XML sets parameters specific to SENSEI and to the chosen back end. Many of the back ends have sophisticated configuration mechanisms which SENSEI makes use of. For example the following XML configuration was used on NERSC’s Cori with WarpX to render 10 iso surfaces, shown in Fig. 24, using VisIt Libsim.

<sensei>
  <analysis type="libsim" frequency="1" mode="batch"
    session="beam_j_pin.session"
    image-filename="beam_j_pin_%ts" image-width="1200" image-height="900"
    image-format="png" enabled="1"/>
</sensei>

The session attribute names a session file that contains VisIt specific runtime configuration. The session file is generated using VisIt GUI on a representative dataset. Usually this data set is generated in a low resolution run of the desired simulation.

https://data.kitware.com/api/v1/item/5c06b4b18d777f2179d4784c/download

Rendering of 10 3D iso-surfaces of j using VisIt libsim. The upper left quadrant has been clipped away to reveal innner structure.

The same run and visualization was repeated using ParaView Catalyst, shown in Fig. 25, by providing the following XML configuration.

<sensei>
  <analysis type="catalyst" pipeline="pythonscript"
    filename="beam_j.py" enabled="1" />
</sensei>

Here the filename attribute is used to pass Catalyst a Catalyst specific configuration that was generated using the ParaView GUI on a representative dataset.

https://data.kitware.com/api/v1/item/5c05b6388d777f2179d207ae/download

Rendering of 10 3D iso-surfaces of j using ParaView Catalyst. The upper left quadrant has been clipped away to reveal innner structure.

The renderings in these runs were configured using a representative dataset which was obtained by running the simulation for a few time steps at a lower spatial resolution. When using VisIt Libsim the following XML configures the VTK writer to write the simulation data in VTK format. At the end of the run a .visit file that VisIt can open will be generated.

<sensei>
  <analysis type="PosthocIO" mode="visit" writer="xml"
     ghost_array_name="avtGhostZones" output_dir="./"
     enabled="1">
  </analysis>
</sensei>

When using ParaView Catalyst the following XML configures the VTK writer to write the simulation data in VTK format. At the end of the run a .pvd file that ParaView can open will be generated.

<sensei>
  <analysis type="PosthocIO" mode="paraview" writer="xml"
     ghost_array_name="vtkGhostType" output_dir="./"
     enabled="1">
  </analysis>
</sensei>
Obtaining SENSEI

SENSEI is hosted on Kitware’s Gitlab site at https://gitlab.kitware.com/sensei/sensei It’s best to checkout the latest release rather than working on the develop branch.

To ease the burden of wrangling back end installs SENSEI provides two platforms with all dependencies pre-installed, a VirtualBox VM, and a NERSC Cori deployment. New users are encouraged to experiment with one of these.

SENSEI VM

The SENSEI VM comes with all of SENSEI’s dependencies and the major back ends such as VisIt and ParaView installed. The VM is the easiest way to test things out. It also can be used to see how installs were done and the environment configured.

The SENSEI VM can be downloaded here.

The SENSEI VM uses modules to manage the build and run environment. Load the SENSEI modulefile for the back-end you wish to use. The following table describes the available installs and which back-ends are supported in each.

modulefile

back-end(s)

sensei/2.1.1-catalyst-shared

ParaView Catalyst, ADIOS, Python

sensei/2.1.1-libsim-shared

VisIt Libsim, ADIOS, Python

sensei/2.1.1-vtk-shared

VTK-m, ADIOS, Python

NERSC Cori

SENSEI is deployed at NERSC on Cori. The NERSC deployment includes the major back ends such as ADIOS, ParaView Catalyst, VisIt Libsim, and Python.

The SENSEI installs uses modules to manage the build and run environment. Load the SENSEI modulefile for the back-end you wish to use. The following table describes the available installs and which back-ends are supported in each.

modulefile

back-end(s)

sensei/2.1.0-catalyst-shared

ParaView Catalyst, ADIOS, Python

sensei/2.1.0-libsim-shared

VisIt Libsim, ADIOS, Python

sensei/2.1.0-vtk-shared

VTK-m, ADIOS, Python

To access the SENSEI modulefiles on cori first add the SENSEI install to the search path:

module use /usr/common/software/sensei/modulefiles
3D LPA Example

This section shows an example of using SENSEI and three different back ends on a 3D LPA simulation. The instructions are specifically for NERSC cori, but also work with the SENSEI VM. The primary difference between working through the examples on cori or the VM are that different versions of software are installed.

Rendering with VisIt Libsim

First, log into cori and clone the git repo’s.

cd $SCRATCH
mkdir warpx
cd warpx/
git clone https://github.com/ECP-WarpX/WarpX.git WarpX-libsim
git clone https://github.com/AMReX-Codes/amrex
git clone https://github.com/ECP-WarpX/picsar.git
cd WarpX-libsim
vim GNUmakefile

Next, edit the makefile to turn the SENSEI features on.

USE_SENSEI_INSITU=TRUE

Then, load the SENSEI VisIt module, bring SENSEI’s build requirements into the environment, and compile WarpX.

module use /usr/common/software/sensei/modulefiles/
module load sensei/2.1.0-libsim-shared
source sensei_config
make -j8

Download the WarpX input deck, SENSEI XML configuration and and VisIt session files. The inputs file configures WarpX, the xml file configures SENSEI, and the session file configures VisIt. The inputs and xml files are written by hand, while the session file is generated in VisIt gui on a representative data set.

wget https://data.kitware.com/api/v1/item/5c05d48e8d777f2179d22f20/download -O inputs.3d
wget https://data.kitware.com/api/v1/item/5c05d4588d777f2179d22f16/download -O beam_j_pin.xml
wget https://data.kitware.com/api/v1/item/5c05d4588d777f2179d22f0e/download -O beam_j_pin.session

To run the demo, submit an interactive job to the batch queue, and launch WarpX.

salloc -C haswell -N 1 -t 00:30:00 -q debug
./Bin/main3d.gnu.TPROF.MPI.OMP.ex inputs.3d
Rendering with ParaView Catalyst

First, log into cori and clone the git repo’s.

cd $SCRATCH
mkdir warpx
cd warpx/
git clone https://github.com/ECP-WarpX/WarpX.git WarpX-catalyst
git clone --branch development https://github.com/AMReX-Codes/amrex
git clone https://github.com/ECP-WarpX/picsar.git
cd WarpX-catalyst
vim GNUmakefile

Next, edit the makefile to turn the SENSEI features on.

USE_SENSEI_INSITU=TRUE

Then, load the SENSEI ParaView module, bring SENSEI’s build requirements into the environment, and compile WarpX.

module use /usr/common/software/sensei/modulefiles/
module load sensei/2.1.0-catalyst-shared
source sensei_config
make -j8

Download the WarpX input deck, SENSEI XML configuration and and ParaView session files. The inputs file configures WarpX, the xml file configures SENSEI, and the session file configures ParaView. The inputs and xml files are written by hand, while the session file is generated in ParaView gui on a representative data set.

wget https://data.kitware.com/api/v1/item/5c05b3fd8d777f2179d2067d/download -O inputs.3d
wget https://data.kitware.com/api/v1/item/5c05b3fd8d777f2179d20675/download -O beam_j.xml
wget https://data.kitware.com/api/v1/item/5c05b3fc8d777f2179d2066d/download -O beam_j.py

To run the demo, submit an interactive job to the batch queue, and launch WarpX.

salloc -C haswell -N 1 -t 00:30:00 -q debug
./Bin/main3d.gnu.TPROF.MPI.OMP.ex inputs.3d
In situ Calculation with Python

SENSEI’s Python back-end loads a user provided script file containing callbacks for Initialize, Execute, and Finalize phases of the run. During the execute phase the simulation pushes data through SENSEI. SENSEI forwards this data to the user provided Python function. SENSEI’s MPI communicator is made available to the user’s function via a global variable comm.

Here is a template for the user provided Python code.

# YOUR IMPORTS HERE

# SET DEFAULTS OF GLOBAL VARIABLES THAT INFLUENCE RUNTIME BEHAVIOR HERE

def Initialize():
  """ Initialization code """
  # YOUR CODE HERE
  return

def Execute(dataAdaptor):
  """ Use sensei::DataAdaptor instance passed in
      dataAdaptor to access and process simulation data """
  # YOUR CODE HERE
  return

def Finalize():
  """ Finalization code """
  # YOUR CODE HERE
  return

Initialize and Finalize are optional and will be called if they are provided. Execute is required. SENSEI’s DataAdaptor API is used to obtain data and metadata from the simulation. Data is through VTK Object’s. In WarpX the vtkOverlappingAMR VTK dataset is used.

The following script shows a simple integration of a scalar quantity over the valid cells of the mesh. The result is saved in a CSV format.

import numpy as np, matplotlib.pyplot as plt
from vtk.util.numpy_support import *
from vtk import vtkDataObject
import sys

# default values of control parameters
array = ''
out_file = ''

def Initialize():
  # rank zero writes the result
  if comm.Get_rank() == 0:
    fn = out_file if out_file else 'integrate_%s.csv'%(array)
    f = open(fn, 'w')
    f.write('# time, %s\n'%(array))
    f.close()
  return

def Execute(adaptor):
  # get the mesh and arrays we need
  dobj = adaptor.GetMesh('mesh', False)
  adaptor.AddArray(dobj, 'mesh', vtkDataObject.CELL, array)
  adaptor.AddGhostCellsArray(dobj, 'mesh')
  time = adaptor.GetDataTime()

  # integrate over the local blocks
  varint = 0.
  it = dobj.NewIterator()
  while not it.IsDoneWithTraversal():
    # get the local data block and its props
    blk = it.GetCurrentDataObject()

    # get the array container
    atts = blk.GetCellData()

    # get the data array
    var =  vtk_to_numpy(atts.GetArray(array))

    # get ghost cell mask
    ghost = vtk_to_numpy(atts.GetArray('vtkGhostType'))
    ii = np.where(ghost == 0)[0]

    # integrate over valid cells
    varint = np.sum(var[ii])*np.prod(blk.GetSpacing())

    it.GoToNextItem()

  # reduce integral to rank 0
  varint = comm.reduce(varint, root=0, op=MPI.SUM)

  # rank zero writes the result
  if comm.Get_rank() == 0:
    fn = out_file if out_file else 'integrate_%s.csv'%(array)
    f = open(fn, 'a+')
    f.write('%s, %s\n'%(time, varint))
    f.close()
  return

The following XML configures SENSEI’s Python back-end.

<sensei>
  <analysis type="python" script_file="./integrate.py" enabled="1">
    <initialize_source>
array='rho'
out_file='rho.csv'
     </initialize_source>
  </analysis>
</sensei>

The script_file attribute sets the file path to load the user’s Python code from, and the initialize_source element contains Python code that controls runtime behavior specific to each user provided script.

In situ Visualization with Ascent

Ascent is a system designed to meet the in-situ visualization and analysis needs of simulation code teams running multi-physics calculations on many-core HPC architectures. It provides rendering runtimes that can leverage many-core CPUs and GPUs to render images of simulation meshes.

Compiling with GNU Make

After building and installing Ascent according to the instructions at Building Ascent, you can enable support for it in WarpX by changing the line

USE_ASCENT_INSITU=FALSE

in GNUmakefile to

USE_ASCENT_INSITU=TRUE

Furthermore, you must ensure that either the ASCENT_DIR shell environment variable contains the directory where Ascent is installed or you must specify this location when invoking make, i.e.,

make -j 8 USE_ASCENT_INSITU=TRUE ASCENT_DIR=/path/to/ascent/install
Inputs File Configuration

Once WarpX has been compiled with Ascent support, it will need to be enabled and configured at runtime. This is done using our usual inputs file (read with amrex::ParmParse). The supported parameters are part of the FullDiagnostics with <diag_name>.format parameter set to ascent.

Visualization/Analysis Pipeline Configuration

Ascent uses the file ascent_actions.yaml to configure analysis and visualization pipelines. Ascent looks for the ascent_actions.yaml file in the current working directory.

For example, the following ascent_actions.yaml file extracts an isosurface of the field Ex for 15 levels and saves the resulting images to levels_<nnnn>.png. Ascent Actions provides an overview over all available analysis and visualization actions.

-
  action: "add_pipelines"
  pipelines:
    p1:
      f1:
        type: "contour"
        params:
           field: "Ex"
           levels: 15
-
  action: "add_scenes"
  scenes:
    scene1:
      image_prefix: "levels_%04d"
      plots:
        plot1:
          type: "pseudocolor"
          pipeline: "p1"
          field: "Ex"

Here is another ascent_actions.yaml example that renders isosurfaces and particles:

-
  action: "add_pipelines"
  pipelines:
    p1:
      f1:
        type: "contour"
        params:
           field: "Bx"
           levels: 3
-
  action: "add_scenes"
  scenes:
    scene1:
      plots:
        plot1:
          type: "pseudocolor"
          pipeline: "p1"
          field: "Bx"
        plot2:
          type: "pseudocolor"
          field: "particle_electrons_Bx"
          points:
            radius: 0.0000005
      renders:
        r1:
          camera:
            azimuth: 100
            elevation: 10
          image_prefix: "out_render_3d_%06d"

Finally, here is a more complex ascent_actions.yaml example that creates the same images as the prior example, but adds a trigger that creates a Cinema Database at cycle 300:

-
  action: "add_triggers"
  triggers:
    t1:
      params:
        condition: "cycle() == 300"
        actions_file: "trigger.yaml"
-
  action: "add_pipelines"
  pipelines:
    p1:
      f1:
        type: "contour"
        params:
           field: "jy"
           iso_values: [ 1000000000000.0, -1000000000000.0]
-
  action: "add_scenes"
  scenes:
    scene1:
      plots:
        plot1:
          type: "pseudocolor"
          pipeline: "p1"
          field: "jy"
        plot2:
          type: "pseudocolor"
          field: "particle_electrons_w"
          points:
            radius: 0.0000002
      renders:
        r1:
          camera:
            azimuth: 100
            elevation: 10
          image_prefix: "out_render_jy_part_w_3d_%06d"

When the trigger condition is meet, cycle() == 300, the actions in trigger.yaml are also executed:

-
  action: "add_pipelines"
  pipelines:
    p1:
      f1:
        type: "contour"
        params:
           field: "jy"
           iso_values: [ 1000000000000.0, -1000000000000.0]
-
  action: "add_scenes"
  scenes:
    scene1:
      plots:
        plot1:
          type: "pseudocolor"
          pipeline: "p1"
          field: "jy"
        plot2:
          type: "pseudocolor"
          field: "particle_electrons_w"
          points:
            radius: 0.0000001
      renders:
        r1:
          type: "cinema"
          phi: 10
          theta: 10
          db_name: "cinema_out"

You can view the Cinema Database result by opening cinema_databases/cinema_out/index.html.

Replay

With Ascent/Conduit, one can store the intermediate data files before the rendering step is applied to custom files. These so-called Conduit Blueprint HDF5 files can be “replayed”, i.e. rendered without running the simulation again. VisIt 3.0+ also supports those files.

Replay is a utility that allows the user to replay a simulation from aforementioned files and rendering them with Ascent. Replay enables the user or developer to pick specific time steps and load them for Ascent visualization, without running the simulation again.

We will guide you through the replay procedure.

Get Blueprint Files

To use replay, you first need Conduit Blueprint HDF5 files. The following block can be used in an ascent action to extract Conduit Blueprint HDF5 files from a simulation run.

-
  action: "add_extracts"
  extracts:
    e1:
      type: "relay"
      params:
        path: "conduit_blueprint"
        protocol: "blueprint/mesh/hdf5"

The output in the WarpX run directory will look as in the following listing. The .root file is a metadata file and the corresponding directory contains the conduit blueprint data in an internal format that is based on HDF5.

conduit_blueprint.cycle_000000/
conduit_blueprint.cycle_000000.root
conduit_blueprint.cycle_000050/
conduit_blueprint.cycle_000050.root
conduit_blueprint.cycle_000100/
conduit_blueprint.cycle_000100.root

In order to select a few time steps after the fact, a so-called cycles file can be created. A cycles file is a simple text file that lists one root file per line, e.g.:

conduit_blueprint.cycle_000100.root
conduit_blueprint.cycle_000050.root
Run Replay

For Ascent Replay, two command line tools are provided in the utilities/replay directory of the Ascent installation. There are two version of replay: the MPI-parallel version replay_mpi and a serial version, replay_ser. Use an MPI-parallel replay with data sets created with MPI-parallel builds of WarpX. Here we use replay_mpi as an example.

The options for replay are:

  • --root: specifies Blueprint root file to load

  • --cycles: specifies a text file containing a list of Blueprint root files to load

  • --actions: specifies the name of the actions file to use (default: ascent_actions.yaml)

Instead of starting a simulation that generates data for Ascent, we now execute replay_ser/replay_mpi. Replay will loop over the files listed in cycles in the order in which they appear in the cycles file.

For example, for a small data example that fits on a single computer:

./replay_ser --root=conduit_blueprint.cycle_000400.root --actions=ascent_actions.yaml

Will replay the data of WarpX step 400 (“cycle” 400). A whole set of steps can be replayed with the above mentioned cycles file:

./replay_ser --cycles=warpx_list.txt --actions=ascent_actions.yaml

For larger examples, e.g. on a cluster with Slurm batch system, a parallel launch could look like this:

# one step
srun -n 8 ./replay_mpi --root=conduit_blueprint.cycle_000400.root --actions=ascent_actions.yaml
# multiple steps
srun -n 8 ./replay_mpi --cycles=warpx_list.txt --actions=ascent_actions.yaml
Example Actions

A visualization of the electric field component \(E_x\) (variable: Ex) with a contour plot and with added particles can be obtained with the following Ascent Action. This action can be used both in replay as well as in situ runs.

-
  action: "add_pipelines"
  pipelines:
    clipped_volume:
      f0:
        type: "contour"
        params:
          field: "Ex"
          levels: 16
      f1:
        type: "clip"
        params:
          topology: topo # name of the amr mesh
          multi_plane:
            point1:
              x: 0.0
              y: 0.0
              z: 0.0
            normal1:
              x: 0.0
              y: -1.0
              z: 0.0
            point2:
              x: 0.0
              y: 0.0
              z: 0.0
            normal2:
              x: -0.7
              y: -0.7
              z: 0.0
    sampled_particles:
      f1:
        type: histsampling
        params:
          field: particle_electrons_uz
          bins: 64
          sample_rate: 0.90
      f2:
        type: "clip"
        params:
          topology: particle_electrons # particle data
          multi_plane:
            point1:
              x: 0.0
              y: 0.0
              z: 0.0
            normal1:
              x: 0.0
              y: -1.0
              z: 0.0
            point2:
              x: 0.0
              y: 0.0
              z: 0.0
            normal2:
              x: -0.7
              y: -0.7
              z: 0.0

# Uncomment this block if you want to create "Conduit Blueprint files" that can
# be used with Ascent "replay" after the simulation run.
# Replay is a workflow to visualize individual steps without running the simulation again.
#-
#  action: "add_extracts"
#  extracts:
#    e1:
#      type: "relay"
#      params:
#        path: "./conduit_blueprint"
#        protocol: "blueprint/mesh/hdf5"

-
  action: "add_scenes"
  scenes:
    scene1:
      plots:
        p0:
          type: "pseudocolor"
          field: "particle_electrons_uz"
          pipeline: "sampled_particles"
        p1:
          type: "pseudocolor"
          field: "Ex"
          pipeline: "clipped_volume"
      renders:
        image1:
          bg_color: [1.0, 1.0, 1.0]
          fg_color: [0.0, 0.0, 0.0]
          image_prefix: "lwfa_Ex_e-uz_%06d"
          camera:
            azimuth: 20
            elevation: 30
            zoom: 2.5

There are more Ascent Actions examples available for you to play.

Workflow

Note

This section is in-progress. TODOs: finalize acceptance testing; update 3D LWFA example

In the preparation of simulations, it is generally useful to run small, under-resolved versions of the planned simulation layout first. Ascent replay is helpful in the setup of an in situ visualization pipeline during this process. In the following, a Jupyter-based workflow is shown that can be used to quickly iterate on the design of a ascent_actions.yaml file, repeatedly rendering the same (small) data.

First, run a small simulation, e.g. on a local computer, and create conduit blueprint files (see above). Second, copy the Jupyter Notebook file ascent_replay_warpx.ipynb into the simulation output directory. Third, download and start a Docker container with a prepared Jupyter installation and Ascent Python bindings from the simulation output directory:

docker pull alpinedav/ascent-jupyter:latest
docker run -v$PWD:/home/user/ascent/install-debug/examples/ascent/tutorial/ascent_intro/notebooks/replay -p 8000:8000 -p 8888:8888 -p 9000:9000 -p 10000:10000 -t -i alpinedav/ascent-jupyter:latest

Now, access Jupyter Lab via: http://localhost:8888/lab (password: learn).

Inside the Jupyter Lab is a replay/ directory, which mounts the outer working directory. You can now open ascent_replay_warpx.ipynb and execute all cells. The last two cells are the replay action that can be quickly iterated: change replay_actions.yaml cell and execute both.

Note

  • Keep an eye on the terminal, if a replay action is erroneous it will show up on the terminal that started the docker container. (TODO: We might want to catch that inside python and print it in Jupyter instead.)

  • If you remove a “key” from the replay action, you might see an error in the AscentViewer. Restart and execute all cells in that case.

If you like the 3D rendering of laser wakefield acceleration on the WarpX documentation front page (which is also the avatar of the ECP-WarpX organization), you can find the serial analysis script video_yt.py as well as a parallel analysis script video_yt.py used to make a similar rendering for a beam-driven wakefield simulation, running parallel.

Staggering in Data Output

Warning: currently, quantities in the output file for iteration n are not all defined at the same physical time due to the staggering in time in WarpX. The table below provides the physical time at which each quantity in the output file is written, in units of time step, for time step n.

quantity

staggering

E

n

n

B

n

n

j

n-1/2

n-1/2

rho

n

n

position

n

n

momentum

n-1/2

n-1/2

yt-project

yt is a Python package that can help in analyzing and visualizing WarpX data (among other data formats). It is convenient to use yt within a Jupyter notebook.

Data Support

yt primarily supports WarpX through plotfiles. There is also support for openPMD HDF5 files in yt (w/o mesh refinement).

Installation

From the terminal, install the latest version of yt:

python3 -m pip install cython
python3 -m pip install --upgrade yt

Alternatively, yt can be installed via their installation script, see yt installation web page.

Visualizing the data

Once data (“plotfiles”) has been created by the simulation, open a Jupyter notebook from the terminal:

jupyter notebook

Then use the following commands in the first cell of the notebook to import yt and load the first plot file:

import yt
ds = yt.load('./diags/plotfiles/plt00000/')

The list of field data and particle data stored can be seen with:

ds.field_list

For a quick start-up, the most useful commands for post-processing can be found in our Jupyter notebook Visualization.ipynb

Field data

Field data can be visualized using yt.SlicePlot (see the docstring of this function here)

For instance, in order to plot the field Ex in a slice orthogonal to y (1):

yt.SlicePlot( ds, 1, 'Ex', origin='native' )

Note

yt.SlicePlot creates a 2D plot with the same aspect ratio as the physical size of the simulation box. Sometimes this can lead to very elongated plots that are difficult to read. You can modify the aspect ratio with the aspect argument ; for instance:

yt.SlicePlot( ds, 1, 'Ex', aspect=1./10 )

Alternatively, the data can be obtained as a numpy array.

For instance, in order to obtain the field jz (on level 0) as a numpy array:

ad0 = ds.covering_grid(level=0, left_edge=ds.domain_left_edge, dims=ds.domain_dimensions)
jz_array = ad0['jz'].to_ndarray()

Particle data

Particle data can be visualized using yt.ParticlePhasePlot (see the docstring here).

For instance, in order to plot the particles’ x and y positions:

yt.ParticlePhasePlot( ds.all_data(), 'particle_position_x', 'particle_position_y', 'particle_weight')

Alternatively, the data can be obtained as a numpy array.

For instance, in order to obtain the array of position x as a numpy array:

ad = ds.all_data()
x = ad['particle_position_x'].to_ndarray()

Further information

A lot more information can be obtained from the yt documentation, and the corresponding notebook tutorials here.

Out-of-the-box plotting script

A ready-to-use python script for plotting simulation results is available at plot_parallel.py. Feel free to use it out-of-the-box or to modify it to suit your needs.

Dependencies

Most of its dependencies are standard Python packages, that come with a default Anaconda installation or can be installed with pip or conda: os, matplotlib, sys, argparse, matplotlib, scipy.

Additional dependencies are yt >= 4.0.1 and mpi4py.

Run serial

Executing the script with

python plot_parallel.py

will loop through plotfiles named plt????? (e.g., plt00000, plt00100 etc.) and save one image per plotfile. For a 2D simulation, a 2D colormap of the Ez field is plotted by default, with 1/20 of particles of each species (with different colors). For a 3D simulation, a 2D colormap of the central slices in y is plotted, and particles are handled the same way.

The script reads command-line options (which field and particle species, rendering with yt or matplotlib, etc.). For the full list of options, run

python plot_parallel.py --help

In particular, option --plot_Ey_max_evolution shows you how to plot the evolution of a scalar quantity over time (by default, the max of the Ey field). Feel free to modify it to plot the evolution of other quantities.

Run parallel

To execute the script in parallel, you can run for instance

mpirun -np 4 python plot_parallel.py --parallel

In this case, MPI ranks will share the plotfiles to process as evenly as possible. Note that each plotfile is still processed in serial. When option --plot_Ey_max_evolution is on, the scalar quantity is gathered to rank 0, and rank 0 plots the image.

If all dependencies are satisfied, the script can be used on Summit or Cori. For instance, the following batch script illustrates how to submit a post-processing batch job on Cori haswell with some options:

#!/bin/bash

# Copyright 2019 Maxence Thevenet
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL

#SBATCH --job-name=postproc
#SBATCH --time=00:20:00
#SBATCH -C haswell
#SBATCH -N 8
#SBATCH -q regular
#SBATCH -e postproce.txt
#SBATCH -o postproco.txt
#SBATCH --mail-type=end
#SBATCH --account=m2852

export OMP_NUM_THREADS=1

# Requires python3 and yt > 3.5
srun -n 32 -c 16 python plot_parallel.py --path <path/to/plotfiles> --plotlib=yt --parallel

Advanced Visualization of Plotfiles With yt (for developers)

This sections contains yt commands for advanced users. The Particle-In-Cell methods uses a staggered grid (see particle-in-cell theory), so that the x, y, and z components of the electric and magnetic fields are all defined at different locations in space. Regular output (see the yt-project page, or the notebook at WarpX/Tools/PostProcessing/Visualization.ipynb for an example) returns cell-centered data for convenience, which involves an additional operation. It is sometimes useful to access the raw data directly. Furthermore, the WarpX implementation for mesh refinement contains a number of grids for each level (coarse, fine and auxiliary, see the theory for more details), and it is sometimes useful to access each of them (regular output return the auxiliary grid only). This page provides information to read raw data of all grids.

Write Raw Data

For a given diagnostic the user has the option to write the raw data by setting <diag_name>.plot_raw_fields = 1. Moreover, the user has the option to write also the values of the fields in the guard cells by setting <diag_name>.plot_raw_fields_guards = 1. Please refer to Input Parameters for more information.

Read Raw Data

Meta-data relevant to this topic (for example, number and locations of grids in the simulation) are accessed with

import yt
# get yt dataset
ds = yt.load( './plotfiles/plt00004' )
# Index of data in the plotfile
ds_index = ds.index
# Print the number of grids in the simulation
ds_index.grids.shape
# Left and right physical boundary of each grid
ds_index.grid_left_edge
ds_index.grid_right_edge
# List available fields
ds.field_list

When <diag_name>.plot_raw_fields = 1, here are some useful commands to access properties of a grid and the Ex field on the fine patch:

# store grid number 2 into my_grid
my_grid = ds.index.grids[2]
# Get left and right edges of my_grid
my_grid.LeftEdge
my_grid.RightEdge
# Get Level of my_grid
my_grid.Level
# left edge of the grid, in number of points
my_grid.start_index

Return the Ex field on the fine patch of grid my_grid:

my_field = my_grid['raw', 'Ex_fp'].squeeze().v

For a 2D plotfile, my_field has shape (nx,nz,2). The last component stands for the two values on the edges of each cell for the electric field, due to field staggering. Numpy function squeeze removes empty components. While yt arrays are unit-aware, it is sometimes useful to extract the data into unitless numpy arrays. This is achieved with .v. In the case of Ex_fp, the staggering is on direction x, so that my_field[:,:-1,1] == my_field[:,1:,0].

All combinations of the fields (E or B), the component (x, y or z) and the grid (_fp for fine, _cp for coarse and _aux for auxiliary) can be accessed in this way, i.e., my_grid['raw', 'Ey_aux'] or my_grid['raw', 'Bz_cp'] are valid queries.

Read Raw Data With Guard Cells

When the output includes the data in the guard cells, the user can read such data using the post-processing tool read_raw_data.py, available in Tools/PostProcessing/, as illustrated in the following example:

from read_raw_data import read_data

# Load all data saved in a given path
path = './diags/diag00200/'
data = read_data(path)

# Load Ex_fp on mesh refinement level 0
level = 0
field = 'Ex_fp'
# data[level] is a dictionary, data[level][field] is a numpy array
my_field = data[level][field]

Note that a list of all available raw fields written to output, that is, a list of all valid strings that the variable field in the example above can be assigned to, can be obtained by calling data[level].keys().

In order to plot a 2D slice of the data with methods like matplotlib.axes.Axes.imshow, one might want to pass the correct extent (the bounding box in data coordinates that the image will fill), including the guard cells. One way to set the correct extent is illustrated in the following example (case of a 2D slice in the (x,z) plane):

import yt
import numpy as np

from read_raw_data import read_data

# Load all data saved in a given path
path = './diags/diag00200/'
data = read_data(path)

# Load Ex_fp on mesh refinement level 0
level = 0
field = 'Ex_fp'
# data[level] is a dictionary, data[level][field] is a numpy array
my_field = data[level][field]

# Set the number of cells in the valid domain
# by loading the standard output data with yt
ncells = yt.load(path).domain_dimensions

# Set the number of dimensions automatically (2D or 3D)
dim = 2 if (ncells[2] == 1) else 3

xdir = 0
zdir = 1 if (dim == 2) else 2

# Set the extent (bounding box in data coordinates, including guard cells)
# to be passed to matplotlib.axes.Axes.imshow
left_edge_x  = 0            - (my_field.shape[xdir] - ncells[xdir]) // 2
right_edge_x = ncells[xdir] + (my_field.shape[xdir] - ncells[xdir]) // 2
left_edge_z  = 0            - (my_field.shape[zdir] - ncells[zdir]) // 2
right_edge_z = ncells[zdir] + (my_field.shape[zdir] - ncells[zdir]) // 2
extent = np.array([left_edge_z, right_edge_z, left_edge_x, right_edge_x])

openPMD-viewer

openPMD-viewer is an open-source Python package to access openPMD data.

It allows to:

  • Quickly browse through the data, with a GUI-type interface in the Jupyter notebook

  • Have access to the data numpy array, for more detailed analysis

Installation

openPMD-viewer can be installed via conda or pip:

conda install -c conda-forge openpmd-viewer openpmd-api
python3 -m pip install openPMD-viewer openPMD-api

Usage

openPMD-viewer can be used either in simple Python scripts or in Jupyter. For interactive plots in Jupyter notebook or Jupyter Lab, add this “cell magic” to the first line of your notebook:

%matplotlib widget

If none of those work, e.g. because ipympl is not properly installed, you can as a last resort always try %matplotlib inline for non-interactive plots.

In both interactive and scripted usage, you can import openPMD-viewer, and load the data with the following commands:

from openpmd_viewer import OpenPMDTimeSeries
ts = OpenPMDTimeSeries('./diags/diag1/')

Note

If you are using the Jupyter notebook, then you can start a pre-filled notebook, which already contains the above lines, by typing in a terminal:

openPMD_notebook

When using the Jupyter notebook, you can quickly browse through the data by using the command:

ts.slider()

You can also access the particle and field data as numpy arrays with the methods ts.get_field and ts.get_particle. See the openPMD-viewer tutorials here for more info.

openPMD-api

openPMD-api is an open-source C++ and Python API for openPMD data.

Please see the openPMD-api manual for a quick introduction:

https://openpmd-api.readthedocs.io

3D Visualization: ParaView

WarpX results can be visualized by ParaView, an open source visualization and analysis software. ParaView can be downloaded and installed from httpshttps://www.paraview.org. Use the latest version for best results.

Tutorials

ParaView is a powerful, general parallel rendering program. If this is your first time using ParaView, consider starting with a tutorial.

openPMD

WarpX’ openPMD files can be visualized with ParaView 5.9+. ParaView supports ADIOS1, ADIOS2 and HDF5 files, as it implements (like WarpX) against openPMD-api.

For openPMD output, WarpX automatically creates an .pmd file per diagnostics, which can be opened with ParaView.

Tip

When you first open ParaView, adjust its global Settings (Linux: under menu item Edit). General -> Advanced -> Search for data -> Data Processing Options. Check the box Auto Convert Properties.

This will simplify application of filters, e.g., contouring of components of vector fields, without first adding a calculator that extracts a single component or magnitude.

Warning

WarpX issue 21162: We currently load WarpX field data with a rotation. Please apply rotation of 0 -90 0 to mesh data.

Warning

ParaView issue 21837: In order to visualize particle traces with the Temporal Particles To Pathlines, you need to apply the Merge Blocks filter first.

If you have multiple species, you may have to extract the species you want with Extract Block before applying Merge Blocks.

Plotfiles (AMReX)

ParaView also supports visualizing AMReX plotfiles. Please see the AMReX documentation for more details.

3D Visualization: VisIt

WarpX results can be visualized by VisIt, an open source visualization and analysis software. VisIt can be downloaded and installed from https://wci.llnl.gov/simulation/computer-codes/visit.

openPMD (HDF5)

WarpX’ openPMD files can be visualized with VisIt 3.1.0+. VisIt supports openPMD HDF5 files and requires to rename the files from .h5 to .opmd to be automatically detected.

Plotfiles (AMReX)

Assuming that you ran a 2D simulation, here are instructions for making a simple plot from a given plotfile:

  • Open the header file: Run VisIt, then select “File” -> “Open file …”, then select the Header file associated with the plotfile of interest (e.g., plt10000/Header).

  • View the data: Select “Add” -> “Pseudocolor” -> “Ez” and select “Draw”. You can select other variable to draw, such as jx, jy, jz, Ex, …

  • View the grid structure: Select “Subset” -> “levels”. Then double click the text “Subset-levels”, enable the “Wireframe” option, select “Apply”, select “Dismiss”, and then select “Draw”.

  • Save the image: Select “File” -> “Set save options”, then customize the image format to your liking, then click “Save”.

Your image should look similar to the one below

picture

In 3D, you must apply the “Operators” -> “Slicing” -> “ThreeSlice”, You can left-click and drag over the image to rotate the image to generate image you like.

To make a movie, you must first create a text file named movie.visit with a list of the Header files for the individual frames.

The next step is to run VisIt, select “File” -> “Open file …”, then select movie.visit. Create an image to your liking and press the “play” button on the VCR-like control panel to preview all the frames. To save the movie, choose “File” -> “Save movie …”, and follow the instructions on the screen.

VisualPIC

VisualPIC is an open-source Python GUI for visual data analysis, especially for advanced accelerator simulations. It supports WarpX’ data through openPMD files.

Installation

mamba install -c conda-forge python vtk pyvista pyqt
python3 -m pip install git+https://github.com/AngelFP/VisualPIC.git@dev

Usage

VisualPIC provides a Python data reader API and plotting capabilties. It is designed for small to medium-size data sets that fit in the RAM of a single computer.

Plotting can be performed via a command line tools or scripted with Python. Command line tools are:

  • vpic [options] <path/to/diagnostics/>: 2D matplotlib plotter, e.g., for particle phase space

  • vpic3d [options] <path/to/diagnostics/>: 3D VTK renderer

Example: vpic3d -s beam -rho -Ez diags/diag1/ could be used to visualize the witness beam, plasma density, and accelerating field of an LWFA.

Example: vpic3d -Ex diags/diag1/ could be used to visualize the transverse focusing field \(E_x\) in a plasma wake behind a laser pulse (linearly polarized in \(E_y\)), see below:

Example view of a 3D rendering with VisualPIC.

The Python script controlled rendering allows more flexible options, such as selecting and cutting views, rendering directly into an image file, looping for animations, etc. As with matplotlib scripts, Python script scenes can also be used to open a GUI and then browse time series interactively. The VisualPIC examples provide showcases for scripting.

Repository

The source code can be found under:

https://github.com/AngelFP/VisualPIC

PICViewer

picture

PICViewer is a visualization GUI implemented on PyQt. The toolkit provides various easy-to-use functions for data analysis of Warp/WarpX simulations.

It works for both plotfiles and openPMD files.

Main features

  • 2D/3D openPMD or WarpX data visualization,

  • Multi-plot panels (up to 6 rows x 5 columns) which can be controlled independently or synchronously

  • Interactive mouse functions (panel selection, image zoom-in, local data selection, etc)

  • Animation from a single or multiple panel(s)

  • Saving your job configuration and loading it later

  • Interface to use VisIt, yt, or mayavi for 3D volume rendering (currently updating)

Required software

Installation

python3 -m pip install picviewer

You need to install yt and PySide separately.

You can install from the source for the latest update,

python3 -m pip install git+https://bitbucket.org/ecp_warpx/picviewer/

To install manually

  • Clone this repository

    git clone https://bitbucket.org/ecp_warpx/picviewer/
    
  • Switch to the cloned directory with cd picviewer and type python setup.py install

To run

  • You can start PICViewer from any directory. Type picviewer in the command line. Select a folder where your data files are located.

  • You can directly open your data. Move on to a folder where your data files ae located (cd [your data folder]) and type picviewer in the command line.

Note

We currently seek a new maintainer for PICViewer. Please contact us if you are interested.

Reduced diagnostics

WarpX has optional reduced diagnostics, that typically return one value (e.g., particle energy) per timestep.

A simple and quick way to read the data using python is

data = numpy.genfromtxt("filename.txt")

where data is a two dimensional array, data[i][j] gives the data in the ith row and the jth column.

A Python function to read the data is available from module read_raw_data in WarpX/Tools/PostProcessing/:

from read_raw_data import read_reduced_diags
filename = 'EF.txt'
metadata, data = read_reduced_diags( filename )
# list available diagnostics
data.keys()
# Print total field energy on level 0
data['total_lev0']
# Print units for the total field energy on level 0
metadata['units']['total_lev0']

In addition, for reduced diagnostic type ParticleHistogram, another Python function is available:

from read_raw_data import read_reduced_diags_histogram
filename = 'velocity_distribution.txt'
metadata_dict, data_dict, bin_value, bin_data = read_reduced_diags_histogram( filename )
# 1-D array of the ith bin value
bin_value[i]
# 2-D array of the jth bin data at the ith time
bin_data[i][j]

Another available reduced diagnostic is ParticleHistogram2D. It computes a 2D histogram of particle data with user-specified axes and value functions. The output data is stored in openPMD files gathered in a hist2D/ folder.

Workflows

This section collects typical user workflows and best practices for data analysis with WarpX.

Port Tunneling

SSH port tunneling (port forwarding) is a secure way to access a computational service of a remote computer. A typical workflow where you might need port tunneling is for Jupyter data analysis, e.g., when analyzing data on your desktop computer but working from your laptop.

Before getting started here, please note that many HPC centers offer a pre-installed Jupyter service, where tunnel is not needed. For example, see the NERSC Jupyter and OLCF Jupyter services.

Introduction

When running a service such as Jupyter from your command line, it will start a local (web) port. The IPv4 address of your local computer is always 127.0.0.1 or the alias localhost.

As a secure default, you cannot connect from outside your local computer to this port. This prevents misconfigurations where one could, in the worst case, connect to your open port without authentication and execute commands with your user privileges.

One way to access your remote Jupyter desktop service from your laptop is to forward the port started remotely via an encrypted SSH connection to a local port on your current laptop. The following section will explain the detailed workflow.

Workflow

  • you connect via SSH to your desktop at work, in a terminal (A) as usual

    • e.g., ssh username@your-computers-hostname.dhcp.lbl.gov

    • start Jupyter locally in headless mode, e.g., jupyter lab --no-browser

    • this will show you a 127.0.0.1 (aka localhost) URL, by default on port TCP 8888

    • you cannot reach that URL, because you are not sitting on that computer, with your browser

  • You now start a second terminal (B) locally, which forwards the remote port 8888 to your local laptop

    • this step must be done after Jupyter was started on the desktop

    • ssh -L <laptop-port>:<Ip-as-seen-on-desktop>:<desktop-port> <desktop-ip> -N

    • so concrete: ssh -L 8888:localhost:8888 your-computers-hostname.dhcp.lbl.gov -N

      • note: Jupyter on the desktop will increase the port if already in use.

      • note: take another port on your laptop if you have local Jupyter instances still running

  • Now open the browser on your local laptop, open the URL from Jupyter with .../127.0.0.1:8888/... in it

To close the connection down, do this:

  • stop Jupyter in terminal A: Ctrl+C and confirm with y, Enter

  • Ctrl+C the SSH tunnel in terminal B

Example view of remote started Jupyter service, active SSH tunnel, and local browser connecting to the service.

Theory

Introduction

Plasma laser-driven (top) and charged-particles-driven (bottom) acceleration (rendering from 3-D Particle-In-Cell simulations). A laser beam (red and blue disks in top picture) or a charged particle beam (red dots in bottom picture) propagating (from left to right) through an under-dense plasma (not represented) displaces electrons, creating a plasma wakefield that supports very high electric fields (pale blue and yellow). These electric fields, which can be orders of magnitude larger than with conventional techniques, can be used to accelerate a short charged particle beam (white) to high-energy over a very short distance.

Plasma laser-driven (top) and charged-particles-driven (bottom) acceleration (rendering from 3-D Particle-In-Cell simulations). A laser beam (red and blue disks in top picture) or a charged particle beam (red dots in bottom picture) propagating (from left to right) through an under-dense plasma (not represented) displaces electrons, creating a plasma wakefield that supports very high electric fields (pale blue and yellow). These electric fields, which can be orders of magnitude larger than with conventional techniques, can be used to accelerate a short charged particle beam (white) to high-energy over a very short distance.

Computer simulations have had a profound impact on the design and understanding of past and present plasma acceleration experiments [1, 2, 3, 4, 5]. Accurate modeling of wake formation, electron self-trapping and acceleration require fully kinetic methods (usually Particle-In-Cell) using large computational resources due to the wide range of space and time scales involved. Numerical modeling complements and guides the design and analysis of advanced accelerators, and can reduce development costs significantly. Despite the major recent experimental successes [6, 7, 8, 9], the various advanced acceleration concepts need significant progress to fulfill their potential. To this end, large-scale simulations will continue to be a key component toward reaching a detailed understanding of the complex interrelated physics phenomena at play.

For such simulations, the most popular algorithm is the Particle-In-Cell (or PIC) technique, which represents electromagnetic fields on a grid and particles by a sample of macroparticles. However, these simulations are extremely computationally intensive, due to the need to resolve the evolution of a driver (laser or particle beam) and an accelerated beam into a structure that is orders of magnitude longer and wider than the accelerated beam. Various techniques or reduced models have been developed to allow multidimensional simulations at manageable computational costs: quasistatic approximation [10, 11, 12, 13, 14], ponderomotive guiding center (PGC) models [11, 12, 14, 15, 16], simulation in an optimal Lorentz boosted frame [17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29], expanding the fields into a truncated series of azimuthal modes [30, 31, 32, 33, 34], fluid approximation [12, 15, 35] and scaled parameters [4, 36].

[1]

F. S. Tsung, W. Lu, M. Tzoufras, W. B. Mori, C. Joshi, J. M. Vieira, L. O. Silva, and R. A. Fonseca. Simulation Of Monoenergetic Electron Generation Via Laser Wakefield Accelerators For 5-25 TW Lasers. Physics of Plasmas, 13(5):56708, May 2006. doi:10.1063/1.2198535.

[2]

C. G. R. Geddes, D. L. Bruhwiler, J. R. Cary, W. B. Mori, J.-L. Vay, S. F. Martins, T. Katsouleas, E. Cormier-Michel, W. M. Fawley, C. Huang, X. Wang, B. Cowan, V. K. Decyk, E. Esarey, R. A. Fonseca, W. Lu, P. Messmer, P. Mullowney, K. Nakamura, K. Paul, G. R. Plateau, C. B. Schroeder, L. O. Silva, C. Toth, F. S. Tsung, M. Tzoufras, T. Antonsen, J. Vieira, and W. P. Leemans. Computational Studies And Optimization Of Wakefield Accelerators. In Journal of Physics: Conference Series, volume 125, 012002 (11 Pp.). 2008.

[3]

C. G.R. Geddes, E. Cormier-Michel, E. H. Esarey, C. B. Schroeder, J.-L. Vay, W. P. Leemans, D. L. Bruhwiler, J. R. Cary, B. Cowan, M. Durant, P. Hamill, P. Messmer, P. Mullowney, C. Nieter, K. Paul, S. Shasharina, S. Veitzer, G. Weber, O. Rubel, D. Ushizima, W. Bethel, and J. Wu. Laser Plasma Particle Accelerators: Large Fields For Smaller Facility Sources. In Scidac Review 13, number 13, 13–21. 2009. URL: https://www.osti.gov/biblio/971264.

[4] (1,2)

C. G. R. Geddes, E. Cormier-Michel, E. Esarey, C. B. Schroeder, and W. P. Leemans. Scaled Simulation Design Of High Quality Laser Wakefield Accelerator Stages. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.

[5]

C. Huang, W. An, V. K. Decyk, W. Lu, W. B. Mori, F. S. Tsung, M. Tzoufras, S. Morshed, T. Antonsen, B. Feng, T. Katsouleas, R. A. Fonseca, S. F. Martins, J. Vieira, L. O. Silva, E. Esarey, C. G. R. Geddes, W. P. Leemans, E. Cormier-Michel, J.-L. Vay, D. L. Bruhwiler, B. Cowan, J. R. Cary, and K. Paul. Recent Results And Future Challenges For Large Scale Particle-In-Cell Simulations Of Plasma-Based Accelerator Concepts. Journal of Physics: Conference Series, 180(1):012005 (11 Pp.), 2009.

[6]

W. P. Leemans, A. J. Gonsalves, H.-S. Mao, K. Nakamura, C. Benedetti, C. B. Schroeder, Cs. Tóth, J. Daniels, D. E. Mittelberger, S. S. Bulanov, J.-L. Vay, C. G. R. Geddes, and E. Esarey. Multi-GeV Electron Beams from Capillary-Discharge-Guided Subpetawatt Laser Pulses in the Self-Trapping Regime. Phys. Rev. Lett., 113(24):245002, Dec 2014. URL: http://link.aps.org/doi/10.1103/PhysRevLett.113.245002, doi:10.1103/PhysRevLett.113.245002.

[7]

I. Blumenfeld, C. E. Clayton, F.-J. Decker, M. J. Hogan, C. Huang, R. Ischebeck, R. Iverson, C. Joshi, T. Katsouleas, N. Kirby, W. Lu, K. A. Marsh, W. B. Mori, P. Muggli, E. Oz, R. H. Siemann, D. Walz, and M. Zhou. Energy doubling of 42[thinsp]GeV electrons in a metre-scale plasma wakefield accelerator. Nature, 445(7129):741–744, Feb 2007. URL: http://dx.doi.org/10.1038/nature05538.

[8]

S. V. Bulanov, J. J. Wilkens, T. Z. Esirkepov, G. Korn, G. Kraft, S. D. Kraft, M. Molls, and V. S. Khoroshkov. Laser ion acceleration for hadron therapy. Physics-Uspekhi, 57(12):1149, 2014. URL: http://stacks.iop.org/1063-7869/57/i=12/a=1149.

[9]

S. Steinke, J. van Tilborg, C. Benedetti, C. G. R. Geddes, C. B. Schroeder, J. Daniels, K. K. Swanson, A. J. Gonsalves, K. Nakamura, N. H. Matlis, B. H. Shaw, E. Esarey, and W. P. Leemans. Multistage coupling of independent laser-plasma accelerators. Nature, 530(7589):190–193, Feb 2016. URL: http://dx.doi.org/10.1038/nature16525 http://10.1038/nature16525.

[10]

P. Sprangle, E. Esarey, and A. Ting. Nonlinear theory of intense laser-plasma interactions. Physical Review Letters, 64(17):2011–2014, Apr 1990.

[11] (1,2)

T. M. Antonsen and P. Mora. Self-Focusing And Raman-Scattering Of Laser-Pulses In Tenuous Plasmas. Physical Review Letters, 69(15):2204–2207, Oct 1992. doi:10.1103/Physrevlett.69.2204.

[12] (1,2,3)

J. Krall, A. Ting, E. Esarey, and P. Sprangle. Enhanced Acceleration In A Self-Modulated-Laser Wake-Field Accelerator. Physical Review E, 48(3):2157–2161, Sep 1993. doi:10.1103/Physreve.48.2157.

[13]

P. Mora and T. M. Antonsen. Kinetic Modeling Of Intense, Short Laser Pulses Propagating In Tenuous Plasmas. Phys. Plasmas, 4(1):217–229, Jan 1997. doi:10.1063/1.872134.

[14] (1,2)

C. Huang, V. K. Decyk, C. Ren, M. Zhou, W. Lu, W. B. Mori, J. H. Cooley, T. M. Antonsen, Jr, and T. Katsouleas. Quickpic: A Highly Efficient Particle-In-Cell Code For Modeling Wakefield Acceleration In Plasmas. Journal of Computational Physics, 217(2):658–679, Sep 2006. doi:10.1016/J.Jcp.2006.01.039.

[15] (1,2)

C. Benedetti, C. B. Schroeder, E. Esarey, C. G. R. Geddes, and W. P. Leemans. Efficient Modeling Of Laser-Plasma Accelerators With Inf&Rno. Aip Conference Proceedings, 1299:250–255, 2010. doi:10.1063/1.3520323.

[16]

B. M. Cowan, D. L. Bruhwiler, E. Cormier-Michel, E. Esarey, C. G. R. Geddes, P. Messmer, and K. M. Paul. Characteristics Of An Envelope Model For Laser-Plasma Accelerator Simulation. Journal of Computational Physics, 230(1):61–86, 2011. doi:Doi: 10.1016/J.Jcp.2010.09.009.

[17]

J.-L. Vay. Noninvariance Of Space- And Time-Scale Ranges Under A Lorentz Transformation And The Implications For The Study Of Relativistic Interactions. Physical Review Letters, 98(13):130405/1–4, 2007.

[18]

D. L. Bruhwiler, J. R. Cary, B. M. Cowan, K. Paul, C. G. R. Geddes, P. J. Mullowney, P. Messmer, E. Esarey, E. Cormier-Michel, W. Leemans, and J.-L. Vay. New Developments In The Simulation Of Advanced Accelerator Concepts. In Aip Conference Proceedings, volume 1086, 29–37. 2009.

[19]

J.-L. Vay, D. L. Bruhwiler, C. G. R. Geddes, W. M. Fawley, S. F. Martins, J. R. Cary, E. Cormier-Michel, B. Cowan, R. A. Fonseca, M. A. Furman, W. Lu, W. B. Mori, and L. O. Silva. Simulating Relativistic Beam And Plasma Systems Using An Optimal Boosted Frame. Journal of Physics: Conference Series, 180(1):012006 (5 Pp.), 2009.

[20]

J.-L. Vay, W. M. Fawley, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Application of the reduction of scale range in a Lorentz boosted frame to the numerical simulation of particle acceleration devices. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.

[21]

S. F. Martins, R. A. Fonseca, L. O. Silva, and W. B. Mori. Boosted Frame PIC Simulations of LWFA: Towards the Energy Frontier. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.

[22]

J.‐L. Vay, C. G. R. Geddes, C. Benedetti, D. L. Bruhwiler, E. Cormier‐Michel, B. M. Cowan, J. R. Cary, and D. P. Grote. Modeling Laser Wakefield Accelerators In A Lorentz Boosted Frame. AIP Conference Proceedings, 1299(1):244–249, Nov 2010. URL: https://doi.org/10.1063/1.3520322, arXiv:https://pubs.aip.org/aip/acp/article-pdf/1299/1/244/11928106/244\_1\_online.pdf, doi:10.1063/1.3520322.

[23]

S. F. Martins, R. A. Fonseca, W. Lu, W. B. Mori, and L. O. Silva. Exploring Laser-Wakefield-Accelerator Regimes For Near-Term Lasers Using Particle-In-Cell Simulation In Lorentz-Boosted Frames. Nature Physics, 6(4):311–316, Apr 2010. doi:10.1038/Nphys1538.

[24]

S. F. Martins, R. A. Fonseca, J. Vieira, L. O. Silva, W. Lu, and W. B. Mori. Modeling Laser Wakefield Accelerator Experiments With Ultrafast Particle-In-Cell Simulations In Boosted Frames. Physics of Plasmas, 17(5):56705, May 2010. doi:10.1063/1.3358139.

[25]

S. F. Martins, R. A. Fonseca, L. O. Silva, W. Lu, and W. B. Mori. Numerical Simulations Of Laser Wakefield Accelerators In Optimal Lorentz Frames. Computer Physics Communications, 181(5):869–875, May 2010. doi:10.1016/J.Cpc.2009.12.023.

[26]

J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Numerical Methods For Instability Mitigation In The Modeling Of Laser Wakefield Accelerators In A Lorentz-Boosted Frame. Journal of Computational Physics, 230(15):5908–5929, Jul 2011. doi:10.1016/J.Jcp.2011.04.003.

[27]

J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Effects of hyperbolic rotation in Minkowski space on the modeling of plasma accelerators in a Lorentz boosted frame. Physics of Plasmas, 18(3):030701, Mar 2011. URL: https://doi.org/10.1063/1.3559483, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.3559483/16019930/030701\_1\_online.pdf, doi:10.1063/1.3559483.

[28]

J.-L. Vay, C. G. R. Geddes, E. Esarey, C. B. Schroeder, W. P. Leemans, E. Cormier-Michel, and D. P. Grote. Modeling Of 10 GeV-1 TeV Laser-Plasma Accelerators Using Lorentz Boosted Simulations. Physics of Plasmas, Dec 2011. doi:10.1063/1.3663841.

[29]

P. Yu, X. Xu, A. Davidson, A. Tableman, T. Dalichaouch, F. Li, M. D. Meyers, W. An, F. S. Tsung, V. K. Decyk, F. Fiuza, J. Vieira, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Enabling Lorentz boosted frame particle-in-cell simulations of laser wakefield acceleration in quasi-3D geometry. Journal of Computational Physics, 2016. doi:10.1016/j.jcp.2016.04.014.

[30]

B. B. Godfrey. The IPROP Three-Dimensional Beam Propagation Code. Defense Technical Information Center, 1985.

[31]

A. F. Lifschitz, X. Davoine, E. Lefebvre, J. Faure, C. Rechatin, and V. Malka. Particle-in-Cell modelling of laser-plasma interaction using Fourier decomposition. Journal of Computational Physics, 228(5):1803–1814, 2009. URL: http://www.sciencedirect.com/science/article/pii/S0021999108005950, doi:http://dx.doi.org/10.1016/j.jcp.2008.11.017.

[32]

A. Davidson, A. Tableman, W. An, F. S. Tsung, W. Lu, J. Vieira, R. A. Fonseca, L. O. Silva, and W. B. Mori. Implementation of a hybrid particle code with a PIC description in r–z and a gridless description in \Phi into OSIRIS. Journal of Computational Physics, 281:1063–1077, 2015. doi:10.1016/j.jcp.2014.10.064.

[33]

R. Lehe, M. Kirchen, I. A. Andriyash, B. B. Godfrey, and J.-L. Vay. A spectral, quasi-cylindrical and dispersion-free Particle-In-Cell algorithm. Computer Physics Communications, 203:66–82, 2016. doi:10.1016/j.cpc.2016.02.007.

[34]

I. A. Andriyash, R. Lehe, and A. Lifschitz. Laser-plasma interactions with a Fourier-Bessel particle-in-cell method. Physics of Plasmas, 23(3):, 2016. doi:10.1063/1.4943281.

[35]

B. A. Shadwick, C. B. Schroeder, and E. Esarey. Nonlinear Laser Energy Depletion In Laser-Plasma Accelerators. Physics of Plasmas, 16(5):56704, May 2009. doi:10.1063/1.3124185.

[36]

E. Cormier-Michel, C. G. R. Geddes, E. Esarey, C. B. Schroeder, D. L. Bruhwiler, K. Paul, B. Cowan, and W. P. Leemans. Scaled Simulations Of A 10 GeV Accelerator. In Aip Conference Proceedings, volume 1086, 297–302. 2009.

Particle-in-Cell Method

[fig:PIC] The Particle-In-Cell (PIC) method follows the evolution of a collection of charged macro-particles (positively charged in blue on the left plot, negatively charged in red) that evolve self-consistently with their electromagnetic (or electrostatic) fields. The core PIC algorithm involves four operations at each time step: 1) evolve the velocity and position of the particles using the Newton-Lorentz equations, 2) deposit the charge and/or current densities through interpolation from the particles distributions onto the grid, 3) evolve Maxwell’s wave equations (for electromagnetic) or solve Poisson’s equation (for electrostatic) on the grid, 4) interpolate the fields from the grid onto the particles for the next particle push. Additional “add-ons” operations are inserted between these core operations to account for additional physics (e.g. absorption/emission of particles, addition of external forces to account for accelerator focusing or accelerating component) or numerical effects (e.g. smoothing/filtering of the charge/current densities and/or fields on the grid).

The Particle-In-Cell (PIC) method follows the evolution of a collection of charged macro-particles (positively charged in blue on the left plot, negatively charged in red) that evolve self-consistently with their electromagnetic (or electrostatic) fields. The core PIC algorithm involves four operations at each time step: 1) evolve the velocity and position of the particles using the Newton-Lorentz equations, 2) deposit the charge and/or current densities through interpolation from the particles distributions onto the grid, 3) evolve Maxwell’s wave equations (for electromagnetic) or solve Poisson’s equation (for electrostatic) on the grid, 4) interpolate the fields from the grid onto the particles for the next particle push. Additional “add-ons” operations are inserted between these core operations to account for additional physics (e.g. absorption/emission of particles, addition of external forces to account for accelerator focusing or accelerating component) or numerical effects (e.g. smoothing/filtering of the charge/current densities and/or fields on the grid).

In the electromagnetic particle-in-cell method [1, 2], the electromagnetic fields are solved on a grid, usually using Maxwell’s equations

()\[\frac{\mathbf{\partial B}}{\partial t} = -\nabla\times\mathbf{E}\]
()\[\frac{\mathbf{\partial E}}{\partial t} = \nabla\times\mathbf{B}-\mathbf{J}\]
()\[\nabla\cdot\mathbf{E} = \rho\]
()\[\nabla\cdot\mathbf{B} = 0\]

given here in natural units (\(\epsilon_0=\mu_0=c=1\)), where \(t\) is time, \(\mathbf{E}\) and \(\mathbf{B}\) are the electric and magnetic field components, and \(\rho\) and \(\mathbf{J}\) are the charge and current densities. The charged particles are advanced in time using the Newton-Lorentz equations of motion

()\[\frac{d\mathbf{x}}{dt} = \mathbf{v},\]
()\[\frac{d\left(\gamma\mathbf{v}\right)}{dt} = \frac{q}{m}\left(\mathbf{E}+\mathbf{v}\times\mathbf{B}\right),\]

where \(m\), \(q\), \(\mathbf{x}\), \(\mathbf{v}\) and \(\gamma=1/\sqrt{1-v^{2}}\) are respectively the mass, charge, position, velocity and relativistic factor of the particle given in natural units (\(c=1\)). The charge and current densities are interpolated on the grid from the particles’ positions and velocities, while the electric and magnetic field components are interpolated from the grid to the particles’ positions for the velocity update.

Particle push

A centered finite-difference discretization of the Newton-Lorentz equations of motion is given by

()\[\frac{\mathbf{x}^{i+1}-\mathbf{x}^{i}}{\Delta t} = \mathbf{v}^{i+1/2},\]
()\[\frac{\gamma^{i+1/2}\mathbf{v}^{i+1/2}-\gamma^{i-1/2}\mathbf{v}^{i-1/2}}{\Delta t} = \frac{q}{m}\left(\mathbf{E}^{i}+\mathbf{\bar{v}}^{i}\times\mathbf{B}^{i}\right).\]

In order to close the system, \(\bar{\mathbf{v}}^{i}\) must be expressed as a function of the other quantities. The two implementations that have become the most popular are presented below.

Boris relativistic velocity rotation

The solution proposed by Boris [3] is given by

()\[\mathbf{\bar{v}}^{i} = \frac{\gamma^{i+1/2}\mathbf{v}^{i+1/2}+\gamma^{i-1/2}\mathbf{v}^{i-1/2}}{2\bar{\gamma}^{i}}\]

where \(\bar{\gamma}^{i}\) is defined by \(\bar{\gamma}^{i} \equiv (\gamma^{i+1/2}+\gamma^{i-1/2} )/2\).

The system (8, 9) is solved very efficiently following Boris’ method, where the electric field push is decoupled from the magnetic push. Setting \(\mathbf{u}=\gamma\mathbf{v}\), the velocity is updated using the following sequence:

\[\begin{split}\begin{aligned} \mathbf{u^{-}} & = \mathbf{u}^{i-1/2}+\left(q\Delta t/2m\right)\mathbf{E}^{i} \\ \mathbf{u'} & = \mathbf{u}^{-}+\mathbf{u}^{-}\times\mathbf{t} \\ \mathbf{u}^{+} & = \mathbf{u}^{-}+\mathbf{u'}\times2\mathbf{t}/(1+\mathbf{t}^{2}) \\ \mathbf{u}^{i+1/2} & = \mathbf{u}^{+}+\left(q\Delta t/2m\right)\mathbf{E}^{i} \end{aligned}\end{split}\]

where \(\mathbf{t}=\left(q\Delta t/2m\right)\mathbf{B}^{i}/\bar{\gamma}^{i}\) and where \(\bar{\gamma}^{i}\) can be calculated as \(\bar{\gamma}^{i}=\sqrt{1+(\mathbf{u}^-/c)^2}\).

The Boris implementation is second-order accurate, time-reversible and fast. Its implementation is very widespread and used in the vast majority of PIC codes.

Vay Lorentz-invariant formulation

It was shown in Vay [4] that the Boris formulation is not Lorentz invariant and can lead to significant errors in the treatment of relativistic dynamics. A Lorentz invariant formulation is obtained by considering the following velocity average

()\[\mathbf{\bar{v}}^{i} = \frac{\mathbf{v}^{i+1/2}+\mathbf{v}^{i-1/2}}{2}.\]

This gives a system that is solvable analytically (see Vay [4] for a detailed derivation), giving the following velocity update:

()\[\mathbf{u^{*}} = \mathbf{u}^{i-1/2}+\frac{q\Delta t}{m}\left(\mathbf{E}^{i}+\frac{\mathbf{v}^{i-1/2}}{2}\times\mathbf{B}^{i}\right),\]
()\[\mathbf{u}^{i+1/2} = \frac{\mathbf{u^{*}}+\left(\mathbf{u^{*}}\cdot\mathbf{t}\right)\mathbf{t}+\mathbf{u^{*}}\times\mathbf{t}}{1+\mathbf{t}^{2}},\]

where

\[\begin{split}\begin{align} \mathbf{t} & = \boldsymbol{\tau}/\gamma^{i+1/2}, \\ \boldsymbol{\tau} & = \left(q\Delta t/2m\right)\mathbf{B}^{i}, \\ \gamma^{i+1/2} & = \sqrt{\sigma+\sqrt{\sigma^{2}+\left(\boldsymbol{\tau}^{2}+w^{2}\right)}}, \\ w & = \mathbf{u^{*}}\cdot\boldsymbol{\tau}, \\ \sigma & = \left(\gamma'^{2}-\boldsymbol{\tau}^{2}\right)/2, \\ \gamma' & = \sqrt{1+(\mathbf{u}^{*}/c)^{2}}. \end{align}\end{split}\]

This Lorentz invariant formulation is particularly well suited for the modeling of ultra-relativistic charged particle beams, where the accurate account of the cancellation of the self-generated electric and magnetic fields is essential, as shown in Vay [4].

Field solve

Various methods are available for solving Maxwell’s equations on a grid, based on finite-differences, finite-volume, finite-element, spectral, or other discretization techniques that apply most commonly on single structured or unstructured meshes and less commonly on multiblock multiresolution grid structures. In this chapter, we summarize the widespread second order finite-difference time-domain (FDTD) algorithm, its extension to non-standard finite-differences as well as the pseudo-spectral analytical time-domain (PSATD) and pseudo-spectral time-domain (PSTD) algorithms. Extension to multiresolution (or mesh refinement) PIC is described in, e.g., Vay et al. [5], Vay et al. [6].

[fig:yee_grid](left) Layout of field components on the staggered “Yee” grid. Current densities and electric fields are defined on the edges of the cells and magnetic fields on the faces. (right) Time integration using a second-order finite-difference "leapfrog" integrator.

(left) Layout of field components on the staggered “Yee” grid. Current densities and electric fields are defined on the edges of the cells and magnetic fields on the faces. (right) Time integration using a second-order finite-difference “leapfrog” integrator.

Finite-Difference Time-Domain (FDTD)

The most popular algorithm for electromagnetic PIC codes is the Finite-Difference Time-Domain (or FDTD) solver

()\[D_{t}\mathbf{B} = -\nabla\times\mathbf{E}\]
()\[D_{t}\mathbf{E} = \nabla\times\mathbf{B}-\mathbf{J}\]
()\[\left[\nabla\cdot\mathbf{E} = \rho\right]\]
()\[\left[\nabla\cdot\mathbf{B} = 0\right].\]

The differential operator is defined as \(\nabla=D_{x}\mathbf{\hat{x}}+D_{y}\mathbf{\hat{y}}+D_{z}\mathbf{\hat{z}}\) and the finite-difference operators in time and space are defined respectively as

\[\begin{split}\begin{align} D_{t}G|_{i,j,k}^{n} & = \frac{(G|_{i,j,k}^{n+1/2}-G|_{i,j,k}^{n-1/2})}{\Delta t}, \\ D_{x}G|_{i,j,k}^{n} & = \frac{G|_{i+1/2,j,k}^{n}-G|_{i-1/2,j,k}^{n}}{\Delta x}, \end{align}\end{split}\]

where \(\Delta t\) and \(\Delta x\) are respectively the time step and the grid cell size along \(x\), \(n\) is the time index and \(i\), \(j\) and \(k\) are the spatial indices along \(x\), \(y\) and \(z\) respectively. The difference operators along \(y\) and \(z\) are obtained by circular permutation. The equations in brackets are given for completeness, as they are often not actually solved, thanks to the usage of a so-called charge conserving algorithm, as explained below. As shown in Fig. 28, the quantities are given on a staggered (or “Yee”) grid [7], where the electric field components are located between nodes and the magnetic field components are located in the center of the cell faces. Knowing the current densities at half-integer steps, the electric field components are updated alternately with the magnetic field components at integer and half-integer steps respectively.

Non-Standard Finite-Difference Time-Domain (NSFDTD)

An implementation of the source-free Maxwell’s wave equations for narrow-band applications based on non-standard finite-differences (NSFD) was introduced in Cole [8], Cole [9], and was adapted for wideband applications in Karkkainen et al. [10]. At the Courant limit for the time step and for a given set of parameters, the stencil proposed in Karkkainen et al. [10] has no numerical dispersion along the principal axes, provided that the cell size is the same along each dimension (i.e. cubic cells in 3D). The “Cole-Karkkainen” (or CK) solver uses the non-standard finite difference formulation (based on extended stencils) of the Maxwell-Ampere equation and can be implemented as follows [11]:

()\[D_{t}\mathbf{B} = -\nabla^{*}\times\mathbf{E}\]
()\[D_{t}\mathbf{E} = \nabla\times\mathbf{B}-\mathbf{J}\]
()\[\left[\nabla\cdot\mathbf{E} = \rho\right]\]
()\[\left[\nabla^{*}\cdot\mathbf{B}= 0\right]\]

Eqs. (19) and (20) are not being solved explicitly but verified via appropriate initial conditions and current deposition procedure. The NSFD differential operator is given by

\[\nabla^{*}=D_{x}^{*}\mathbf{\hat{x}}+D_{y}^{*}\mathbf{\hat{y}}+D_{z}^{*}\mathbf{\hat{z}}\]

where

\[D_{x}^{*}=\left(\alpha+\beta S_{x}^{1}+\xi S_{x}^{2}\right)D_{x}\]

with

\[\begin{split}\begin{align} S_{x}^{1}G|_{i,j,k}^{n} & = G|_{i,j+1,k}^{n}+G|_{i,j-1,k}^{n}+G|_{i,j,k+1}^{n}+G|_{i,j,k-1}^{n}, \\ S_{x}^{2}G|_{i,j,k}^{n} & = G|_{i,j+1,k+1}^{n}+G|_{i,j-1,k+1}^{n}+G|_{i,j+1,k-1}^{n}+G|_{i,j-1,k-1}^{n}. \end{align}\end{split}\]

Here \(G\) is a sample vector component, while \(\alpha\), \(\beta\) and \(\xi\) are constant scalars satisfying \(\alpha+4\beta+4\xi=1\). As with the FDTD algorithm, the quantities with half-integer are located between the nodes (electric field components) or in the center of the cell faces (magnetic field components). The operators along \(y\) and \(z\), i.e. \(D_{y}\), \(D_{z}\), \(D_{y}^{*}\), \(D_{z}^{*}\), \(S_{y}^{1}\), \(S_{z}^{1}\), \(S_{y}^{2}\), and \(S_{z}^{2}\), are obtained by circular permutation of the indices.

Assuming cubic cells (\(\Delta x=\Delta y=\Delta z\)), the coefficients given in Karkkainen et al. [10] (\(\alpha=7/12\), \(\beta=1/12\) and \(\xi=1/48\)) allow for the Courant condition to be at \(\Delta t=\Delta x\), which equates to having no numerical dispersion along the principal axes. The algorithm reduces to the FDTD algorithm with \(\alpha=1\) and \(\beta=\xi=0\). An extension to non-cubic cells is provided in 3-D by Cowan et al. [12] and in 2-D by Pukhov [13]. An alternative NSFDTD implementation that enables superluminous waves is also given in Lehe et al. [14].

As mentioned above, a key feature of the algorithms based on NSFDTD is that some implementations [10, 12] enable the time step \(\Delta t=\Delta x\) along one or more axes and no numerical dispersion along those axes. However, as shown in Vay et al. [11], an instability develops at the Nyquist wavelength at (or very near) such a timestep. It is also shown in the same paper that removing the Nyquist component in all the source terms using a bilinear filter (see description of the filter below) suppresses this instability.

Pseudo Spectral Analytical Time Domain (PSATD)

Maxwell’s equations in Fourier space are given by

\[\frac{\partial\mathbf{\tilde{E}}}{\partial t} = i\mathbf{k}\times\mathbf{\tilde{B}}-\mathbf{\tilde{J}}\]
\[\frac{\partial\mathbf{\tilde{B}}}{\partial t} = -i\mathbf{k}\times\mathbf{\tilde{E}}\]
\[{}[i\mathbf{k}\cdot\mathbf{\tilde{E}} = \tilde{\rho}]\]
\[{}[i\mathbf{k}\cdot\mathbf{\tilde{B}} = 0]\]

where \(\tilde{a}\) is the Fourier Transform of the quantity \(a\). As with the real space formulation, provided that the continuity equation \(\partial\tilde{\rho}/\partial t+i\mathbf{k}\cdot\mathbf{\tilde{J}}=0\) is satisfied, then the last two equations will automatically be satisfied at any time if satisfied initially and do not need to be explicitly integrated.

Decomposing the electric field and current between longitudinal and transverse components

\[\begin{split}\begin{aligned} \mathbf{\tilde{E}} & = \mathbf{\tilde{E}}_{L}+\mathbf{\tilde{E}}_{T}=\mathbf{\hat{k}}(\mathbf{\hat{k}}\cdot\mathbf{\tilde{E}})-\mathbf{\hat{k}}\times(\mathbf{\hat{k}}\times\mathbf{\tilde{E}}) \\ \mathbf{\tilde{J}} & = \mathbf{\tilde{J}}_{L}+\mathbf{\tilde{J}}_{T}=\mathbf{\hat{k}}(\mathbf{\hat{k}}\cdot\mathbf{\tilde{J}})-\mathbf{\hat{k}}\times(\mathbf{\hat{k}}\times\mathbf{\tilde{J}}) \end{aligned}\end{split}\]

gives

\[\begin{split}\begin{aligned} \frac{\partial\mathbf{\tilde{E}}_{T}}{\partial t} & = i\mathbf{k}\times\mathbf{\tilde{B}}-\mathbf{\tilde{J}_{T}} \\ \frac{\partial\mathbf{\tilde{E}}_{L}}{\partial t} & = -\mathbf{\tilde{J}_{L}} \\ \frac{\partial\mathbf{\tilde{B}}}{\partial t} & = -i\mathbf{k}\times\mathbf{\tilde{E}} \end{aligned}\end{split}\]

with \(\mathbf{\hat{k}}=\mathbf{k}/k\).

If the sources are assumed to be constant over a time interval \(\Delta t\), the system of equations is solvable analytically and is given by (see Haber et al. [15] for the original formulation and Vay et al. [16] for a more detailed derivation):

()\[\mathbf{\tilde{E}}_{T}^{n+1} = C\mathbf{\tilde{E}}_{T}^{n}+iS\mathbf{\hat{k}}\times\mathbf{\tilde{B}}^{n}-\frac{S}{k}\mathbf{\tilde{J}}_{T}^{n+1/2}\]
()\[\mathbf{\tilde{E}}_{L}^{n+1} = \mathbf{\tilde{E}}_{L}^{n}-\Delta t\mathbf{\tilde{J}}_{L}^{n+1/2}\]
()\[\mathbf{\tilde{B}}^{n+1} = C\mathbf{\tilde{B}}^{n}-iS\mathbf{\hat{k}}\times\mathbf{\tilde{E}}^{n} + i\frac{1-C}{k}\mathbf{\hat{k}}\times\mathbf{\tilde{J}}^{n+1/2}\]

with \(C=\cos\left(k\Delta t\right)\) and \(S=\sin\left(k\Delta t\right)\).

Combining the transverse and longitudinal components, gives

()\[\begin{split}\begin{aligned} \mathbf{\tilde{E}}^{n+1} & = C\mathbf{\tilde{E}}^{n}+iS\mathbf{\hat{k}}\times\mathbf{\tilde{B}}^{n}-\frac{S}{k}\mathbf{\tilde{J}}^{n+1/2} \\ & + (1-C)\mathbf{\hat{k}}(\mathbf{\hat{k}}\cdot\mathbf{\tilde{E}}^{n})\nonumber \\ & + \mathbf{\hat{k}}(\mathbf{\hat{k}}\cdot\mathbf{\tilde{J}}^{n+1/2})\left(\frac{S}{k}-\Delta t\right), \end{aligned}\end{split}\]
()\[\begin{split}\begin{aligned} \mathbf{\tilde{B}}^{n+1} & = C\mathbf{\tilde{B}}^{n}-iS\mathbf{\hat{k}}\times\mathbf{\tilde{E}}^{n} \\ & + i\frac{1-C}{k}\mathbf{\hat{k}}\times\mathbf{\tilde{J}}^{n+1/2}. \end{aligned}\end{split}\]

For fields generated by the source terms without the self-consistent dynamics of the charged particles, this algorithm is free of numerical dispersion and is not subject to a Courant condition. Furthermore, this solution is exact for any time step size subject to the assumption that the current source is constant over that time step.

As shown in Vay et al. [16], by expanding the coefficients \(S_{h}\) and \(C_{h}\) in Taylor series and keeping the leading terms, the PSATD formulation reduces to the perhaps better known pseudo-spectral time-domain (PSTD) formulation [17, 18]:

\[\begin{split}\begin{aligned} \mathbf{\tilde{E}}^{n+1} & = \mathbf{\tilde{E}}^{n}+i\Delta t\mathbf{k}\times\mathbf{\tilde{B}}^{n+1/2}-\Delta t\mathbf{\tilde{J}}^{n+1/2}, \\ \mathbf{\tilde{B}}^{n+3/2} & = \mathbf{\tilde{B}}^{n+1/2}-i\Delta t\mathbf{k}\times\mathbf{\tilde{E}}^{n+1}. \end{aligned}\end{split}\]

The dispersion relation of the PSTD solver is given by \(\sin(\frac{\omega\Delta t}{2})=\frac{k\Delta t}{2}.\) In contrast to the PSATD solver, the PSTD solver is subject to numerical dispersion for a finite time step and to a Courant condition that is given by \(\Delta t\leq \frac{2}{\pi}\left(\frac{1}{\Delta x^{2}}+\frac{1}{\Delta y^{2}}+\frac{1}{\Delta z^{2}}\right)^{-1/2}\).

The PSATD and PSTD formulations that were just given apply to the field components located at the nodes of the grid. As noted in Ohmura and Okamura [19], they can also be easily recast on a staggered Yee grid by multiplication of the field components by the appropriate phase factors to shift them from the collocated to the staggered locations. The choice between a collocated and a staggered formulation is application-dependent.

Spectral solvers used to be very popular in the years 1970s to early 1990s, before being replaced by finite-difference methods with the advent of parallel supercomputers that favored local methods. However, it was shown recently that standard domain decomposition with Fast Fourier Transforms that are local to each subdomain could be used effectively with PIC spectral methods [16], at the cost of truncation errors in the guard cells that could be neglected. A detailed analysis of the effectiveness of the method with exact evaluation of the magnitude of the effect of the truncation error is given in Vincenti and Vay [20] for stencils of arbitrary order (up-to the infinite “spectral” order).

WarpX also includes a kinetic-fluid hybrid model in which the electric field is calculated using Ohm’s law instead of directly evolving Maxwell’s equations. This approach allows reduced physics simulations to be done with significantly lower spatial and temporal resolution than in the standard, fully kinetic, PIC. Details of this model can be found in the section Kinetic-fluid hybrid model.

Current deposition

The current densities are deposited on the computational grid from the particle position and velocities, employing splines of various orders [21].

\[\begin{split}\begin{aligned} \rho & = \frac{1}{\Delta x \Delta y \Delta z}\sum_nq_nS_n \\ \mathbf{J} & = \frac{1}{\Delta x \Delta y \Delta z}\sum_nq_n\mathbf{v_n}S_n \end{aligned}\end{split}\]

In most applications, it is essential to prevent the accumulation of errors resulting from the violation of the discretized Gauss’ Law. This is accomplished by providing a method for depositing the current from the particles to the grid that preserves the discretized Gauss’ Law, or by providing a mechanism for “divergence cleaning” [1, 22, 23, 24, 25]. For the former, schemes that allow a deposition of the current that is exact when combined with the Yee solver is given in Villasenor and Buneman [26] for linear splines and in Esirkepov [27] for splines of arbitrary order.

The NSFDTD formulations given above and in Vay et al. [11], Cowan et al. [12], Pukhov [13], Lehe et al. [14] apply to the Maxwell-Faraday equation, while the discretized Maxwell-Ampere equation uses the FDTD formulation. Consequently, the charge conserving algorithms developed for current deposition [26, 27] apply readily to those NSFDTD-based formulations. More details concerning those implementations, including the expressions for the numerical dispersion and Courant condition are given in Vay et al. [11], Cowan et al. [12], Pukhov [13], Lehe et al. [14].

Current correction

In the case of the pseudospectral solvers, the current deposition algorithm generally does not satisfy the discretized continuity equation in Fourier space:

\[\tilde{\rho}^{n+1}=\tilde{\rho}^{n}-i\Delta t\mathbf{k}\cdot\mathbf{\tilde{J}}^{n+1/2}.\]

In this case, a Boris correction [1] can be applied in \(k\) space in the form

\[\mathbf{\tilde{E}}_{c}^{n+1}=\mathbf{\tilde{E}}^{n+1}-\frac{\mathbf{k}\cdot\mathbf{\tilde{E}}^{n+1}+i\tilde{\rho}^{n+1}}{k}\mathbf{\hat{k}},\]

where \(\mathbf{\tilde{E}}_{c}\) is the corrected field. Alternatively, a correction to the current can be applied (with some similarity to the current deposition presented by Morse and Nielson in their potential-based model in Morse and Nielson [28]) using

\[\mathbf{\tilde{J}}_{c}^{n+1/2}=\mathbf{\tilde{J}}^{n+1/2}-\left[\mathbf{k}\cdot\mathbf{\tilde{J}}^{n+1/2}-i\left(\tilde{\rho}^{n+1}-\tilde{\rho}^{n}\right)/\Delta t\right]\mathbf{\hat{k}}/k,\]

where \(\mathbf{\tilde{J}}_{c}\) is the corrected current. In this case, the transverse component of the current is left untouched while the longitudinal component is effectively replaced by the one obtained from integration of the continuity equation, ensuring that the corrected current satisfies the continuity equation. The advantage of correcting the current rather than the electric field is that it is more local and thus more compatible with domain decomposition of the fields for parallel computation [16].

Vay deposition

Alternatively, an exact current deposition can be written for the pseudo-spectral solvers, following the geometrical interpretation of existing methods in real space [26, 27, 28].

The Vay deposition scheme is the generalization of the Esirkepov deposition scheme for the spectral case with arbitrary-order stencils [16]. The current density \(\widehat{\boldsymbol{J}}^{\,n+1/2}\) in Fourier space is computed as \(\widehat{\boldsymbol{J}}^{\,n+1/2} = i \, \widehat{\boldsymbol{D}} / \boldsymbol{k}\) when \(\boldsymbol{k} \neq 0\) and set to zero otherwise. The quantity \(\boldsymbol{D}\) is deposited in real space by averaging the currents over all possible grid paths between the initial position \(\boldsymbol{x}^{\,n}\) and the final position \(\boldsymbol{x}^{\,n+1}\) and is defined as

  • 2D Cartesian geometry:

\[\begin{split}\begin{align} D_x & = \sum_i \frac{1}{\Delta x \Delta z} \frac{q_i w_i}{2 \Delta t} \bigg[ \Gamma(x_i^{n+1},z_i^{n+1}) - \Gamma(x_i^{n},z_i^{n+1}) + \Gamma(x_i^{n+1},z_i^{n}) - \Gamma(x_i^{n},z_i^{n}) \bigg] \\[8pt] D_y & = \sum_i \frac{v_i^y}{\Delta x \Delta z} \frac{q_i w_i}{4} \bigg[ \Gamma(x_i^{n+1},z_i^{n+1}) + \Gamma(x_i^{n+1},z_i^{n}) + \Gamma(x_i^{n},z_i^{n+1}) + \Gamma(x_i^{n},z_i^{n}) \bigg] \\[8pt] D_z & = \sum_i \frac{1}{\Delta x \Delta z} \frac{q_i w_i}{2 \Delta t} \bigg[ \Gamma(x_i^{n+1},z_i^{n+1}) - \Gamma(x_i^{n+1},z_i^{n}) + \Gamma(x_i^{n},z_i^{n+1}) - \Gamma(x_i^{n},z_i^{n}) \bigg] \end{align}\end{split}\]
  • 3D Cartesian geometry:

\[\begin{split}\begin{align} \begin{split} D_x & = \sum_i \frac{1}{\Delta x\Delta y\Delta z} \frac{q_i w_i}{6\Delta t} \bigg[ 2 \Gamma(x_i^{n+1},y_i^{n+1},z_i^{n+1}) - 2 \Gamma(x_i^{n},y_i^{n+1},z_i^{n+1}) \\[4pt] & \phantom{=} \: + \Gamma(x_i^{n+1},y_i^{n},z_i^{n+1}) - \Gamma(x_i^{n},y_i^{n},z_i^{n+1}) + \Gamma(x_i^{n+1},y_i^{n+1},z_i^{n}) \\[4pt] & \phantom{=} \: - \Gamma(x_i^{n},y_i^{n+1},z_i^{n}) + 2 \Gamma(x_i^{n+1},y_i^{n},z_i^{n}) - 2 \Gamma(x_i^{n},y_i^{n},z_i^{n}) \bigg] \end{split} \\[8pt] \begin{split} D_y & = \sum_i \frac{1}{\Delta x\Delta y\Delta z} \frac{q_i w_i}{6\Delta t} \bigg[ 2 \Gamma(x_i^{n+1},y_i^{n+1},z_i^{n+1}) - 2 \Gamma(x_i^{n+1},y_i^{n},z_i^{n+1}) \\[4pt] & \phantom{=} \: + \Gamma(x_i^{n+1},y_i^{n+1},z_i^{n}) - \Gamma(x_i^{n+1},y_i^{n},z_i^{n}) + \Gamma(x_i^{n},y_i^{n+1},z_i^{n+1}) \\[4pt] & \phantom{=} \: - \Gamma(x_i^{n},y_i^{n},z_i^{n+1}) + 2 \Gamma(x_i^{n},y_i^{n+1},z_i^{n}) - 2 \Gamma(x_i^{n},y_i^{n},z_i^{n}) \bigg] \end{split} \\[8pt] \begin{split} D_z & = \sum_i \frac{1}{\Delta x\Delta y\Delta z} \frac{q_i w_i}{6\Delta t} \bigg[ 2 \Gamma(x_i^{n+1},y_i^{n+1},z_i^{n+1}) - 2 \Gamma(x_i^{n+1},y_i^{n+1},z_i^{n}) \\[4pt] & \phantom{=} \: + \Gamma(x_i^{n},y_i^{n+1},z_i^{n+1}) - \Gamma(x_i^{n},y_i^{n+1},z_i^{n}) + \Gamma(x_i^{n+1},y_i^{n},z_i^{n+1}) \\[4pt] & \phantom{=} \: - \Gamma(x_i^{n+1},y_i^{n},z_i^{n}) + 2 \Gamma(x_i^{n},y_i^{n},z_i^{n+1}) - 2 \Gamma(x_i^{n},y_i^{n},z_i^{n}) \bigg] \end{split} \end{align}\end{split}\]

Here, \(w_i\) represents the weight of the \(i\)-th macro-particle and \(\Gamma\) represents its shape factor. Note that in 2D Cartesian geometry, \(D_y\) is effectively \(J_y\) and does not require additional operations in Fourier space.

Field gather

In general, the field is gathered from the mesh onto the macroparticles using splines of the same order as for the current deposition \(\mathbf{S}=\left(S_{x},S_{y},S_{z}\right)\). Three variations are considered:

  • “momentum conserving”: fields are interpolated from the grid nodes to the macroparticles using \(\mathbf{S}=\left(S_{nx},S_{ny},S_{nz}\right)\) for all field components (if the fields are known at staggered positions, they are first interpolated to the nodes on an auxiliary grid),

  • “energy conserving (or Galerkin)”: fields are interpolated from the staggered Yee grid to the macroparticles using \(\left(S_{nx-1},S_{ny},S_{nz}\right)\) for \(E_{x}\), \(\left(S_{nx},S_{ny-1},S_{nz}\right)\) for \(E_{y}\), \(\left(S_{nx},S_{ny},S_{nz-1}\right)\) for \(E_{z}\), \(\left(S_{nx},S_{ny-1},S_{nz-1}\right)\) for \(B_{x}\), \(\left(S_{nx-1},S_{ny},S_{nz-1}\right)\) for \(B{}_{y}\) and\(\left(S_{nx-1},S_{ny-1},S_{nz}\right)\) for \(B_{z}\) (if the fields are known at the nodes, they are first interpolated to the staggered positions on an auxiliary grid),

  • “uniform”: fields are interpolated directly form the Yee grid to the macroparticles using \(\mathbf{S}=\left(S_{nx},S_{ny},S_{nz}\right)\) for all field components (if the fields are known at the nodes, they are first interpolated to the staggered positions on an auxiliary grid).

As shown in Birdsall and Langdon [1], Hockney and Eastwood [2], Lewis [29], the momentum and energy conserving schemes conserve momentum and energy respectively at the limit of infinitesimal time steps and generally offer better conservation of the respective quantities for a finite time step. The uniform scheme does not conserve momentum nor energy in the sense defined for the others but is given for completeness, as it has been shown to offer some interesting properties in the modeling of relativistically drifting plasmas [30].

Filtering

It is common practice to apply digital filtering to the charge or current density in Particle-In-Cell simulations as a complement or an alternative to using higher order splines [1]. A commonly used filter in PIC simulations is the three points filter

\[\phi_{j}^{f}=\alpha\phi_{j}+\left(1-\alpha\right)\left(\phi_{j-1}+\phi_{j+1}\right)/2\]

where \(\phi^{f}\) is the filtered quantity. This filter is called a bilinear filter when \(\alpha=0.5\). Assuming \(\phi=e^{jkx}\) and \(\phi^{f}=g\left(\alpha,k\right)e^{jkx}\), the filter gain \(g\) is given as a function of the filtering coefficient \(\alpha\) and the wavenumber \(k\) by

\[g\left(\alpha,k\right)=\alpha+\left(1-\alpha\right)\cos\left(k\Delta x\right)\approx1-\left(1-\alpha\right)\frac{\left(k\Delta x\right)^{2}}{2}+O\left(k^{4}\right)`.\]

The total attenuation \(G\) for \(n\) successive applications of filters of coefficients \(\alpha_{1}\)\(\alpha_{n}\) is given by

\[G=\prod_{i=1}^{n}g\left(\alpha_{i},k\right)\approx1-\left(n-\sum_{i=1}^{n}\alpha_{i}\right)\frac{\left(k\Delta x\right)^{2}}{2}+O\left(k^{4}\right)`.\]

A sharper cutoff in \(k\) space is provided by using \(\alpha_{n}=n-\sum_{i=1}^{n-1}\alpha_{i}\), so that \(G\approx1+O\left(k^{4}\right)\). Such step is called a “compensation” step [1]. For the bilinear filter (\(\alpha=1/2\)), the compensation factor is \(\alpha_{c}=2-1/2=3/2\). For a succession of \(n\) applications of the bilinear factor, it is \(\alpha_{c}=n/2+1\).

It is sometimes necessary to filter on a relatively wide band of wavelength, necessitating the application of a large number of passes of the bilinear filter or on the use of filters acting on many points. The former can become very intensive computationally while the latter is problematic for parallel computations using domain decomposition, as the footprint of the filter may eventually surpass the size of subdomains. A workaround is to use a combination of filters of limited footprint. A solution based on the combination of three point filters with various strides was proposed in Vay et al. [11] and operates as follows.

The bilinear filter provides complete suppression of the signal at the grid Nyquist wavelength (twice the grid cell size). Suppression of the signal at integer multiples of the Nyquist wavelength can be obtained by using a stride \(s\) in the filter

\[\phi_{j}^{f}=\alpha\phi_{j}+\left(1-\alpha\right)\left(\phi_{j-s}+\phi_{j+s}\right)/2\]

for which the gain is given by

\[g\left(\alpha,k\right)=\alpha+\left(1-\alpha\right)\cos\left(sk\Delta x\right)\approx1-\left(1-\alpha\right)\frac{\left(sk\Delta x\right)^{2}}{2}+O\left(k^{4}\right).\]

For a given stride, the gain is given by the gain of the bilinear filter shifted in k space, with the pole \(g=0\) shifted from the wavelength \(\lambda=2/\Delta x\) to \(\lambda=2s/\Delta x\), with additional poles, as given by \(sk\Delta x=\arccos\left(\frac{\alpha}{\alpha-1}\right)\pmod{2\pi}\). The resulting filter is pass band between the poles, but since the poles are spread at different integer values in k space, a wide band low pass filter can be constructed by combining filters using different strides. As shown in Vay et al. [11], the successive application of 4-passes + compensation of filters with strides 1, 2 and 4 has a nearly equivalent fall-off in gain as 80 passes + compensation of a bilinear filter. Yet, the strided filter solution needs only 15 passes of a three-point filter, compared to 81 passes for an equivalent n-pass bilinear filter, yielding a gain of 5.4 in number of operations in favor of the combination of filters with stride. The width of the filter with stride 4 extends only on 9 points, compared to 81 points for a single pass equivalent filter, hence giving a gain of 9 in compactness for the stride filters combination in comparison to the single-pass filter with large stencil, resulting in more favorable scaling with the number of computational cores for parallel calculations.

[1] (1,2,3,4,5,6)

C. K. Birdsall and A. B. Langdon. Plasma Physics Via Computer Simulation. Adam-Hilger, 1991. ISBN 0 07 005371 5.

[2] (1,2)

R. W. Hockney and J. W. Eastwood. Computer simulation using particles. Routledge, 1988. ISBN 0-85274-392-0.

[3]

J. P. Boris. Relativistic Plasma Simulation-Optimization of a Hybrid Code. In Proc. Fourth Conf. Num. Sim. Plasmas, 3–67. Naval Res. Lab., Wash., D. C., 1970.

[4] (1,2,3)

J.-L. Vay. Simulation Of Beams Or Plasmas Crossing At Relativistic Velocity. Physics of Plasmas, 15(5):56701, May 2008. doi:10.1063/1.2837054.

[5]

J.-L. Vay, D. P. Grote, R. H. Cohen, and A. Friedman. Novel methods in the particle-in-cell accelerator code-framework warp. Computational Science and Discovery, 5(1):014019 (20 pp.), 2012.

[6]

J.-L. Vay, J.-C. Adam, and A. Heron. Asymmetric Pml For The Absorption Of Waves. Application To Mesh Refinement In Electromagnetic Particle-In-Cell Plasma Simulations. Computer Physics Communications, 164(1-3):171–177, Dec 2004. doi:10.1016/J.Cpc.2004.06.026.

[7]

K. S. Yee. Numerical Solution Of Initial Boundary Value Problems Involving Maxwells Equations In Isotropic Media. Ieee Transactions On Antennas And Propagation, Ap14(3):302–307, 1966.

[8]

J. B. Cole. A High-Accuracy Realization Of The Yee Algorithm Using Non-Standard Finite Differences. Ieee Transactions On Microwave Theory And Techniques, 45(6):991–996, Jun 1997.

[9]

J. B. Cole. High-Accuracy Yee Algorithm Based On Nonstandard Finite Differences: New Developments And Verifications. Ieee Transactions On Antennas And Propagation, 50(9):1185–1191, Sep 2002. doi:10.1109/Tap.2002.801268.

[10] (1,2,3,4)

M. Karkkainen, E. Gjonaj, T. Lau, and T. Weiland. Low-Dispersionwake Field Calculation Tools. In Proc. Of International Computational Accelerator Physics Conference, 35–40. Chamonix, France, 2006.

[11] (1,2,3,4,5,6)

J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Numerical Methods For Instability Mitigation In The Modeling Of Laser Wakefield Accelerators In A Lorentz-Boosted Frame. Journal of Computational Physics, 230(15):5908–5929, Jul 2011. doi:10.1016/J.Jcp.2011.04.003.

[12] (1,2,3,4)

B. M. Cowan, D. L. Bruhwiler, J. R. Cary, E. Cormier-Michel, and C. G. R. Geddes. Generalized algorithm for control of numerical dispersion in explicit time-domain electromagnetic simulations. Physical Review Special Topics-Accelerators And Beams, Apr 2013. doi:10.1103/PhysRevSTAB.16.041303.

[13] (1,2,3)

A. Pukhov. Three-dimensional electromagnetic relativistic particle-in-cell code VLPL (Virtual Laser Plasma Lab). Journal of Plasma Physics, 61(3):425–433, Apr 1999. doi:10.1017/S0022377899007515.

[14] (1,2,3)

R. Lehe, A. Lifschitz, C. Thaury, V. Malka, and X. Davoine. Numerical growth of emittance in simulations of laser-wakefield acceleration. Physical Review Special Topics-Accelerators And Beams, Feb 2013. doi:10.1103/PhysRevSTAB.16.021301.

[15]

I. Haber, R. Lee, H. H. Klein, and J. P. Boris. Advances In Electromagnetic Simulation Techniques. In Proc. Sixth Conf. Num. Sim. Plasmas, 46–48. Berkeley, Ca, 1973.

[16] (1,2,3,4,5)

J.-L. Vay, I. Haber, and B. B. Godfrey. A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas. Journal of Computational Physics, 243:260–268, Jun 2013. doi:10.1016/j.jcp.2013.03.010.

[17]

J. M. Dawson. Particle Simulation Of Plasmas. Reviews Of Modern Physics, 55(2):403–447, 1983. doi:10.1103/RevModPhys.55.403.

[18]

Q. H. Liu. The PSTD Algorithm: A Time-Domain Method Requiring Only Two Cells Per Wavelength. Microwave And Optical Technology Letters, 15(3):158–165, Jun 1997. doi:10.1002/(Sici)1098-2760(19970620)15:3<158::Aid-Mop11>3.3.Co;2-T.

[19]

Y. Ohmura and Y. Okamura. Staggered Grid Pseudo-Spectral Time-Domain Method For Light Scattering Analysis. Piers Online, 6(7):632–635, 2010.

[20]

H. Vincenti and J.-L. Vay. Detailed analysis of the effects of stencil spatial variations with arbitrary high-order finite-difference Maxwell solver. Computer Physics Communications, 200:147–167, Mar 2016. URL: https://apps.webofknowledge.com/full{\_}record.do?product=UA{\&}search{\_}mode=GeneralSearch{\&}qid=1{\&}SID=1CanLFIHrQ5v8O7cxqV{\&}page=1{\&}doc=2, doi:10.1016/j.cpc.2015.11.009.

[21]

H. Abe, N. Sakairi, R. Itatani, and H. Okuda. High-Order Spline Interpolations In The Particle Simulation. Journal of Computational Physics, 63(2):247–267, Apr 1986.

[22]

A. B. Langdon. On Enforcing Gauss Law In Electromagnetic Particle-In-Cell Codes. Computer Physics Communications, 70(3):447–450, Jul 1992.

[23]

B. Marder. A Method For Incorporating Gauss Law Into Electromagnetic Pic Codes. Journal of Computational Physics, 68(1):48–55, Jan 1987.

[24]

J.-L. Vay and C. Deutsch. Charge Compensated Ion Beam Propagation In A Reactor Sized Chamber. Physics of Plasmas, 5(4):1190–1197, Apr 1998.

[25]

C. D. Munz, P. Omnes, R. Schneider, E. Sonnendrucker, and U. Voss. Divergence Correction Techniques For Maxwell Solvers Based On A Hyperbolic Model. Journal of Computational Physics, 161(2):484–511, Jul 2000. doi:10.1006/Jcph.2000.6507.

[26] (1,2,3)

J. Villasenor and O. Buneman. Rigorous Charge Conservation For Local Electromagnetic-Field Solvers. Computer Physics Communications, 69(2-3):306–316, 1992.

[27] (1,2,3)

T. Z. Esirkepov. Exact Charge Conservation Scheme For Particle-In-Cell Simulation With An Arbitrary Form-Factor. Computer Physics Communications, 135(2):144–153, Apr 2001.

[28] (1,2)

R. L. Morse and C. W. Nielson. Numerical Simulation Of Weibel Instability In One And 2 Dimensions. Phys. Fluids, 14(4):830–&, 1971. doi:10.1063/1.1693518.

[29]

H. R. Lewis. Variational algorithms for numerical simulation of collisionless plasma with point particles including electromagnetic interactions. Journal of Computational Physics, 10(3):400–419, 1972. URL: http://www.sciencedirect.com/science/article/pii/0021999172900447, doi:http://dx.doi.org/10.1016/0021-9991(72)90044-7.

[30]

B. B. Godfrey and J.-L. Vay. Numerical stability of relativistic beam multidimensional \PIC\ simulations employing the Esirkepov algorithm. Journal of Computational Physics, 248(0):33–46, 2013. URL: http://www.sciencedirect.com/science/article/pii/S0021999113002556, doi:http://dx.doi.org/10.1016/j.jcp.2013.04.006.

Mesh refinement

Sketches of the implementation of mesh refinement in WarpX with the electrostatic (left) and electromagnetic (right) solvers. In both cases, the charge/current from particles are deposited at the finest levels first, then interpolated recursively to coarser levels. In the electrostatic case, the potential is calculated first at the coarsest level :math:`L_0`, the solution interpolated to the boundaries of the refined patch :math:`r` at the next level :math:`L_{1}` and the potential calculated at :math:`L_1`. The procedure is repeated iteratively up to the highest level. In the electromagnetic case, the fields are computed independently on each grid and patch without interpolation at boundaries. Patches are terminated by absorbing layers (PML) to prevent the reflection of electromagnetic waves. Additional coarse patch :math:`c` and fine grid :math:`a` are needed so that the full solution is obtained by substitution on :math:`a` as :math:`F_{n+1}(a)=F_{n+1}(r)+I[F_n( s )-F_{n+1}( c )]` where :math:`F` is the field, and :math:`I` is a coarse-to-fine interpolation operator. In both cases, the field solution at a given level :math:`L_n` is unaffected by the solution at higher levels :math:`L_{n+1}` and up, allowing for mitigation of some spurious effects (see text) by providing a transition zone via extension of the patches by a few cells beyond the desired refined area (red & orange rectangles) in which the field is interpolated onto particles from the coarser parent level only.

Sketches of the implementation of mesh refinement in WarpX with the electrostatic (left) and electromagnetic (right) solvers. In both cases, the charge/current from particles are deposited at the finest levels first, then interpolated recursively to coarser levels. In the electrostatic case, the potential is calculated first at the coarsest level \(L_0\), the solution interpolated to the boundaries of the refined patch \(r\) at the next level \(L_{1}\) and the potential calculated at \(L_1\). The procedure is repeated iteratively up to the highest level. In the electromagnetic case, the fields are computed independently on each grid and patch without interpolation at boundaries. Patches are terminated by absorbing layers (PML) to prevent the reflection of electromagnetic waves. Additional coarse patch \(c\) and fine grid \(a\) are needed so that the full solution is obtained by substitution on \(a\) as \(F_{n+1}(a)=F_{n+1}(r)+I[F_n( s )-F_{n+1}( c )]\) where \(F\) is the field, and \(I\) is a coarse-to-fine interpolation operator. In both cases, the field solution at a given level \(L_n\) is unaffected by the solution at higher levels \(L_{n+1}\) and up, allowing for mitigation of some spurious effects (see text) by providing a transition zone via extension of the patches by a few cells beyond the desired refined area (red & orange rectangles) in which the field is interpolated onto particles from the coarser parent level only.

The mesh refinement methods that have been implemented in WarpX were developed following the following principles: i) avoidance of spurious effects from mesh refinement, or minimization of such effects; ii) user controllability of the spurious effects’ relative magnitude; iii) simplicity of implementation. The two main generic issues that were identified are: a) spurious self-force on macroparticles close to the mesh refinement interface [1, 2]; b) reflection (and possible amplification) of short wavelength electromagnetic waves at the mesh refinement interface [3]. The two effects are due to the loss of translation invariance introduced by the asymmetry of the grid on each side of the mesh refinement interface.

In addition, for some implementations where the field that is computed at a given level is affected by the solution at finer levels, there are cases where the procedure violates the integral of Gauss’ Law around the refined patch, leading to long range errors [1, 2]. As will be shown below, in the procedure that has been developed in WarpX, the field at a given refinement level is not affected by the solution at finer levels, and is thus not affected by this type of error.

Electrostatic

A cornerstone of the Particle-In-Cell method is that given a particle lying in a hypothetical infinite grid, if the grid is regular and symmetrical, and if the order of field gathering matches the order of charge (or current) deposition, then there is no self-force of the particle acting on itself: a) anywhere if using the so-called “momentum conserving” gathering scheme; b) on average within one cell if using the “energy conserving” gathering scheme [4]. A breaking of the regularity and/or symmetry in the grid, whether it is from the use of irregular meshes or mesh refinement, and whether one uses finite difference, finite volume or finite elements, results in a net spurious self-force (which does not average to zero over one cell) for a macroparticle close to the point of irregularity (mesh refinement interface for the current purpose) [1, 2].

A sketch of the implementation of mesh refinement in WarpX is given in Fig. 29. Given the solution of the electric potential at a refinement level \(L_n\), it is interpolated onto the boundaries of the grid patch(es) at the next refined level \(L_{n+1}\). The electric potential is then computed at level \(L_{n+1}\) by solving the Poisson equation. This procedure necessitates the knowledge of the charge density at every level of refinement. For efficiency, the macroparticle charge is deposited on the highest level patch that contains them, and the charge density of each patch is added recursively to lower levels, down to the lowest.

Position history of one charged particle attracted by its image induced by a nearby metallic (dirichlet) boundary. The particle is initialized at rest. Without refinement patch (reference case), the particle is accelerated by its image, is reflected specularly at the wall, then decelerates until it reaches its initial position at rest. If the particle is initialized inside a refinement patch, the particle is initially accelerated toward the wall but is spuriously reflected before it reaches the boundary of the patch whether using the method implemented in WarpX or the MC method. Providing a surrounding transition region 2 or 4 cells wide in which the potential is interpolated from the parent coarse solution reduces significantly the effect of the spurious self-force.

Position history of one charged particle attracted by its image induced by a nearby metallic (dirichlet) boundary. The particle is initialized at rest. Without refinement patch (reference case), the particle is accelerated by its image, is reflected specularly at the wall, then decelerates until it reaches its initial position at rest. If the particle is initialized inside a refinement patch, the particle is initially accelerated toward the wall but is spuriously reflected before it reaches the boundary of the patch whether using the method implemented in WarpX or the MC method. Providing a surrounding transition region 2 or 4 cells wide in which the potential is interpolated from the parent coarse solution reduces significantly the effect of the spurious self-force.

The presence of the self-force is illustrated on a simple test case that was introduced in Vay et al. [1] and also used in Colella and Norgaard [2]: a single macroparticle is initialized at rest within a single refinement patch four cells away from the patch refinement boundary. The patch at level \(L_1\) has \(32\times32\) cells and is centered relative to the lowest \(64\times64\) grid at level \(L_0\) (“main grid”), while the macroparticle is centered in one direction but not in the other. The boundaries of the main grid are perfectly conducting, so that the macroparticle is attracted to the closest wall by its image. Specular reflection is applied when the particle reaches the boundary so that the motion is cyclic. The test was performed with WarpX using either linear or quadratic interpolation when gathering the main grid solution onto the refined patch boundary. It was also performed using another method from P. McCorquodale et al (labeled “MC” in this paper) based on the algorithm given in Mccorquodale et al. [5], which employs a more elaborate procedure involving two-ways interpolations between the main grid and the refined patch. A reference case was also run using a single \(128\times128\) grid with no refined patch, in which it is observed that the particle propagates toward the closest boundary at an accelerated pace, is reflected specularly at the boundary, then slows down until it reaches its initial position at zero velocity. The particle position histories are shown for the various cases in Fig. 30. In all the cases using the refinement patch, the particle was spuriously reflected near the patch boundary and was effectively trapped in the patch. We notice that linear interpolation performs better than quadratic, and that the simple method implemented in WarpX performs better than the other proposed method for this test (see discussion below).

(left) Maps of the magnitude of the spurious self-force :math:`\epsilon` in arbitrary units within one quarter of the refined patch, defined as :math:`\epsilon=\sqrt{(E_x-E_x^{ref})^2+(E_y-E_y^{ref})^2}`, where :math:`E_x` and :math:`E_y` are the electric field components within the patch experienced by one particle at a given location and :math:`E_x^{ref}` and :math:`E_y^{ref}` are the electric field from a reference solution. The map is given for the WarpX and the MC mesh refinement algorithms and for linear and quadratic interpolation at the patch refinement boundary. (right) Lineouts of the maximum (taken over neighboring cells) of the spurious self-force. Close to the interface boundary (x=0), the spurious self-force decreases at a rate close to one order of magnitude per cell (red line), then at about one order of magnitude per six cells (green line).

(left) Maps of the magnitude of the spurious self-force \(\epsilon\) in arbitrary units within one quarter of the refined patch, defined as \(\epsilon=\sqrt{(E_x-E_x^{ref})^2+(E_y-E_y^{ref})^2}\), where \(E_x\) and \(E_y\) are the electric field components within the patch experienced by one particle at a given location and \(E_x^{ref}\) and \(E_y^{ref}\) are the electric field from a reference solution. The map is given for the WarpX and the MC mesh refinement algorithms and for linear and quadratic interpolation at the patch refinement boundary. (right) Lineouts of the maximum (taken over neighboring cells) of the spurious self-force. Close to the interface boundary (x=0), the spurious self-force decreases at a rate close to one order of magnitude per cell (red line), then at about one order of magnitude per six cells (green line).

The magnitude of the spurious self-force as a function of the macroparticle position was mapped and is shown in Fig. 31 for the WarpX and MC algorithms using linear or quadratic interpolations between grid levels. It is observed that the magnitude of the spurious self-force decreases rapidly with the distance between the particle and the refined patch boundary, at a rate approaching one order of magnitude per cell for the four cells closest to the boundary and about one order of magnitude per six cells beyond. The method implemented in WarpX offers a weaker spurious force on average and especially at the cells that are the closest to the coarse-fine interface where it is the largest and thus matters most. We notice that the magnitude of the spurious self-force depends strongly on the distance to the edge of the patch and to the nodes of the underlying coarse grid, but weakly on the order of deposition and size of the patch.

A method was devised and implemented in WarpX for reducing the magnitude of spurious self-forces near the coarse-fine boundaries as follows. Noting that the coarse grid solution is unaffected by the presence of the patch and is thus free of self-force, extra “transition” cells are added around the “effective” refined area. Within the effective area, the particles gather the potential in the fine grid. In the extra transition cells surrounding the refinement patch, the force is gathered directly from the coarse grid (an option, which has not yet been implemented, would be to interpolate between the coarse and fine grid field solutions within the transition zone so as to provide continuity of the force experienced by the particles at the interface). The number of cells allocated in the transition zones is controllable by the user in WarpX, giving the opportunity to check whether the spurious self-force is affecting the calculation by repeating it using different thicknesses of the transition zones. The control of the spurious force using the transition zone is illustrated in Fig. 30, where the calculation with WarpX using linear interpolation at the patch interface was repeated using either two or four cells transition regions (measured in refined patch cell units). Using two extra cells allowed for the particle to be free of spurious trapping within the refined area and follow a trajectory that is close to the reference one, and using four extra cells improved further to the point where the resulting trajectory becomes indistinguishable from the reference one. We note that an alternative method was devised for reducing the magnitude of self-force near the coarse-fine boundaries for the MC method, by using a special deposition procedure near the interface [2].

Electromagnetic

The method that is used for electrostatic mesh refinement is not directly applicable to electromagnetic calculations. As was shown in section 3.4 of Vay [3], refinement schemes relying solely on interpolation between coarse and fine patches lead to the reflection with amplification of the short wavelength modes that fall below the cutoff of the Nyquist frequency of the coarse grid. Unless these modes are damped heavily or prevented from occurring at their source, they may affect particle motion and their effect can escalate if trapped within a patch, via multiple successive reflections with amplification.

To circumvent this issue, an additional coarse patch (with the same resolution as the parent grid) is added, as shown in Fig. 29 and described in Vay et al. [6]. Both the fine and the coarse grid patches are terminated by Perfectly Matched Layers, reducing wave reflection by orders of magnitude, controllable by the user [7, 8]. The source current resulting from the motion of charged macroparticles within the refined region is accumulated on the fine patch and is then interpolated onto the coarse patch and added onto the parent grid. The process is repeated recursively from the finest level down to the coarsest. The Maxwell equations are then solved for one time interval on the entire set of grids, by default for one time step using the time step of the finest grid. The field on the coarse and fine patches only contain the contributions from the particles that have evolved within the refined area but not from the current sources outside the area. The total contribution of the field from sources within and outside the refined area is obtained by adding the field from the refined grid \(F(r)\), and adding an interpolation \(I\) of the difference between the relevant subset \(s\) of the field in the parent grid \(F(s)\) and the field of the coarse grid \(F( c )\), on an auxiliary grid \(a\), i.e. \(F(a)=F(r)+I[F(s)-F( c )]\). The field on the parent grid subset \(F(s)\) contains contributions from sources from both within and outside of the refined area. Thus, in effect, there is substitution of the coarse field resulting from sources within the patch area by its fine resolution counterpart. The operation is carried out recursively starting at the coarsest level up to the finest. An option has been implemented in which various grid levels are pushed with different time steps, given as a fixed fraction of the individual grid Courant conditions (assuming same cell aspect ratio for all grids and refinement by integer factors). In this case, the fields from the coarse levels, which are advanced less often, are interpolated in time.

The substitution method has two potential drawbacks due to the inexact cancellation between the coarse and fine patches of : (i) the remnants of ghost fixed charges created by the particles entering and leaving the patches (this effect is due to the use of the electromagnetic solver and is different from the spurious self-force that was described for the electrostatic case); (ii) if using a Maxwell solver with a low-order stencil, the electromagnetic waves traveling on each patch at slightly different velocity due to numerical dispersion. The first issue results in an effective spurious multipole field whose magnitude decreases very rapidly with the distance to the patch boundary, similarly to the spurious self-force in the electrostatic case. Hence, adding a few extra transition cells surrounding the patches mitigates this effect very effectively. The tunability of WarpX’s electromagnetic finite-difference and pseudo-spectral solvers provides the means to optimize the numerical dispersion so as to minimize the second effect for a given application, which has been demonstrated on the laser-plasma interaction test case presented in Vay et al. [6]. Both effects and their mitigation are described in more detail in Vay et al. [6].

Caustics are supported anywhere on the grid with an accuracy that is set by the local resolution, and will be adequately resolved if the grid resolution supports the necessary modes from their sources to the points of wavefront crossing. The mesh refinement method that is implemented in WarpX has the potential to provide higher efficiency than the standard use of fixed gridding, by offering a path toward adaptive gridding following wavefronts.

[1] (1,2,3,4)

J.-L. Vay, P. Colella, P. Mccorquodale, B. Van Straalen, A. Friedman, and D. P. Grote. Mesh Refinement For Particle-In-Cell Plasma Simulations: Applications To And Benefits For Heavy Ion Fusion. Laser And Particle Beams, 20(4):569–575, Dec 2002. doi:10.1017/S0263034602204139.

[2] (1,2,3,4,5)

P. Colella and P. C. Norgaard. Controlling Self-Force Errors At Refinement Boundaries For Amr-Pic. Journal of Computational Physics, 229(4):947–957, Feb 2010. doi:10.1016/J.Jcp.2009.07.004.

[3] (1,2)

J.-L. Vay. An Extended Fdtd Scheme For The Wave Equation: Application To Multiscale Electromagnetic Simulation. Journal of Computational Physics, 167(1):72–98, Feb 2001.

[4]

C. K. Birdsall and A. B. Langdon. Plasma Physics Via Computer Simulation. Adam-Hilger, 1991. ISBN 0 07 005371 5.

[5]

P. Mccorquodale, P. Colella, D. P. Grote, and J.-L. Vay. A Node-Centered Local Refinement Algorithm For Poisson's Equation In Complex Geometries. Journal of Computational Physics, 201(1):34–60, Nov 2004. doi:10.1016/J.Jcp.2004.04.022.

[6] (1,2,3)

J.-L. Vay, J.-C. Adam, and A. Heron. Asymmetric Pml For The Absorption Of Waves. Application To Mesh Refinement In Electromagnetic Particle-In-Cell Plasma Simulations. Computer Physics Communications, 164(1-3):171–177, Dec 2004. doi:10.1016/J.Cpc.2004.06.026.

[7]

J. P. Berenger. Three-Dimensional Perfectly Matched Layer For The Absorption Of Electromagnetic Waves. Journal of Computational Physics, 127(2):363–379, Sep 1996.

[8]

J.-L. Vay. Asymmetric Perfectly Matched Layer For The Absorption Of Waves. Journal of Computational Physics, 183(2):367–399, Dec 2002. doi:10.1006/Jcph.2002.7175.

Boundary conditions

Perfectly Matched Layer: open boundary condition for electromagnetic waves

For the transverse electric (TE) case, the original Berenger’s Perfectly Matched Layer (PML) paper [1] writes

()\[\varepsilon _{0}\frac{\partial E_{x}}{\partial t}+\sigma _{y}E_{x} = \frac{\partial H_{z}}{\partial y}\]
()\[\varepsilon _{0}\frac{\partial E_{y}}{\partial t}+\sigma _{x}E_{y} = -\frac{\partial H_{z}}{\partial x}\]
()\[\mu _{0}\frac{\partial H_{zx}}{\partial t}+\sigma ^{*}_{x}H_{zx} = -\frac{\partial E_{y}}{\partial x}\]
()\[\mu _{0}\frac{\partial H_{zy}}{\partial t}+\sigma ^{*}_{y}H_{zy} = \frac{\partial E_{x}}{\partial y}\]
()\[H_{z} = H_{zx}+H_{zy}\]

This can be generalized to

()\[\varepsilon _{0}\frac{\partial E_{x}}{\partial t}+\sigma _{y}E_{x} = \frac{c_{y}}{c}\frac{\partial H_{z}}{\partial y}+\overline{\sigma }_{y}H_{z}\]
()\[\varepsilon _{0}\frac{\partial E_{y}}{\partial t}+\sigma _{x}E_{y} = -\frac{c_{x}}{c}\frac{\partial H_{z}}{\partial x}+\overline{\sigma }_{x}H_{z}\]
()\[\mu _{0}\frac{\partial H_{zx}}{\partial t}+\sigma ^{*}_{x}H_{zx} = -\frac{c^{*}_{x}}{c}\frac{\partial E_{y}}{\partial x}+\overline{\sigma }_{x}^{*}E_{y}\]
()\[\mu _{0}\frac{\partial H_{zy}}{\partial t}+\sigma ^{*}_{y}H_{zy} = \frac{c^{*}_{y}}{c}\frac{\partial E_{x}}{\partial y}+\overline{\sigma }_{y}^{*}E_{x}\]
()\[H_{z} = H_{zx}+H_{zy}\]

For \(c_{x}=c_{y}=c^{*}_{x}=c^{*}_{y}=c\) and \(\overline{\sigma }_{x}=\overline{\sigma }_{y}=\overline{\sigma }_{x}^{*}=\overline{\sigma }_{y}^{*}=0\), this system reduces to the Berenger PML medium, while adding the additional constraint \(\sigma _{x}=\sigma _{y}=\sigma _{x}^{*}=\sigma _{y}^{*}=0\) leads to the system of Maxwell equations in vacuum.

Propagation of a Plane Wave in an APML Medium

We consider a plane wave of magnitude (\(E_{0},H_{zx0},H_{zy0}\)) and pulsation \(\omega\) propagating in the APML medium with an angle \(\varphi\) relative to the x axis

()\[E_{x} = -E_{0}\sin \varphi \: e^{i\omega \left( t-\alpha x-\beta y\right) }\]
()\[E_{y} = E_{0}\cos \varphi \: e^{i\omega \left( t-\alpha x-\beta y\right) }\]
()\[H_{zx} = H_{zx0} \: e^{i\omega \left( t-\alpha x-\beta y\right) }\]
()\[H_{zy} = H_{zy0} \: e^{i\omega \left( t-\alpha x-\beta y\right) }\]

where \(\alpha\) and \(\beta\) are two complex constants to be determined.

Introducing Eqs. (36), (37), (38) and (39) into Eqs. (31), (32), (33) and (34) gives

()\[\varepsilon _{0}E_{0}\sin \varphi -i\frac{\sigma _{y}}{\omega }E_{0}\sin \varphi = \beta \frac{c_{y}}{c}\left( H_{zx0}+H_{zy0}\right) +i\frac{\overline{\sigma }_{y}}{\omega }\left( H_{zx0}+H_{zy0}\right)\]
()\[\varepsilon _{0}E_{0}\cos \varphi -i\frac{\sigma _{x}}{\omega }E_{0}\cos \varphi = \alpha \frac{c_{x}}{c}\left( H_{zx0}+H_{zy0}\right) -i\frac{\overline{\sigma }_{x}}{\omega }\left( H_{zx0}+H_{zy0}\right)\]
()\[\mu _{0}H_{zx0}-i\frac{\sigma ^{*}_{x}}{\omega }H_{zx0} = \alpha \frac{c^{*}_{x}}{c}E_{0}\cos \varphi -i\frac{\overline{\sigma }^{*}_{x}}{\omega }E_{0}\cos \varphi\]
()\[\mu _{0}H_{zy0}-i\frac{\sigma ^{*}_{y}}{\omega }H_{zy0} = \beta \frac{c^{*}_{y}}{c}E_{0}\sin \varphi +i\frac{\overline{\sigma }^{*}_{y}}{\omega }E_{0}\sin \varphi\]

Defining \(Z=E_{0}/\left( H_{zx0}+H_{zy0}\right)\) and using Eqs. (40) and (41), we get

()\[\beta = \left[ Z\left( \varepsilon _{0}-i\frac{\sigma _{y}}{\omega }\right) \sin \varphi -i\frac{\overline{\sigma }_{y}}{\omega }\right] \frac{c}{c_{y}}\]
()\[\alpha = \left[ Z\left( \varepsilon _{0}-i\frac{\sigma _{x}}{\omega }\right) \cos \varphi +i\frac{\overline{\sigma }_{x}}{\omega }\right] \frac{c}{c_{x}}\]

Adding \(H_{zx0}\) and \(H_{zy0}\) from Eqs. (42) and (43) and substituting the expressions for \(\alpha\) and \(\beta\) from Eqs. (44) and (45) yields

\[\begin{split}\begin{aligned} \frac{1}{Z} & = \frac{Z\left( \varepsilon _{0}-i\frac{\sigma _{x}}{\omega }\right) \cos \varphi \frac{c^{*}_{x}}{c_{x}}+i\frac{\overline{\sigma }_{x}}{\omega }\frac{c^{*}_{x}}{c_{x}}-i\frac{\overline{\sigma }^{*}_{x}}{\omega }}{\mu _{0}-i\frac{\sigma ^{*}_{x}}{\omega }}\cos \varphi \nonumber \\ & + \frac{Z\left( \varepsilon _{0}-i\frac{\sigma _{y}}{\omega }\right) \sin \varphi \frac{c^{*}_{y}}{c_{y}}-i\frac{\overline{\sigma }_{y}}{\omega }\frac{c^{*}_{y}}{c_{y}}+i\frac{\overline{\sigma }^{*}_{y}}{\omega }}{\mu _{0}-i\frac{\sigma ^{*}_{y}}{\omega }}\sin \varphi \end{aligned}\end{split}\]

If \(c_{x}=c^{*}_{x}\), \(c_{y}=c^{*}_{y}\), \(\overline{\sigma }_{x}=\overline{\sigma }^{*}_{x}\), \(\overline{\sigma }_{y}=\overline{\sigma }^{*}_{y}\), \(\frac{\sigma _{x}}{\varepsilon _{0}}=\frac{\sigma ^{*}_{x}}{\mu _{0}}\) and \(\frac{\sigma _{y}}{\varepsilon _{0}}=\frac{\sigma ^{*}_{y}}{\mu _{0}}\) then

()\[Z = \pm \sqrt{\frac{\mu _{0}}{\varepsilon _{0}}}\]

which is the impedance of vacuum. Hence, like the PML, given some restrictions on the parameters, the APML does not generate any reflection at any angle and any frequency. As for the PML, this property is not retained after discretization, as shown subsequently.

Calling \(\psi\) any component of the field and \(\psi _{0}\) its magnitude, we get from Eqs. (36), (44), (45) and (46) that

()\[\psi =\psi _{0} \: e^{i\omega \left( t\mp x\cos \varphi /c_{x}\mp y\sin \varphi /c_{y}\right) }e^{-\left( \pm \frac{\sigma _{x}\cos \varphi }{\varepsilon _{0}c_{x}}+\overline{\sigma }_{x}\frac{c}{c_{x}}\right) x} e^{-\left( \pm \frac{\sigma _{y}\sin \varphi }{\varepsilon _{0}c_{y}}+\overline{\sigma }_{y}\frac{c}{c_{y}}\right) y}.\]

We assume that we have an APML layer of thickness \(\delta\) (measured along \(x\)) and that \(\sigma _{y}=\overline{\sigma }_{y}=0\) and \(c_{y}=c.\) Using (47), we determine that the coefficient of reflection given by this layer is

\[\begin{split}\begin{aligned} R_{\mathrm{APML}}\left( \theta \right) & = e^{-\left( \sigma _{x}\cos \varphi /\varepsilon _{0}c_{x}+\overline{\sigma }_{x}c/c_{x}\right) \delta }e^{-\left( \sigma _{x}\cos \varphi /\varepsilon _{0}c_{x}-\overline{\sigma }_{x}c/c_{x}\right) \delta },\nonumber \\ & = e^{-2\left( \sigma _{x}\cos \varphi /\varepsilon _{0}c_{x}\right) \delta }, \end{aligned}\end{split}\]

which happens to be the same as the PML theoretical coefficient of reflection if we assume \(c_{x}=c\). Hence, it follows that for the purpose of wave absorption, the term \(\overline{\sigma }_{x}\) seems to be of no interest. However, although this conclusion is true at the infinitesimal limit, it does not hold for the discretized counterpart.

Discretization

In the following we set \(\varepsilon_0 = \mu_0 = 1\). We discretize Eqs. (26), (27), (28), and (29) to obtain

\[\frac{E_x|^{n+1}_{j+1/2,k,l}-E_x|^{n}_{j+1/2,k,l}}{\Delta t} + \sigma_y \frac{E_x|^{n+1}_{j+1/2,k,l}+E_x|^{n}_{j+1/2,k,l}}{2} = \frac{H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j+1/2,k-1/2,l}}{\Delta y}\]
\[\frac{E_y|^{n+1}_{j,k+1/2,l}-E_y|^{n}_{j,k+1/2,l}}{\Delta t} + \sigma_x \frac{E_y|^{n+1}_{j,k+1/2,l}+E_y|^{n}_{j,k+1/2,l}}{2} = - \frac{H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j-1/2,k+1/2,l}}{\Delta x}\]
\[\frac{H_{zx}|^{n+3/2}_{j+1/2,k+1/2,l}-H_{zx}|^{n+1/2}_{j+1/2,k+1/2,l}}{\Delta t} + \sigma^*_x \frac{H_{zx}|^{n+3/2}_{j+1/2,k+1/2,l}+H_{zx}|^{n+1/2}_{j+1/2,k+1/2,l}}{2} = - \frac{E_y|^{n+1}_{j+1,k+1/2,l}-E_y|^{n+1}_{j,k+1/2,l}}{\Delta x}\]
\[\frac{H_{zy}|^{n+3/2}_{j+1/2,k+1/2,l}-H_{zy}|^{n+1/2}_{j+1/2,k+1/2,l}}{\Delta t} + \sigma^*_y \frac{H_{zy}|^{n+3/2}_{j+1/2,k+1/2,l}+H_{zy}|^{n+1/2}_{j+1/2,k+1/2,l}}{2} = \frac{E_x|^{n+1}_{j+1/2,k+1,l}-E_x|^{n+1}_{j+1/2,k,l}}{\Delta y}\]

and this can be solved to obtain the following leapfrog integration equations

\[\begin{split}\begin{aligned} E_x|^{n+1}_{j+1/2,k,l} & = \left(\frac{1-\sigma_y \Delta t/2}{1+\sigma_y \Delta t/2}\right) E_x|^{n}_{j+1/2,k,l} + \frac{\Delta t/\Delta y}{1+\sigma_y \Delta t/2} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j+1/2,k-1/2,l}\right) \\ E_y|^{n+1}_{j,k+1/2,l} & = \left(\frac{1-\sigma_x \Delta t/2}{1+\sigma_x \Delta t/2}\right) E_y|^{n}_{j,k+1/2,l} - \frac{\Delta t/\Delta x}{1+\sigma_x \Delta t/2} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j-1/2,k+1/2,l}\right) \\ H_{zx}|^{n+3/2}_{j+1/2,k+1/2,l} & = \left(\frac{1-\sigma^*_x \Delta t/2}{1+\sigma^*_x \Delta t/2}\right) H_{zx}|^{n+1/2}_{j+1/2,k+1/2,l} - \frac{\Delta t/\Delta x}{1+\sigma^*_x \Delta t/2} \left(E_y|^{n+1}_{j+1,k+1/2,l}-E_y|^{n+1}_{j,k+1/2,l}\right) \\ H_{zy}|^{n+3/2}_{j+1/2,k+1/2,l} & = \left(\frac{1-\sigma^*_y \Delta t/2}{1+\sigma^*_y \Delta t/2}\right) H_{zy}|^{n+1/2}_{j+1/2,k+1/2,l} + \frac{\Delta t/\Delta y}{1+\sigma^*_y \Delta t/2} \left(E_x|^{n+1}_{j+1/2,k+1,l}-E_x|^{n+1}_{j+1/2,k,l}\right) \end{aligned}\end{split}\]

If we account for higher order \(\Delta t\) terms, a better approximation is given by

\[\begin{split}\begin{aligned} E_x|^{n+1}_{j+1/2,k,l} & = e^{-\sigma_y\Delta t} E_x|^{n}_{j+1/2,k,l} + \frac{1-e^{-\sigma_y\Delta t}}{\sigma_y \Delta y} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j+1/2,k-1/2,l}\right) \\ E_y|^{n+1}_{j,k+1/2,l} & = e^{-\sigma_x\Delta t} E_y|^{n}_{j,k+1/2,l} - \frac{1-e^{-\sigma_x\Delta t}}{\sigma_x \Delta x} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j-1/2,k+1/2,l}\right) \\ H_{zx}|^{n+3/2}_{j+1/2,k+1/2,l} & = e^{-\sigma^*_x\Delta t} H_{zx}|^{n+1/2}_{j+1/2,k+1/2,l} - \frac{1-e^{-\sigma^*_x\Delta t}}{\sigma^*_x \Delta x} \left(E_y|^{n+1}_{j+1,k+1/2,l}-E_y|^{n+1}_{j,k+1/2,l}\right) \\ H_{zy}|^{n+3/2}_{j+1/2,k+1/2,l} & = e^{-\sigma^*_y\Delta t} H_{zy}|^{n+1/2}_{j+1/2,k+1/2,l} + \frac{1-e^{-\sigma^*_y\Delta t}}{\sigma^*_y \Delta y} \left(E_x|^{n+1}_{j+1/2,k+1,l}-E_x|^{n+1}_{j+1/2,k,l}\right) \end{aligned}\end{split}\]

More generally, this becomes

\[\begin{split}\begin{aligned} E_x|^{n+1}_{j+1/2,k,l} & = e^{-\sigma_y\Delta t} E_x|^{n}_{j+1/2,k,l} + \frac{1-e^{-\sigma_y\Delta t}}{\sigma_y \Delta y}\frac{c_y}{c} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j+1/2,k-1/2,l}\right) \\ E_y|^{n+1}_{j,k+1/2,l} & = e^{-\sigma_x\Delta t} E_y|^{n}_{j,k+1/2,l} - \frac{1-e^{-\sigma_x\Delta t}}{\sigma_x \Delta x}\frac{c_x}{c} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j-1/2,k+1/2,l}\right) \\ H_{zx}|^{n+3/2}_{j+1/2,k+1/2,l} & = e^{-\sigma^*_x\Delta t} H_{zx}|^{n+1/2}_{j+1/2,k+1/2,l} - \frac{1-e^{-\sigma^*_x\Delta t}}{\sigma^*_x \Delta x}\frac{c^*_x}{c} \left(E_y|^{n+1}_{j+1,k+1/2,l}-E_y|^{n+1}_{j,k+1/2,l}\right) \\ H_{zy}|^{n+3/2}_{j+1/2,k+1/2,l} & = e^{-\sigma^*_y\Delta t} H_{zy}|^{n+1/2}_{j+1/2,k+1/2,l} + \frac{1-e^{-\sigma^*_y\Delta t}}{\sigma^*_y \Delta y}\frac{c^*_y}{c} \left(E_x|^{n+1}_{j+1/2,k+1,l}-E_x|^{n+1}_{j+1/2,k,l}\right) \end{aligned}\end{split}\]

If we set

\[\begin{split}\begin{aligned} c_x & = c \: e^{-\sigma_x\Delta t} \frac{\sigma_x \Delta t}{1-e^{-\sigma_x\Delta t}} \\ c_y & = c \: e^{-\sigma_y\Delta t} \frac{\sigma_y \Delta t}{1-e^{-\sigma_y\Delta t}} \\ c^*_x & = c \: e^{-\sigma^*_x\Delta t} \frac{\sigma^*_x \Delta t}{1-e^{-\sigma^*_x\Delta t}} \\ c^*_y & = c \: e^{-\sigma^*_y\Delta t} \frac{\sigma^*_y \Delta t}{1-e^{-\sigma^*_y\Delta t}}\end{aligned}\end{split}\]

then this becomes

\[\begin{split}\begin{aligned} E_x|^{n+1}_{j+1/2,k,l} & = e^{-\sigma_y\Delta t} \left[ E_x|^{n}_{j+1/2,k,l} + \frac{\Delta t}{\Delta y} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j+1/2,k-1/2,l}\right) \right] \\ E_y|^{n+1}_{j,k+1/2,l} & = e^{-\sigma_x\Delta t} \left[ E_y|^{n}_{j,k+1/2,l} - \frac{\Delta t}{\Delta x} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j-1/2,k+1/2,l}\right) \right] \\ H_{zx}|^{n+3/2}_{j+1/2,k+1/2,l} & = e^{-\sigma^*_x\Delta t} \left[ H_{zx}|^{n+1/2}_{j+1/2,k+1/2,l} - \frac{\Delta t}{\Delta x} \left(E_y|^{n+1}_{j+1,k+1/2,l}-E_y|^{n+1}_{j,k+1/2,l}\right) \right] \\ H_{zy}|^{n+3/2}_{j+1/2,k+1/2,l} & = e^{-\sigma^*_y\Delta t} \left[ H_{zy}|^{n+1/2}_{j+1/2,k+1/2,l} + \frac{\Delta t}{\Delta y} \left(E_x|^{n+1}_{j+1/2,k+1,l}-E_x|^{n+1}_{j+1/2,k,l}\right) \right] \end{aligned}\end{split}\]

When the generalized conductivities are zero, the update equations are

\[\begin{split}\begin{aligned} E_x|^{n+1}_{j+1/2,k,l} & = E_x|^{n}_{j+1/2,k,l} + \frac{\Delta t}{\Delta y} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j+1/2,k-1/2,l}\right) \\ E_y|^{n+1}_{j,k+1/2,l} & = E_y|^{n}_{j,k+1/2,l} - \frac{\Delta t}{\Delta x} \left(H_z|^{n+1/2}_{j+1/2,k+1/2,l}-H_z|^{n+1/2}_{j-1/2,k+1/2,l}\right) \\ H_{zx}|^{n+3/2}_{j+1/2,k+1/2,l} & = H_{zx}|^{n+1/2}_{j+1/2,k+1/2,l} - \frac{\Delta t}{\Delta x} \left(E_y|^{n+1}_{j+1,k+1/2,l}-E_y|^{n+1}_{j,k+1/2,l}\right) \\ H_{zy}|^{n+3/2}_{j+1/2,k+1/2,l} & = H_{zy}|^{n+1/2}_{j+1/2,k+1/2,l} + \frac{\Delta t}{\Delta y} \left(E_x|^{n+1}_{j+1/2,k+1,l}-E_x|^{n+1}_{j+1/2,k,l}\right) \end{aligned}\end{split}\]

as expected.

Perfect Electrical Conductor

This boundary can be used to model a dielectric or metallic surface. For the electromagnetic solve, at PEC, the tangential electric field and the normal magnetic field are set to 0. In the guard-cell region, the tangential electric field is set equal and opposite to the respective field component in the mirror location across the PEC boundary, and the normal electric field is set equal to the field component in the mirror location in the domain across the PEC boundary. Similarly, the tangential (and normal) magnetic field components are set equal (and opposite) to the respective magnetic field components in the mirror locations across the PEC boundary.

The PEC boundary condition also impacts the deposition of charge and current density. On the boundary the charge density and parallel current density is set to zero. If a reflecting boundary condition is used for the particles, density overlapping with the PEC will be reflected back into the domain (for both charge and current density). If absorbing boundaries are used, an image charge (equal weight but opposite charge) is considered in the mirror location accross the boundary, and the density from that charge is also deposited in the simulation domain. Fig. 32 shows the effect of this. The left boundary is absorbing while the right boundary is reflecting.

Plot of PEC boundary current deposition showing current vs position along the ``x``-axis.

PEC boundary current deposition along the x-axis. The left boundary is absorbing while the right boundary is reflecting.

[1]

J. P. Berenger. A Perfectly Matched Layer For The Absorption Of Electromagnetic-Waves. Journal of Computational Physics, 114(2):185–200, Oct 1994.

Moving window and optimal Lorentz boosted frame

The simulations of plasma accelerators from first principles are extremely computationally intensive, due to the need to resolve the evolution of a driver (laser or particle beam) and an accelerated particle beam into a plasma structure that is orders of magnitude longer and wider than the accelerated beam. As is customary in the modeling of particle beam dynamics in standard particle accelerators, a moving window is commonly used to follow the driver, the wake and the accelerated beam. This results in huge savings, by avoiding the meshing of the entire plasma that is orders of magnitude longer than the other length scales of interest.

[fig:Boosted-frame] A first principle simulation of a short driver beam (laser or charged particles) propagating through a plasma that is orders of magnitude longer necessitates a very large number of time steps. Recasting the simulation in a frame of reference that is moving close to the speed of light in the direction of the driver beam leads to simulating a driver beam that appears longer propagating through a plasma that appears shorter than in the laboratory. Thus, this relativistic transformation of space and time reduces the disparity of scales, and thereby the number of time steps to complete the simulation, by orders of magnitude.

A first principle simulation of a short driver beam (laser or charged particles) propagating through a plasma that is orders of magnitude longer necessitates a very large number of time steps. Recasting the simulation in a frame of reference that is moving close to the speed of light in the direction of the driver beam leads to simulating a driver beam that appears longer propagating through a plasma that appears shorter than in the laboratory. Thus, this relativistic transformation of space and time reduces the disparity of scales, and thereby the number of time steps to complete the simulation, by orders of magnitude.

Even using a moving window, however, a full PIC simulation of a plasma accelerator can be extraordinarily demanding computationally, as many time steps are needed to resolve the crossing of the short driver beam with the plasma column. As it turns out, choosing an optimal frame of reference that travels close to the speed of light in the direction of the laser or particle beam (as opposed to the usual choice of the laboratory frame) enables speedups by orders of magnitude [1, 2]. This is a result of the properties of Lorentz contraction and dilation of space and time. In the frame of the laboratory, a very short driver (laser or particle) beam propagates through a much longer plasma column, necessitating millions to tens of millions of time steps for parameters in the range of the BELLA or FACET-II experiments. As sketched in Fig. 33, in a frame moving with the driver beam in the plasma at velocity \(v=\beta c\) (where \(c\) is the speed of light in vacuum), the beam length is now elongated by \(\approx(1+\beta)\gamma\) while the plasma contracts by \(\gamma\) (where \(\gamma=1/\sqrt{1-\beta^2}\) is the relativistic factor associated with the frame velocity). The number of time steps that is needed to simulate a “longer” beam through a “shorter” plasma is now reduced by up to \(\approx(1+\beta) \gamma^2\) (a detailed derivation of the speedup is given below).

The modeling of a plasma acceleration stage in a boosted frame involves the fully electromagnetic modeling of a plasma propagating at near the speed of light, for which Numerical Cerenkov [3, 4] is a potential issue, as explained in more details below. In addition, for a frame of reference moving in the direction of the accelerated beam (or equivalently the wake of the laser), waves emitted by the plasma in the forward direction expand while the ones emitted in the backward direction contract, following the properties of the Lorentz transformation. If one had to resolve both forward and backward propagating waves emitted from the plasma, there would be no gain in selecting a frame different from the laboratory frame. However, the physics of interest for a laser wakefield is the laser driving the wake, the wake, and the accelerated beam. Backscatter is weak in the short-pulse regime, and does not interact as strongly with the beam as do the forward propagating waves which stay in phase for a long period. It is thus often assumed that the backward propagating waves can be neglected in the modeling of plasma accelerator stages. The accuracy of this assumption has been demonstrated by comparison between explicit codes which include both forward and backward waves and envelope or quasistatic codes which neglect backward waves [5, 6, 7].

Theoretical speedup dependency with the frame boost

The derivation that is given here reproduces the one given in Vay et al. [2], where the obtainable speedup is derived as an extension of the formula that was derived earlier [1], taking in addition into account the group velocity of the laser as it traverses the plasma.

Assuming that the simulation box is a fixed number of plasma periods long, which implies the use (which is standard) of a moving window following the wake and accelerated beam, the speedup is given by the ratio of the time taken by the laser pulse and the plasma to cross each other, divided by the shortest time scale of interest, that is the laser period. To first order, the wake velocity \(v_w\) is set by the 1D group velocity of the laser driver, which in the linear (low intensity) limit, is given by [8]:

\[v_w/c=\beta_w=\left(1-\frac{\omega_p^2}{\omega^2}\right)^{1/2}\]

where \(\omega_p=\sqrt{(n_e e^2)/(\epsilon_0 m_e)}\) is the plasma frequency, \(\omega=2\pi c/\lambda\) is the laser frequency, \(n_e\) is the plasma density, \(\lambda\) is the laser wavelength in vacuum, \(\epsilon_0\) is the permittivity of vacuum, \(c\) is the speed of light in vacuum, and \(e\) and \(m_e\) are respectively the charge and mass of the electron.

In practice, the runs are typically stopped when the last electron beam macro-particle exits the plasma, and a measure of the total time of the simulation is then given by

\[T=\frac{L+\eta \lambda_p}{v_w-v_p}\]

where \(\lambda_p\approx 2\pi c/\omega_p\) is the wake wavelength, \(L\) is the plasma length, \(v_w\) and \(v_p=\beta_p c\) are respectively the velocity of the wake and of the plasma relative to the frame of reference, and \(\eta\) is an adjustable parameter for taking into account the fraction of the wake which exited the plasma at the end of the simulation. For a beam injected into the \(n^{th}\) bucket, \(\eta\) would be set to \(n-1/2\). If positrons were considered, they would be injected half a wake period ahead of the location of the electrons injection position for a given period, and one would have \(\eta=n-1\). The numerical cost \(R_t\) scales as the ratio of the total time to the shortest timescale of interest, which is the inverse of the laser frequency, and is thus given by

\[R_t=\frac{T c}{\lambda}=\frac{\left(L+\eta \lambda_p\right)}{\left(\beta_w-\beta_p\right) \lambda}\]

In the laboratory, \(v_p=0\) and the expression simplifies to

\[R_{lab}=\frac{T c}{\lambda}=\frac{\left(L+\eta \lambda_p\right)}{\beta_w \lambda}\]

In a frame moving at \(\beta c\), the quantities become

\[\begin{split}\begin{aligned} \lambda_p^* & = \lambda_p/\left[\gamma \left(1-\beta_w \beta\right)\right] \\ L^* & = L/\gamma \\ \lambda^* & = \gamma\left(1+\beta\right) \lambda \\ \beta_w^* & = \left(\beta_w-\beta\right)/\left(1-\beta_w\beta\right) \\ v_p^* & = -\beta c \\ T^* & = \frac{L^*+\eta \lambda_p^*}{v_w^*-v_p^*} \\ R_t^* & = \frac{T^* c}{\lambda^*} = \frac{\left(L^*+\eta \lambda_p^*\right)}{\left(\beta_w^*+\beta\right) \lambda^*} \end{aligned}\end{split}\]

where \(\gamma=1/\sqrt{1-\beta^2}\).

The expected speedup from performing the simulation in a boosted frame is given by the ratio of \(R_{lab}\) and \(R_t^*\)

()\[S=\frac{R_{lab}}{R_t^*}=\frac{\left(1+\beta\right)\left(L+\eta \lambda_p\right)}{\left(1-\beta\beta_w\right)L+\eta \lambda_p}\]

We note that assuming that \(\beta_w\approx1\) (which is a valid approximation for most practical cases of interest) and that \(\gamma<<\gamma_w\), this expression is consistent with the expression derived earlier [1] for the laser-plasma acceleration case, which states that \(R_t^*=\alpha R_t/\left(1+\beta\right)\) with \(\alpha=\left(1-\beta+l/L\right)/\left(1+l/L\right)\), where \(l\) is the laser length which is generally proportional to \(\eta \lambda_p\), and \(S=R_t/R_T^*\). However, higher values of \(\gamma\) are of interest for maximum speedup, as shown below.

For intense lasers (\(a\sim 1\)) typically used for acceleration, the energy gain is limited by dephasing [9], which occurs over a scale length \(L_d \sim \lambda_p^3/2\lambda^2\). Acceleration is compromised beyond \(L_d\) and in practice, the plasma length is proportional to the dephasing length, i.e. \(L= \xi L_d\). In most cases, \(\gamma_w^2>>1\), which allows the approximations \(\beta_w\approx1-\lambda^2/2\lambda_p^2\), and \(L=\xi \lambda_p^3/2\lambda^2\approx \xi \gamma_w^2 \lambda_p/2>>\eta \lambda_p\), so that Eq.(48) becomes

()\[S=\left(1+\beta\right)^2\gamma^2\frac{\xi\gamma_w^2}{\xi\gamma_w^2+\left(1+\beta\right)\gamma^2\left(\xi\beta/2+2\eta\right)}\]

For low values of \(\gamma\), i.e. when \(\gamma<<\gamma_w\), Eq.(49) reduces to

()\[S_{\gamma<<\gamma_w}=\left(1+\beta\right)^2\gamma^2\]

Conversely, if \(\gamma\rightarrow\infty\), Eq.(Eq_scaling1d) becomes

()\[S_{\gamma\rightarrow\infty}=\frac{4}{1+4\eta/\xi}\gamma_w^2\]

Finally, in the frame of the wake, i.e. when \(\gamma=\gamma_w\), assuming that \(\beta_w\approx1\), Eq.(49) gives

()\[S_{\gamma=\gamma_w}\approx\frac{2}{1+2\eta/\xi}\gamma_w^2\]

Since \(\eta\) and \(\xi\) are of order unity, and the practical regimes of most interest satisfy \(\gamma_w^2>>1\), the speedup that is obtained by using the frame of the wake will be near the maximum obtainable value given by Eq.(51).

Note that without the use of a moving window, the relativistic effects that are at play in the time domain would also be at play in the spatial domain [1], and the \(\gamma^2\) scaling would transform to \(\gamma^4\). Hence, it is important to use a moving window even in simulations in a Lorentz boosted frame. For very high values of the boosted frame, the optimal velocity of the moving window may vanish (i.e. no moving window) or even reverse.

Numerical Stability and alternate formulation in a Galilean frame

The numerical Cherenkov instability (NCI) [10] is the most serious numerical instability affecting multidimensional PIC simulations of relativistic particle beams and streaming plasmas [11, 12, 13, 14, 15, 16]. It arises from coupling between possibly numerically distorted electromagnetic modes and spurious beam modes, the latter due to the mismatch between the Lagrangian treatment of particles and the Eulerian treatment of fields [17].

In recent papers the electromagnetic dispersion relations for the numerical Cherenkov instability were derived and solved for both FDTD [15, 18] and PSATD [19, 20] algorithms.

Several solutions have been proposed to mitigate the NCI [19, 20, 21, 22, 23, 24]. Although these solutions efficiently reduce the numerical instability, they typically introduce either strong smoothing of the currents and fields, or arbitrary numerical corrections, which are tuned specifically against the NCI and go beyond the natural discretization of the underlying physical equation. Therefore, it is sometimes unclear to what extent these added corrections could impact the physics at stake for a given resolution.

For instance, NCI-specific corrections include periodically smoothing the electromagnetic field components [11], using a special time step [12, 13] or applying a wide-band smoothing of the current components [12, 13, 25]. Another set of mitigation methods involve scaling the deposited currents by a carefully-designed wavenumber-dependent factor [18, 20] or slightly modifying the ratio of electric and magnetic fields (\(E/B\)) before gathering their value onto the macroparticles [19, 22]. Yet another set of NCI-specific corrections [23, 24] consists in combining a small timestep \(\Delta t\), a sharp low-pass spatial filter, and a spectral or high-order scheme that is tuned so as to create a small, artificial “bump” in the dispersion relation [23]. While most mitigation methods have only been applied to Cartesian geometry, this last set of methods [23, 24] has the remarkable property that it can be applied [24] to both Cartesian geometry and quasi-cylindrical geometry (i.e. cylindrical geometry with azimuthal Fourier decomposition [26, 27, 28]). However, the use of a small timestep proportionally slows down the progress of the simulation, and the artificial “bump” is again an arbitrary correction that departs from the underlying physics.

A new scheme was recently proposed, in Kirchen et al. [29], Lehe et al. [30], which completely eliminates the NCI for a plasma drifting at a uniform relativistic velocity – with no arbitrary correction – by simply integrating the PIC equations in Galilean coordinates (also known as comoving coordinates). More precisely, in the new method, the Maxwell equations in Galilean coordinates are integrated analytically, using only natural hypotheses, within the PSATD framework (Pseudo-Spectral-Analytical-Time-Domain [4, 31]).

The idea of the proposed scheme is to perform a Galilean change of coordinates, and to carry out the simulation in the new coordinates:

()\[\boldsymbol{x}' = \boldsymbol{x} - \boldsymbol{v}_{gal}t\]

where \(\boldsymbol{x} = x\,\boldsymbol{u}_x + y\,\boldsymbol{u}_y + z\,\boldsymbol{u}_z\) and \(\boldsymbol{x}' = x'\,\boldsymbol{u}_x + y'\,\boldsymbol{u}_y + z'\,\boldsymbol{u}_z\) are the position vectors in the standard and Galilean coordinates respectively.

When choosing \(\boldsymbol{v}_{gal}= \boldsymbol{v}_0\), where \(\boldsymbol{v}_0\) is the speed of the bulk of the relativistic plasma, the plasma does not move with respect to the grid in the Galilean coordinates \(\boldsymbol{x}'\) – or, equivalently, in the standard coordinates \(\boldsymbol{x}\), the grid moves along with the plasma. The heuristic intuition behind this scheme is that these coordinates should prevent the discrepancy between the Lagrangian and Eulerian point of view, which gives rise to the NCI [17].

An important remark is that the Galilean change of coordinates in Eq. (53) is a simple translation. Thus, when used in the context of Lorentz-boosted simulations, it does of course preserve the relativistic dilatation of space and time which gives rise to the characteristic computational speedup of the boosted-frame technique.

Another important remark is that the Galilean scheme is not equivalent to a moving window (and in fact the Galilean scheme can be independently combined with a moving window). Whereas in a moving window, gridpoints are added and removed so as to effectively translate the boundaries, in the Galilean scheme the gridpoints themselves are not only translated but in this case, the physical equations are modified accordingly. Most importantly, the assumed time evolution of the current \(\boldsymbol{J}\) within one timestep is different in a standard PSATD scheme with moving window and in a Galilean PSATD scheme [30].

In the Galilean coordinates \(\boldsymbol{x}'\), the equations of particle motion and the Maxwell equations take the form

()\[\frac{d\boldsymbol{x}'}{dt} = \frac{\boldsymbol{p}}{\gamma m} - \boldsymbol{v}_{gal}\]
()\[\frac{d\boldsymbol{p}}{dt} = q \left( \boldsymbol{E} + \frac{\boldsymbol{p}}{\gamma m} \times \boldsymbol{B} \right)\]
()\[\left( \frac{\partial \;}{\partial t} - \boldsymbol{v}_{gal}\cdot\boldsymbol{\nabla'}\right)\boldsymbol{B} = -\boldsymbol{\nabla'}\times\boldsymbol{E}\]
()\[\frac{1}{c^2}\left( \frac{\partial \;}{\partial t} - \boldsymbol{v}_{gal}\cdot\boldsymbol{\nabla'}\right)\boldsymbol{E} = \boldsymbol{\nabla'}\times\boldsymbol{B} - \mu_0\boldsymbol{J}\]

where \(\boldsymbol{\nabla'}\) denotes a spatial derivative with respect to the Galilean coordinates \(\boldsymbol{x}'\).

Integrating these equations from \(t=n\Delta t\) to \(t=(n+1)\Delta t\) results in the following update equations (see Lehe et al. [30] for the details of the derivation):

()\[\begin{split}\begin{aligned} \mathbf{\tilde{B}}^{n+1} & = \theta^2 C \mathbf{\tilde{B}}^n -\frac{\theta^2 S}{ck}i\boldsymbol{k}\times \mathbf{\tilde{E}}^n \nonumber \\ & + \;\frac{\theta \chi_1}{\epsilon_0c^2k^2}\;i\boldsymbol{k} \times \mathbf{\tilde{J}}^{n+1/2} \end{aligned}\end{split}\]
()\[\begin{split}\begin{aligned} \mathbf{\tilde{E}}^{n+1} & = \theta^2 C \mathbf{\tilde{E}}^n +\frac{\theta^2 S}{k} \,c i\boldsymbol{k}\times \mathbf{\tilde{B}}^n \nonumber \\ & + \frac{i\nu \theta \chi_1 - \theta^2S}{\epsilon_0 ck} \; \mathbf{\tilde{J}}^{n+1/2}\nonumber \\ & - \frac{1}{\epsilon_0k^2}\left(\; \chi_2\;\hat{\mathcal{\rho}}^{n+1} - \theta^2\chi_3\;\hat{\mathcal{\rho}}^{n} \;\right) i\boldsymbol{k} \end{aligned}\end{split}\]

where we used the short-hand notations \(\mathbf{\tilde{E}}^n \equiv \mathbf{\tilde{E}}(\boldsymbol{k}, n\Delta t)\), \(\mathbf{\tilde{B}}^n \equiv \mathbf{\tilde{B}}(\boldsymbol{k}, n\Delta t)\) as well as:

()\[C = \cos(ck\Delta t), \quad S = \sin(ck\Delta t), \quad k = |\boldsymbol{k}|,\]
()\[\nu = \frac{\boldsymbol{k}\cdot\boldsymbol{v}_{gal}}{ck}, \quad \theta = e^{i\boldsymbol{k}\cdot\boldsymbol{v}_{gal}\Delta t/2},\]
()\[\chi_1 = \frac{1}{1 -\nu^2} \left( \theta^* - C \theta + i \nu \theta S \right),\]
()\[\chi_2 = \frac{\chi_1 - \theta(1-C)}{\theta^*-\theta}\]
()\[\chi_3 = \frac{\chi_1-\theta^*(1-C)}{\theta^*-\theta}\]

Note that, in the limit \(\boldsymbol{v}_{gal}=\boldsymbol{0}\), Eqs. (58) and (59) reduce to the standard PSATD equations [4], as expected. As shown in Kirchen et al. [29], Lehe et al. [30], the elimination of the NCI with the new Galilean integration is verified empirically via PIC simulations of uniform drifting plasmas and laser-driven plasma acceleration stages, and confirmed by a theoretical analysis of the instability.

[1] (1,2,3,4)

J.-L. Vay. Noninvariance Of Space- And Time-Scale Ranges Under A Lorentz Transformation And The Implications For The Study Of Relativistic Interactions. Physical Review Letters, 98(13):130405/1–4, 2007.

[2] (1,2)

J.-L. Vay, C. G. R. Geddes, E. Esarey, C. B. Schroeder, W. P. Leemans, E. Cormier-Michel, and D. P. Grote. Modeling Of 10 GeV-1 TeV Laser-Plasma Accelerators Using Lorentz Boosted Simulations. Physics of Plasmas, Dec 2011. doi:10.1063/1.3663841.

[3]

J. P. Boris and R. Lee. Nonphysical Self Forces In Some Electromagnetic Plasma-Simulation Algorithms. Journal of Computational Physics, 12(1):131–136, 1973.

[4] (1,2,3)

I. Haber, R. Lee, H. H. Klein, and J. P. Boris. Advances In Electromagnetic Simulation Techniques. In Proc. Sixth Conf. Num. Sim. Plasmas, 46–48. Berkeley, Ca, 1973.

[5]

C. G. R. Geddes, D. L. Bruhwiler, J. R. Cary, W. B. Mori, J.-L. Vay, S. F. Martins, T. Katsouleas, E. Cormier-Michel, W. M. Fawley, C. Huang, X. Wang, B. Cowan, V. K. Decyk, E. Esarey, R. A. Fonseca, W. Lu, P. Messmer, P. Mullowney, K. Nakamura, K. Paul, G. R. Plateau, C. B. Schroeder, L. O. Silva, C. Toth, F. S. Tsung, M. Tzoufras, T. Antonsen, J. Vieira, and W. P. Leemans. Computational Studies And Optimization Of Wakefield Accelerators. In Journal of Physics: Conference Series, volume 125, 012002 (11 Pp.). 2008.

[6]

C. G. R. Geddes, E. Cormier-Michel, E. Esarey, C. B. Schroeder, and W. P. Leemans. Scaled Simulation Design Of High Quality Laser Wakefield Accelerator Stages. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.

[7]

B. Cowan, D. Bruhwiler, E. Cormier-Michel, E. Esarey, C. G. R. Geddes, P. Messmer, and K. Paul. Laser Wakefield Simulation Using A Speed-Of-Light Frame Envelope Model. In Aip Conference Proceedings, volume 1086, 309–314. 2009.

[8]

E. Esarey, C. B. Schroeder, and W. P. Leemans. Physics Of Laser-Driven Plasma-Based Electron Accelerators. Rev. Mod. Phys., 81(3):1229–1285, 2009. doi:10.1103/Revmodphys.81.1229.

[9]

C. B. Schroeder, C. Benedetti, E. Esarey, and W. P. Leemans. Nonlinear Pulse Propagation And Phase Velocity Of Laser-Driven Plasma Waves. Physical Review Letters, 106(13):135002, Mar 2011. doi:10.1103/Physrevlett.106.135002.

[10]

B. B. Godfrey. Numerical Cherenkov Instabilities In Electromagnetic Particle Codes. Journal of Computational Physics, 15(4):504–521, 1974.

[11] (1,2)

S. F. Martins, R. A. Fonseca, L. O. Silva, W. Lu, and W. B. Mori. Numerical Simulations Of Laser Wakefield Accelerators In Optimal Lorentz Frames. Computer Physics Communications, 181(5):869–875, May 2010. doi:10.1016/J.Cpc.2009.12.023.

[12] (1,2,3)

J.‐L. Vay, C. G. R. Geddes, C. Benedetti, D. L. Bruhwiler, E. Cormier‐Michel, B. M. Cowan, J. R. Cary, and D. P. Grote. Modeling Laser Wakefield Accelerators In A Lorentz Boosted Frame. AIP Conference Proceedings, 1299(1):244–249, Nov 2010. URL: https://doi.org/10.1063/1.3520322, arXiv:https://pubs.aip.org/aip/acp/article-pdf/1299/1/244/11928106/244\_1\_online.pdf, doi:10.1063/1.3520322.

[13] (1,2,3)

J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Numerical Methods For Instability Mitigation In The Modeling Of Laser Wakefield Accelerators In A Lorentz-Boosted Frame. Journal of Computational Physics, 230(15):5908–5929, Jul 2011. doi:10.1016/J.Jcp.2011.04.003.

[14]

L. Sironi and A. Spitkovsky. No Title. 2011.

[15] (1,2)

B. B. Godfrey and J.-L. Vay. Numerical stability of relativistic beam multidimensional \PIC\ simulations employing the Esirkepov algorithm. Journal of Computational Physics, 248(0):33–46, 2013. URL: http://www.sciencedirect.com/science/article/pii/S0021999113002556, doi:http://dx.doi.org/10.1016/j.jcp.2013.04.006.

[16]

X. Xu, P. Yu, S. F. Martins, F. S. Tsung, V. K. Decyk, J. Vieira, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Numerical instability due to relativistic plasma drift in EM-PIC simulations. Computer Physics Communications, 184(11):2503–2514, 2013. URL: http://www.sciencedirect.com/science/article/pii/S0010465513002312, doi:http://dx.doi.org/10.1016/j.cpc.2013.07.003.

[17] (1,2)

B. B. Godfrey. Canonical Momenta And Numerical Instabilities In Particle Codes. Journal of Computational Physics, 19(1):58–76, 1975.

[18] (1,2)

B. B. Godfrey and J.-L. Vay. Suppressing the numerical Cherenkov instability in FDTD PIC codes. Journal of Computational Physics, 267:1–6, 2014.

[19] (1,2,3)

B. B. Godfrey, J.-L. Vay, and I. Haber. Numerical stability analysis of the pseudo-spectral analytical time-domain PIC algorithm. Journal of Computational Physics, 258:689–704, 2014.

[20] (1,2,3)

B. B. Godfrey, J.-L. Vay, and I. Haber. Numerical stability improvements for the pseudospectral EM PIC algorithm. IEEE Transactions on Plasma Science, 42(5):1339–1344, 2014.

[21]

B. B. Godfrey, J.-L. Vay, and I. Haber. Numerical stability analysis of the pseudo-spectral analytical time-domain \PIC\ algorithm. Journal of Computational Physics, 258(0):689–704, 2014. URL: http://www.sciencedirect.com/science/article/pii/S0021999113007298, doi:http://dx.doi.org/10.1016/j.jcp.2013.10.053.

[22] (1,2)

B. B. Godfrey and J.-L. Vay. Improved numerical Cherenkov instability suppression in the generalized PSTD PIC algorithm. Computer Physics Communications, 196:221–225, 2015.

[23] (1,2,3,4)

P. Yu, X. Xu, V. K. Decyk, F. Fiuza, J. Vieira, F. S. Tsung, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Elimination of the numerical Cerenkov instability for spectral EM-PIC codes. Computer Physics Communications, 192:32–47, Jul 2015. URL: https://apps.webofknowledge.com/full{\_}record.do?product=UA{\&}search{\_}mode=GeneralSearch{\&}qid=2{\&}SID=1CanLFIHrQ5v8O7cxqV{\&}page=1{\&}doc=3, doi:10.1016/j.cpc.2015.02.018.

[24] (1,2,3,4)

P. Yu, X. Xu, A. Tableman, V. K. Decyk, F. S. Tsung, F. Fiuza, A. Davidson, J. Vieira, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Mitigation of numerical Cerenkov radiation and instability using a hybrid finite difference-FFT Maxwell solver and a local charge conserving current deposit. Computer Physics Communications, 197:144–152, Dec 2015. URL: https://apps.webofknowledge.com/full{\_}record.do?product=UA{\&}search{\_}mode=GeneralSearch{\&}qid=2{\&}SID=1CanLFIHrQ5v8O7cxqV{\&}page=1{\&}doc=2, doi:10.1016/j.cpc.2015.08.026.

[25]

J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Effects of hyperbolic rotation in Minkowski space on the modeling of plasma accelerators in a Lorentz boosted frame. Physics of Plasmas, 18(3):030701, Mar 2011. URL: https://doi.org/10.1063/1.3559483, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.3559483/16019930/030701\_1\_online.pdf, doi:10.1063/1.3559483.

[26]

A. F. Lifschitz, X. Davoine, E. Lefebvre, J. Faure, C. Rechatin, and V. Malka. Particle-in-Cell modelling of laser-plasma interaction using Fourier decomposition. Journal of Computational Physics, 228(5):1803–1814, 2009. URL: http://www.sciencedirect.com/science/article/pii/S0021999108005950, doi:http://dx.doi.org/10.1016/j.jcp.2008.11.017.

[27]

A. Davidson, A. Tableman, W. An, F. S. Tsung, W. Lu, J. Vieira, R. A. Fonseca, L. O. Silva, and W. B. Mori. Implementation of a hybrid particle code with a PIC description in r–z and a gridless description in \Phi into OSIRIS. Journal of Computational Physics, 281:1063–1077, 2015. doi:10.1016/j.jcp.2014.10.064.

[28]

R. Lehe, M. Kirchen, I. A. Andriyash, B. B. Godfrey, and J.-L. Vay. A spectral, quasi-cylindrical and dispersion-free Particle-In-Cell algorithm. Computer Physics Communications, 203:66–82, 2016. doi:10.1016/j.cpc.2016.02.007.

[29] (1,2)

M. Kirchen, R. Lehe, B. B. Godfrey, I. Dornmair, S. Jalas, K. Peters, J.-L. Vay, and A. R. Maier. Stable discrete representation of relativistically drifting plasmas. Physics of Plasmas, 23(10):100704, Oct 2016. URL: https://doi.org/10.1063/1.4964770, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.4964770/14024121/100704\_1\_online.pdf, doi:10.1063/1.4964770.

[30] (1,2,3,4)

R. Lehe, M. Kirchen, B. B. Godfrey, A. R. Maier, and J.-L. Vay. Elimination of numerical Cherenkov instability in flowing-plasma particle-in-cell simulations by using Galilean coordinates. Phys. Rev. E, 94:053305, Nov 2016. URL: https://link.aps.org/doi/10.1103/PhysRevE.94.053305, doi:10.1103/PhysRevE.94.053305.

[31]

J.-L. Vay, I. Haber, and B. B. Godfrey. A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas. Journal of Computational Physics, 243:260–268, Jun 2013. doi:10.1016/j.jcp.2013.03.010.

Inputs and Outputs

Initialization of the plasma columns and drivers (laser or particle beam) is performed via the specification of multidimensional functions that describe the initial state with, if needed, a time dependence, or from reconstruction of distributions based on experimental data. Care is needed when initializing quantities in parallel to avoid double counting and ensure smoothness of the distributions at the interface of computational domains. When the sum of the initial distributions of charged particles is not charge neutral, initial fields are computed using generally a static approximation with Poisson solves accompanied by proper relativistic scalings [1, 2].

Outputs include dumps of particle and field quantities at regular intervals, histories of particle distributions moments, spectra, etc, and plots of the various quantities. In parallel simulations, the diagnostic subroutines need to handle additional complexity from the domain decomposition, as well as large amount of data that may necessitate data reduction in some form before saving to disk.

Simulations in a Lorentz boosted frame require additional considerations, as described below.

Inputs and outputs in a boosted frame simulation

(top) Snapshot of a particle beam showing “frozen" (grey spheres) and “active" (colored spheres) macroparticles traversing the injection plane (red rectangle). (bottom) Snapshot of the beam macroparticles (colored spheres) passing through the background of electrons (dark brown streamlines) and the diagnostic stations (red rectangles). The electrons, the injection plane and the diagnostic stations are fixed in the laboratory plane, and are thus counter-propagating to the beam in a boosted frame.

(top) Snapshot of a particle beam showing “frozen” (grey spheres) and “active” (colored spheres) macroparticles traversing the injection plane (red rectangle). (bottom) Snapshot of the beam macroparticles (colored spheres) passing through the background of electrons (dark brown streamlines) and the diagnostic stations (red rectangles). The electrons, the injection plane and the diagnostic stations are fixed in the laboratory plane, and are thus counter-propagating to the beam in a boosted frame.

The input and output data are often known from, or compared to, experimental data. Thus, calculating in a frame other than the laboratory entails transformations of the data between the calculation frame and the laboratory frame. This section describes the procedures that have been implemented in the Particle-In-Cell framework Warp [3] to handle the input and output of data between the frame of calculation and the laboratory frame [4]. Simultaneity of events between two frames is valid only for a plane that is perpendicular to the relative motion of the frame. As a result, the input/output processes involve the input of data (particles or fields) through a plane, as well as output through a series of planes, all of which are perpendicular to the direction of the relative velocity between the frame of calculation and the other frame of choice.

Input in a boosted frame simulation

Particles -

Particles are launched through a plane using a technique that is generic and applies to Lorentz boosted frame simulations in general, including plasma acceleration, and is illustrated using the case of a positively charged particle beam propagating through a background of cold electrons in an assumed continuous transverse focusing system, leading to a well-known growing transverse “electron cloud” instability [5]. In the laboratory frame, the electron background is initially at rest and a moving window is used to follow the beam progression. Traditionally, the beam macroparticles are initialized all at once in the window, while background electron macroparticles are created continuously in front of the beam on a plane that is perpendicular to the beam velocity. In a frame moving at some fraction of the beam velocity in the laboratory frame, the beam initial conditions at a given time in the calculation frame are generally unknown and one must initialize the beam differently. However, it can be taken advantage of the fact that the beam initial conditions are often known for a given plane in the laboratory, either directly, or via simple calculation or projection from the conditions at a given time in the labortory frame. Given the position and velocity \(\{x,y,z,v_x,v_y,v_z\}\) for each beam macroparticle at time \(t=0\) for a beam moving at the average velocity \(v_b=\beta_b c\) (where \(c\) is the speed of light) in the laboratory, and using the standard synchronization (\(z=z'=0\) at \(t=t'=0\)) between the laboratory and the calculation frames, the procedure for transforming the beam quantities for injection in a boosted frame moving at velocity \(\beta c\) in the laboratory is as follows (the superscript \('\) relates to quantities known in the boosted frame while the superscript \(^*\) relates to quantities that are know at a given longitudinal position \(z^*\) but different times of arrival):

  1. project positions at \(z^*=0\) assuming ballistic propagation

    \[\begin{split}\begin{aligned} t^* &= \left(z-\bar{z}\right)/v_z \label{Eq:t*}\\ x^* &= x-v_x t^* \label{Eq:x*}\\ y^* &= y-v_y t^* \label{Eq:y*}\\ z^* &= 0 \label{Eq:z*}\end{aligned}\end{split}\]

    the velocity components being left unchanged,

  2. apply Lorentz transformation from laboratory frame to boosted frame

    \[\begin{split}\begin{aligned} t'^* &= -\gamma t^* \label{Eq:tp*}\\ x'^* &= x^* \label{Eq:xp*}\\ y'^* &= y^* \label{Eq:yp*}\\ z'^* &= \gamma\beta c t^* \label{Eq:zp*}\\ v'^*_x&=\frac{v_x^*}{\gamma\left(1-\beta \beta_b\right)} \label{Eq:vxp*}\\ v'^*_y&=\frac{v_y^*}{\gamma\left(1-\beta \beta_b\right)} \label{Eq:vyp*}\\ v'^*_z&=\frac{v_z^*-\beta c}{1-\beta \beta_b} \label{Eq:vzp*}\end{aligned}\end{split}\]

    where \(\gamma=1/\sqrt{1-\beta^2}\). With the knowledge of the time at which each beam macroparticle crosses the plane into consideration, one can inject each beam macroparticle in the simulation at the appropriate location and time.

  3. synchronize macroparticles in boosted frame, obtaining their positions at a fixed \(t'=0\) (before any particle is injected)

    \[\begin{aligned} z' &= z'^*-\bar{v}'^*_z t'^* \label{Eq:zp}\end{aligned}\]

    This additional step is needed for setting the electrostatic or electromagnetic fields at the plane of injection. In a Particle-In-Cell code, the three-dimensional fields are calculated by solving the Maxwell equations (or static approximation like Poisson, Darwin or other [1]) on a grid on which the source term is obtained from the macroparticles distribution. This requires generation of a three-dimensional representation of the beam distribution of macroparticles at a given time before they cross the injection plane at \(z'^*\). This is accomplished by expanding the beam distribution longitudinally such that all macroparticles (so far known at different times of arrival at the injection plane) are synchronized to the same time in the boosted frame. To keep the beam shape constant, the particles are “frozen” until they cross that plane: the three velocity components and the two position components perpendicular to the boosted frame velocity are kept constant, while the remaining position component is advanced at the average beam velocity. As particles cross the plane of injection, they become regular “active” particles with full 6-D dynamics.

A snapshot of a beam that has passed partly through the injection plane in shown in Fig. 34 (top). As the frozen beam macroparticles pass through the injection plane (which moves opposite to the beam in the boosted frame), they are converted to “active” macroparticles. The charge or current density is accumulated from the active and the frozen particles, thus ensuring that the fields at the plane of injection are consistent.

Laser -

Similarly to the particle beam, the laser is injected through a plane perpendicular to the axis of propagation of the laser (by default \(z\)). The electric field \(E_\perp\) that is to be emitted is given by the formula

\[E_\perp\left(x,y,t\right)=E_0 f\left(x,y,t\right) \sin\left[\omega t+\phi\left(x,y,\omega\right)\right]\]

where \(E_0\) is the amplitude of the laser electric field, \(f\left(x,y,t\right)\) is the laser envelope, \(\omega\) is the laser frequency, \(\phi\left(x,y,\omega\right)\) is a phase function to account for focusing, defocusing or injection at an angle, and \(t\) is time. By default, the laser envelope is a three-dimensional gaussian of the form

\[f\left(x,y,t\right)=e^{-\left(x^2/2 \sigma_x^2+y^2/2 \sigma_y^2+c^2t^2/2 \sigma_z^2\right)}\]

where \(\sigma_x\), \(\sigma_y\) and \(\sigma_z\) are the dimensions of the laser pulse; or it can be defined arbitrarily by the user at runtime. If \(\phi\left(x,y,\omega\right)=0\), the laser is injected at a waist and parallel to the axis \(z\).

If, for convenience, the injection plane is moving at constant velocity \(\beta_s c\), the formula is modified to take the Doppler effect on frequency and amplitude into account and becomes

\[\begin{aligned} E_\perp\left(x,y,t\right)&=\left(1-\beta_s\right)E_0 f\left(x,y,t\right)\sin\left[\left(1-\beta_s\right)\omega t+\phi\left(x,y,\omega\right)\right] \end{aligned}\]

The injection of a laser of frequency \(\omega\) is considered for a simulation using a boosted frame moving at \(\beta c\) with respect to the laboratory. Assuming that the laser is injected at a plane that is fixed in the laboratory, and thus moving at \(\beta_s=-\beta\) in the boosted frame, the injection in the boosted frame is given by

\[\begin{split}\begin{aligned} E_\perp\left(x',y',t'\right)&=\left(1-\beta_s\right)E'_0 f\left(x',y',t'\right)\sin\left[\left(1-\beta_s\right)\omega' t'+\phi\left(x',y',\omega'\right)\right] \\ &=\left(E_0/\gamma\right) f\left(x',y',t'\right)\sin\left[\omega t'/\gamma+\phi\left(x',y',\omega'\right)\right] \end{aligned}\end{split}\]

since \(E'_0/E_0=\omega'/\omega=1/\left(1+\beta\right)\gamma\).

The electric field is then converted into currents that get injected via a 2D array of macro-particles, with one positive and one dual negative macro-particle for each array cell in the plane of injection, whose weights and motion are governed by \(E_\perp\left(x',y',t'\right)\). Injecting using this dual array of macroparticles offers the advantage of automatically including the longitudinal component that arises from emitting into a boosted frame, and to automatically verify the discrete Gauss’ law thanks to using charge conserving (e.g. Esirkepov) current deposition scheme [6].

Output in a boosted frame simulation

Some quantities, e.g. charge or dimensions perpendicular to the boost velocity, are Lorentz invariant. Those quantities are thus readily available from standard diagnostics in the boosted frame calculations. Quantities that do not fall in this category are recorded at a number of regularly spaced “stations”, immobile in the laboratory frame, at a succession of time intervals to record data history, or averaged over time. A visual example is given on Fig. 34 (bottom). Since the space-time locations of the diagnostic grids in the laboratory frame generally do not coincide with the space-time positions of the macroparticles and grid nodes used for the calculation in a boosted frame, some interpolation is performed at runtime during the data collection process. As a complement or an alternative, selected particle or field quantities can be dumped at regular intervals and quantities are reconstructed in the laboratory frame during a post-processing phase. The choice of the methods depends on the requirements of the diagnostics and particular implementations.

[1] (1,2)

J.-L. Vay. Simulation Of Beams Or Plasmas Crossing At Relativistic Velocity. Physics of Plasmas, 15(5):56701, May 2008. doi:10.1063/1.2837054.

[2]

B. M. Cowan, D. L. Bruhwiler, J. R. Cary, E. Cormier-Michel, and C. G. R. Geddes. Generalized algorithm for control of numerical dispersion in explicit time-domain electromagnetic simulations. Physical Review Special Topics-Accelerators And Beams, Apr 2013. doi:10.1103/PhysRevSTAB.16.041303.

[3]

D. P. Grote, A. Friedman, J.-L. Vay, and I. Haber. The Warp Code: Modeling High Intensity Ion Beams. In Aip Conference Proceedings, number 749, 55–58. 2005.

[4]

J.-L. Vay, C. G. R. Geddes, E. Esarey, C. B. Schroeder, W. P. Leemans, E. Cormier-Michel, and D. P. Grote. Modeling Of 10 GeV-1 TeV Laser-Plasma Accelerators Using Lorentz Boosted Simulations. Physics of Plasmas, Dec 2011. doi:10.1063/1.3663841.

[5]

J.-L. Vay. Noninvariance Of Space- And Time-Scale Ranges Under A Lorentz Transformation And The Implications For The Study Of Relativistic Interactions. Physical Review Letters, 98(13):130405/1–4, 2007.

[6]

T. Z. Esirkepov. Exact Charge Conservation Scheme For Particle-In-Cell Simulation With An Arbitrary Form-Factor. Computer Physics Communications, 135(2):144–153, Apr 2001.

Collisions

WarpX includes several different models to capture collisional processes including collisions between kinetic particles (Coulomb collisions, DSMC, nuclear fusion) as well as collisions between kinetic particles and a fixed (i.e. non-evolving) background species (MCC, background stopping).

Background Monte Carlo Collisions (MCC)

Several types of collisions between simulation particles and a neutral background gas are supported including elastic scattering, back scattering, charge exchange, excitation collisions and impact ionization.

The so-called null collision strategy is used in order to minimize the computational burden of the MCC module. This strategy is standard in PIC-MCC and a detailed description can be found elsewhere, for example in Birdsall [1]. In short the maximum collision probability is found over a sensible range of energies and is used to pre-select the appropriate number of macroparticles for collision consideration. Only these pre-selected particles are then individually considered for a collision based on their energy and the cross-sections of all the different collisional processes included.

The MCC implementation assumes that the background neutral particles are thermal, and are moving at non-relativistic velocities in the lab frame. For each simulation particle considered for a collision, a velocity vector for a neutral particle is randomly chosen given the user specified neutral temperature. The particle velocity is then boosted to the stationary frame of the neutral through a Galilean transformation. The energy of the collision is calculated using the particle utility function, ParticleUtils::getCollisionEnergy(), as

\[\begin{split}\begin{aligned} E_{coll} &= \sqrt{(\gamma mc^2 + Mc^2)^2 - (mu)^2} - (mc^2 + Mc^2) \\ &= \frac{2Mmu^2}{M + m + \sqrt{M^2+m^2+2\gamma mM}}\frac{1}{\gamma + 1} \end{aligned}\end{split}\]

where \(u\) is the speed of the particle as tracked in WarpX (i.e. \(u = \gamma v\) with \(v\) the particle speed), while \(m\) and \(M\) are the rest masses of the simulation and background species, respectively. The Lorentz factor is defined in the usual way, \(\gamma \def \sqrt{1 + u^2/c^2}\). Note that if \(\gamma\to1\) the above expression reduces to the classical equation \(E_{coll} = \frac{1}{2}\frac{Mm}{M+m} u^2\). The collision cross-sections for all scattering processes are evaluated at the energy as calculated above.

Once a particle is selected for a specific collision process, that process determines how the particle is scattered as outlined below.

Direct Simulation Monte Carlo (DSMC)

The algorithm by which binary collisions are treated is outlined below. The description assumes collisions between different species.

  1. Particles from both species are sorted by grid-cells.

  2. The order of the particles in each cell is shuffled.

  3. Within each cell, particles are paired to form collision partners. Particles of the species with fewer members in a given cell is split in half so that each particle has exactly one partner of the other species.

  4. Each collision pair is considered for a collision using the same logic as in the MCC description above.

  5. Particles that are chosen for collision are scattered according to the selected collision process.

Scattering processes

Charge exchange

This process can occur when an ion and neutral (of the same species) collide and results in the exchange of an electron. The ion velocity is simply replaced with the neutral velocity and vice-versa.

Elastic scattering

The elastic option uses isotropic scattering, i.e., with a differential cross section that is independent of angle. This scattering process as well as the ones below that relate to it, are all performed in the center-of-momentum (COM) frame. Designating the COM velocity of the particle as \(\vec{u}_c\) and its labframe velocity as \(\vec{u}_l\), the transformation from lab frame to COM frame is done with a general Lorentz boost (see function ParticleUtils::doLorentzTransform()):

\[\begin{split}\begin{bmatrix} \gamma_c c \\ u_{cx} \\ u_{cy} \\ u_{cz} \end{bmatrix} = \begin{bmatrix} \gamma & -\gamma\beta_x & -\gamma\beta_y & -\gamma\beta_z \\ -\gamma\beta_x & 1+(\gamma-1)\frac{\beta_x^2}{\beta^2} & (\gamma-1)\frac{\beta_x\beta_y}{\beta^2} & (\gamma-1)\frac{\beta_x\beta_z}{\beta^2} \\ -\gamma\beta_y & (\gamma-1)\frac{\beta_x\beta_y}{\beta^2} & 1 +(\gamma-1)\frac{\beta_y^2}{\beta^2} & (\gamma-1)\frac{\beta_y\beta_z}{\beta^2} \\ -\gamma\beta_z & (\gamma-1)\frac{\beta_x\beta_z}{\beta^2} & (\gamma-1)\frac{\beta_y\beta_z}{\beta^2} & 1+(\gamma-1)\frac{\beta_z^2}{\beta^2} \\ \end{bmatrix} \begin{bmatrix} \gamma_l c \\ u_{lx} \\ u_{ly} \\ u_{lz} \end{bmatrix}\end{split}\]

where \(\gamma\) is the Lorentz factor of the relative speed between the lab frame and the COM frame, \(\beta_i = v^{COM}_i/c\) is the i’th component of the relative velocity between the lab frame and the COM frame with

\[\vec{v}^{COM} = \frac{m \vec{u_c}}{\gamma_u m + M}\]

The particle velocity in the COM frame is then isotropically scattered using the function ParticleUtils::RandomizeVelocity(). After the direction of the velocity vector has been appropriately changed, it is transformed back to the lab frame with the reversed Lorentz transform as was done above followed by the reverse Galilean transformation using the starting neutral velocity.

Back scattering

The process is the same as for elastic scattering above expect the scattering angle is fixed at \(\pi\), meaning the particle velocity in the COM frame is updated to \(-\vec{u}_c\).

Excitation

The process is also the same as for elastic scattering except the excitation energy cost is subtracted from the particle energy. This is done by reducing the velocity before a scattering angle is chosen.

Benchmarks

See the MCC example for a benchmark of the MCC implementation against literature results.

Particle cooling due to elastic collisions

It is straight forward to determine the energy a projectile loses during an elastic collision with another body, as a function of scattering angle, through energy and momentum conservation. See for example Lim [2] for a derivation. The result is that given a projectile with mass \(m\), a target with mass \(M\), a scattering angle \(\theta\), and collision energy \(E\), the post collision energy of the projectile is given by

\[\begin{split}\begin{aligned} E_{final} = E - &[(E + mc^2)\sin^2\theta + Mc^2 - \cos(\theta)\sqrt{M^2c^4 - m^2c^4\sin^2\theta}] \\ &\times \frac{E(E+2mc^2)}{(E+mc^2+Mc^2)^2 - E(E+2mc^2)\cos^2\theta} \end{aligned}\end{split}\]

The impact of incorporating relativistic effects in the MCC routine can be seen in the plots below where high energy collisions are considered with both a classical and relativistic implementation of MCC. It is observed that the classical version of MCC reproduces the classical limit of the above equation but especially for ions, this result differs substantially from the fully relativistic result.

Classical v relativistic MCC
[1]

C. K. Birdsall. Particle-in-cell charged-particle simulations, plus Monte Carlo collisions with neutral atoms, PIC-MCC. IEEE Transactions on Plasma Science, 19(2):65–85, 1991. doi:10.1109/27.106800.

[2]

C.-H. Lim. The interaction of energetic charged particles with gas and boundaries in the particle simulation of plasmas. 2007. URL: https://search.library.berkeley.edu/permalink/01UCS_BER/s4lks2/cdi_proquest_miscellaneous_35689087.

Kinetic-fluid Hybrid Model

Many problems in plasma physics fall in a class where both electron kinetics and electromagnetic waves do not play a critical role in the solution. Examples of such situations include the study of collisionless magnetic reconnection and instabilities driven by ion temperature anisotropy, to mention only two. For these kinds of problems the computational cost of resolving the electron dynamics can be avoided by modeling the electrons as a neutralizing fluid rather than kinetic particles. By further using Ohm’s law to compute the electric field rather than evolving it with the Maxwell-Faraday equation, light waves can be stepped over. The simulation resolution can then be set by the ion time and length scales (commonly the ion cyclotron period \(1/\Omega_i\) and ion skin depth \(l_i\), respectively), which can reduce the total simulation time drastically compared to a simulation that has to resolve the electron Debye length and CFL-condition based on the speed of light.

Many authors have described variations of the kinetic ion & fluid electron model, generally referred to as particle-fluid hybrid or just hybrid-PIC models. The implementation in WarpX is described in detail in Groenewald et al. [1]. This description follows mostly from that reference.

Model

The basic justification for the hybrid model is that the system to which it is applied is dominated by ion kinetics, with ions moving much slower than electrons and photons. In this scenario two critical approximations can be made, namely, neutrality (\(n_e=n_i\)) and the Maxwell-Ampere equation can be simplified by neglecting the displacement current term [2], giving,

\[\mu_0\vec{J} = \vec{\nabla}\times\vec{B},\]

where \(\vec{J} = \sum_{s\neq e}\vec{J}_s + \vec{J}_e + \vec{J}_{ext}\) is the total electrical current, i.e. the sum of electron and ion currents as well as any external current (not captured through plasma particles). Since ions are treated in the regular PIC manner, the ion current, \(\sum_{s\neq e}\vec{J}_s\), is known during a simulation. Therefore, given the magnetic field, the electron current can be calculated.

The electron momentum transport equation (obtained from multiplying the Vlasov equation by mass and integrating over velocity), also called the generalized Ohm’s law, is given by:

\[en_e\vec{E} = \frac{m}{e}\frac{\partial \vec{J}_e}{\partial t} + \frac{m}{e^2}\left( \vec{U}_e\cdot\nabla \right) \vec{J}_e - \nabla\cdot {\overleftrightarrow P}_e - \vec{J}_e\times\vec{B}+\vec{R}_e\]

where \(\vec{U}_e = \vec{J}_e/(en_e)\) is the electron fluid velocity, \({\overleftrightarrow P}_e\) is the electron pressure tensor and \(\vec{R}_e\) is the drag force due to collisions between electrons and ions. Applying the above momentum equation to the Maxwell-Faraday equation (\(\frac{\partial\vec{B}}{\partial t} = -\nabla\times\vec{E}\)) and substituting in \(\vec{J}\) calculated from the Maxwell-Ampere equation, gives,

\[\frac{\partial\vec{J}_e}{\partial t} = -\frac{1}{\mu_0}\nabla\times\left(\nabla\times\vec{E}\right) - \frac{\partial\vec{J}_{ext}}{\partial t} - \sum_{s\neq e}\frac{\partial\vec{J}_s}{\partial t}.\]

Plugging this back into the generalized Ohm’ law gives:

\[\begin{split}\left(en_e +\frac{m}{e\mu_0}\nabla\times\nabla\times\right)\vec{E} =& - \frac{m}{e}\left( \frac{\partial\vec{J}_{ext}}{\partial t} + \sum_{s\neq e}\frac{\partial\vec{J}_s}{\partial t} \right) \\ &+ \frac{m}{e^2}\left( \vec{U}_e\cdot\nabla \right) \vec{J}_e - \nabla\cdot {\overleftrightarrow P}_e - \vec{J}_e\times\vec{B}+\vec{R}_e.\end{split}\]

If we now further assume electrons are inertialess (i.e. \(m=0\)), the above equation simplifies to,

\[en_e\vec{E} = -\vec{J}_e\times\vec{B}-\nabla\cdot{\overleftrightarrow P}_e+\vec{R}_e.\]

Making the further simplifying assumptions that the electron pressure is isotropic and that the electron drag term can be written using a simple resistivity (\(\eta\)) and hyper-resistivity (\(\eta_h\)) i.e. \(\vec{R}_e = en_e(\eta-\eta_h \nabla^2)\vec{J}\), brings us to the implemented form of Ohm’s law:

\[\vec{E} = -\frac{1}{en_e}\left( \vec{J}_e\times\vec{B} + \nabla P_e \right)+\eta\vec{J}-\eta_h \nabla^2\vec{J}.\]

Lastly, if an electron temperature is given from which the electron pressure can be calculated, the model is fully constrained and can be evolved given initial conditions.

Implementation details

Note

Various verification tests of the hybrid model implementation can be found in the examples section.

The kinetic-fluid hybrid extension mostly uses the same routines as the standard electromagnetic PIC algorithm with the only exception that the E-field is calculated from the above equation rather than it being updated from the full Maxwell-Ampere equation. The function WarpX::HybridPICEvolveFields() handles the logic to update the E&M fields when the “hybridPIC” model is used. This function is executed after particle pushing and deposition (charge and current density) has been completed. Therefore, based on the usual time-staggering in the PIC algorithm, when HybridPICEvolveFields() is called at timestep \(t=t_n\), the quantities \(\rho^n\), \(\rho^{n+1}\), \(\vec{J}_i^{n-1/2}\) and \(\vec{J}_i^{n+1/2}\) are all known.

Field update

The field update is done in three steps as described below.

First half step

Firstly the E-field at \(t=t_n\) is calculated for which the current density needs to be interpolated to the correct time, using \(\vec{J}_i^n = 1/2(\vec{J}_i^{n-1/2}+ \vec{J}_i^{n+1/2})\). The electron pressure is simply calculated using \(\rho^n\) and the B-field is also already known at the correct time since it was calculated for \(t=t_n\) at the end of the last step. Once \(\vec{E}^n\) is calculated, it is used to push \(\vec{B}^n\) forward in time (using the Maxwell-Faraday equation) to \(\vec{B}^{n+1/2}\).

Second half step

Next, the E-field is recalculated to get \(\vec{E}^{n+1/2}\). This is done using the known fields \(\vec{B}^{n+1/2}\), \(\vec{J}_i^{n+1/2}\) and interpolated charge density \(\rho^{n+1/2}=1/2(\rho^n+\rho^{n+1})\) (which is also used to calculate the electron pressure). Similarly as before, the B-field is then pushed forward to get \(\vec{B}^{n+1}\) using the newly calculated \(\vec{E}^{n+1/2}\) field.

Extrapolation step

Obtaining the E-field at timestep \(t=t_{n+1}\) is a well documented issue for the hybrid model. Currently the approach in WarpX is to simply extrapolate \(\vec{J}_i\) forward in time, using

\[\vec{J}_i^{n+1} = \frac{3}{2}\vec{J}_i^{n+1/2} - \frac{1}{2}\vec{J}_i^{n-1/2}.\]

With this extrapolation all fields required to calculate \(\vec{E}^{n+1}\) are known and the simulation can proceed.

Sub-stepping

It is also well known that hybrid PIC routines require the B-field to be updated with a smaller timestep than needed for the particles. A 4th order Runge-Kutta scheme is used to update the B-field. The RK scheme is repeated a number of times during each half-step outlined above. The number of sub-steps used can be specified by the user through a runtime simulation parameter (see input parameters section).

Electron pressure

The electron pressure is assumed to be a scalar quantity and calculated using the given input parameters, \(T_{e0}\), \(n_0\) and \(\gamma\) using

\[P_e = n_0T_{e0}\left( \frac{n_e}{n_0} \right)^\gamma.\]

The isothermal limit is given by \(\gamma = 1\) while \(\gamma = 5/3\) (default) produces the adiabatic limit.

Electron current

WarpX’s displacement current diagnostic can be used to output the electron current in the kinetic-fluid hybrid model since in the absence of kinetic electrons, and under the assumption of zero displacement current, that diagnostic simply calculates the hybrid model’s electron current.

[1]

R. E. Groenewald, A. Veksler, F. Ceccherini, A. Necas, B. S. Nicks, D. C. Barnes, T. Tajima, and S. A. Dettrick. Accelerated kinetic model for global macro stability studies of high-beta fusion reactors. Physics of Plasmas, 30(12):122508, Dec 2023. doi:10.1063/5.0178288.

[2]

C. W. Nielson and H. R. Lewis. Particle-Code Models in the Nonradiative Limit. In J. Killeen, editor, Controlled Fusion, volume 16 of Methods in Computational Physics: Advances in Research and Applications, pages 367–388. Elsevier, 1976. doi:10.1016/B978-0-12-460816-0.50015-4.

Cold Relativistic Fluid Model

An alternate to the representation of the plasma as macroparticles, is the cold relativistic fluid model. The cold relativistic fluid model is typically faster to compute than particles and useful to replace particles when kinetic effects are negligible. This can be done for certain parts of the plasma, such as the background plasma, while still representing particle beams as a group of macroparticles. The two models then couple through Maxwell’s equations.

In the cold limit (zero internal temperature and pressure) of a relativistic plasma, the Maxwell-Fluid equations govern the plasma evolution. The fluid equations per species, s, are given by,

\[\begin{split}\frac{\partial N_s}{\partial t} + \nabla \cdot (N_s\mathbf{V}_s) &= 0 \\ \frac{\partial (N\mathbf{U})_s}{\partial t} + \nabla \cdot ((N\mathbf{U})_s\mathbf{V}_s) &= \frac{q_sN_s}{m_s}(\mathbf{E}_s + \mathbf{V}_s \times \mathbf{B}_s).\end{split}\]

Where the fields are updated via Maxwell’s equations,

\[\begin{split}\nabla \cdot \mathbf{E} &= \frac{\rho}{\varepsilon_0} \\ \nabla \cdot \mathbf{B} &= 0 \\ \nabla \times \mathbf{E} &= -\frac{\partial \mathbf{B}}{\partial t} \\ \nabla \times \mathbf{B} &= \mu_0 \mathbf{J} + \mu_0 \varepsilon_0 \frac{\partial \mathbf{E}}{\partial t}.\end{split}\]

The fluids are coupled to the fields through,

\[\begin{split}\rho &= \rho_{ptcl}+\sum_s q_sN_s \\ \mathbf{J} &= \mathbf{J}_{ptcl}+\sum_s q_sN_s\mathbf{V}_s \\ \mathbf{V}_s &= \frac{ \mathbf{U}_s }{ \sqrt{ 1 + \mathbf{U}_s^2/c^2} } \\ (N\mathbf{U})_s &= N_s\mathbf{U}_s\end{split}\]

where the particle quantities are calculated by the PIC algorithm.

Implementation details

Figure showing fluid Loop embedded within the overall PIC loop.

Fluid Loop embedded within the overall PIC loop.

The fluid timeloop is embedded inside the standard PIC timeloop and consists of the following steps: 1. Higuera and Cary push of the momentum 2. Non-inertial (momentum source) terms (only in cylindrical geometry) 3. boundary conditions and MPI Communications 4. MUSCL scheme for advection terms 5. Current and Charge Deposition. Fig. 35 gives a visual representation of these steps, and we describe each of these in more detail.

Step 0: Preparation

Before the fluid loop begins, it is assumed that the program is in the state where fields \(\mathbf{E}\) and \(\mathbf{B}\) are available integer timestep. The fluids themselves are represented by arrays of fluid quantities (density and momentum density, \(\mathbf{Q} \equiv \{ N, NU_x, NU_y, NU_z \}\)) known on a nodal grid and at half-integer timestep.

Step 1: Higuera and Cary Push

The time staggering of the fields is used by the momentum source term, which is solved with a Higuera and Cary push [1]. We do not adopt spatial grid staggering, all discretized fluid quantities exist on the nodal grid. External fields can be included at this step.

Step 2: Non-inertial Terms

In RZ, the divergence of the flux terms has additional non-zero elements outside of the derivatives. These terms are Strang split and are time integrated via equation 2.18 from Shu and Osher [2], which is the SSP-RK3 integrator.

Step 3: Boundary Conditions and Communications

At this point, the code applies boundary conditions (assuming Neumann boundary conditions for the fluid quantities) and exchanges guard cells between MPI ranks in preparation of derivative terms in the next step.

Step 4: Advective Push

For the advective term, a MUSCL scheme with a low-diffusion minmod slope limiting is used. We further simplify the conservative equations in terms of primitive variables, \(\{ N, U_x, U_y, U_z \}\). Which we found to be more stable than conservative variables for the MUSCL reconstruction. Details of the scheme can be found in Van Leer [3].

Step 5: Current and Charge Deposition

Once this series of steps is complete and the fluids have been evolved by an entire timestep, the current and charge is deposited onto the grid and added to the total current and charge densities.

Note

The algorithm is safe with zero fluid density.

It also implements a positivity limiter on the density to prevent negative density regions from forming.

There is currently no ability to perform azimuthal mode decomposition in RZ.

Mesh refinement is not supported for the fluids.

The implemented MUSCL scheme has a simplified slope averaging, see the extended writeup for details.

More details on the precise implementation are available here, WarpX_Cold_Rel_Fluids.pdf.

Warning

If using the fluid model with the Kinetic-Fluid Hybrid model or the electrostatic solver, there is a known issue that the fluids deposit at a half-timestep offset in the charge-density.

[1]

A. V. Higuera and J. R. Cary. Structure-preserving second-order integration of relativistic charged particle trajectories in electromagnetic fields. Physics of Plasmas, 24(5):052104, 04 2017. URL: https://doi.org/10.1063/1.4979989, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.4979989/15988441/052104\_1\_online.pdf, doi:10.1063/1.4979989.

[2]

C.-W. Shu and S. Osher. Efficient implementation of essentially non-oscillatory shock-capturing schemes. Journal of Computational Physics, 77(2):439–471, 1988. URL: https://www.sciencedirect.com/science/article/pii/0021999188901775, doi:https://doi.org/10.1016/0021-9991(88)90177-5.

[3]

B. Van Leer. On The Relation Between The Upwind-Differencing Schemes Of Godunov, Engquist—Osher and Roe, pages 33–52. Springer Berlin Heidelberg, 1997. URL: https://doi.org/10.1007/978-3-642-60543-7_3, doi:10.1007/978-3-642-60543-7_3.

Development

Contribute to WarpX

We welcome new contributors! Here is how to participate to the WarpX development.

Git workflow

The WarpX project uses git for version control. If you are new to git, you can follow this tutorial.

Configure your GitHub Account & Development Machine

First, let’s setup your Git environment and GitHub account.

  1. Go to https://github.com/settings/profile and add your real name and affiliation

  2. Go to https://github.com/settings/emails and add & verify the professional e-mails you want to be associated with.

  3. Configure git on the machine you develop on to use the same spelling of your name and email:

    • git config --global user.name "FIRSTNAME LASTNAME"

    • git config --global user.email EMAIL@EXAMPLE.com

  4. Go to https://github.com/settings/keys and add the SSH public key of the machine you develop on. (Check out the GitHub guide to generating SSH keys or troubleshoot common SSH problems. )

Make your own fork

First, fork the WarpX “mainline” repo on GitHub by pressing the Fork button on the top right of the page. A fork is a copy of WarpX on GitHub, which is under your full control.

Then, we create local copies, for development:

# Clone the mainline WarpX source code to your local computer.
# You cannot write to this repository, but you can read from it.
git clone git@github.com:ECP-WarpX/WarpX.git
cd WarpX

# rename what we just cloned: call it "mainline"
git remote rename origin mainline

# Add your own fork. You can get this address on your fork's Github page.
# Here is where you will publish new developments, so that they can be
# reviewed and integrated into "mainline" later on.
# "myGithubUsername" needs to be replaced with your user name on GitHub.
git remote add myGithubUsername git@github.com:myGithubUsername/WarpX.git

Now you are free to play with your fork (for additional information, you can visit the Github fork help page).

Note

We only need to do the above steps for the first time.

Let’s Develop

You are all set! Now, the basic WarpX development workflow is:

  1. Implement your changes and push them on a new branch branch_name on your fork.

  2. Create a Pull Request from branch branch_name on your fork to branch development on the main WarpX repo.

Create a branch branch_name (the branch name should reflect the piece of code you want to add, like fix-spectral-solver) with

# start from an up-to-date development branch
git checkout development
git pull mainline development

# create a fresh branch
git checkout -b branch_name

and do the coding you want.

It is probably a good time to look at the AMReX documentation and at the Doxygen reference pages:

Once you are done developing, add the files you created and/or modified to the git staging area with

git add <file_I_created> <and_file_I_modified>

Build your changes

If you changed C++ files, then now is a good time to test those changes by compiling WarpX locally. Follow the developer instructions in our manual to set up a local development environment, then compile and run WarpX.

Commit & push your changes

Periodically commit your changes with

git commit

The commit message is super important in order to follow the developments during code-review and identify bugs. A typical format is:

This is a short, 40-character title

After a newline, you can write arbitrary paragraphs. You
usually limit the lines to 70 characters, but if you don't, then
nothing bad will happen.

The most important part is really that you find a descriptive title
and add an empty newline after it.

For the moment, commits are on your local repo only. You can push them to your fork with

git push -u myGithubUsername branch_name

If you want to synchronize your branch with the development branch (this is useful when the development branch is being modified while you are working on branch_name), you can use

git pull mainline development

and fix any conflict that may occur.

Submit a Pull Request

A Pull Request (PR) is the way to efficiently visualize the changes you made and to propose your new feature/improvement/fix to the WarpX project. Right after you push changes, a banner should appear on the Github page of your fork, with your <branch_name>.

  • Click on the compare & pull request button to prepare your PR.

  • It is time to communicate your changes: write a title and a description for your PR. People who review your PR are happy to know

    • what feature/fix you propose, and why

    • how you made it (added new/edited files, created a new class than inherits from…)

    • how you tested it and what was the output you got

    • and anything else relevant to your PR (attach images and scripts, link papers, etc.)

  • Press Create pull request. Now you can navigate through your PR, which highlights the changes you made.

Please DO NOT write large pull requests, as they are very difficult and time-consuming to review. As much as possible, split them into small, targeted PRs. For example, if find typos in the documentation open a pull request that only fixes typos. If you want to fix a bug, make a small pull request that only fixes a bug.

If you want to implement a feature and are not too sure how to split it, just open an issue about your plans and ping other WarpX developers on it to chime in. Generally, write helper functionality first, test it and then write implementation code. Submit tests, documentation changes and implementation of a feature together for pull request review.

Even before your work is ready to merge, it can be convenient to create a PR (so you can use Github tools to visualize your changes). In this case, please put the [WIP] tag (for Work-In-Progress) at the beginning of the PR title. You can also use the GitHub project tab in your fork to organize the work into separate tasks/PRs and share it with the WarpX community to get feedback.

Include a test to your PR

A new feature is great, a working new feature is even better! Please test your code and add your test to the automated test suite. It’s the way to protect your work from adventurous developers. Instructions are given in the testing section of our developer’s documentation.

Include documentation about your PR

Now, let users know about your new feature by describing its usage in the WarpX documentation. Our documentation uses Sphinx, and it is located in Docs/source/. For instance, if you introduce a new runtime parameter in the input file, you can add it to Docs/source/running_cpp/parameters.rst. If Sphinx is installed on your computer, you should be able to generate the html documentation with

make html

in Docs/. Then open Docs/build/html/index.html with your favorite web browser and look for your changes.

Once your code is ready with documentation and automated test, congratulations! You can create the PR (or remove the [WIP] tag if you already created it). Reviewers will interact with you if they have comments/questions.

Style and conventions

  • For indentation, WarpX uses four spaces (no tabs)

  • Some text editors automatically modify the files you open. We recommend to turn on to remove trailing spaces and replace Tabs with 4 spaces.

  • The number of characters per line should be <100

  • Exception: in documentation files (.rst/.md) use one sentence per line independent of its number of characters, which will allow easier edits.

  • Space before and after assignment operator (=)

  • To define a function, use a space between the name of the function and the paranthesis, e.g., myfunction (). When calling a function, no space should be used, i.e., just use myfunction(). The reason this is beneficial is that when we do a git grep to search for myfunction (), we can clearly see the locations where myfunction () is defined and where myfunction() is called. Also, using git grep "myfunction ()" searches for files only in the git repo, which is more efficient compared to the grep "myfunction ()" command that searches through all the files in a directory, including plotfiles for example.

  • To define a class, use class on the same line as the name of the class, e.g., class MyClass. The reason this is beneficial is that when we do a git grep to search for class MyClass, we can clearly see the locations where class MyClass is defined and where MyClass is called.

  • When defining a function or class, make sure the starting { token appears on a new line.

  • Use curly braces for single statement blocks. For example:

    for (int n = 0; n < 10; ++n) {
        amrex::Print() << "Like this!";
    }
    
    for (int n = 0; n < 10; ++n) { amrex::Print() << "Or like this!"; }
    

    but not

    for (int n = 0; n < 10; ++n)
        amrex::Print() << "Not like this.";
    
    for (int n = 0; n < 10; ++n) amrex::Print() << "Nor like this.";
    
  • It is recommended that style changes are not included in the PR where new code is added. This is to avoid any errors that may be introduced in a PR just to do style change.

  • WarpX uses CamelCase convention for file names and class names, rather than snake_case.

  • The names of all member variables should be prefixed with m_. This is particularly useful to avoid capturing member variables by value in a lambda function, which causes the whole object to be copied to GPU when running on a GPU-accelerated architecture. This convention should be used for all new piece of code, and it should be applied progressively to old code.

  • #include directives in C++ have a distinct order to avoid bugs, see the WarpX repo structure for details

  • For all new code, we should avoid relying on using namespace amrex; and all amrex types should be prefixed with amrex::. Inside limited scopes, AMReX type literals can be included with using namespace amrex::literals;. Ideally, old code should be modified accordingly.

Implementation Details

AMReX basics (excessively basic)

WarpX is built on the Adaptive Mesh Refinement (AMR) library AMReX. This section provides a very sporadic description of the main AMReX classes and concepts relevant for WarpX, that can serve as a reminder. Please read the AMReX basics doc page, of which this section is largely inspired.

  • amrex::Box: Dimension-dependent lower and upper indices defining a rectangular volume in 3D (or surface in 2D) in the index space. Box is a lightweight meta-data class, with useful member functions.

  • amrex::BoxArray: Collection of Box on a single AMR level. The information of which MPI rank owns which Box in a BoxArray is in DistributionMapping.

  • amrex::FArrayBox: Fortran-ordered array of floating-point amrex::Real elements defined on a Box. A FArrayBox can represent scalar data or vector data, with ncomp components.

  • amrex::MultiFab: Collection of FAB (= FArrayBox) on a single AMR level, distributed over MPI ranks. The concept of ghost cells is defined at the MultiFab level.

  • amrex::ParticleContainer: A collection of particles, typically for particles of a physical species. Particles in a ParticleContainer are organized per Box. Particles in a Box are organized per tile (this feature is off when running on GPU). Particles within a tile are stored in several structures, each being contiguous in memory: (i) a Struct-of-Array (SoA) for amrex::ParticleReal data such as positions, weight, momentum, etc., (ii) a Struct-of-Array (SoA) for int data, such as ionization levels, and (iii) a Struct-of-Array (SoA) for a uint64_t unique identifier index per particle (containing a 40bit id and 24bit cpu sub-identifier as assigned at particle creation time). This id is also used to check if a particle is active/valid or marked for removal.

The simulation domain is decomposed in several Box, and each MPI rank owns (and performs operations on) the fields and particles defined on a few of these Box, but has the metadata of all of them. For convenience, AMReX provides iterators, to easily iterate over all FArrayBox (or even tile-by-tile, optionally) in a MultiFab own by the MPI rank (MFIter), or over all particles in a ParticleContainer on a per-box basis (ParIter, or its derived class WarpXParIter). These are respectively done in loops like:

// mf is a pointer to MultiFab
for ( amrex::MFIter mfi(mf, false); mfi.isValid(); ++mfi ) { ... }

and

// *this is a pointer to a ParticleContainer
for (WarpXParIter pti(*this, lev); pti.isValid(); ++pti) { ... }

When looping over FArrayBox in a MultiFab, the iterator provides functions to retrieve the metadata of the Box on which the FAB is defined (MFIter::box(), MFIter::tilebox() or variations) or the particles defined on this Box (ParIter::GetParticles()).

WarpX Structure

Repo Organization

All the WarpX source code is located in Source/. All sub-directories have a pretty straightforward name. The PIC loop is part of the WarpX class, in function WarpX::Evolve implemented in Source/WarpXEvolve.cpp. The core of the PIC loop (i.e., without diagnostics etc.) is in WarpX::OneStep_nosub (when subcycling is OFF) or WarpX::OneStep_sub1 (when subcycling is ON, with method 1). Here is a visual representation of the repository structure.

Code organization

The main WarpX class is WarpX, implemented in Source/WarpX.cpp.

Build System

WarpX uses the CMake build system generator. Each sub-folder contains a file CMakeLists.txt with the names of the source files (.cpp) that are added to the build. Do not list header files (.H) here.

For experienced developers, we also support AMReX’ GNUmake build script collection. The file Make.package in each sub-folder has the same purpose as the CMakeLists.txt file, please add new .cpp files to both dirs.

C++ Includes

All WarpX header files need to be specified relative to the Source/ directory.

  • e.g. #include "Utils/WarpXConst.H"

  • files in the same directory as the including header-file can be included with #include "FileName.H"

By default, in a MyName.cpp source file we do not include headers already included in MyName.H. Besides this exception, if a function or a class is used in a source file, the header file containing its declaration must be included, unless the inclusion of a facade header is more appropriate. This is sometimes the case for AMReX headers. For instance AMReX_GpuLaunch.H is a façade header for AMReX_GpuLaunchFunctsC.H and AMReX_GpuLaunchFunctsG.H, which contain respectively the CPU and the GPU implementation of some methods, and which should not be included directly. Whenever possible, forward declarations headers are included instead of the actual headers, in order to save compilation time (see dedicated section below). In WarpX forward declaration headers have the suffix *_fwd.H, while in AMReX they have the suffix *Fwd.H. The include order (see PR #874 and PR #1947) and proper quotation marks are:

In a MyName.cpp file:

  1. #include "MyName.H" (its header) then

  2. (further) WarpX header files #include "..." then

  3. WarpX forward declaration header files #include "..._fwd.H"

  4. AMReX header files #include <...> then

  5. AMReX forward declaration header files #include <...Fwd.H> then

  6. PICSAR header files #include <...> then

  7. other third party includes #include <...> then

  8. standard library includes, e.g. #include <vector>

In a MyName.H file:

  1. #include "MyName_fwd.H" (the corresponding forward declaration header, if it exists) then

  2. WarpX header files #include "..." then

  3. WarpX forward declaration header files #include "..._fwd.H"

  4. AMReX header files #include <...> then

  5. AMReX forward declaration header files #include <...Fwd.H> then

  6. PICSAR header files #include <...> then

  7. other third party includes #include <...> then

  8. standard library includes, e.g. #include <vector>

Each of these groups of header files should ideally be sorted alphabetically, and a blank line should be placed between the groups.

For details why this is needed, please see PR #874, PR #1947, the LLVM guidelines, and include-what-you-use.

Forward Declaration Headers

Forward declarations can be used when a header file needs only to know that a given class exists, without any further detail (e.g., when only a pointer to an instance of that class is used). Forward declaration headers are a convenient way to organize forward declarations. If a forward declaration is needed for a given class MyClass, declared in MyClass.H, the forward declaration should appear in a header file named MyClass_fwd.H, placed in the same folder containing MyClass.H. As for regular header files, forward declaration headers must have include guards. Below we provide a simple example:

MyClass_fwd.H:

#ifndef MY_CLASS_FWD_H
#define MY_CLASS_FWD_H

class MyClass;

#endif // MY_CLASS_FWD_H

MyClass.H:

#ifndef MY_CLASS_H
#define MY_CLASS_H

#include "MyClass_fwd.H"
#include "someHeader.H"

class MyClass {
    void stuff ();
};

#endif // MY_CLASS_H

MyClass.cpp:

#include "MyClass.H"

class MyClass {
    void stuff () { /* stuff */ }
};

Usage: in SomeType.H

#ifndef SOMETYPE_H
#define SOMETYPE_H

#include "MyClass_fwd.H" // all info we need here
#include <memory>

struct SomeType {
    std::unique_ptr<MyClass> p_my_class;
};

#endif // SOMETYPE_H

Usage: in somewhere.cpp

#include "SomeType.H"
#include "MyClass.H"  // because we call "stuff()" we really need
                      // to know the full declaration of MyClass

void somewhere ()
{
    SomeType s;
    s.p_my_class = std::make_unique<MyClass>();
    s.p_my_class->stuff();
}

All files that only need to know the type SomeType from SomeType.H but do not access the implementation details of MyClass will benefit from improved compilation times.

Dimensionality

This section describes the handling of dimensionality in WarpX.

Build Options

Dimensions

CMake Option

3D3V

WarpX_DIMS=3 (default)

2D3V

WarpX_DIMS=2

1D3V

WarpX_DIMS=1

RZ

WarpX_DIMS=RZ

Note that one can build multiple WarpX dimensions at once via -DWarpX_DIMS="1;2;RZ;3".

See building from source for further details.

Defines

Depending on the build variant of WarpX, the following preprocessor macros will be set:

Macro

3D3V

2D3V

1D3V

RZ

AMREX_SPACEDIM

3

2

1

2

WARPX_DIM_3D

defined

undefined

undefined

undefined

WARPX_DIM_1D_Z

undefined

undefined

defined

undefined

WARPX_DIM_XZ

undefined

defined

undefined

undefined

WARPX_DIM_RZ

undefined

undefined

undefined

defined

WARPX_ZINDEX

2

1

0

1

At the same time, the following conventions will apply:

Convention

3D3V

2D3V

1D3V

RZ

Fields

AMReX Box dimensions

3

2

1

2

WarpX axis labels

x, y, z

x, z

z

x, z

Particles

AMReX .pos()

0, 1, 2

0, 1

0

0, 1

WarpX position names

x, y, z

x, z

z

r, z

extra SoA attribute

theta

Please see the following sections for particle SoA details.

Conventions

In 2D3V, we assume that the position of a particle in y is equal to 0. In 1D3V, we assume that the position of a particle in x and y is equal to 0.

Fields

Note

Add info on staggering and domain decomposition. Synchronize with section initialization.

The main fields are the electric field Efield, the magnetic field Bfield, the current density current and the charge density rho. When a divergence-cleaner is used, we add another field F (containing \(\vec \nabla \cdot \vec E - \rho\)).

Due the AMR strategy used in WarpX (see section Theory: AMR for a complete description), each field on a given refinement level lev (except for the coarsest 0) is defined on:

  • the fine patch (suffix _fp, the actual resolution on lev).

  • the coarse patch (suffix _cp, same physical domain with the resolution of MR level lev-1).

  • the auxiliary grid (suffix _aux, same resolution as _fp), from which the fields are gathered from the grids to particle positions. For this reason. only E and B are defined on this _aux grid (not the current density or charge density).

  • In some conditions, i.e., when buffers are used for the field gather (for numerical reasons), a copy of E and B on the auxiliary grid _aux of the level below lev-1 is stored in fields with suffix _cax (for coarse aux).

As an example, the structures for the electric field are Efield_fp, Efield_cp, Efield_aux (and optionally Efield_cax).

Declaration

All the fields described above are public members of class WarpX, defined in WarpX.H. They are defined as an amrex::Vector (over MR levels) of std::array (for the 3 spatial components \(E_x\), \(E_y\), \(E_z\)) of std::unique_ptr of amrex::MultiFab, i.e.:

amrex::Vector<std::array< std::unique_ptr<amrex::MultiFab>, 3 > > Efield_fp;

Hence, Ex on MR level lev is a pointer to an amrex::MultiFab. The other fields are organized in the same way.

Allocation and initialization

The MultiFab constructor (for, e.g., Ex on level lev) is called in WarpX::AllocLevelMFs.

By default, the MultiFab are set to 0 at initialization. They can be assigned a different value in WarpX::InitLevelData.

Field solver

The field solver is performed in WarpX::EvolveE for the electric field and WarpX::EvolveB for the magnetic field, called from WarpX::OneStep_nosub in WarpX::Evolve. This section describes the FDTD field push. It is implemented in Source/FieldSolver/FiniteDifferenceSolver/.

As all cell-wise operation, the field push is done as follows (this is split in multiple functions in the actual implementation to avoid code duplication) :

 // Loop over MR levels
 for (int lev = 0; lev <= finest_level; ++lev) {
    // Get pointer to MultiFab Ex on level lev
    MultiFab* Ex = Efield_fp[lev][0].get();
    // Loop over boxes (or tiles if not on GPU)
    for ( MFIter mfi(*Ex, TilingIfNotGPU()); mfi.isValid(); ++mfi ) {
        // Apply field solver on the FAB
    }
}

The innermost step // Apply field solver on the FAB could be done with 3 nested for loops for the 3 dimensions (in 3D). However, for portability reasons (see section Developers: Portability), this is done in two steps: (i) extract AMReX data structures into plain-old-data simple structures, and (ii) call a general ParallelFor function (translated into nested loops on CPU or a kernel launch on GPU, for instance):

// Get Box corresponding to the current MFIter
const Box& tex  = mfi.tilebox(Ex_nodal_flag);
// Extract the FArrayBox into a simple structure, for portability
Array4<Real> const& Exfab = Ex->array(mfi);
// Loop over cells and perform stencil operation
amrex::ParallelFor(tex,
    [=] AMREX_GPU_DEVICE (int j, int k, int l)
    {
        Ex(i, j, k) += c2 * dt * (
            - T_Algo::DownwardDz(By, coefs_z, n_coefs_z, i, j, k)
            + T_Algo::DownwardDy(Bz, coefs_y, n_coefs_y, i, j, k)
            - PhysConst::mu0 * jx(i, j, k) );
    }
);

where T_Algo::DownwardDz and T_Algo::DownwardDy represent the discretized derivative for a given algorithm (represented by the template parameter T_Algo). The available discretization algorithms can be found in Source/FieldSolver/FiniteDifferenceSolver/FiniteDifferenceAlgorithms.

Guard cells exchanges

Communications are mostly handled in Source/Parallelization/.

For E and B guard cell exchanges, the main functions are variants of amrex::FillBoundary(amrex::MultiFab, ...) (or amrex::MultiFab::FillBoundary(...)) that fill guard cells of all amrex::FArrayBox in an amrex::MultiFab with valid cells of corresponding amrex::FArrayBox neighbors of the same amrex::MultiFab. There are a number of FillBoundaryE, FillBoundaryB etc. Under the hood, amrex::FillBoundary calls amrex::ParallelCopy, which is also sometimes directly called in WarpX. Most calls a

For the current density, the valid cells of neighboring MultiFabs are accumulated (added) rather than just copied. This is done using amrex::MultiFab::SumBoundary, and mostly located in Source/Parallelization/WarpXSumGuardCells.H.

Interpolations for MR

This is mostly implemented in Source/Parallelization, see the following functions (you may complain to the authors if the documentation is empty)

void WarpX::SyncCurrent(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_fp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_cp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_buffer)

Apply filter and sum guard cells across MR levels. If current centering is used, center the current from a nodal grid to a staggered grid. For each MR level beyond level 0, interpolate the fine-patch current onto the coarse-patch current at the same level. Then, for each MR level, including level 0, apply filter and sum guard cells across levels.

Parameters:
  • J_fp[inout] reference to fine-patch current MultiFab (all MR levels)

  • J_cp[inout] reference to coarse-patch current MultiFab (all MR levels)

  • J_buffer[inout] reference to buffer current MultiFab (all MR levels)

void WarpX::RestrictCurrentFromFineToCoarsePatch(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_fp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_cp, int lev)

Fills the values of the current on the coarse patch by averaging the values of the current of the fine patch (on the same level).

void WarpX::AddCurrentFromFineLevelandSumBoundary(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_fp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_cp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_buffer, int lev)

Filter

General functions for filtering can be found in Source/Filter/, where the main Filter class is defined (see below). All filters (so far there are two of them) in WarpX derive from this class.

class Filter

Subclassed by BilinearFilter, NCIGodfreyFilter

Bilinear filter

The multi-pass bilinear filter (applied on the current density) is implemented in Source/Filter/, and class WarpX holds an instance of this class in member variable WarpX::bilinear_filter. For performance reasons (to avoid creating too many guard cells), this filter is directly applied in communication routines, see WarpX::AddCurrentFromFineLevelandSumBoundary above and

void WarpX::ApplyFilterJ(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &current, int lev, int idim)
void WarpX::SumBoundaryJ(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &current, int lev, int idim, const amrex::Periodicity &period)
Godfrey’s anti-NCI filter for FDTD simulations

This filter is applied on the electric and magnetic field (on the auxiliary grid) to suppress the Numerical Cherenkov Instability when running FDTD. It is implemented in Source/Filter/, and there are two different stencils, one for Ex, Ey and Bz and the other for Ez, Bx and By.

class NCIGodfreyFilter : public Filter

Class for Godfrey’s filter to suppress Numerical Cherenkov Instability.

It derives from the base class Filter. The filter stencil is initialized in method ComputeStencils. Computing the stencil requires to read parameters from a table, where each lines stands for a value of c*dt/dz. The filter is applied using the base class’ method ApplyStencil.

The class WarpX holds two corresponding instances of this class in member variables WarpX::nci_godfrey_filter_exeybz and WarpX::nci_godfrey_filter_bxbyez. It is a 9-point stencil (is the z direction only), for which the coefficients are computed using tabulated values (depending on dz/dx) in Source/Utils/NCIGodfreyTables.H, see variable table_nci_godfrey_galerkin_Ex_Ey_Bz. The filter is applied in PhysicalParticleContainer::Evolve, right after field gather and before particle push, see

void PhysicalParticleContainer::applyNCIFilter(int lev, const amrex::Box &box, amrex::Elixir &exeli, amrex::Elixir &eyeli, amrex::Elixir &ezeli, amrex::Elixir &bxeli, amrex::Elixir &byeli, amrex::Elixir &bzeli, amrex::FArrayBox &filtered_Ex, amrex::FArrayBox &filtered_Ey, amrex::FArrayBox &filtered_Ez, amrex::FArrayBox &filtered_Bx, amrex::FArrayBox &filtered_By, amrex::FArrayBox &filtered_Bz, const amrex::FArrayBox &Ex, const amrex::FArrayBox &Ey, const amrex::FArrayBox &Ez, const amrex::FArrayBox &Bx, const amrex::FArrayBox &By, const amrex::FArrayBox &Bz, amrex::FArrayBox const *&ex_ptr, amrex::FArrayBox const *&ey_ptr, amrex::FArrayBox const *&ez_ptr, amrex::FArrayBox const *&bx_ptr, amrex::FArrayBox const *&by_ptr, amrex::FArrayBox const *&bz_ptr)

Apply NCI Godfrey filter to all components of E and B before gather.

The NCI Godfrey filter is applied on Ex, the result is stored in filtered_Ex and the pointer exfab is modified (before this function is called, it points to Ex and after this function is called, it points to Ex_filtered)

Parameters:
  • lev – MR level

  • box – box onto which the filter is applied

  • exeli – safeguard Elixir object (to avoid de-allocating too early &#8212;between ParIter iterations&#8212; on GPU) for field Ex

  • eyeli – safeguard Elixir object (to avoid de-allocating too early &#8212;between ParIter iterations&#8212; on GPU) for field Ey

  • ezeli – safeguard Elixir object (to avoid de-allocating too early &#8212;between ParIter iterations&#8212; on GPU) for field Ez

  • bxeli – safeguard Elixir object (to avoid de-allocating too early &#8212;between ParIter iterations&#8212; on GPU) for field Bx

  • byeli – safeguard Elixir object (to avoid de-allocating too early &#8212;between ParIter iterations&#8212; on GPU) for field By

  • bzeli – safeguard Elixir object (to avoid de-allocating too early &#8212;between ParIter iterations&#8212; on GPU) for field Bz

  • filtered_Ex – Array containing filtered value

  • filtered_Ey – Array containing filtered value

  • filtered_Ez – Array containing filtered value

  • filtered_Bx – Array containing filtered value

  • filtered_By – Array containing filtered value

  • filtered_Bz – Array containing filtered value

  • Ex – Field array before filtering (not modified)

  • Ey – Field array before filtering (not modified)

  • Ez – Field array before filtering (not modified)

  • Bx – Field array before filtering (not modified)

  • By – Field array before filtering (not modified)

  • Bz – Field array before filtering (not modified)

  • ex_ptr – pointer to the Ex field (modified)

  • ey_ptr – pointer to the Ey field (modified)

  • ez_ptr – pointer to the Ez field (modified)

  • bx_ptr – pointer to the Bx field (modified)

  • by_ptr – pointer to the By field (modified)

  • bz_ptr – pointer to the Bz field (modified)

Particles

Particle containers

Particle structures and functions are defined in Source/Particles/. WarpX uses the Particle class from AMReX for single particles. An ensemble of particles (e.g., a plasma species, or laser particles) is stored as a WarpXParticleContainer (see description below) in a per-box (and even per-tile on CPU) basis.

class WarpXParticleContainer : public NamedComponentParticleContainer<amrex::DefaultAllocator>

WarpXParticleContainer is the base polymorphic class from which all concrete particle container classes (that store a collection of particles) derive. Derived classes can be used for plasma particles, photon particles, or non-physical particles (e.g., for the laser antenna). It derives from amrex::ParticleContainerPureSoA<PIdx::nattribs>, where the template arguments stand for the number of int and amrex::Real SoA data in amrex::SoAParticle.

  • SoA amrex::Real: positions x, y, z, momentum ux, uy, uz, … see PIdx for details; more can be added at runtime

  • SoA int: 0 attributes by default, but can be added at runtime

  • SoA uint64_t: idcpu, a global 64bit index, with a 40bit local id and a 24bit cpu id (both set at creation) the list.

WarpXParticleContainer contains the main functions for initialization, interaction with the grid (field gather and current deposition) and particle push.

Note: many functions are pure virtual (meaning they MUST be defined in derived classes, e.g., Evolve) or actual functions (e.g. CurrentDeposition).

Subclassed by LaserParticleContainer, PhysicalParticleContainer

Physical species are stored in PhysicalParticleContainer, that derives from WarpXParticleContainer. In particular, the main function to advance all particles in a physical species is PhysicalParticleContainer::Evolve (see below).

virtual void PhysicalParticleContainer::Evolve(int lev, const amrex::MultiFab &Ex, const amrex::MultiFab &Ey, const amrex::MultiFab &Ez, const amrex::MultiFab &Bx, const amrex::MultiFab &By, const amrex::MultiFab &Bz, amrex::MultiFab &jx, amrex::MultiFab &jy, amrex::MultiFab &jz, amrex::MultiFab *cjx, amrex::MultiFab *cjy, amrex::MultiFab *cjz, amrex::MultiFab *rho, amrex::MultiFab *crho, const amrex::MultiFab *cEx, const amrex::MultiFab *cEy, const amrex::MultiFab *cEz, const amrex::MultiFab *cBx, const amrex::MultiFab *cBy, const amrex::MultiFab *cBz, amrex::Real t, amrex::Real dt, DtType a_dt_type = DtType::Full, bool skip_deposition = false, PushType push_type = PushType::Explicit) override

Finally, all particle species (physical plasma species PhysicalParticleContainer, photon species PhotonParticleContainer or non-physical species LaserParticleContainer) are stored in MultiParticleContainer. The class WarpX holds one instance of MultiParticleContainer as a member variable, called WarpX::mypc (where mypc stands for “my particle containers”):

class MultiParticleContainer

The class MultiParticleContainer holds multiple instances of the polymorphic class WarpXParticleContainer, stored in its member variable “allcontainers”. The class WarpX typically has a single (pointer to an) instance of MultiParticleContainer.

MultiParticleContainer typically has two types of functions:

  • Functions that loop over all instances of WarpXParticleContainer in allcontainers and calls the corresponding function (for instance, MultiParticleContainer::Evolve loops over all particles containers and calls the corresponding WarpXParticleContainer::Evolve function).

  • Functions that specifically handle multiple species (for instance ReadParameters or mapSpeciesProduct).

Loop over particles

A typical loop over particles reads:

// pc is a std::unique_ptr<WarpXParticleContainer>
// Loop over MR levels
for (int lev = 0; lev <= finest_level; ++lev) {
    // Loop over particles, box by box
    for (WarpXParIter pti(*this, lev); pti.isValid(); ++pti) {
        // Do something on particles
        // [MY INNER LOOP]
    }
}

The innermost step [MY INNER LOOP] typically calls amrex::ParallelFor to perform operations on all particles in a portable way. The innermost loop in the code snippet above could look like:

// Get Struct-Of-Array particle data, also called attribs
// (x, y, z, ux, uy, uz, w)
auto& attribs = pti.GetAttribs();
auto& x = attribs[PIdx::x];
// [...]
// Number of particles in this box
const long np = pti.numParticles();

Main functions

virtual void PhysicalParticleContainer::PushPX(WarpXParIter &pti, amrex::FArrayBox const *exfab, amrex::FArrayBox const *eyfab, amrex::FArrayBox const *ezfab, amrex::FArrayBox const *bxfab, amrex::FArrayBox const *byfab, amrex::FArrayBox const *bzfab, amrex::IntVect ngEB, int, long offset, long np_to_push, int lev, int gather_lev, amrex::Real dt, ScaleFields scaleFields, DtType a_dt_type = DtType::Full)
void WarpXParticleContainer::DepositCurrent(amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J, amrex::Real dt, amrex::Real relative_time)

Deposit current density.

Parameters:
  • J[inout] vector of current densities (one three-dimensional array of pointers to MultiFabs per mesh refinement level)

  • dt[in] Time step for particle level

  • relative_time[in] Time at which to deposit J, relative to the time of the current positions of the particles. When different than 0, the particle position will be temporarily modified to match the time of the deposition.

Note

The current deposition is used both by PhysicalParticleContainer and LaserParticleContainer, so it is in the parent class WarpXParticleContainer.

Buffers

To reduce numerical artifacts at the boundary of a mesh-refinement patch, WarpX has an option to use buffers: When particles evolve on the fine level, they gather from the coarse level (e.g., Efield_cax, a copy of the aux data from the level below) if they are located on the fine level but fewer than WarpX::n_field_gather_buffer cells away from the coarse-patch boundary. Similarly, when particles evolve on the fine level, they deposit on the coarse level (e.g., Efield_cp) if they are located on the fine level but fewer than WarpX::n_current_deposition_buffer cells away from the coarse-patch boundary.

WarpX::gather_buffer_masks and WarpX::current_buffer_masks contain masks indicating if a cell is in the interior of the fine-resolution patch or in the buffers. Then, particles depending on this mask in

void PhysicalParticleContainer::PartitionParticlesInBuffers(long &nfine_current, long &nfine_gather, long np, WarpXParIter &pti, int lev, amrex::iMultiFab const *current_masks, amrex::iMultiFab const *gather_masks)

Note

Buffers are complex!

Particle attributes

WarpX adds the following particle attributes by default to WarpX particles. These attributes are stored in Struct-of-Array (SoA) locations of the AMReX particle containers: one SoA for amrex::ParticleReal attributes, one SoA for int attributes and one SoA for a uint64_t global particle index per particle. The data structures for those are either pre-described at compile-time (CT) or runtime (RT).

Attribute name

int/real

Description

Where

When

Notes

position_x/y/z

real

Particle position.

SoA

CT

weight

real

Particle position.

SoA

CT

momentum_x/y/z

real

Particle position.

SoA

CT

id

amrex::Long

CPU-local particle index where the particle was created.

SoA

CT

First 40 bytes of idcpu

cpu

int

CPU index where the particle was created.

SoA

CT

Last 24 bytes of idcpu

stepScraped

int

PIC iteration of the last step before the particle hits the boundary.

SoA

RT

Added when there is particle-boundary interaction. Saved in the boundary buffers.

deltaTimeScraped

real

Difference of time between the stepScraped and the exact time when the particle hits the boundary.

SoA

RT

Added when there is particle-boundary interaction. Saved in the boundary buffers.

n_x/y/z

real

Normal components to the boundary on the position where the particle hits the boundary.

SoA

RT

Added when there is particle-boundary interaction. Saved in the boundary buffers.

ionizationLevel

int

Ion ionization level

SoA

RT

Added when ionization physics is used.

opticalDepthQSR

real

QED: optical depth of the Quantum- Synchrotron process

SoA

RT

Added when PICSAR QED physics is used.

opticalDepthBW

real

QED: optical depth of the Breit- Wheeler process

SoA

RT

Added when PICSAR QED physics is used.

WarpX allows extra runtime attributes to be added to particle containers (through AddRealComp("attrname") or AddIntComp("attrname")). The attribute name can then be used to access the values of that attribute. For example, using a particle iterator, pti, to loop over the particles the command pti.GetAttribs(particle_comps["attrname"]).dataPtr(); will return the values of the "attrname" attribute.

User-defined integer or real attributes are initialized when particles are generated in AddPlasma(). The attribute is initialized with a required user-defined parser function. Please see the input options addIntegerAttributes and addRealAttributes for a user-facing documentation.

Commonly used runtime attributes are described in the table below and are all part of SoA particle storage:

Attribute name

int/real

Description

Default value

prev_x/y/z

real

The coordinates of the particles at the previous timestep.

user-defined

orig_x/y/z

real

The coordinates of the particles when they were created.

user-defined

A Python example that adds runtime options can be found in Examples/Tests/particle_data_python

Note

Only use _ to separate components of vectors!

Accelerator lattice

The files in this directory handle the accelerator lattice. These are fields of various types and configurations. The lattice is laid out along the z-axis.

The AcceleratorLattice has the instances of the accelerator element types and handles the input of the data.

The LatticeElementFinder manages the application of the fields to the particles. It maintains index lookup tables that allow rapidly determining which elements the particles are in.

The classes for each element type are in the subdirectory LatticeElements.

Host and device classes

The LatticeElementFinder and each of the element types have two classes, one that lives on the host and one that can be trivially copied to the device. This dual structure is needed because of the complex data structures describing both the accelerator elements and the index lookup tables. The host level classes manage the data structures, reading in and setting up the data. The host classes copy the data to the device and maintain the pointers to that data on the device. The device level classes grab pointers to the appropriate data (on the device) needed when fetching the data for the particles.

External fields

The lattice fields are applied to the particles from the GetExternalEBField class. If a lattice is defined, the GetExternalEBField class gets the lattice element finder device level instance associated with the grid being operated on. The fields are applied from that instance, which calls the “get_field” method for each lattice element type that is defined for each particle.

Adding new element types

A number of places need to be touched when adding a new element types. The best method is to look for every place where the “quad” element is referenced and duplicate the code for the new element type. Changes will only be needed within the AcceleratorLattice directory.

Initialization

Note

Section almost empty!!

General simulation initialization

Regular simulation
Running in a boosted frame

Field initialization

Particle initialization

Diagnostics

Regular Diagnostics (plotfiles)

Note

Section empty!

Back-Transformed Diagnostics

Note

Section empty!

Moving Window

Note

Section empty!

QED

Quantum synchrotron

Note

Section empty!

Breit-Wheeler

Note

Section empty!

Schwinger process

If the code is compiled with QED and the user activates the Schwinger process in the input file, electron-positron pairs can be created in vacuum in the function MultiParticleContainer::doQEDSchwinger:

void MultiParticleContainer::doQEDSchwinger()

If Schwinger process is activated, this function is called at every timestep in Evolve and is used to create Schwinger electron-positron pairs. Within this function we loop over all cells to calculate the number of created physical pairs. If this number is higher than 0, we create a single particle per species in this cell, with a weight corresponding to the number of physical particles.

MultiParticleContainer::doQEDSchwinger in turn calls the function filterCreateTransformFromFAB:

Warning

doxygenfunction: Unable to resolve function “filterCreateTransformFromFAB” with arguments (DstTile&, DstTile&, const amrex::Box, const FABs&, const Index, const Index, FilterFunc&&, CreateFunc1&&, CreateFunc2&&, TransFunc&&) in doxygen xml output for project “WarpX” from directory: ../doxyxml/. Potential matches:

- template<int N, typename DstPC, typename DstTile, typename FAB, typename Index, typename CreateFunc1, typename CreateFunc2, typename TransFunc, amrex::EnableIf_t<std::is_integral<Index>::value, int> foo = 0> Index filterCreateTransformFromFAB(DstPC &pc1, DstPC &pc2, DstTile &dst1, DstTile &dst2, const amrex::Box box, const FAB *src_FAB, const Index *mask, const Index dst1_index, const Index dst2_index, CreateFunc1 &&create1, CreateFunc2 &&create2, TransFunc &&transform, const amrex::Geometry &geom_lev_zero) noexcept
- template<int N, typename DstPC, typename DstTile, typename FABs, typename Index, typename FilterFunc, typename CreateFunc1, typename CreateFunc2, typename TransFunc> Index filterCreateTransformFromFAB(DstPC &pc1, DstPC &pc2, DstTile &dst1, DstTile &dst2, const amrex::Box box, const FABs &src_FABs, const Index dst1_index, const Index dst2_index, FilterFunc &&filter, CreateFunc1 &&create1, CreateFunc2 &&create2, TransFunc &&transform, const amrex::Geometry &geom_lev_zero) noexcept

filterCreateTransformFromFAB proceeds in three steps. In the filter phase, we loop on every cell and calculate the number of physical pairs created within the time step dt as a function of the electromagnetic field at the given cell position. This probabilistic calculation is done via a wrapper that calls the PICSAR library. In the create phase, the particles are created at the desired positions, currently at the cell nodes. In the transform phase, we assign a weight to the particles depending on the number of physical pairs created. At most one macroparticle is created per cell per timestep per species, with a weight corresponding to the total number of physical pairs created.

So far the Schwinger module requires using warpx.grid_type = collocated or algo.field_gathering = momentum-conserving (so that the auxiliary fields are calculated on the nodes) and is not compatible with either mesh refinement, RZ coordinates or single precision.

Portability

Note

Section empty!

Warning logger

The ⚠️ warning logger ⚠️ allows grouping the warning messages raised during the simulation, in order to display them together in a list (e.g., right after step 1 and at the end of the simulation).

General description

If no warning messages are raised, the warning list should look as follows:

**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ FIRST STEP ]
*
* No recorded warnings.
********************************************************************************

On the contrary, if warning messages are raised, the list should look as follows:

**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ FIRST STEP ]
*
* --> [!! ] [Species] [raised once]
*     Both 'electrons.charge' and electrons.species_type' are specified.
*     electrons.charge' will take precedence.
*     @ Raised by: ALL
*
* --> [!! ] [Species] [raised once]
*     Both 'electrons.mass' and electrons.species_type' are specified.
*     electrons.mass' will take precedence.
*     @ Raised by: ALL
*
********************************************************************************

Here, GLOBAL indicates that warning messages are gathered across all the MPI ranks (specifically after the FIRST STEP).

Each entry of warning list respects the following format:

* --> [PRIORITY] [TOPIC] [raised COUNTER]
*     MULTILINE MESSAGE
*     MULTILINE MESSAGE
*     @ Raised by: WHICH_RANKS

where:

  • [PRIORITY] can be [!  ] (low priority), [!! ] (medium priority) or [!!!] (high priority). It indicates the importance of the warning.

  • [TOPIC] indicates which part of the code is concerned by the warning (e.g., particles, laser, parallelization…)

  • MULTILINE MESSAGE is an arbitrary text message. It can span multiple-lines. Text is wrapped automatically.

  • COUNTER indicates the number of times the warning was raised across all the MPI ranks. This means that if we run WarpX with 2048 MPI ranks and each rank raises the same warning once, the displayed message will be [raised 2048 times]. Possible values are once, twice, XX times

  • WHICH_RANKS can be either ALL or a sequence of rank IDs. It is the list of the MPI ranks which have raised the warning message.

Entries are sorted first by priority (high priority first), then by topic (alphabetically) and finally by text message (alphabetically).

How to record a warning for later display

In the code, instead of using amrex::Warning to immediately print a warning message, the following method should be called:

ablastr::warn_manager::WMRecordWarning(
   "QED",
   "Using default value (2*me*c^2) for photon energy creation threshold",
   ablastr::warn_manager::WarnPriority::low);

In this example, QED is the topic, Using [...] is the warning message and ablastr::warn_manager::WarnPriority::low is the priority. RecordWarning is not a collective call and should also be thread-safe (it can be called in OpenMP loops). In case the user wants to also print the warning messages immediately, the runtime parameter warpx.always_warn_immediately can be set to 1. The Warning manager is a singleton class defined in Source/ablastr/warn_manager/WarnManager.H`

How to print the warning list

The warning list can be printed as follows:

amrex::Print() << ablastr::warn_manager::GetWMInstance().PrintGlobalWarnings("THE END");

where the string is a temporal marker that appears in the warning list. At the moment this is done right after step one and at the end of the simulation. Calling this method triggers several collective calls that allow merging all the warnings recorded by all the MPI ranks.

Implementation details

How warning messages are recorded

Warning messages are stored by each rank as a map associating each message with a counter. A message is defined by its priority, its topic and its text. Given two messages, if any of these components differ between the two, the messages are considered as different.

How the global warning list is generated

In order to generate the global warning list we follow the strategy outlined below.

  1. Each MPI rank has a map<Msg, counter>, associating each with a counter, which counts how many times the warning has been raised on that rank.

  2. When PrintGlobalWarnings is called, the MPI ranks send to the I/O rank the number of different warnings that they have observed. The I/O rank finds the rank having more warnings and broadcasts 📢 this information back to all the others. This rank, referred in the following as gather rank, will lead 👑 the generation of the global warning list

  3. The gather rank serializes its warning messages [📝,📝,📝,📝,📝…] into a byte array 📦 and broadcasts 📢 this array to all the other ranks.

  4. The other ranks unpack this byte array 📦, obtaining a list of messages [📝,📝,📝,📝,📝…]

  5. For each message seen by the gather rank , each rank prepares a vector containing the number of times it has seen that message (i.e., the counter in map<Msg, counter> if Msg is in the map): [1️⃣,0️⃣,1️⃣,4️⃣,0️⃣…]

  6. In addition, each rank prepares a vector containing the messages seen only by that rank, associated with the corresponding counter: [(📝,1️⃣), (📝,4️⃣),…]

  7. Each rank appends the second list to the first one and packs them into a byte array: [1️⃣,0️⃣,1️⃣,4️⃣,0️⃣…] [(📝,1️⃣), (📝,4️⃣),…] –> 📦

  8. Each rank sends 📨 this byte array to the gather rank, which puts them together in a large byte vector [📦,📦,📦,📦,📦…]

  9. The gather rank parses the byte array, adding the counters of the other ranks to its counters, adding new messages to the message list, and keeping track of which rank has generated which warning 📜

  10. If the gather rank is also the I/O rank, then we are done 🎉, since the rank has a list of messages, global counters and ranks lists [(📝,4️⃣,📜 ), (📝,1️⃣,📜 ),… ]

  11. If the gather rank is not the I/O rank, then it packs the list into a byte array and sends 📨 it to the I/O rank, which unpacks it: gather rank [(📝,4️⃣,📜 ), (📝,1️⃣,📜 ),… ] –> 📦 –> 📨 –> 📦 –> [(📝,4️⃣,📜 ), (📝,1️⃣,📜 ),… ] I/O rank

This procedure is described in more details in these slides.

How to test the warning logger

In order to test the warning logger there is the possibility to inject “artificial” warnings with the inputfile. For instance, the following inputfile

#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 10
amr.n_cell =  128 128
amr.max_grid_size = 64
amr.blocking_factor = 32
amr.max_level = 0
geometry.dims = 2
geometry.prob_lo     = -20.e-6   -20.e-6    # physical domain
geometry.prob_hi     =  20.e-6    20.e-6

#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic
boundary.field_hi = periodic periodic

#################################
############ NUMERICS ###########
#################################
warpx.serialize_initial_conditions = 1
warpx.verbose = 1
warpx.cfl = 1.0
warpx.use_filter = 0

# Order of particle shape factors
algo.particle_shape = 1

#################################
######## DEBUG WARNINGS #########
#################################

warpx.test_warnings = w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15 w16 w17 w18 w19 w20 w21 w22

w1.topic    = "Priority Sort Test"
w1.msg      = "Test that priority is correctly sorted"
w1.priority = "low"
w1.all_involved = 1

w2.topic    = "Priority Sort Test"
w2.msg        = "Test that priority is correctly sorted"
w2.priority = "medium"
w2.all_involved = 1

w3.topic    = "Priority Sort Test"
w3.msg      = "Test that priority is correctly sorted"
w3.priority = "high"
w3.all_involved = 1

w4.topic    = "ZZA Topic sort Test"
w4.msg      = "Test that topic is correctly sorted"
w4.priority = "medium"
w4.all_involved = 1

w5.topic    = "ZZB Topic sort Test"
w5.msg      = "Test that topic is correctly sorted"
w5.priority = "medium"
w5.all_involved = 1

w6.topic    = "ZZC Topic sort Test"
w6.msg      = "Test that topic is correctly sorted"
w6.priority = "medium"
w6.all_involved = 1

w7.topic    = "Msg sort Test"
w7.msg      = "AAA Test that msg is correctly sorted"
w7.priority = "medium"
w7.all_involved = 1

w8.topic    = "Msg sort Test"
w8.msg      = "BBB Test that msg is correctly sorted"
w8.priority = "medium"
w8.all_involved = 1

w9.topic    = "Long line"
w9.msg      = "Test very long line: a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a"
w9.priority = "medium"
w9.all_involved = 1

w10.topic    = "Repeated warnings"
w10.msg      = "Test repeated warnings"
w10.priority = "high"
w10.all_involved = 1

w11.topic    = "Repeated warnings"
w11.msg      = "Test repeated warnings"
w11.priority = "high"
w11.all_involved = 1

w12.topic    = "Repeated warnings"
w12.msg      = "Test repeated warnings"
w12.priority = "high"
w12.all_involved = 1

w13.topic    = "Not all involved (0)"
w13.msg      = "Test warnings raised by a fraction of ranks"
w13.priority = "high"
w13.all_involved = 0
w13.who_involved = 0

w14.topic    = "Not all involved (0)"
w14.msg      = "Test warnings raised by a fraction of ranks"
w14.priority = "high"
w14.all_involved = 0
w14.who_involved = 0

w15.topic    = "Not all involved (1)"
w15.msg      = "Test warnings raised by a fraction of ranks"
w15.priority = "high"
w15.all_involved = 0
w15.who_involved = 1

w16.topic    = "Not all involved (1,2)"
w16.msg      = "Test warnings raised by a fraction of ranks"
w16.priority = "high"
w16.all_involved = 0
w16.who_involved = 1 2

w17.topic    = "Different counters"
w17.msg      = "Test that different counters are correctly summed"
w17.priority = "low"
w17.all_involved = 1

w18.topic    = "Different counters"
w18.msg      = "Test that different counters are correctly summed"
w18.priority = "low"
w18.all_involved = 1

w19.topic    = "Different counters"
w19.msg      = "Test that different counters are correctly summed"
w19.priority = "low"
w19.all_involved = 0
w19.who_involved = 0

w20.topic    = "Different counters B"
w20.msg      = "Test that different counters are correctly summed"
w20.priority = "low"
w20.all_involved = 1

w21.topic    = "Different counters B"
w21.msg      = "Test that different counters are correctly summed"
w21.priority = "low"
w21.all_involved = 1

w22.topic    = "Different counters B"
w22.msg      = "Test that different counters are correctly summed"
w22.priority = "low"
w22.all_involved = 0
w22.who_involved = 1

should generate the following warning list (if run on 4 MPI ranks):

**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ THE END ]
*
* --> [!!!] [Not all involved (0)] [raised twice]
*     Test warnings raised by a fraction of ranks
*     @ Raised by: 0
*
* --> [!!!] [Not all involved (1)] [raised once]
*     Test warnings raised by a fraction of ranks
*     @ Raised by: 1
*
* --> [!!!] [Not all involved (1,2)] [raised twice]
*     Test warnings raised by a fraction of ranks
*     @ Raised by: 1 2
*
* --> [!!!] [Priority Sort Test] [raised 4 times]
*     Test that priority is correctly sorted
*     @ Raised by: ALL
*
* --> [!!!] [Repeated warnings] [raised 12 times]
*     Test repeated warnings
*     @ Raised by: ALL
*
* --> [!! ] [Long line] [raised 4 times]
*     Test very long line: a a a a a a a a a a a a a a a a a a a a a a a a a a a
*     a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
*     @ Raised by: ALL
*
* --> [!! ] [Msg sort Test] [raised 4 times]
*     AAA Test that msg is correctly sorted
*     @ Raised by: ALL
*
* --> [!! ] [Msg sort Test] [raised 4 times]
*     BBB Test that msg is correctly sorted
*     @ Raised by: ALL
*
* --> [!! ] [Priority Sort Test] [raised 4 times]
*     Test that priority is correctly sorted
*     @ Raised by: ALL
*
* --> [!! ] [ZZA Topic sort Test] [raised 4 times]
*     Test that topic is correctly sorted
*     @ Raised by: ALL
*
* --> [!! ] [ZZB Topic sort Test] [raised 4 times]
*     Test that topic is correctly sorted
*     @ Raised by: ALL
*
* --> [!! ] [ZZC Topic sort Test] [raised 4 times]
*     Test that topic is correctly sorted
*     @ Raised by: ALL
*
* --> [!  ] [Different counters] [raised 9 times]
*     Test that different counters are correctly summed
*     @ Raised by: ALL
*
* --> [!  ] [Different counters B] [raised 9 times]
*     Test that different counters are correctly summed
*     @ Raised by: ALL
*
* --> [!  ] [Priority Sort Test] [raised 4 times]
*     Test that priority is correctly sorted
*     @ Raised by: ALL
*
********************************************************************************

Processing PICMI Input Options

The input parameters in a WarpX PICMI file are processed in two layers. The first layer is the Python level API, which mirrors the C++ application input structure; the second is the translation from the PICMI input to the equivalent app (AMReX) input file parameters.

The two layers are described below.

Input parameters

In a C++ input file, each of the parameters has a prefix, for example geometry in geometry.prob_lo. For each of these prefixes, an instance of a Python class is created and the parameters saved as attributes. This construction is used since the lines in the input file look very much like a Python assignment statement, assigning attributes of class instances, for example geometry.dims = 3.

Many of the prefix instances are predefined, for instance geometry is created in the file Python/pywarpx/Geometry.py. In that case, geometry is an instance of the class Bucket (specified in Python/pywarpx/Bucket.py), the general class for prefixes. It is called Bucket since its main purpose is a place to hold attributes. Most of the instances are instances of the Bucket class. There are exceptions, such as constants and diagnostics where extra processing is needed.

There can also be instances created as needed. For example, for the particle species, an instance is created for each species listed in particles.species_names. This gives a place to hold the parameters for the species, e.g., electrons.mass.

The instances are then used to generate the input parameters. Each instance can generate a list of strings, one for each attribute. This happens in the Bucket.attrlist method. The strings will be the lines as in an input file, for example "electrons.mass = m_e". The lists for each instance are gathered into one long list in the warpx instance (of the class WarpX defined in Python/pywarpx/WarpX.py). This instance has access to all of the predefined instances as well as lists of the generated instances.

In both of the ways that WarpX can be run with Python, that list of input parameter strings will be generated. This is done in the routine WarpX.create_argv_list in Python/pywarpx/WarpX.py. If WarpX will be run directly in Python, that list will be sent to the amrex_init routine as the argv. This is as if all of the input parameters had been specified on the command line. If Python is only used as a prepocessor to generate the input file, the list are the strings that are written out to create the input file.

There are two input parameters that do not have prefixes, max_step and stop_time. These are handled via keyword arguments in the WarpX.create_argv_list method.

Conversion from PICMI

In the PICMI implementation, defined in Python/pywarpx/picmi.py, for each PICMI class, a class was written that inherits the PICMI class and does the processing of the input. Each of the WarpX classes has two methods, init and initialize_inputs. The init method is called during the creation of the class instances that happens in the user’s PICMI input file. This is part of the standard - each of the PICMI classes call the method handle_init from the constructor __init__ routines. The main purpose is to process application specific keyword arguments (those that start with warpx_ for example). These are then passed into the init methods. In the WarpX implementation, in the init, each of the WarpX specific arguments are saved as attributes of the implementation class instances.

It is in the second method, initialize_inputs, where the PICMI input parameters are translated into WarpX input parameters. This method is called later during the initialization. The prefix instances described above are all accessible in the implementation classes (via the pywarpx module). For each PICMI input quantity, the appropriate WarpX input parameters are set in the prefix classes. As needed, for example in the Species class, the dynamic prefix instances are created and the attributes set.

Simulation class

The Simulation class ties it all together. In a PICMI input file, all information is passed into the Simulation class instance, either through the constructor or through add_ methods. Its initialize_inputs routine initializes the input parameters it handles and also calls the initialize_inputs methods of all of the PICMI class instances that have been passed in, such as the field solver, the particles species, and the diagnostics. As with other PICMI classes, the init routine is called by the constructor and initialize_inputs is called during initialization. The initialization happens when either the write_input_file method is called or the step method. After initialize_inputs is finished, the attributes of the prefix instances have been filled in, and the process described above happens, where the prefix instances are looped over to generate the list of input parameter strings (that is either written out to a file or passed in as argv). The two parameters that do not have a prefix, max_step and stop_time, are passed into the warpx method as keyword arguments.

Python runtime interface

The Python interface provides low and high level access to much of the data in WarpX. With the low level access, a user has direct access to the underlying memory contained in the MultiFabs and in the particle arrays. The high level provides a more user friendly interface.

High level interface

There are two python modules that provide convenient access to the fields and the particles.

Fields

The fields module provides wrapper around most of the MultiFabs that are defined in the WarpX class. For a list of all of the available wrappers, see the file Python/pywarpx/fields.py. For each MultiFab, there is a function that will return a wrapper around the data. For instance, the function ExWrapper returns a wrapper around the x component of the MultiFab vector Efield_aux.

from pywarpx import fields
Ex = fields.ExWrapper()

By default, this wraps the MultiFab for level 0. The level argument can be specified for other levels. By default, the wrapper only includes the valid cells. To include the ghost cells, set the argument include_ghosts=True.

The wrapper provides access to the data via global indexing. Using standard array indexing (with exceptions) with square brackets, the data can be accessed using indices that are relative to the full domain (across the MultiFab and across processors). With multiple processors, the result is broadcast to all processors. This example will return the Bz field at all points along x at the specified y and z indices.

from pywarpx import fields
Bz = fields.BzWrapper()
Bz_along_x = Bz[:,5,6]

The same global indexing can be done to set values. This example will set the values over a range in y and z at the specified x. The data will be scattered appropriately to the underlying FABs.

from pywarpx import fields
Jy = fields.JyFPWrapper()
Jy[5,6:20,8:30] = 7.

The code does error checking to ensure that the specified indices are within the bounds of the global domain. Note that negative indices are handled differently than with numpy arrays because of the possibility of having ghost cells. With ghost cells, the lower ghost cells are accessed using negative indices (since 0 is the index of the lower bound of the valid cells). Without ghost cells, a negative index will always raise an out of bounds error since there are no ghost cells.

Under the covers, the wrapper object has a list of numpy arrays that have pointers to the underlying data, one array for each FAB. When data is being fetched, it loops over that list to gather the data. The result is then gathered among all processors. Note that the result is not writeable, in the sense that changing it won’t change the underlying data since it is a copy. When the data is set, using the global indexing, a similar process is done where the processors loop over their FABs and set the data at the appropriate indices.

The wrappers are always up to date since whenever an access is done (either a get or a set), the list of numpy arrays for the FABs is regenerated. In this case, efficiency is sacrificed for consistency.

If it is needed, the list of numpy arrays associated with the FABs can be obtained using the wrapper method _getfields. Additionally, there are the methods _getlovects and _gethivects that get the list of the bounds of each of the arrays.

Particles

This is still in development.

Tip

A tutorial-style overview of the code structure can also be found in a developer presentation from 03/2020. It contains information about the code structure, a step-by-step description of what happens in a simulation (initialization and iterations) as well as slides on topics relevant to WarpX development.

Information in the following pages are generally more up-to-date, but the slides above might still be useful.

C++ Objects & Functions

We generate the documentation of C++ objects and functions from our C++ source code by adding Doxygen strings.

This documentation dynamically links to objects described in dependencies:

GNUmake Build System (Legacy)

CMake is our primary build system. In this section, we describe our legacy build scripts - do not use them unless you used them before.

WarpX is built on AMReX, which also provides support for a Linux-centric set of build scripts implemented in GNUmake. Since we sometimes need to move fast and test highly experimental compilers and Unix derivates on core components of WarpX, this set of build scripts is used by some of our experienced developers.

Warning

On the long-term, these scripts do not scale to the full feature set of WarpX and its dependencies. Please see the CMake-based developer section instead.

This page describes the most basic build with GNUmake files and points to instructions for more advanced builds.

Downloading the source code

Clone the source codes of WarpX, and its dependencies AMReX and PICSAR into one single directory (e.g. warpx_directory):

mkdir warpx_directory
cd warpx_directory
git clone https://github.com/ECP-WarpX/WarpX.git
git clone https://github.com/ECP-WarpX/picsar.git
git clone https://github.com/ECP-WarpX/warpx-data.git
git clone https://github.com/AMReX-Codes/amrex.git

Note

The warpx-data repository is currently only needed for MCC cross-sections.

Basic compilation

WarpX requires a C/C++ compiler (e.g., GNU, LLVM or Intel) and an MPI implementation (e.g., OpenMPI or MPICH). Start a GNUmake build by cd-ing into the directory WarpX and type

make -j 4

This will generate an executable file in the Bin directory.

Compile-time vs. run-time options

WarpX has multiple compile-time and run-time options. The compilation options are set in the file GNUmakefile. The default options correspond to an optimized code for 3D geometry. The main compile-time options are:

  • DIM=3 or 2: Geometry of the simulation (note that running an executable compiled for 3D with a 2D input file will crash).

  • DEBUG=FALSE or TRUE: Compiling in DEBUG mode can help tremendously during code development.

  • USE_PSATD=FALSE or TRUE: Compile the Pseudo-Spectral Analytical Time Domain Maxwell solver. Requires an FFT library.

  • USE_RZ=FALSE or TRUE: Compile for 2D axisymmetric geometry.

  • COMP=gcc or intel: Compiler.

  • USE_MPI=TRUE or FALSE: Whether to compile with MPI support.

  • USE_OMP=TRUE or FALSE: Whether to compile with OpenMP support.

  • USE_GPU=TRUE or FALSE: Whether to compile for Nvidia GPUs (requires CUDA).

  • USE_OPENPMD=TRUE or FALSE: Whether to support openPMD for I/O (requires openPMD-api).

  • MPI_THREAD_MULTIPLE=TRUE or FALSE: Whether to initialize MPI with thread multiple support. Required to use asynchronous IO with more than amrex.async_out_nfiles (by default, 64) MPI tasks. Please see data formats for more information.

  • PRECISION=FLOAT USE_SINGLE_PRECISION_PARTICLES=TRUE: Switch from default double precision to single precision (experimental).

For a description of these different options, see the corresponding page in the AMReX documentation.

Alternatively, instead of modifying the file GNUmakefile, you can directly pass the options in command line ; for instance:

make -j 4 USE_OMP=FALSE

In order to clean a previously compiled version (typically useful for troubleshooting, if you encounter unexpected compilation errors):

make realclean

before re-attempting compilation.

Advanced GNUmake instructions

Building WarpX with support for openPMD output

WarpX can dump data in the openPMD format. This feature currently requires to have a parallel version of HDF5 installed ; therefore we recommend to use spack in order to facilitate the installation.

More specifically, we recommend that you try installing the openPMD-api library 0.15.1 or newer using spack (first section below). If this fails, a back-up solution is to install parallel HDF5 with spack, and then install the openPMD-api library from source.

In order to install spack, you can simply do:

git clone https://github.com/spack/spack.git
export SPACK_ROOT=$PWD/spack
. $SPACK_ROOT/share/spack/setup-env.sh

You may want to auto-activate spack when you open a new terminal by adding this to your $HOME/.bashrc file:

echo -e "# activate spack package manager\n. ${SPACK_ROOT}/share/spack/setup-env.sh" >> $HOME/.bashrc
WarpX Development Environment with Spack

Create and activate a Spack environment with all software needed to build WarpX

spack env create warpx-dev    # you do this once
spack env activate warpx-dev
spack add gmake
spack add mpi
spack add openpmd-api
spack add pkg-config
spack install

This will download and compile all dependencies.

Whenever you need this development environment in the future, just repeat the quick spack env activate warpx-dev step. For example, we can now compile WarpX by cd-ing into the WarpX folder and typing:

spack env activate warpx-dev
make -j 4 USE_OPENPMD=TRUE

You will also need to load the same spack environment when running WarpX, for instance:

spack env activate warpx-dev
mpirun -np 4 ./warpx.exe inputs

You can check which Spack environments exist and if one is still active with

spack env list  # already created environments
spack env st    # is an environment active?
Installing openPMD-api from source

You can also build openPMD-api from source, e.g. to build against the module environment of a supercomputer cluster.

First, load the according modules of the cluster to support the openPMD-api dependencies. You can find the required and optional dependencies here.

You usually just need a C++ compiler, CMake, and one or more file backend libraries, such as HDF5 and/or ADIOS2.

If optional dependencies are installed in non-system paths, one needs to hint their installation location with an environment variable during the build phase:

# optional: only if you manually installed HDF5 and/or ADIOS2 in custom directories
export HDF5_ROOT=$HOME/path_to_installed_software/hdf5-1.12.0/
export ADIOS2_ROOT=$HOME/path_to_installed_software/adios2-2.7.1/

Then, in the $HOME/warpx_directory/, download and build openPMD-api:

git clone https://github.com/openPMD/openPMD-api.git
mkdir openPMD-api-build
cd openPMD-api-build
cmake ../openPMD-api -DopenPMD_USE_PYTHON=OFF -DCMAKE_INSTALL_PREFIX=$HOME/warpx_directory/openPMD-install/ -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_RPATH='$ORIGIN'
cmake --build . --target install

Finally, compile WarpX:

cd ../WarpX
# Note that one some systems, /lib might need to be replaced with /lib64.
export PKG_CONFIG_PATH=$HOME/warpx_directory/openPMD-install/lib/pkgconfig:$PKG_CONFIG_PATH
export CMAKE_PREFIX_PATH=$HOME/warpx_directory/openPMD-install:$CMAKE_PREFIX_PATH

make -j 4 USE_OPENPMD=TRUE

Note

If you compile with CMake, all you need to add is the -DWarpX_OPENPMD=ON option (on by default), and we will download and build openPMD-api on-the-fly.

When running WarpX, we will recall where you installed openPMD-api via RPATHs, so you just need to load the same module environment as used for building (same MPI, HDF5, ADIOS2, for instance).

# module load ...  (compiler, MPI, HDF5, ADIOS2, ...)

mpirun -np 4 ./warpx.exe inputs

Building the spectral solver

By default, the code is compiled with a finite-difference (FDTD) Maxwell solver. In order to run the code with a spectral solver, you need to:

  • Install (or load) an MPI-enabled version of FFTW. For instance, for Debian, this can be done with

    apt-get install libfftw3-dev libfftw3-mpi-dev
    
  • Set the environment variable FFTW_HOME to the path for FFTW. For instance, for Debian, this is done with

    export FFTW_HOME=/usr/
    
  • Set USE_PSATD=TRUE when compiling:

    make -j 4 USE_PSATD=TRUE
    

See Building WarpX to use RZ geometry for using the spectral solver with USE_RZ. Additional steps are needed. PSATD is compatible with single precision, but please note that, on CPU, FFTW needs to be compiled with option --enable-float.

Building WarpX to use RZ geometry

WarpX can be built to run with RZ geometry. Both an FDTD solver (the default) and a PSATD solver are available. Both solvers allow multiple azimuthal modes.

To select RZ geometry, set the flag USE_RZ = TRUE when compiling:

make -j 4 USE_RZ=TRUE

Note that this sets DIM=2, which is required with USE_RZ=TRUE. The executable produced will have “RZ” as a suffix.

RZ geometry with spectral solver

Additional steps are needed to build the spectral solver. Some of the steps are the same as is done for the Cartesian spectral solver, setting up the FFTW package and setting USE_PSATD=TRUE.

  • Install (or load) an MPI-enabled version of FFTW. For instance, for Debian, this can be done with

    apt-get install libfftw3-dev libfftw3-mpi-dev
    
  • Set the environment variable FFTW_HOME to the path for FFTW. For instance, for Debian, this is done with

    export FFTW_HOME=/usr/
    
  • Download and build the blaspp and lapackpp packages. These can be obtained from GitHub.

    git clone https://github.com/icl-utk-edu/blaspp.git
    git clone https://github.com/icl-utk-edu/lapackpp.git
    

    The two packages can be built in multiple ways. A recommended method is to follow the cmake instructions provided in the INSTALL.md that comes with the packages. They can also be installed using spack.

  • Set the environment variables BLASPP_HOME and LAPACKPP_HOME to the locations where the packages libraries were installed. For example, using bash:

    export BLASPP_HOME=/location/of/installation/blaspp
    export LAPACKPP_HOME=/location/of/installation/lapackpp
    
  • In some case, the blas and lapack libraries need to be specified. If needed, this can be done by setting the BLAS_LIB and LAPACK_LIB environment variables appropriately. For example, using bash:

    export BLAS_LIB=-lblas
    
  • Set USE_PSATD=TRUE when compiling:

    make -j 4 USE_RZ=TRUE USE_PSATD=TRUE
    

Building WarpX with GPU support (Linux only)

Warning

In order to build WarpX on a specific GPU cluster (e.g. Summit), look for the corresponding specific instructions, instead of those on this page.

In order to build WarpX with GPU support, make sure that you have cuda and mpich installed on your system. (Compiling with openmpi currently fails.) Then compile WarpX with the option USE_GPU=TRUE, e.g.

make -j 4 USE_GPU=TRUE

Installing WarpX as a Python package

A full Python installation of WarpX can be done, which includes a build of all of the C++ code, or a pure Python version can be made which only installs the Python scripts. WarpX requires Python version 3.8 or newer.

For a full Python installation of WarpX

WarpX’ Python bindings depend on numpy, periodictable, picmistandard, and mpi4py.

Type

make -j 4 USE_PYTHON_MAIN=TRUE

or edit the GNUmakefile and set USE_PYTHON_MAIN=TRUE, and type

make -j 4

Additional compile time options can be specified as needed. This will compile the code, and install the Python bindings and the Python scripts as a package (named pywarpx) in your standard Python installation (i.e. in your site-packages directory).

If you do not have write permission to the default Python installation (e.g. typical on computer clusters), there are two options. The recommended option is to use a virtual environment, which provides the most flexibility and robustness.

Alternatively, add the --user install option to have WarpX installed elsewhere.

make -j 4 PYINSTALLOPTIONS=--user

With --user, the default location will be in your home directory, ~/.local, or the location defined by the environment variable PYTHONUSERBASE.

In HPC environments, it is often recommended to install codes in scratch or work space which typically have faster disk access.

The different dimensioned versions of WarpX, 3D, 2D, and RZ, can coexist in the Python installation. The appropriate one will be imported depending on the input file. Note, however, other options will overwrite - compiling with DEBUG=TRUE will replace the version compiled with DEBUG=FALSE for example.

For a pure Python installation

This avoids the compilation of the C++ and is recommended when only using the Python input files as preprocessors. This installation depend on numpy, periodictable, and picmistandard.

Go into the Python subdirectory and run

python setup.py install

This installs the Python scripts as a package (named pywarpx) in your standard Python installation (i.e. in your site-packages directory). If you do not have write permission to the default Python installation (e.g. typical on computer clusters), there are two options. The recommended option is to use a virtual environment, which provides the most flexibility and robustness.

Alternatively, add the --user install option to have WarpX installed elsewhere.

python setup.py install --user

With --user, the default location will be in your home directory, ~/.local, or the location defined by the environment variable PYTHONUSERBASE.

Building WarpX with Spack

As mentioned in the install section, WarpX can be installed using Spack. From the Spack web page: “Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments.”

Note

Quick-start hint for macOS users: Before getting started with Spack, please check what you manually installed in /usr/local. If you find entries in bin/, lib/ et al. that look like you manually installed MPI, HDF5 or other software at some point, then remove those files first.

If you find software such as MPI in the same directories that are shown as symbolic links then it is likely you brew installed software before. Run brew unlink … on such packages first to avoid software incompatibilities.

Spack is available from github. Spack only needs to be cloned and can be used right away - there are no installation steps. You can add binary caches for faster builds:

spack mirror add rolling https://binaries.spack.io/develop
spack buildcache keys --install --trust

Do not miss out on the official Spack tutorial if you are new to Spack.

The spack command, spack/bin/spack, can be used directly or spack/bin can be added to your PATH environment variable.

WarpX is built with the single command

spack install warpx

This will build the 3-D version of WarpX using the development branch. At the very end of the output from build sequence, Spack tells you where the WarpX executable has been placed. Alternatively, spack load warpx can be called, which will put the executable in your PATH environment variable.

WarpX can be built in several variants, see

spack info warpx
spack info py-warpx

for all available options.

For example

spack install warpx dims=2 build_type=Debug

will build the 2-D version and also turns debugging on.

See spack help --spec for all syntax details. Also, please consult the basic usage section of the Spack package manager for an extended introduction to Spack.

The Python version of WarpX is available through the py-warpx package.

Workflows

Profiling the Code

Profiling allows us to find the bottle-necks of the code as it is currently implemented. Bottle-necks are the parts of the code that may delay the simulation, making it more computationally expensive. Once found, we can update the related code sections and improve its efficiency. Profiling tools can also be used to check how load balanced the simulation is, i.e. if the work is well distributed across all MPI ranks used. Load balancing can be activated in WarpX by setting input parameters, see the parallelization input parameter section.

AMReX’s Tiny Profiler

By default, WarpX uses the AMReX baseline tool, the TINYPROFILER, to evaluate the time information for different parts of the code (functions) between the different MPI ranks. The results, timers, are stored into four tables in the standard output, stdout, that are located below the simulation steps information and above the warnings regarding unused input file parameters (if there were any).

The timers are displayed in tables for which the columns correspond to:

  • name of the function

  • number of times it is called in total

  • minimum of time spent exclusively/inclusively in it, between all ranks

  • average of time, between all ranks

  • maximum time, between all ranks

  • maximum percentage of time spent, across all ranks

If the simulation is well load balanced the minimum, average and maximum times should be identical.

The top two tables refer to the complete simulation information. The bottom two are related to the Evolve() section of the code (where each time step is computed).

Each set of two timers show the exclusive, top, and inclusive, bottom, information depending on whether the time spent in nested sections of the codes are included.

Note

When creating performance-related issues on the WarpX GitHub repo, please include Tiny Profiler tables (besides the usual issue description, input file and submission script), or (even better) the whole standard output.

For more detailed information please visit the AMReX profiling documentation. There is a script located here that parses the Tiny Profiler output and generates a JSON file that can be used with Hatchet in order to analyze performance.

AMReX’s Full Profiler

The Tiny Profiler provides a summary across all MPI ranks. However, when analyzing load-balancing, it can be useful to have more detailed information about the behavior of each individual MPI rank. The workflow for doing so is the following:

  • Compile WarpX with full profiler support:

    cmake -S . -B build -DAMReX_BASE_PROFILE=YES -DAMReX_TRACE_PROFILE=YES  -DAMReX_COMM_PROFILE=YES -DAMReX_TINY_PROFILE=OFF
    cmake --build build -j 4
    

    Warning

    Please note that the AMReX build options for AMReX_TINY_PROFILE (our default: ON) and full profiling traces via AMReX_BASE_PROFILE are mutually exclusive. Further tracing options are sub-options of AMReX_BASE_PROFILE.

    To turn on the tiny profiler again, remove the build directory or turn off AMReX_BASE_PROFILE again:

    cmake -S . -B build -DAMReX_BASE_PROFILE=OFF -DAMReX_TINY_PROFILE=ON
    
  • Run the simulation to be profiled. Note that the WarpX executable will create a new folder bl_prof, which contains the profiling data.

    Note

    When using the full profiler, it is usually useful to profile only a few PIC iterations (e.g. 10-20 PIC iterations), in order to improve readability. If the interesting PIC iterations occur only late in a simulation, you can run the first part of the simulation without profiling, the create a checkpoint, and then restart the simulation for 10-20 steps with the full profiler on.

Note

The next steps can be done on a local computer (even if the simulation itself ran on an HPC cluster). In this case, simply copy the folder bl_prof to your local computer.

  • In order, to visualize the profiling data, install amrvis using spack:

    spack install amrvis dims=2 +profiling
    
  • Then create timeline database from the bl_prof data and open it:

    <amrvis-executable> -timelinepf bl_prof/
    <amrvis-executable> pltTimeline/
    

    In the above, <amrvis-executable> should be replaced by the actual of your amrvis executable, which can be found starting to type amrvis and then using Tab completion, in a Terminal.

  • This will pop-up a window with the timeline. Here are few guidelines to navigate it:
    • Use the horizontal scroller to find the area where the 10-20 PIC steps occur.

    • In order to zoom on an area, you can drag and drop with the mouse, and the hit Ctrl-S on a keyboard.

    • You can directly click on the timeline to see which actual MPI call is being perform. (Note that the colorbar can be misleading.)

Nvidia Nsight-Systems

Vendor homepage and product manual.

Nsight-Systems provides system level profiling data, including CPU and GPU interactions. It runs quickly, and provides a convenient visualization of profiling results including NVTX timers.

Perlmutter Example

Example on how to create traces on a multi-GPU system that uses the Slurm scheduler (e.g., NERSC’s Perlmutter system). You can either run this on an interactive node or use the Slurm batch script header documented here.

# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1
# 1 OpenMP thread
export OMP_NUM_THREADS=1

export TMPDIR="$PWD/tmp"
rm -rf ${TMPDIR} profiling*
mkdir -p ${TMPDIR}

# record
srun --ntasks=4 --gpus=4 --cpu-bind=cores \
    nsys profile -f true               \
      -o profiling_%q{SLURM_TASK_PID}     \
      -t mpi,cuda,nvtx,osrt,openmp        \
      --mpi-impl=mpich                    \
    ./warpx.3d.MPI.CUDA.DP.QED            \
      inputs_3d                           \
        warpx.numprocs=1 1 4 amr.n_cell=512 512 2048 max_step=10

Note

If everything went well, you will obtain as many output files named profiling_<number>.nsys-rep as active MPI ranks. Each MPI rank’s performance trace can be analyzed with the Nsight System graphical user interface (GUI). In WarpX, every MPI rank is associated with one GPU, which each creates one trace file.

Warning

The last line of the sbatch file has to match the data of your input files.

Summit Example

Example on how to create traces on a multi-GPU system that uses the jsrun scheduler (e.g., OLCF’s Summit system):

# nsys: remove old traces
rm -rf profiling* tmp-traces
# nsys: a location where we can write temporary nsys files to
export TMPDIR=$PWD/tmp-traces
mkdir -p $TMPDIR
# WarpX: one OpenMP thread per MPI rank
export OMP_NUM_THREADS=1

# record
jsrun -n 4 -a 1 -g 1 -c 7 --bind=packed:$OMP_NUM_THREADS \
    nsys profile -f true \
      -o profiling_%p \
      -t mpi,cuda,nvtx,osrt,openmp   \
      --mpi-impl=openmpi             \
    ./warpx.3d.MPI.CUDA.DP.QED inputs_3d \
      warpx.numprocs=1 1 4 amr.n_cell=512 512 2048 max_step=10

Warning

Sep 10th, 2021 (OLCFHELP-3580): The Nsight-Compute (nsys) version installed on Summit does not record details of GPU kernels. This is reported to Nvidia and OLCF.

Details

In these examples, the individual lines for recording a trace profile are:

  • srun: execute multi-GPU runs with srun (Slurm’s mpiexec wrapper), here for four GPUs

  • -f true overwrite previously written trace profiles

  • -o: record one profile file per MPI rank (per GPU); if you run mpiexec/mpirun with OpenMPI directly, replace SLURM_TASK_PID with OMPI_COMM_WORLD_RANK

  • -t: select a couple of APIs to trace

  • --mpi--impl: optional, hint the MPI flavor

  • ./warpx...: select the WarpX executable and a good inputs file

  • warpx.numprocs=...: make the run short, reasonably small, and run only a few steps

Now open the created trace files (per rank) in the Nsight-Systems GUI. This can be done on another system than the one that recorded the traces. For example, if you record on a cluster and open the analysis GUI on your laptop, it is recommended to make sure that versions of Nsight-Systems match on the remote and local system.

Nvidia Nsight-Compute

Vendor homepage and product manual.

Nsight-Compute captures fine grained information at the kernel level concerning resource utilization. By default, it collects a lot of data and runs slowly (can be a few minutes per step), but provides detailed information about occupancy, and memory bandwidth for a kernel.

Example

Example of how to create traces on a single-GPU system. A jobscript for Perlmutter is shown, but the SBATCH headers are not strictly necessary as the command only profiles a single process. This can also be run on an interactive node, or without a workload management system.

#!/bin/bash -l
#SBATCH -t 00:30:00
#SBATCH -N 1
#SBATCH -J ncuProfiling
#SBATCH -A <your account>
#SBATCH -q regular
#SBATCH -C gpu
#SBATCH --ntasks-per-node=1
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=map_gpu:0
#SBATCH --mail-user=<email>
#SBATCH --mail-type=ALL

# record
dcgmi profile --pause
ncu -f -o out \
--target-processes all \
--set detailed \
--nvtx --nvtx-include="WarpXParticleContainer::DepositCurrent::CurrentDeposition/" \
./warpx input max_step=1 \
&> warpxOut.txt

Note

To collect full statistics, Nsight-Compute reruns kernels, temporarily saving device memory in host memory. This makes it slower than Nsight-Systems, so the provided script profiles only a single step of a single process. This is generally enough to extract relevant information.

Details

In the example above, the individual lines for recording a trace profile are:

  • dcgmi profile --pause other profiling tools can’t be collecting data, see this Q&A.

  • -f overwrite previously written trace profiles.

  • -o: output file for profiling.

  • --target-processes all: required for multiprocess code.

  • --set detailed: controls what profiling data is collected. If only interested in a few things, this can improve profiling speed. detailed gets pretty much everything.

  • --nvtx: collects NVTX data. See note.

  • --nvtx-include: tells the profiler to only profile the given sections. You can also use -k to profile only a given kernel.

  • ./warpx...: select the WarpX executable and a good inputs file.

Now open the created trace file in the Nsight-Compute GUI. As with Nsight-Systems, this can be done on another system than the one that recorded the traces. For example, if you record on a cluster and open the analysis GUI on your laptop, it is recommended to make sure that versions of Nsight-Compute match on the remote and local system.

Note

nvtx-include syntax is very particular. The trailing / in the example is significant. For full information, see the Nvidia’s documentation on NVTX filtering .

Testing the code

When adding a new feature, you want to make sure that (i) you did not break the existing code and (ii) your contribution gives correct results. While existing capabilities are tested regularly remotely (when commits are pushed to an open PR on CI, and every night on local clusters), it can also be useful to run tests on your custom input file. This section details how to use both automated and custom tests.

Continuous Integration in WarpX

Configuration

Our regression tests are using the suite published and documented at AMReX-Codes/regression_testing.

Most of the configuration of our regression tests happens in Regression/Warpx-tests.ini. We slightly modify this file in Regression/prepare_file_ci.py.

For example, if you like to change the compiler to compilation to build on Nvidia GPUs, modify this block to add -DWarpX_COMPUTE=CUDA:

[source]
dir = /home/regtester/AMReX_RegTesting/warpx
branch = development
cmakeSetupOpts = -DAMReX_ASSERTIONS=ON -DAMReX_TESTING=ON -DWarpX_COMPUTE=CUDA

We also support changing compilation options via the usual build environment variables. For instance, compiling with clang++ -Werror would be:

export CXX=$(which clang++)
export CXXFLAGS="-Werror"

Run Pre-Commit Tests Locally

When proposing code changes to Warpx, we perform a couple of automated stylistic and correctness checks on the code change. You can run those locally before you push to save some time, install them once like this:

python -m pip install -U pre-commit
pre-commit install

See pre-commit.com and our .pre-commit-config.yaml file in the repository for more details.

Run the test suite locally

Once your new feature is ready, there are ways to check that you did not break anything. WarpX has automated tests running every time a commit is added to an open pull request. The list of automated tests is defined in ./Regression/WarpX-tests.ini.

For easier debugging, it can be convenient to run the tests on your local machine by executing the script ./run_test.sh from WarpX’s root folder, as illustrated in the examples below:

# Example:
# run all tests defined in ./Regression/WarpX-tests.ini
./run_test.sh

# Example:
# run only the test named 'pml_x_yee'
./run_test.sh pml_x_yee

# Example:
# run only the tests named 'pml_x_yee', 'pml_x_ckc' and 'pml_x_psatd'
./run_test.sh pml_x_yee pml_x_ckc pml_x_psatd

Note that the script ./run_test.sh runs the tests with the exact same compile-time options and runtime options used to run the tests remotely.

Moreover, the script ./run_test.sh compiles all the executables that are necessary in order to run the chosen tests. The default number of threads allotted for compiling is set with numMakeJobs = 8 in ./Regression/WarpX-tests.ini. However, when running the tests on a local machine, it is usually possible and convenient to allot more threads for compiling, in order to speed up the builds. This can be accomplished by setting the environment variable WARPX_CI_NUM_MAKE_JOBS, with the preferred number of threads that fits your local machine, e.g. export WARPX_CI_NUM_MAKE_JOBS=16 (or less if your machine is smaller). On public CI, we overwrite the value to WARPX_CI_NUM_MAKE_JOBS=2, in order to avoid overloading the available remote resources. Note that this will not change the number of threads used to run each test, but only the number of threads used to compile each executable necessary to run the tests.

Once the execution of ./run_test.sh is completed, you can find all the relevant files associated with each test in one single directory. For example, if you run the single test pml_x_yee, as shown above, on 04/30/2021, you can find all relevant files in the directory ./test_dir/rt-WarpX/WarpX-tests/2021-04-30/pml_x_yee/. The content of this directory will look like the following (possibly including backtraces if the test crashed at runtime):

$ ls ./test_dir/rt-WarpX/WarpX-tests/2021-04-30/pml_x_yee/
analysis_pml_yee.py     # Python analysis script
inputs_2d               # input file
main2d.gnu.TEST.TPROF.MTMPI.OMP.QED.ex  # executable
pml_x_yee.analysis.out  # Python analysis output
pml_x_yee.err.out       # error output
pml_x_yee.make.out      # build output
pml_x_yee_plt00000/     # data output (initialization)
pml_x_yee_plt00300/     # data output (last time step)
pml_x_yee.run.out       # test output

Add a test to the suite

There are three steps to follow to add a new automated test (illustrated here for PML boundary conditions):

  • An input file for your test, in folder Example/Tests/…. For the PML test, the input file is at Examples/Tests/pml/inputs_2d. You can also re-use an existing input file (even better!) and pass specific parameters at runtime (see below).

  • A Python script that reads simulation output and tests correctness versus theory or calibrated results. For the PML test, see Examples/Tests/pml/analysis_pml_yee.py. It typically ends with Python statement assert( error<0.01 ).

  • If you need a new Python package dependency for testing, add it in Regression/requirements.txt

  • Add an entry to Regression/WarpX-tests.ini, so that a WarpX simulation runs your test in the continuous integration process, and the Python script is executed to assess the correctness. For the PML test, the entry is

[pml_x_yee]
buildDir = .
inputFile = Examples/Tests/pml/inputs2d
runtime_params = warpx.do_dynamic_scheduling=0 algo.maxwell_solver=yee
dim = 2
addToCompileString =
cmakeSetupOpts = -DWarpX_DIMS=2
restartTest = 0
useMPI = 1
numprocs = 2
useOMP = 1
numthreads = 1
compileTest = 0
doVis = 0
analysisRoutine = Examples/Tests/pml/analysis_pml_yee.py

If you re-use an existing input file, you can add arguments to runtime_params, like runtime_params = amr.max_level=1 amr.n_cell=32 512 max_step=100 plasma_e.zmin=-200.e-6.

Note

If you added analysisRoutine = Examples/analysis_default_regression.py, then run the new test case locally and add the checksum file for the expected output.

Note

We run those tests on our continuous integration services, which at the moment only have 2 virtual CPU cores. Thus, make sure that the product of numprocs and numthreads for a test is <=2.

Useful tool for plotfile comparison: fcompare

AMReX provides fcompare, an executable that takes two plotfiles as input and returns the absolute and relative difference for each field between these two plotfiles. For some changes in the code, it is very convenient to run the same input file with an old and your current version, and fcompare the plotfiles at the same iteration. To use it:

# Compile the executable
cd <path to AMReX>/Tools/Plotfile/ # This may change
make -j 8
# Run the executable to compare old and new versions
<path to AMReX>/Tools/Plotfile/fcompare.gnu.ex old/plt00200 new/plt00200

which should return something like

          variable name             absolute error            relative error
                                       (||A - B||)         (||A - B||/||A||)
----------------------------------------------------------------------------
level = 0
jx                                 1.044455105e+11               1.021651316
jy                                  4.08631977e+16               7.734299273
jz                                 1.877301764e+14               1.073458933
Ex                                 4.196315448e+10               1.253551615
Ey                                 3.330698083e+12               6.436470137
Ez                                 2.598167798e+10              0.6804387128
Bx                                     273.8687473               2.340209782
By                                     152.3911863                1.10952567
Bz                                     37.43212767                 2.1977289
part_per_cell                                   15                    0.9375
Ex_fp                              4.196315448e+10               1.253551615
Ey_fp                              3.330698083e+12               6.436470137
Ez_fp                              2.598167798e+10              0.6804387128
Bx_fp                                  273.8687473               2.340209782
By_fp                                  152.3911863                1.10952567
Bz_fp                                  37.43212767                 2.1977289

Documentation

Doxygen documentation

WarpX uses a Doxygen documentation. Whenever you create a new class, please document it where it is declared (typically in the header file):

/** \brief A brief title
 *
 * few-line description explaining the purpose of MyClass.
 *
 * If you are kind enough, also quickly explain how things in MyClass work.
 * (typically a few more lines)
 */
class MyClass
{ ... }

Doxygen reads this docstring, so please be accurate with the syntax! See Doxygen manual for more information. Similarly, please document functions when you declare them (typically in a header file) like:

/** \brief A brief title
 *
 * few-line description explaining the purpose of my_function.
 *
 * \param[in,out] my_int a pointer to an integer variable on which
 *                       my_function will operate.
 * \return what is the meaning and value range of the returned value
 */
int MyClass::my_function (int* my_int);

An online version of this documentation is linked here.

Breathe documentation

Your Doxygen documentation is not only useful for people looking into the code, it is also part of the WarpX online documentation based on Sphinx! This is done using the Python module Breathe, that allows you to write Doxygen documentation directly in the source and have it included it in your Sphinx documentation, by calling Breathe functions. For instance, the following line will get the Doxygen documentation for WarpXParticleContainer in Source/Particles/WarpXParticleContainer.H and include it to the html page generated by Sphinx:

.. doxygenclass:: WarpXParticleContainer

Building the documentation

To build the documentation on your local computer, you will need to install Doxygen as well as the Python module breathe. First, make sure you are in the root directory of WarpX’s source and install the Python requirements:

python3 -m pip install -r Docs/requirements.txt

You will also need Doxygen (macOS: brew install doxygen; Ubuntu: sudo apt install doxygen).

Then, to compile the documentation, use

cd Docs/

make html
# This will first compile the Doxygen documentation (execute doxygen)
# and then build html pages from rst files using sphinx and breathe.

Open the created build/html/index.html file with your favorite browser. Rebuild and refresh as needed.

Checksum regression tests

WarpX has checksum regression tests: as part of CI testing, when running a given test, the checksum module computes one aggregated number per field (Ex_checksum = np.sum(np.abs(Ex))) and compares it to a reference (benchmark). This should be sensitive enough to make the test fail if your PR causes a significant difference, print meaningful error messages, and give you a chance to fix a bug or reset the benchmark if needed.

The checksum module is located in Regression/Checksum/, and the benchmarks are stored as human-readable JSON files in Regression/Checksum/benchmarks_json/, with one file per benchmark (for instance, test Langmuir_2d has a corresponding benchmark Regression/Checksum/benchmarks_json/Langmuir_2d.json).

For more details on the implementation, the Python files in Regression/Checksum/ should be well documented.

From a user point of view, you should only need to use checksumAPI.py. It contains Python functions that can be imported and used from an analysis Python script. It can also be executed directly as a Python script. Here are recipes for the main tasks related to checksum regression tests in WarpX CI.

Include a checksum regression test in an analysis Python script

This relies on the function evaluate_checksum:

checksumAPI.evaluate_checksum(test_name, output_file, output_format='plotfile', rtol=1e-09, atol=1e-40, do_fields=True, do_particles=True)[source]

Compare output file checksum with benchmark. Read checksum from output file, read benchmark corresponding to test_name, and assert their equality.

Parameters:
  • test_name (string) – Name of test, as found between [] in .ini file.

  • output_file (string) – Output file from which the checksum is computed.

  • output_format (string) – Format of the output file (plotfile, openpmd).

  • rtol (float, default=1.e-9) – Relative tolerance for the comparison.

  • atol (float, default=1.e-40) – Absolute tolerance for the comparison.

  • do_fields (bool, default=True) – Whether to compare fields in the checksum.

  • do_particles (bool, default=True) – Whether to compare particles in the checksum.

For an example, see

#!/usr/bin/env python3

import os
import re
import sys

sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI

# this will be the name of the plot file
fn = sys.argv[1]

# Get name of the test
test_name = os.path.split(os.getcwd())[1]

# Run checksum regression test
if re.search( 'single_precision', fn ):
    checksumAPI.evaluate_checksum(test_name, fn, rtol=2.e-6)
else:
    checksumAPI.evaluate_checksum(test_name, fn)

This can also be included in an existing analysis script. Note that the plotfile must be <test name>_plt?????, as is generated by the CI framework.

Evaluate a checksum regression test from a bash terminal

You can execute checksumAPI.py as a Python script for that, and pass the plotfile that you want to evaluate, as well as the test name (so the script knows which benchmark to compare it to).

./checksumAPI.py --evaluate --output-file <path/to/plotfile> --output-format <'openpmd' or 'plotfile'> --test-name <test name>

See additional options

  • --skip-fields if you don’t want the fields to be compared (in that case, the benchmark must not have fields)

  • --skip-particles same thing for particles

  • --rtol relative tolerance for the comparison

  • --atol absolute tolerance for the comparison (a sum of both is used by numpy.isclose())

Create/Reset a benchmark with new values that you know are correct

Create/Reset a benchmark from a plotfile generated locally

This is using checksumAPI.py as a Python script.

./checksumAPI.py --reset-benchmark --output-file <path/to/plotfile> --output-format <'openpmd' or 'plotfile'> --test-name <test name>

See additional options

  • --skip-fields if you don’t want the benchmark to have fields

  • --skip-particles same thing for particles

Since this will automatically change the JSON file stored on the repo, make a separate commit just for this file, and if possible commit it under the Tools name:

git add <test name>.json
git commit -m "reset benchmark for <test name> because ..." --author="Tools <warpx@lbl.gov>"
Reset a benchmark from the Azure pipeline output on Github

Alternatively, the benchmarks can be reset using the output of the Azure continuous intergration (CI) tests on Github. The output can be accessed by following the steps below:

  • On the Github page of the Pull Request, find (one of) the pipeline(s) failing due to benchmarks that need to be updated and click on “Details”.

    Screen capture showing how to access Azure pipeline output on Github.
  • Click on “View more details on Azure pipelines”.

    Screen capture showing how to access Azure pipeline output on Github.
  • Click on “Build & test”.

    Screen capture showing how to access Azure pipeline output on Github.

From this output, there are two options to reset the benchmarks:

  1. For each of the tests failing due to benchmark changes, the output contains the content of the new benchmark file, as shown below. This content can be copied and pasted into the corresponding benchmark file. For instance, if the failing test is LaserAcceleration_BTD, this content can be pasted into the file Regression/Checksum/benchmarks_json/LaserAcceleration_BTD.json.

    Screen capture showing how to read new benchmark file from Azure pipeline output.
  2. If there are many tests failing in a single Azure pipeline, it might become more convenient to update the benchmarks automatically. WarpX provides a script for this, located in Tools/DevUtils/update_benchmarks_from_azure_output.py. This script can be used by following the steps below:

    • From the Azure output, click on “View raw log”.

      Screen capture showing how to download raw Azure pipeline output.
    • This should lead to a page that looks like the image below. Save it as a text file on your local computer.

      Screen capture showing how to download raw Azure pipeline output.
    • On your local computer, go to the WarpX folder and cd to the Tools/DevUtils folder.

    • Run the command python update_benchmarks_from_azure_output.py /path/to/azure_output.txt. The benchmarks included in that Azure output should now be updated.

    • Repeat this for every Azure pipeline (e.g. cartesian2d, cartesian3d, qed) that contains benchmarks that need to be updated.

Fast, Local Compilation

For simplicity, WarpX compilation with CMake by default downloads, configures and compiles compatible versions of central dependencies such as:

on-the-fly, which is called a superbuild.

In some scenarios, e.g., when compiling without internet, with slow internet access, or when working on WarpX and its dependencies, modifications to the superbuild strategy might be preferable. In the below workflows, you as the developer need to make sure to use compatible versions of the dependencies you provide.

Compiling From Local Sources

This workflow is best for developers that make changes to WarpX, AMReX, PICSAR, openPMD-api and/or pyAMReX at the same time. For instance, use this if you add a feature in AMReX and want to try it in WarpX before it is proposed as a pull request for inclusion in AMReX.

Instead of downloading the source code of the above dependencies, one can also use an already cloned source copy. For instance, clone these dependencies to $HOME/src:

cd $HOME/src

git clone https://github.com/ECP-WarpX/WarpX.git warpx
git clone https://github.com/AMReX-Codes/amrex.git
git clone https://github.com/openPMD/openPMD-api.git
git clone https://github.com/ECP-WarpX/picsar.git
git clone https://github.com/AMReX-Codes/pyamrex.git
git clone https://github.com/pybind/pybind11.git

Now modify the dependencies as needed in their source locations, update sources if you cloned them earlier, etc. When building WarpX, the following CMake flags will use the respective local sources:

cd src/warpx

rm -rf build

cmake -S . -B build  \
  -DWarpX_PYTHON=ON  \
  -DWarpX_amrex_src=$HOME/src/amrex          \
  -DWarpX_openpmd_src=$HOME/src/openPMD-api  \
  -DWarpX_picsar_src=$HOME/src/picsar        \
  -DWarpX_pyamrex_src=$HOME/src/pyamrex      \
  -DWarpX_pybind11_src=$HOME/src/pybind11

cmake --build build -j 8
cmake --build build -j 8 --target pip_install

Compiling With Pre-Compiled Dependencies

This workflow is the best and fastest to compile WarpX, when you just want to change code in WarpX and have the above central dependencies already made available in the right configurations (e.g., w/ or w/o MPI or GPU support) from a module system or package manager.

Instead of downloading the source code of the above central dependencies, or using a local copy of their source, we can compile and install those dependencies once. By setting the CMAKE_PREFIX_PATH environment variable to the respective dependency install location prefixes, we can instruct CMake to find their install locations and configurations.

WarpX supports this with the following CMake flags:

cd src/warpx

rm -rf build

cmake -S . -B build  \
  -DWarpX_PYTHON=ON  \
  -DWarpX_amrex_internal=OFF    \
  -DWarpX_openpmd_internal=OFF  \
  -DWarpX_picsar_internal=OFF   \
  -DWarpX_pyamrex_internal=OFF  \
  -DWarpX_pybind11_internal=OFF

cmake --build build -j 8
cmake --build build -j 8 --target pip_install

As a background, this is also the workflow how WarpX is built in package managers such as Spack and Conda-Forge.

Faster Python Builds

The Python bindings of WarpX and AMReX (pyAMReX) use pybind11. Since pybind11 relies heavily on C++ metaprogramming, speeding up the generated binding code requires that we perform a link-time optimization (LTO) step, also known as interprocedural optimization (IPO).

For fast local development cycles, one can skip LTO/IPO with the following flags:

cd src/warpx

cmake -S . -B build       \
  -DWarpX_PYTHON=ON       \
  -DWarpX_PYTHON_IPO=OFF  \
  -DpyAMReX_IPO=OFF

cmake --build build -j 8 --target pip_install

Note

We might transition to nanobind in the future, which does not rely on LTO/IPO for optimal binaries. You can contribute to this pyAMReX pull request to help exploring this library (and if it works for the HPC/GPU compilers that we need to support).

For robustness, our pip_install target performs a regular wheel build and then installs it with pip. This step will check every time of WarpX dependencies are properly installed, to avoid broken installations. When developing without internet or after the first pip_install succeeded in repeated installations in rapid development cycles, this check of pip can be skipped by using the pip_install_nodeps target instead:

cmake --build build -j 8 --target pip_install_nodeps

CCache

WarpX builds will automatically search for CCache to speed up subsequent compilations in development cycles. Make sure a recent CCache version is installed to make use of this feature.

For power developers that switch a lot between fundamentally different WarpX configurations (e.g., 1D to 3D, GPU and CPU builds, many branches with different bases, developing AMReX and WarpX at the same time), also consider increasing the CCache cache size and changing the cache directory if needed, e.g., due to storage quota constraints or to choose a fast(er) filesystem for the cache files.

The clang-tidy linter

Clang-tidy CI test

WarpX’s CI tests include several checks performed with the clang-tidy linter (currently the version 15 of this tool). The complete list of checks enforced in CI tests can be found in the .clang-tidy configuration file.

clang-tidy configuration file
Checks: '
    -*,
    bugprone-*,
        -bugprone-easily-swappable-parameters,
        -bugprone-implicit-widening-of-multiplication-result,
        -bugprone-misplaced-widening-cast,
        -bugprone-unchecked-optional-access,
    cert-*,
        -cert-err58-cpp,
    clang-analyzer-*,
        -clang-analyzer-optin.performance.Padding,
        -clang-analyzer-optin.mpi.MPI-Checker,
        -clang-analyzer-osx.*,
        -clang-analyzer-optin.osx.*,
    clang-diagnostic-*,
    cppcoreguidelines-*,
        -cppcoreguidelines-avoid-c-arrays,
        -cppcoreguidelines-avoid-magic-numbers,
        -cppcoreguidelines-avoid-non-const-global-variables,
        -cppcoreguidelines-init-variables,
        -cppcoreguidelines-macro-usage,
        -cppcoreguidelines-narrowing-conversions,
        -cppcoreguidelines-non-private-member-variables-in-classes,
        -cppcoreguidelines-owning-memory,
        -cppcoreguidelines-pro-*,
    google-build-explicit-make-pair,
    google-build-namespaces,
    google-global-names-in-headers,
    misc-*,
        -misc-no-recursion,
        -misc-non-private-member-variables-in-classes,
    modernize-*,
        -modernize-avoid-c-arrays,
        -modernize-return-braced-init-list,
        -modernize-use-trailing-return-type,
    mpi-*,
    performance-*,
    portability-*,
    readability-*,
        -readability-convert-member-functions-to-static,
        -readability-else-after-return,
        -readability-function-cognitive-complexity,
        -readability-identifier-length,
        -readability-implicit-bool-conversion,
        -readability-isolate-declaration,
        -readability-magic-numbers,
        -readability-named-parameter,
        -readability-uppercase-literal-suffix
    '

CheckOptions:
- key:          bugprone-narrowing-conversions.WarnOnIntegerToFloatingPointNarrowingConversion
  value:        "false"
- key:          misc-definitions-in-headers.HeaderFileExtensions
  value:        "H,"
- key:          modernize-pass-by-value.ValuesOnly
  value:        "true"


HeaderFilterRegex: 'Source[a-z_A-Z0-9\/]+\.H$'

Run clang-tidy linter locally

We provide a script to run clang-tidy locally. The script can be run as follows, provided that all the requirements to compile WarpX are met (see building from source <install-developers>). The script generates a simple wrapper to ensure that clang-tidy is only applied to WarpX source files and compiles WarpX in 1D,2D,3D, and RZ using such wrapper. By default WarpX is compiled in single precision with PSATD solver, QED module, QED table generator and Embedded boundary in order to find more potential issues with the clang-tidy tool.

Few optional environment variables can be set to tune the behavior of the script:

  • WARPX_TOOLS_LINTER_PARALLEL: sets the number of cores to be used for the compilation

  • CLANG, CLANGXX, and CLANGTIDY : set the version of the compiler and of the linter

Note: clang v15 is currently used in CI tests. It is therefore recommended to use this version. Otherwise, a newer version may find issues not currently covered by CI tests (checks are opt-in) while older versions may not find all the issues.

export WARPX_TOOLS_LINTER_PARALLEL=12
export CLANG=clang-15
export CLANGXX=clang++-15
export CLANGTIDY=clang-tidy-15
./Tools/Linter/runClangTidy.sh
Script Details
#!/usr/bin/env bash
#
# Copyright 2024 Luca Fedeli
#
# This file is part of WarpX.
#

# This script is a developer's tool to perform the
# checks done by the clang-tidy CI test locally.
#
# Note: this script is only tested on Linux

echo "============================================="
echo
echo "This script is a developer's tool to perform the"
echo "checks done by the clang-tidy CI test locally"
echo "_____________________________________________"

# Check source dir
REPO_DIR=$(cd $(dirname ${BASH_SOURCE})/../../ && pwd)
echo
echo "Your current source directory is: ${REPO_DIR}"
echo "_____________________________________________"

# Set number of jobs to use for compilation
PARALLEL="${WARPX_TOOLS_LINTER_PARALLEL:-4}"
echo
echo "${PARALLEL} jobs will be used for compilation."
echo "This can be overridden by setting the environment"
echo "variable WARPX_TOOLS_LINTER_PARALLEL, e.g.: "
echo
echo "$ export WARPX_TOOLS_LINTER_PARALLEL=8"
echo "$ ./Tools/Linter/runClangTidy.sh"
echo "_____________________________________________"

# Check clang version
export CC="${CLANG:-"clang"}"
export CXX="${CLANGXX:-"clang++"}"
export CTIDY="${CLANGTIDY:-"clang-tidy"}"
echo
echo "The following versions of the clang compiler and"
echo "of the clang-tidy linter will be used:"
echo
echo "clang version:"
which ${CC}
${CC} --version
echo
echo "clang++ version:"
which ${CXX}
${CXX} --version
echo
echo "clang-tidy version:"
which ${CTIDY}
${CTIDY} --version
echo
echo "This can be overridden by setting the environment"
echo "variables CLANG, CLANGXX, and CLANGTIDY e.g.: "
echo "$ export CLANG=clang-15"
echo "$ export CLANGXX=clang++-15"
echo "$ export CTIDCLANGTIDYY=clang-tidy-15"
echo "$ ./Tools/Linter/runClangTidy.sh"
echo
echo "******************************************************"
echo "* Warning: clang v15 is currently used in CI tests.  *"
echo "* It is therefore recommended to use this version.   *"
echo "* Otherwise, a newer version may find issues not     *"
echo "* currently covered by CI tests while older versions *"
echo "* may not find all the issues.                       *"
echo "******************************************************"
echo "_____________________________________________"

# Prepare clang-tidy wrapper
echo
echo "Prepare clang-tidy wrapper"
echo "The following wrapper ensures that only source files"
echo "in WarpX/Source/* are actually processed by clang-tidy"
echo
cat > ${REPO_DIR}/clang_tidy_wrapper << EOF
#!/bin/bash
REGEX="[a-z_A-Z0-9\/]*WarpX\/Source[a-z_A-Z0-9\/]+.cpp"
if [[ \$4 =~ \$REGEX ]];then
  ${CTIDY} \$@
fi
EOF
chmod +x ${REPO_DIR}/clang_tidy_wrapper
echo "clang_tidy_wrapper: "
cat ${REPO_DIR}/clang_tidy_wrapper
echo "_____________________________________________"

# Compile Warpx using clang-tidy
echo
echo "*******************************************"
echo "* Compile Warpx using clang-tidy          *"
echo "* Please ensure that all the dependencies *"
echo "* required to compile WarpX are met       *"
echo "*******************************************"
echo

rm -rf ${REPO_DIR}/build_clang_tidy

cmake -S ${REPO_DIR} -B ${REPO_DIR}/build_clang_tidy \
  -DCMAKE_CXX_CLANG_TIDY="${REPO_DIR}/clang_tidy_wrapper;--system-headers=0;--config-file=${REPO_DIR}/.clang-tidy" \
  -DCMAKE_VERBOSE_MAKEFILE=ON  \
  -DWarpX_DIMS="1;2;3;RZ"      \
  -DWarpX_MPI=ON               \
  -DWarpX_COMPUTE=OMP          \
  -DWarpX_PSATD=ON             \
  -DWarpX_QED=ON               \
  -DWarpX_QED_TABLE_GEN=ON     \
  -DWarpX_OPENPMD=ON           \
  -DWarpX_PRECISION=SINGLE

cmake --build ${REPO_DIR}/build_clang_tidy -j ${PARALLEL} 2> ${REPO_DIR}/build_clang_tidy/clang-tidy.log

cat ${REPO_DIR}/build_clang_tidy/clang-tidy.log
echo
echo "============================================="

FAQ

This section lists frequently asked developer questions.

What is 0.0_rt?

It’s a C++ floating-point literal for zero of type amrex::Real.

We use literals to define constants with a specific type, in that case the zero-value. There is also 0.0_prt, which is a literal zero of type amrex::ParticleReal. In std C++, you know: 0.0 (literal double), 0.0f (literal float) and 0.0L (literal long double). We do not use use those, so that we can configure floating point precision at compile time and use different precision for fields (amrex::Real) and particles (amrex::ParticleReal).

You can also write things like 42.0_prt if you like to have another value than zero.

We use these C++ user literals ([1], [2], [3]), because we want to avoid that double operations, i.e., 3. / 4., implicit casts, or even worse integer operations, i.e., 3 / 4, sneak into the code base and make results wrong or slower.

Do you worry about using size_t vs. uint vs. int for indexing things?

std::size_t is the C++ unsigned int type for all container sizes.

Close to but not necessarily uint, depends on the platform. For “hot” inner loops, you want to use int instead of an unsigned integer type. Why? Because int has no handling for overflows (it is intentional, undefined behavior in C++), which allows compilers to vectorize easier, because they don’t need to check for an overflow every time one reaches the control/condition section of the loop.

C++20 will also add support for ssize (signed size), but we currently require C++17 for builds. Thus, sometimes you need to static_cast<int>(...).

What does std::make_unique do?

make_unique is a C++ factory method that creates a std::unique_ptr<T>.

Follow-up: Why use this over just *my_ptr = new <class>?

Because so-called smart-pointers, such as std::unique_ptr<T>, do delete themselves automatically when they run out of scope. That means: no memory leaks, because you cannot forget to delete them again.

Why name header files .H instead of .h?

This is just a convention that we follow through the code base, which slightly simplifies what we need to parse in our various build systems. We inherited that from AMReX. Generally speaking, C++ file endings can be arbitrary, we just keep them consistent to avoid confusion in the code base.

To be explicit and avoid confusion (with C/ObjC), we might change them all to .hpp and .cpp/.cxx at some point, but for now .H and .cpp is what we do (as in AMReX).

What are #include "..._fwd.H" and #include <...Fwd.H> files?

These are C++ forward declarations. In C++, #include statements copy the referenced header file literally into place, which can increase the compile time of a .cpp file to an object file significantly, especially with transitive header files including each other.

In order to reduce compile time, we define forward declarations in WarpX and AMReX for commonly used, large classes. The C++ standard library also uses that concept, e.g., in iosfwd.

What does const int /*i_buffer*/ mean in argument list?

This is often seen in a derived class, overwriting an interface method. It means we do not name the parameter because we do not use it when we overwrite the interface. But we add the name as a comment /* ... */ so that we know what we ignored when looking at the definition of the overwritten method.

What is Pinned Memory?

We need pinned aka “page locked” host memory when we:

  • do asynchronous copies between the host and device

  • want to write to CPU memory from a GPU kernel

A typical use case is initialization of our (filtered/processed) output routines. AMReX provides pinned memory via the amrex::PinnedArenaAllocator , which is the last argument passed to constructors of ParticleContainer and MultiFab.

Read more on this here: How to Optimize Data Transfers in CUDA C/C++ (note that pinned memory is a host memory feature and works with all GPU vendors we support)

Bonus: underneath the hood, asynchronous MPI communications also pin and unpin memory. One of the benefits of GPU-aware MPI implementations is, besides the possibility to use direct device-device transfers, that MPI and GPU API calls are aware of each others’ pinning ambitions and do not create data races to unpin the same memory.

Maintenance

Dependencies & Releases

Update WarpX’ Core Dependencies

WarpX has direct dependencies on AMReX and PICSAR, which we periodically update.

The following scripts automate this workflow, in case one needs a newer commit of AMReX or PICSAR between releases:

./Tools/Release/updateAMReX.py
./Tools/Release/updatepyAMReX.py
./Tools/Release/updatePICSAR.py

Create a new WarpX release

WarpX has one release per month. The version number is set at the beginning of the month and follows the format YY.MM.

In order to create a GitHub release, you need to:

  1. Create a new branch from development and update the version number in all source files. We usually wait for the AMReX release to be tagged first, then we also point to its tag.

    There is a script for updating core dependencies of WarpX and the WarpX version:

    ./Tools/Release/updateAMReX.py
    ./Tools/Release/updatepyAMReX.py
    ./Tools/Release/updatePICSAR.py
    
    ./Tools/Release/newVersion.sh
    

    For a WarpX release, ideally a git tag of AMReX & PICSAR shall be used instead of an unnamed commit.

    Then open a PR, wait for tests to pass and then merge.

  2. Local Commit (Optional): at the moment, @ax3l is managing releases and signs tags (naming: YY.MM) locally with his GPG key before uploading them to GitHub.

    Publish: On the GitHub Release page, create a new release via Draft a new release. Either select the locally created tag or create one online (naming: YY.MM) on the merged commit of the PR from step 1.

    In the release description, please specify the compatible versions of dependencies (see previous releases), and provide info on the content of the release. In order to get a list of PRs merged since last release, you may run

    git log <last-release-tag>.. --format='- %s'
    
  3. Optional/future: create a release-<version> branch, write a changelog, and backport bug-fixes for a few days.

Automated performance tests

WarpX has automated performance test scripts, which run weak scalings for various tests on a weekly basis. The results are stored in the perf_logs repo and plots of the performance history can be found on this page.

These performance tests run automatically, so they need to do git operations etc. For this reason, they need a separate clone of the source repos, so they don’t conflict with one’s usual operations. This is typically in a sub-directory in the $HOME, with variable $AUTOMATED_PERF_TESTS pointing to it. Similarly, a directory is needed to run the simulations and store the results. By default, it is $SCRATCH/performance_warpx.

The test runs a weak scaling (1,2,8,64,256,512 nodes) for 6 different tests Tools/PerformanceTests/automated_test_{1,2,3,4,5,6}_*, gathered in 1 batch job per number of nodes to avoid submitting too many jobs.

Setup on Summit @ OLCF

Here is an example setup for Summit:

# I put the next three lines in $HOME/my_bashrc.sh
export proj=aph114  # project for job submission
export AUTOMATED_PERF_TESTS=$HOME/AUTOMATED_PERF_TESTS/
export SCRATCH=/gpfs/alpine/scratch/$(whoami)/$proj/

mkdir $HOME/AUTOMATED_PERF_TESTS
cd $AUTOMATED_PERF_TESTS
git clone https://github.com/ECP-WarpX/WarpX.git warpx
git clone https://github.com/ECP-WarpX/picsar.git
git clone https://github.com/AMReX-Codes/amrex.git
git clone https://github.com/ECP-WarpX/perf_logs.git

Then, in $AUTOMATED_PERF_TESTS, create a file run_automated_performance_tests_512.sh with the following content:

#!/bin/bash -l
#BSUB -P APH114
#BSUB -W 00:15
#BSUB -nnodes 1
#BSUB -J PERFTEST
#BSUB -e err_automated_tests.txt
#BSUB -o out_automated_tests.txt

module load nano
module load cmake/3.20.2
module load gcc/9.3.0
module load cuda/11.0.3
module load blaspp/2021.04.01
module load lapackpp/2021.04.00
module load boost/1.76.0
module load adios2/2.7.1
module load hdf5/1.12.2

module unload darshan-runtime

export AMREX_CUDA_ARCH=7.0
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=$(which g++)

# Make sure all dependencies are installed and loaded
cd $HOME
module load python/3.8.10
module load freetype/2.10.4     # matplotlib
module load openblas/0.3.5-omp
export BLAS=$OLCF_OPENBLAS_ROOT/lib/libopenblas.so
export LAPACK=$OLCF_OPENBLAS_ROOT/lib/libopenblas.so
python3 -m pip install --user --upgrade pip
python3 -m pip install --user virtualenv
python3 -m venv $HOME/sw/venvs/warpx-perftest
source $HOME/sw/venvs/warpx-perftest/bin/activate
# While setting up the performance tests for the first time,
# execute the lines above this comment and then the commented
# lines below this comment once, before submission.
# The commented lines take too long for the job script.
#python3 -m pip install --upgrade pip
#python3 -m pip install --upgrade build packaging setuptools wheel
#python3 -m pip install --upgrade cython
#python3 -m pip install --upgrade numpy
#python3 -m pip install --upgrade markupsafe
#python3 -m pip install --upgrade pandas
#python3 -m pip install --upgrade matplotlib==3.2.2  # does not try to build freetype itself
#python3 -m pip install --upgrade bokeh
#python3 -m pip install --upgrade gitpython
#python3 -m pip install --upgrade tables

# Run the performance test suite
cd $AUTOMATED_PERF_TESTS/warpx/Tools/PerformanceTests/
python run_automated.py --n_node_list='1,2,8,64,256,512' --automated

# submit next week's job
cd $AUTOMATED_PERF_TESTS/
next_date=`date -d "+7 days" '+%Y:%m:%d:%H:%M'`
bsub -b $next_date ./run_automated_performance_tests_512.sh

Then, running

bsub run_automated_performance_tests_512.sh

will submit this job once, and all the following ones. It will:

  • Create directory $SCRATCH/performance_warpx if doesn’t exist.

  • Create 1 sub-directory per week per number of nodes (1,2,8,64,256,512).

  • Submit one job per number of nodes. It will run 6 different tests, each twice (to detect fluctuations).

  • Submit an analysis job, that will read the results ONLY AFTER all runs are finished. This uses the dependency feature of the batch system.

  • This job reads the Tiny Profiler output for each run, and stores the results in a pandas file at the hdf5 format.

  • Execute write_csv.py from the perf_logs repo to append a csv and a hdf5 file with the new results.

  • Commit the results (but DO NOT PUSH YET)

Then, the user periodically has to

cd $AUTOMATED_PERF_TESTS/perf_logs
git pull # to get updates from someone else, or from another supercomputer
git push

This will update the database but not the online plots. For this, you need to periodically run something like

cd $AUTOMATED_PERF_TESTS/perf_logs
git pull
python generate_index_html.py
git add -u
git commit -m "upload new html page"
git push

Setup on Cori @ NERSC

Still to be written!

Epilogue

Glossary

In daily communication, we tend to abbreviate a lot of terms. It is important to us to make it easy to interact with the WarpX community and thus, this list shall help to clarify often used terms.

Abbreviations

  • 2FA: Two-factor-authentication

  • ABLASTR: Accelerated BLAST Recipes, the library inside WarpX to share functionality with other BLAST codes

  • ALCF: Argonne Leadership Computing Facility, a supercomputing center located near Chicago, IL (USA)

  • ALS: Advance Light Source, a U.S. Department of Energy scientific user facility at Lawrence Berkeley National Laboratory

  • BLAST: Beam, Plasma & Accelerator Simulation Toolkit

  • AMR: adaptive mesh-refinement

  • BC: boundary condition (of a simulation)

  • BCK: Benkler-Chavannes-Kuster method, a stabilization technique for small cells in the electromagnetic solver

  • BTD: backtransformed diagnostics, a method to collect data for analysis from a boosted frame simulation

  • CEX: charge-exchange collisions

  • CFL: the Courant-Friedrichs-Lewy condition, a numerical parameter for the numerical convergence of PDE solvers

  • CI: continuous integration, automated tests that we perform before a proposed code-change is accepted; see PR

  • CPU: central processing unit, we usual mean a socket or generally the host-side of a computer (compared to the accelerator, e.g. GPU)

  • DOE: The United States Department of Energy, the largest sponsor of national laboratory research in the United States of America

  • DSMC: Direct Simulation Monte Carlo, a method to capture collisions between kinetic particles

  • ECP: Exascale Computing Project, a U.S. DOE funding source that supports WarpX development

  • ECT: Enlarged Cell Technique, an electromagnetic solver with accurate resolution of perfectly conducting embedded boundaries

  • EB: embedded boundary, boundary conditions inside the simulation box, e.g. following material surfaces

  • EM: electromagnetic, e.g. EM PIC

  • ES: electrostatic, e.g. ES PIC

  • FDTD: Finite-difference time-domain or Yee’s method, a class of grid-based finite-difference field solvers

  • FRC: Field Reversed Configuration, an approach of magnetic confinement fusion

  • GPU: originally graphics processing unit, now used for fast general purpose computing (GPGPU); also called (hardware) accelerator

  • IO: input/output, usually files and/or data

  • IPO: interprocedural optimization, a collection of compiler optimization techniques that analyze the whole code to avoid duplicate calculations and optimize performance

  • ISI: Induced Spectral Incoherence (a laser pulse manipulation technique)

  • LDRD: Laboratory Directed Research and Development, a funding program in U.S. DOE laboratories that kick-started ABLASTR development

  • LPA: laser-plasma acceleration, historically used for laser-electron acceleration

  • LPI: laser-plasma interaction (often for laser-solid physics) or laser-plasma instability (often in fusion physics), depending on context

  • LTO: link-time optimization, program optimizations for file-by-file compilation that optimize object files before linking them together to an executable

  • LWFA: laser-wakefield acceleration (of electrons/leptons)

  • MCC: Monte-Carlo collisions wherein a kinetic species collides with a fluid species, for example used in glow discharge simulations

  • MR: mesh-refinement

  • MS: magnetostatic, e.g. MS PIC

  • MVA: magnetic-vortex acceleration (of protons/ions)

  • NERSC: National Energy Research Scientific Computing Center, a supercomputing center located in Berkeley, CA (USA)

  • NSF: the National Science Foundation, a large public agency in the United States of America, supporting research and education

  • OLCF: Oak Ridge Leadership Computing Facility, a supercomputing center located in Oak Ridge, TN (USA)

  • OTP: One-Time-Password; see 2FA

  • PDE: partial differential equation, an equation which imposes relations between the various partial derivatives of a multivariable function

  • PIC: particle-in-cell, the method implemented in WarpX

  • PICMI: Particle-In-Cell Modeling Interface, a standard proposing naming and structure conventions for particle-in-cell simulation input

  • PICSAR: Particle-In-Cell Scalable Application Resource, a high performance parallelization library intended to help scientists porting their Particle-In-Cell (PIC) codes to next generation of exascale computers

  • PR: github pull request, a proposed change to the WarpX code base

  • PSATD: pseudo-spectral analytical time-domain method, a spectral field solver with better numerical properties than FDTD solvers

  • PWFA: plasma-wakefield acceleration

  • QED: quantum electrodynamics

  • RPA: radiation-pressure acceleration (of protons/ions), e.g. hole-boring (HB) or light-sail (LS) acceleration

  • RPP: Random Phase Plate (a laser pulse manipulation technique)

  • RZ: for the coordinate system r-z in cylindrical geometry; we use “RZ” when we refer to quasi-cylindrical geometry, decomposed in azimuthal modes (see details here)

  • SENSEI: Scalable in situ analysis and visualization, light weight framework for in situ data analysis offering access to multiple visualization and analysis backends

  • SEE: secondary electron emission

  • SSD: Smoothing by Spectral Dispersion (a laser pulse manipulation technique)

  • TNSA: target-normal sheet acceleration (of protons/ions)

Terms

  • accelerator: depending on context, either a particle accelerator in physics or a hardware accelerator (e.g. GPU) in computing

  • AMReX: C++ library for block-structured adaptive mesh-refinement, a primary dependency of WarpX

  • Ascent: many-core capable flyweight in situ visualization and analysis infrastructure, a visualization backend usable with WarpX data

  • boosted frame: a Lorentz-boosted frame of reference for a simulation

  • evolve: this is a generic term to advance a quantity (same nomenclature in AMReX).

    For instance, WarpX::EvolveE(dt) advances the electric field for duration dt, PhysicalParticleContainer::Evolve(...) does field gather + particle push + current deposition for all particles in PhysicalParticleContainer, and WarpX::Evolve is the central WarpX function that performs 1 PIC iteration.

  • Frontier: an Exascale supercomputer at OLCF

  • hybrid-PIC: a plasma simulation scheme that combines fluid and kinetic approaches, with (usually) the electrons treated as a fluid and the ions as kinetic particles (see Kinetic-fluid Hybrid Model)

  • laser: most of the time, we mean a laser pulse

  • openPMD: Open Standard for Particle-Mesh Data Files, a community meta-data project for scientific data

  • Ohm’s law solver: the logic that solves for the electric-field when using the hybrid-PIC algorithm

  • Perlmutter: a Berkeley Lab nobel laureate and a Pre-Exascale supercomputer at NERSC

  • plotfiles: the internal binary format for data files in AMReX

  • Python: a popular scripted programming language

  • scraping: a term often used to refer to the process of removing particles that have crossed into an embedded boundary or pass an absorbing domain boundary from the simulation

WarpX Governance

WarpX is led in an open governance model, described in this file.

Steering Committee

Current Roster

  • Jean-Luc Vay (chair)

  • Remi Lehe

  • Axel Huebl

See: GitHub team

Role

Members of the steering committee (SC) can change organizational settings, do administrative operations such as rename/move/archive repositories, change branch protection rules, etc. SC members can call votes for decisions (technical or governance).

The SC can veto decisions of the technical committee (TC) by voting in the SC. The TC can overwrite a veto with a 2/3rd majority vote in the TC. Decisions are documented in the weekly developer meeting notes and/or on the GitHub repository.

The SC can change the governance structure, but only in a unanimous vote.

Decision Process

Decision of the SC usually happen in the weekly developer meetings, via e-mail or public chat.

Decisions are made in a non-confidential manner, by majority on the cast votes of SC members. Votes can be cast in asynchronous manner, e.g., over the time of 1-2 weeks. In tie situations, the chair of the SC acts as the tie breaker.

Appointment Process

Appointed by current SC members in an unanimous vote. As a SC member, regularly attending and contributing to the weekly developer meetings is expected.

SC members can resign or be removed by majority vote, e.g., due to inactivity, bad acting or other reasons.

Technical Committee

Current Roster

  • Luca Fedeli

  • Roelof Groenewald

  • David Grote

  • Axel Huebl

  • Revathi Jambunathan

  • Remi Lehe

  • Andrew Myers

  • Maxence Thévenet

  • Jean-Luc Vay

  • Weiqun Zhang

  • Edoardo Zoni

See: GitHub team

Role

The technical committee (TC) is the core governance body, where under normal operations most ideas are discussed and decisions are made. Individual TC members can approve and merge code changes. Usually, they seek approval by another maintainer for their own changes, too. TC members lead - and weigh in on - technical discussions and, if needed, can call for a vote between TC members for a technical decision. TC members merge/close PRs and issues, and moderate (including block/mute) bad actors. The TC can propose governance changes to the SC.

Decision Process

Discussion in the TC usually happens in the weekly developer meetings.

If someone calls for a vote to make a decision: majority based on the cast votes; we need 50% of the committee participating to vote. In the absence of a quorum, the SC will decide according to its voting rules.

Votes are cast in a non-confidential manner. Decisions are documented in the weekly developer meeting notes and/or on the GitHub repository.

TC members can individually appoint new contributors, unless a vote is called on an individual.

Appointment Process

TC members are the maintainers of WarpX. As a TC member, regularly attending and contributing to the weekly developer meetings is expected.

One is appointed to the TC by the steering committee, in a unanimous vote, or by majority vote of the TC. The SC can veto appointments. Steering committee members can also be TC members.

TC members can resign or be removed by majority vote by either TC or SC, e.g., due to inactivity, bad acting or other reasons.

Contributors

Current Roster

See: GitHub team

Role

Contributors are valuable, vetted developers of WarpX. Contributions can be in many forms and not all need to be code contributions. Examples include code pull requests, support in issues & user discussions, writing and updating documentation, writing tutorials, visualizations, R&D on algorithms, testing and benchmarking, etc. Contributors can participate in developer meetings and weigh in on discussions. Contributors can “triage” (add labels) to pull requests, issues, and GitHub discussion pages. Contributors can comment and review PRs (but not merge).

Decision Process

Contributors can individually decide on classification (triage) of pull requests, issues, and GitHub discussion pages.

Appointment Process

Appointed after contributing to WarpX (see above) by any member of the TC.

The role can be lost by resigning or by decision of an individual TC or SC member, e.g., due to inactivity, bad acting or other.

Former Members

“Former members” are the giants on whose shoulders we stand. But, for the purpose of WarpX governance, they are not tracked as a governance role in WarpX. Instead, former (e.g., inactive) contributors are acknowledged separately in GitHub contributor tracking, the WarpX documentation, references, citable Zenodo archives of releases, etc. as appropriate.

Former members of SC, TC and Contributors are not kept in the roster, since committee role rosters shall reflect currently active members and the responsible governance body.

Funding and Acknowledgements

WarpX is supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering, and early testbed platforms, in support of the nation’s exascale computing imperative.

WarpX is supported by the CAMPA collaboration, a project of the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research and Office of High Energy Physics, Scientific Discovery through Advanced Computing (SciDAC) program.

ABLASTR seed development is supported by the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under U.S. Department of Energy Contract No. DE-AC02-05CH11231.

CEA-LIDYL actively contributes to the co-development of WarpX. As part of this initiative, WarpX also receives funding from the French National Research Agency (ANR - Plasm-On-Chip), the Horizon H2020 program and CEA.

We acknowledge all the contributors and users of the WarpX community who participate to the code quality with valuable code improvement and important feedback.