WarpX
WarpX is an advanced, time-based, electromagnetic & electrostatic Particle-In-Cell code.
It supports many features including:
Perfectly-Matched Layers (PML)
Boosted-frame simulations
Mesh refinement
For details on the algorithms that WarpX implements, see the theory section.
WarpX is a highly-parallel and highly-optimized code, which can run on GPUs and multi-core CPUs, and includes load balancing capabilities. WarpX scales to the world’s largest supercomputers and was awarded the 2022 ACM Gordon Bell Prize. In addition, WarpX is also a multi-platform code and runs on Linux, macOS and Windows.
Contact us
If you are starting using WarpX, or if you have a user question, please pop in our discussions page and get in touch with the community.
The WarpX GitHub repo is the main communication platform. Have a look at the action icons on the top right of the web page: feel free to watch the repo if you want to receive updates, or to star the repo to support the project. For bug reports or to request new features, you can also open a new issue.
We also have a discussion page on which you can find already answered questions, add new questions, get help with installation procedures, discuss ideas or share comments.
Code of Conduct
Our Pledge
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
Our Standards
Examples of behavior that contributes to creating a positive environment include:
Using welcoming and inclusive language
Being respectful of differing viewpoints and experiences
Gracefully accepting constructive criticism
Focusing on what is best for the community
Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
The use of sexualized language or imagery and unwelcome sexual attention or advances
Trolling, insulting/derogatory comments, and personal or political attacks
Public or private harassment
Publishing others’ private information, such as a physical or electronic address, without explicit permission
Other conduct which could reasonably be considered inappropriate in a professional setting
Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
Scope
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at warpx-coc@lbl.gov. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project’s leadership.
Attribution
This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
For answers to common questions about this code of conduct, see https://www.contributor-covenant.org/faq
Acknowledge WarpX
Please acknowledge the role that WarpX played in your research.
In presentations
For your presentations, you can find WarpX slides here. Several flavors are available:
full slide
half-slide (portrait or landscape format)
small inset.
Feel free to use the one that fits into your presentation and adequately acknowledges the part that WarpX played in your research.
In publications
Please add the following sentence to your publications, it helps contributors keep in touch with the community and promote the project.
Plain text:
This research used the open-source particle-in-cell code WarpX https://github.com/ECP-WarpX/WarpX, primarily funded by the US DOE Exascale Computing Project. Primary WarpX contributors are with LBNL, LLNL, CEA-LIDYL, SLAC, DESY, CERN, and TAE Technologies. We acknowledge all WarpX contributors.
LaTeX:
\usepackage{hyperref}
This research used the open-source particle-in-cell code WarpX \url{https://github.com/ECP-WarpX/WarpX}, primarily funded by the US DOE Exascale Computing Project.
Primary WarpX contributors are with LBNL, LLNL, CEA-LIDYL, SLAC, DESY, CERN, and TAE Technologies.
We acknowledge all WarpX contributors.
Latest WarpX reference
If your project leads to a scientific publication, please consider citing the paper below.
Fedeli L, Huebl A, Boillod-Cerneux F, Clark T, Gott K, Hillairet C, Jaure S, Leblanc A, Lehe R, Myers A, Piechurski C, Sato M, Zaim N, Zhang W, Vay J-L, Vincenti H. Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ISSN:2167-4337, pp. 25-36, Dallas, TX, US, 2022. DOI:10.1109/SC41404.2022.00008 (preprint here)
Prior WarpX references
If your project uses a specific algorithm or component, please consider citing the respective publications in addition.
Sandberg R T, Lehe R, Mitchell C E, Garten M, Myers A, Qiang J, Vay J-L and Huebl A. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. Proc. of Platform for Advanced Scientific Computing (PASC’24), submitted, 2024. preprint <http://arxiv.org/abs/2402.17248>__
Sandberg R T, Lehe R, Mitchell C E, Garten M, Qiang J, Vay J-L and Huebl A. Hybrid Beamline Element ML-Training for Surrogates in the ImpactX Beam-Dynamics Code. 14th International Particle Accelerator Conference (IPAC’23), WEPA101, 2023. DOI:10.18429/JACoW-IPAC2023-WEPA101
Huebl A, Lehe R, Zoni E, Shapoval O, Sandberg R T, Garten M, Formenti A, Jambunathan R, Kumar P, Gott K, Myers A, Zhang W, Almgren A, Mitchell C E, Qiang J, Sinn A, Diederichs S, Thevenet M, Grote D, Fedeli L, Clark T, Zaim N, Vincenti H, Vay JL. From Compact Plasma Particle Sources to Advanced Accelerators with Modeling at Exascale. Proceedings of the 20th Advanced Accelerator Concepts Workshop (AAC’22), in print, 2023. arXiv:2303.12873
Huebl A, Lehe R, Mitchell C E, Qiang J, Ryne R D, Sandberg R T, Vay JL. Next Generation Computational Tools for the Modeling and Design of Particle Accelerators at Exascale. Proceedings of the 2022 North American Particle Accelerator Conference (NAPAC’22), TUYE2, pp. 302-306, 2022. arXiv:2208.02382, DOI:10.18429/JACoW-NAPAC2022-TUYE2
Fedeli L, Zaim N, Sainte-Marie A, Thevenet M, Huebl A, Myers A, Vay JL, Vincenti H. PICSAR-QED: a Monte Carlo module to simulate Strong-Field Quantum Electrodynamics in Particle-In-Cell codes for exascale architectures. New Journal of Physics 24 025009, 2022. DOI:10.1088/1367-2630/ac4ef1
Lehe R, Blelly A, Giacomel L, Jambunathan R, Vay JL. Absorption of charged particles in perfectly matched layers by optimal damping of the deposited current. Physical Review E 106 045306, 2022. DOI:10.1103/PhysRevE.106.045306
Zoni E, Lehe R, Shapoval O, Belkin D, Zaim N, Fedeli L, Vincenti H, Vay JL. A hybrid nodal-staggered pseudo-spectral electromagnetic particle-in-cell method with finite-order centering. Computer Physics Communications 279, 2022. DOI:10.1016/j.cpc.2022.108457
Myers A, Almgren A, Amorim LD, Bell J, Fedeli L, Ge L, Gott K, Grote DP, Hogan M, Huebl A, Jambunathan R, Lehe R, Ng C, Rowan M, Shapoval O, Thevenet M, Vay JL, Vincenti H, Yang E, Zaim N, Zhang W, Zhao Y, Zoni E. Porting WarpX to GPU-accelerated platforms. Parallel Computing. 2021 Sep, 108:102833. DOI:10.1016/j.parco.2021.102833
Shapoval O, Lehe R, Thevenet M, Zoni E, Zhao Y, Vay JL. Overcoming timestep limitations in boosted-frame Particle-In-Cell simulations of plasma-based acceleration. Phys. Rev. E Nov 2021, 104:055311. arXiv:2104.13995, DOI:10.1103/PhysRevE.104.055311
Vay JL, Huebl A, Almgren A, Amorim LD, Bell J, Fedeli L, Ge L, Gott K, Grote DP, Hogan M, Jambunathan R, Lehe R, Myers A, Ng C, Rowan M, Shapoval O, Thevenet M, Vincenti H, Yang E, Zaim N, Zhang W, Zhao Y, Zoni E. Modeling of a chain of three plasma accelerator stages with the WarpX electromagnetic PIC code on GPUs. Physics of Plasmas. 2021 Feb 9, 28(2):023105. DOI:10.1063/5.0028512
Rowan ME, Gott KN, Deslippe J, Huebl A, Thevenet M, Lehe R, Vay JL. In-situ assessment of device-side compute work for dynamic load balancing in a GPU-accelerated PIC code. PASC ‘21: Proceedings of the Platform for Advanced Scientific Computing Conference. 2021 July, 10, pages 1-11. DOI:10.1145/3468267.3470614
Vay JL, Almgren A, Bell J, Ge L, Grote DP, Hogan M, Kononenko O, Lehe R, Myers A, Ng C, Park J, Ryne R, Shapoval O, Thevenet M, Zhang W. Warp-X: A new exascale computing platform for beam–plasma simulations. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 2018 Nov, 909(12) Pages 476-479. DOI: 10.1016/j.nima.2018.01.035
Kirchen M, Lehe R, Jalas S, Shapoval O, Vay JL, Maier AR. Scalable spectral solver in Galilean coordinates for eliminating the numerical Cherenkov instability in particle-in-cell simulations of streaming plasmas. Physical Review E. 2020 July, 102(1-1):013202. DOI: 10.1103/PhysRevE.102.013202
Shapoval O, Vay JL, Vincenti H. Two-step perfectly matched layer for arbitrary-order pseudo-spectral analytical time-domain methods. Computer Physics Communications. 2019 Feb, 235, pages 102-110. DOI: 10.1016/j.cpc.2018.09.015
Lehe R, Kirchen M, Godfrey BB, Maier AR, Vay JL. Elimination of numerical Cherenkov instability in flowing-plasma particle-in-cell simulations by using galilean coordinates. Physical Review E. 2016 Nov, 94:053305. DOI: 10.1103/PhysRevE.94.053305
Science Highlights
WarpX can be used in many domains of laser-plasma science, plasma physics, accelerator physics and beyond. Below, we collect a series of scientific publications that used WarpX. Please acknowledge WarpX in your works, so we can find your works.
Is your publication missing? Contact us or edit this page via a pull request.
Plasma-Based Acceleration
Scientific works in laser-plasma and beam-plasma acceleration.
Peng, H. and Huang, T. W. and Jiang, K. and Li, R. and Wu, C. N. and Yu, M. Y. and Riconda, C. and Weber, S. and Zhou, C. T. and Ruan, S. C. Coherent Subcycle Optical Shock from a Superluminal Plasma Wake. Phys. Rev. Lett. 131, 145003, 2023 DOI:10.1103/PhysRevLett.131.145003
Mewes SM, Boyle GJ, Ferran Pousa A, Shalloo RJ, Osterhoff J, Arran C, Corner L, Walczak R, Hooker SM, Thévenet M. Demonstration of tunability of HOFI waveguides via start-to-end simulations. Phys. Rev. Research 5, 033112, 2023 DOI:10.1103/PhysRevResearch.5.033112
Sandberg R T, Lehe R, Mitchell C E, Garten M, Myers A, Qiang J, Vay J-L and Huebl A. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. Proc. of Platform for Advanced Scientific Computing (PASC’24), submitted, 2024. preprint
Sandberg R T, Lehe R, Mitchell C E, Garten M, Qiang J, Vay J-L and Huebl A. Hybrid Beamline Element ML-Training for Surrogates in the ImpactX Beam-Dynamics Code. 14th International Particle Accelerator Conference (IPAC’23), WEPA101, 2023. DOI:10.18429/JACoW-IPAC2023-WEPA101
Wang J, Zeng M, Li D, Wang X, Gao J. High quality beam produced by tightly focused laser driven wakefield accelerators. Phys. Rev. Accel. Beams, 26, 091303, 2023. DOI:10.1103/PhysRevAccelBeams.26.091303
Fedeli L, Huebl A, Boillod-Cerneux F, Clark T, Gott K, Hillairet C, Jaure S, Leblanc A, Lehe R, Myers A, Piechurski C, Sato M, Zaim N, Zhang W, Vay J-L, Vincenti H. Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ISSN:2167-4337, pp. 25-36, Dallas, TX, US, 2022. DOI:10.1109/SC41404.2022.00008 (preprint here)
Zhao Y, Lehe R, Myers A, Thevenet M, Huebl A, Schroeder CB, Vay J-L. Plasma electron contribution to beam emittance growth from Coulomb collisions in plasma-based accelerators. Physics of Plasmas 29, 103109, 2022. DOI:10.1063/5.0102919
Wang J, Zeng M, Li D, Wang X, Lu W, Gao J. Injection induced by coaxial laser interference in laser wakefield accelerators. Matter and Radiation at Extremes 7, 054001, 2022. DOI:10.1063/5.0101098
Miao B, Shrock JE, Feder L, Hollinger RC, Morrison J, Nedbailo R, Picksley A, Song H, Wang S, Rocca JJ, Milchberg HM. Multi-GeV electron bunches from an all-optical laser wakefield accelerator. Physical Review X 12, 031038, 2022. DOI:10.1103/PhysRevX.12.031038
Mirani F, Calzolari D, Formenti A, Passoni M. Superintense laser-driven photon activation analysis. Nature Communications Physics volume 4.185, 2021. DOI:10.1038/s42005-021-00685-2
Zhao Y, Lehe R, Myers A, Thevenet M, Huebl A, Schroeder CB, Vay J-L. Modeling of emittance growth due to Coulomb collisions in plasma-based accelerators. Physics of Plasmas 27, 113105, 2020. DOI:10.1063/5.0023776
Laser-Plasma Interaction
Scientific works in laser-ion acceleration and laser-matter interaction.
Knight B, Gautam C, Stoner C, Egner B, Smith J, Orban C, Manfredi J, Frische K, Dexter M, Chowdhury E, Patnaik A (2023). Detailed Characterization of a kHz-rate Laser-Driven Fusion at a Thin Liquid Sheet with a Neutron Detection Suite. High Power Laser Science and Engineering, 1-13, 2023. DOI:10.1017/hpl.2023.84
Fedeli L, Huebl A, Boillod-Cerneux F, Clark T, Gott K, Hillairet C, Jaure S, Leblanc A, Lehe R, Myers A, Piechurski C, Sato M, Zaim N, Zhang W, Vay J-L, Vincenti H. Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ISSN:2167-4337, pp. 25-36, Dallas, TX, US, 2022. DOI:10.1109/SC41404.2022.00008 (preprint here)
Hakimi S, Obst-Huebl L, Huebl A, Nakamura K, Bulanov SS, Steinke S, Leemans WP, Kober Z, Ostermayr TM, Schenkel T, Gonsalves AJ, Vay J-L, Tilborg Jv, Toth C, Schroeder CB, Esarey E, Geddes CGR. Laser-solid interaction studies enabled by the new capabilities of the iP2 BELLA PW beamline. Physics of Plasmas 29, 083102, 2022. DOI:10.1063/5.0089331
Levy D, Andriyash IA, Haessler S, Kaur J, Ouille M, Flacco A, Kroupp E, Malka V, Lopez-Martens R. Low-divergence MeV-class proton beams from kHz-driven laser-solid interactions. Phys. Rev. Accel. Beams 25, 093402, 2022. DOI:10.1103/PhysRevAccelBeams.25.093402
Particle Accelerator & Beam Physics
Scientific works in particle and beam modeling.
Sandberg R T, Lehe R, Mitchell C E, Garten M, Myers A, Qiang J, Vay J-L and Huebl A. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. Proc. of Platform for Advanced Scientific Computing (PASC’24), submitted, 2024. preprint
Sandberg R T, Lehe R, Mitchell C E, Garten M, Qiang J, Vay J-L, Huebl A. Hybrid Beamline Element ML-Training for Surrogates in the ImpactX Beam-Dynamics Code. 14th International Particle Accelerator Conference (IPAC’23), WEPA101, in print, 2023. preprint, DOI:10.18429/JACoW-IPAC-23-WEPA101
Tan W H, Piot P, Myers A, Zhang W, Rheaume T, Jambunathan R, Huebl A, Lehe R, Vay J-L. Simulation studies of drive-beam instability in a dielectric wakefield accelerator. 13th International Particle Accelerator Conference (IPAC’22), MOPOMS012, 2022. DOI:10.18429/JACoW-IPAC2022-MOPOMS012
High Energy Astrophysical Plasma Physics
Scientific works in astrophysical plasma modeling.
Klion H, Jambunathan R, Rowan ME, Yang E, Willcox D, Vay J-L, Lehe R, Myers A, Huebl A, Zhang W. Particle-in-Cell simulations of relativistic magnetic reconnection with advanced Maxwell solver algorithms. arXiv pre-print, 2023. DOI:10.48550/arXiv.2304.10566
Microelectronics
ARTEMIS (Adaptive mesh Refinement Time-domain ElectrodynaMIcs Solver) is based on WarpX and couples the Maxwell’s equations implementation in WarpX with classical equations that describe quantum material behavior (such as, LLG equation for micromagnetics and London equation for superconducting materials) for quantifying the performance of next-generation microelectronics.
Sawant S S, Yao Z, Jambunathan R, Nonaka A. Characterization of Transmission Lines in Microelectronic Circuits Using the ARTEMIS Solver. IEEE Journal on Multiscale and Multiphysics Computational Techniques, vol. 8, pp. 31-39, 2023. DOI:10.1109/JMMCT.2022.3228281
Kumar P, Nonaka A, Jambunathan R, Pahwa G and Salahuddin S, Yao Z. FerroX: A GPU-accelerated, 3D Phase-Field Simulation Framework for Modeling Ferroelectric Devices. arXiv preprint, 2022. arXiv:2210.15668
Yao Z, Jambunathan R, Zeng Y, Nonaka A. A Massively Parallel Time-Domain Coupled Electrodynamics–Micromagnetics Solver. The International Journal of High Performance Computing Applications, 36(2):167-181, 2022. DOI:10.1177/10943420211057906
High-Performance Computing and Numerics
Scientific works in High-Performance Computing, applied mathematics and numerics.
Please see this section.
Nuclear Fusion - Magnetically Confined Plasmas
Nicks B. S., Putvinski S. and Tajima T. Stabilization of the Alfvén-ion cyclotron instability through short plasmas: Fully kinetic simulations in a high-beta regime. Physics of Plasmas 30, 102108, 2023. DOI:10.1063/5.0163889
Groenewald R. E., Veksler A., Ceccherini F., Necas A., Nicks B. S., Barnes D. C., Tajima T. and Dettrick S. A. Accelerated kinetic model for global macro stability studies of high-beta fusion reactors. Physics of Plasmas 30, 122508, 2023. DOI:10.1063/5.0178288
Installation
Users
Our community is here to help. Please report installation problems in case you should get stuck.
Choose one of the installation methods below to get started:
HPC Systems
If want to use WarpX on a specific high-performance computing (HPC) systems, jump directly to our HPC system-specific documentation.
Using the Conda Package
A package for WarpX is available via the Conda package manager.
Tip
We recommend to configure your conda to use the faster libmamba
dependency solver.
conda update -y -n base conda
conda install -y -n base conda-libmamba-solver
conda config --set solver libmamba
We recommend to deactivate that conda self-activates its base
environment.
This avoids interference with the system and other package managers.
conda config --set auto_activate_base false
conda create -n warpx -c conda-forge warpx
conda activate warpx
Note
The warpx
conda package does not yet provide GPU support.
Using the Spack Package
Packages for WarpX are available via the Spack package manager.
The package warpx
installs executables and the package py-warpx
includes Python bindings, i.e. PICMI.
# optional: activate Spack binary caches
spack mirror add rolling https://binaries.spack.io/develop
spack buildcache keys --install --trust
# see `spack info py-warpx` for build options.
# optional arguments: -mpi ^warpx dims=2 compute=cuda
spack install py-warpx
spack load py-warpx
See spack info warpx
or spack info py-warpx
and the official Spack tutorial for more information.
Using the PyPI Package
Given that you have the WarpX dependencies installed, you can use pip
to install WarpX with PICMI from source:
python3 -m pip install -U pip
python3 -m pip install -U build packaging setuptools wheel
python3 -m pip install -U cmake
python3 -m pip wheel -v git+https://github.com/ECP-WarpX/WarpX.git
python3 -m pip install *whl
In the future, will publish pre-compiled binary packages on PyPI for faster installs. (Consider using conda in the meantime.)
Using the Brew Package
Note
Coming soon.
From Source with CMake
After installing the WarpX dependencies, you can also install WarpX from source with CMake:
# get the source code
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
cd $HOME/src/warpx
# configure
cmake -S . -B build
# optional: change configuration
ccmake build
# compile
# on Windows: --config RelWithDebInfo
cmake --build build -j 4
# executables for WarpX are now in build/bin/
We document the details in the developer installation.
Tips for macOS Users
Tip
Before getting started with package managers, please check what you manually installed in /usr/local
.
If you find entries in bin/
, lib/
et al. that look like you manually installed MPI, HDF5 or other software in the past, then remove those files first.
If you find software such as MPI in the same directories that are shown as symbolic links then it is likely you brew installed software before.
If you are trying annother package manager than brew
, run brew unlink … on such packages first to avoid software incompatibilities.
See also: A. Huebl, Working With Multiple Package Managers, Collegeville Workshop (CW20), 2020
Developers
CMake is our primary build system. If you are new to CMake, this short tutorial from the HEP Software foundation is the perfect place to get started. If you just want to use CMake to build the project, jump into sections 1. Introduction, 2. Building with CMake and 9. Finding Packages.
Dependencies
Before you start, you will need a copy of the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
cd $HOME/src/warpx
WarpX depends on popular third party software.
On your development machine, follow the instructions here.
If you are on an HPC machine, follow the instructions here.
Dependencies
WarpX depends on the following popular third party software. Please see installation instructions below.
a mature C++17 compiler, e.g., GCC 8.4+, Clang 7, NVCC 11.0, MSVC 19.15 or newer
AMReX: we automatically download and compile a copy of AMReX
PICSAR: we automatically download and compile a copy of PICSAR
and for Python bindings:
pyAMReX: we automatically download and compile a copy of pyAMReX
pybind11: we automatically download and compile a copy of pybind11
Optional dependencies include:
MPI 3.0+: for multi-node and/or multi-GPU execution
for on-node accelerated compute one of either:
OpenMP 3.1+: for threaded CPU execution or
CUDA Toolkit 11.7+: for Nvidia GPU support (see matching host-compilers) or
ROCm 5.2+ (5.5+ recommended): for AMD GPU support
FFTW3: for spectral solver (PSATD) support when running on CPU or SYCL
also needs the
pkg-config
tool on Unix
BLAS++ and LAPACK++: for spectral solver (PSATD) support in RZ geometry
Boost 1.66.0+: for QED lookup tables generation support
openPMD-api 0.15.1+: we automatically download and compile a copy of openPMD-api for openPMD I/O support
see optional I/O backends, i.e., ADIOS2 and/or HDF5
Ascent 0.8.0+: for in situ 3D visualization
SENSEI 4.0.0+: for in situ analysis and visualization
CCache: to speed up rebuilds (For CUDA support, needs version 3.7.9+ and 4.2+ is recommended)
Ninja: for faster parallel compiles
-
see our
requirements.txt
file for compatible versions
If you are on a high-performance computing (HPC) system, then please see our separate HPC documentation.
For all other systems, we recommend to use a package dependency manager: Pick one of the installation methods below to install all dependencies for WarpX development in a consistent manner.
Conda (Linux/macOS/Windows)
Conda/Mamba are cross-compatible, user-level package managers.
Tip
We recommend to configure your conda to use the faster libmamba
dependency solver.
conda update -y -n base conda
conda install -y -n base conda-libmamba-solver
conda config --set solver libmamba
We recommend to deactivate that conda self-activates its base
environment.
This avoids interference with the system and other package managers.
conda config --set auto_activate_base false
conda create -n warpx-cpu-mpich-dev -c conda-forge blaspp boost ccache cmake compilers git lapackpp "openpmd-api=*=mpi_mpich*" openpmd-viewer python make numpy pandas scipy yt "fftw=*=mpi_mpich*" pkg-config matplotlib mamba mpich mpi4py ninja pip virtualenv
conda activate warpx-cpu-mpich-dev
# compile WarpX with -DWarpX_MPI=ON
# for pip, use: export WARPX_MPI=ON
conda create -n warpx-cpu-dev -c conda-forge blaspp boost ccache cmake compilers git lapackpp openpmd-api openpmd-viewer python make numpy pandas scipy yt fftw pkg-config matplotlib mamba ninja pip virtualenv
conda activate warpx-cpu-dev
# compile WarpX with -DWarpX_MPI=OFF
# for pip, use: export WARPX_MPI=OFF
For OpenMP support, you will further need:
conda install -c conda-forge libgomp
conda install -c conda-forge llvm-openmp
For Nvidia CUDA GPU support, you will need to have a recent CUDA driver installed or you can lower the CUDA version of the Nvidia cuda package and conda-forge to match your drivers and then add these packages:
conda install -c nvidia -c conda-forge cuda cupy
More info for CUDA-enabled ML packages.
Spack (Linux/macOS)
Spack is a user-level package manager. It is primarily written for Linux, with slightly less support for macOS, and future support for Windows.
First, download a WarpX Spack desktop development environment of your choice. For most desktop developments, pick the OpenMP environment for CPUs unless you have a supported GPU.
Debian/Ubuntu Linux:
OpenMP:
system=ubuntu; compute=openmp
(CPUs)CUDA:
system=ubuntu; compute=cuda
(Nvidia GPUs)ROCm:
system=ubuntu; compute=rocm
(AMD GPUs)SYCL: todo (Intel GPUs)
macOS: first, prepare with
brew install gpg2; brew install gcc
OpenMP:
system=macos; compute=openmp
If you already installed Spack, we recommend to activate its binary caches for faster builds:
spack mirror add rolling https://binaries.spack.io/develop
spack buildcache keys --install --trust
Now install the WarpX dependencies in a new WarpX development environment:
# download environment file
curl -sLO https://raw.githubusercontent.com/ECP-WarpX/WarpX/development/Tools/machines/desktop/spack-${system}-${compute}.yaml
# create new development environment
spack env create warpx-${compute}-dev spack-${system}-${compute}.yaml
spack env activate warpx-${compute}-dev
# installation
spack install
python3 -m pip install jupyter matplotlib numpy openpmd-api openpmd-viewer pandas scipy virtualenv yt
In new terminal sessions, re-activate the environment with
spack env activate warpx-openmp-dev
again.
Replace openmp
with the equivalent you chose.
Compile WarpX with -DWarpX_MPI=ON
.
For pip
, use export WARPX_MPI=ON
.
Brew (macOS/Linux)
Homebrew (Brew) is a user-level package manager primarily for Apple macOS, but also supports Linux.
brew update
brew tap openpmd/openpmd
brew install adios2 # for openPMD
brew install ccache
brew install cmake
brew install fftw # for PSATD
brew install git
brew install hdf5-mpi # for openPMD
brew install libomp
brew unlink gcc
brew link --force libomp
brew install pkg-config # for fftw
brew install open-mpi
brew install openblas # for PSATD in RZ
brew install openpmd-api # for openPMD
If you also want to compile with PSATD in RZ, you need to manually install BLAS++ and LAPACK++:
sudo mkdir -p /usr/local/bin/
sudo curl -L -o /usr/local/bin/cmake-easyinstall https://raw.githubusercontent.com/ax3l/cmake-easyinstall/main/cmake-easyinstall
sudo chmod a+x /usr/local/bin/cmake-easyinstall
cmake-easyinstall --prefix=/usr/local git+https://github.com/icl-utk-edu/blaspp.git \
-Duse_openmp=OFF -Dbuild_tests=OFF -DCMAKE_VERBOSE_MAKEFILE=ON
cmake-easyinstall --prefix=/usr/local git+https://github.com/icl-utk-edu/lapackpp.git \
-Duse_cmake_find_lapack=ON -Dbuild_tests=OFF -DCMAKE_VERBOSE_MAKEFILE=ON
Compile WarpX with -DWarpX_MPI=ON
.
For pip
, use export WARPX_MPI=ON
.
APT (Debian/Ubuntu Linux)
The Advanced Package Tool (APT) is a system-level package manager on Debian-based Linux distributions, including Ubuntu.
sudo apt update
sudo apt install build-essential ccache cmake g++ git libfftw3-mpi-dev libfftw3-dev libhdf5-openmpi-dev libopenmpi-dev pkg-config python3 python3-matplotlib python3-mpi4py python3-numpy python3-pandas python3-pip python3-scipy python3-venv
# optional:
# for CUDA, either install
# https://developer.nvidia.com/cuda-downloads (preferred)
# or, if your Debian/Ubuntu is new enough, use the packages
# sudo apt install nvidia-cuda-dev libcub-dev
# compile WarpX with -DWarpX_MPI=ON
# for pip, use: export WARPX_MPI=ON
sudo apt update
sudo apt install build-essential ccache cmake g++ git libfftw3-dev libfftw3-dev libhdf5-dev pkg-config python3 python3-matplotlib python3-numpy python3-pandas python3-pip python3-scipy python3-venv
# optional:
# for CUDA, either install
# https://developer.nvidia.com/cuda-downloads (preferred)
# or, if your Debian/Ubuntu is new enough, use the packages
# sudo apt install nvidia-cuda-dev libcub-dev
# compile WarpX with -DWarpX_MPI=OFF
# for pip, use: export WARPX_MPI=OFF
Compile
From the base of the WarpX source directory, execute:
# find dependencies & configure
# see additional options below, e.g.
# -DWarpX_PYTHON=ON
# -DCMAKE_INSTALL_PREFIX=$HOME/sw/warpx
cmake -S . -B build
# compile, here we use four threads
cmake --build build -j 4
That’s it!
A 3D WarpX binary is now in build/bin/
and can be run with a 3D example inputs file.
Most people execute the binary directly or copy it out.
If you want to install the executables in a programmatic way, run this:
# for default install paths, you will need administrator rights, e.g. with sudo:
cmake --build build --target install
You can inspect and modify build options after running cmake -S . -B build
with either
ccmake build
or by adding arguments with -D<OPTION>=<VALUE>
to the first CMake call.
For example, this builds WarpX in all geometries, enables Python bindings and Nvidia GPU (CUDA) support:
cmake -S . -B build -DWarpX_DIMS="1;2;RZ;3" -DWarpX_COMPUTE=CUDA
Build Options
CMake Option |
Default & Values |
Description |
---|---|---|
|
RelWithDebInfo/Release/Debug |
|
|
system-dependent path |
|
|
ON/OFF |
|
|
Additional options for |
|
|
ON/OFF |
Build the WarpX executable application |
|
ON/OFF |
Ascent in situ visualization |
|
NOACC/OMP/CUDA/SYCL/HIP |
On-node, accelerated computing backend |
|
3/2/1/RZ |
Simulation dimensionality. Use |
|
ON/OFF |
Embedded boundary support (not supported in RZ yet) |
|
ON/OFF |
Compile WarpX with interprocedural optimization (aka LTO) |
|
ON/OFF |
Build WarpX as a library, e.g., for PICMI Python |
|
ON/OFF |
Multi-node support (message-passing) |
|
ON/OFF |
MPI thread-multiple support, i.e. for |
|
ON/OFF |
openPMD I/O (HDF5, ADIOS) |
|
SINGLE/DOUBLE |
Floating point precision (single/double) |
|
SINGLE/DOUBLE |
Particle floating point precision (single/double), defaults to WarpX_PRECISION value if not set |
|
ON/OFF |
Spectral solver |
|
ON/OFF |
Python bindings |
|
ON/OFF |
QED support (requires PICSAR) |
|
ON/OFF |
QED table generation support (requires PICSAR and Boost) |
|
ON/OFF |
Build external tool to generate QED lookup tables (requires PICSAR and Boost) |
|
AUTO/ON/OFF |
Enables OpenMP support for QED lookup tables generation |
|
ON/OFF |
SENSEI in situ visualization |
WarpX can be configured in further detail with options from AMReX, which are documented in the AMReX manual:
Developers might be interested in additional options that control dependencies of WarpX. By default, the most important dependencies of WarpX are automatically downloaded for convenience:
CMake Option |
Default & Values |
Description |
---|---|---|
|
ON/OFF |
|
|
ON/OFF |
Search and use CCache to speed up rebuilds. |
|
ON/OFF |
Print CUDA code generation statistics from |
|
None |
Path to AMReX source directory (preferred if set) |
|
|
Repository URI to pull and build AMReX from |
|
we set and maintain a compatible commit |
Repository branch for |
|
ON/OFF |
Needs a pre-installed AMReX library if set to |
|
None |
Path to openPMD-api source directory (preferred if set) |
|
|
Repository URI to pull and build openPMD-api from |
|
|
Repository branch for |
|
ON/OFF |
Needs a pre-installed openPMD-api library if set to |
|
None |
Path to PICSAR source directory (preferred if set) |
|
|
Repository URI to pull and build PICSAR from |
|
we set and maintain a compatible commit |
Repository branch for |
|
ON/OFF |
Needs a pre-installed PICSAR library if set to |
|
None |
Path to PICSAR source directory (preferred if set) |
|
|
Repository URI to pull and build pyAMReX from |
|
we set and maintain a compatible commit |
Repository branch for |
|
ON/OFF |
Needs a pre-installed pyAMReX library if set to |
|
ON/OFF |
Build Python w/ interprocedural/link optimization (IPO/LTO) |
|
None |
Path to pybind11 source directory (preferred if set) |
|
|
Repository URI to pull and build pybind11 from |
|
we set and maintain a compatible commit |
Repository branch for |
|
ON/OFF |
Needs a pre-installed pybind11 library if set to |
For example, one can also build against a local AMReX copy.
Assuming AMReX’ source is located in $HOME/src/amrex
, add the cmake
argument -DWarpX_amrex_src=$HOME/src/amrex
.
Relative paths are also supported, e.g. -DWarpX_amrex_src=../amrex
.
Or build against an AMReX feature branch of a colleague.
Assuming your colleague pushed AMReX to https://github.com/WeiqunZhang/amrex/
in a branch new-feature
then pass to cmake
the arguments: -DWarpX_amrex_repo=https://github.com/WeiqunZhang/amrex.git -DWarpX_amrex_branch=new-feature
.
More details on this workflow are described here.
You can speed up the install further if you pre-install these dependencies, e.g. with a package manager.
Set -DWarpX_<dependency-name>_internal=OFF
and add installation prefix of the dependency to the environment variable CMAKE_PREFIX_PATH.
Please see the introduction to CMake if this sounds new to you.
More details on this workflow are described here.
If you re-compile often, consider installing the Ninja build system.
Pass -G Ninja
to the CMake configuration call to speed up parallel compiles.
Configure your compiler
If you don’t want to use your default compiler, you can set the following environment variables. For example, using a Clang/LLVM:
export CC=$(which clang)
export CXX=$(which clang++)
If you also want to select a CUDA compiler:
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=$(which clang++)
We also support adding additional compiler flags via environment variables such as CXXFLAGS/LDFLAGS:
# example: treat all compiler warnings as errors
export CXXFLAGS="-Werror"
Note
Please clean your build directory with rm -rf build/
after changing the compiler.
Now call cmake -S . -B build
(+ further options) again to re-initialize the build configuration.
Run
An executable WarpX binary with the current compile-time options encoded in its file name will be created in build/bin/
.
Note that you need separate binaries to run 1D, 2D, 3D, and RZ geometry inputs scripts.
Additionally, a symbolic link named warpx
can be found in that directory, which points to the last built WarpX executable.
More details on running simulations are in the section Run WarpX. Alternatively, read on and also build our PICMI Python interface.
PICMI Python Bindings
Note
Preparation: make sure you work with up-to-date Python tooling.
python3 -m pip install -U pip
python3 -m pip install -U build packaging setuptools wheel
python3 -m pip install -U cmake
python3 -m pip install -r requirements.txt
For PICMI Python bindings, configure WarpX to produce a library and call our pip_install
CMake target:
# find dependencies & configure for all WarpX dimensionalities
cmake -S . -B build_py -DWarpX_DIMS="1;2;RZ;3" -DWarpX_PYTHON=ON
# build and then call "python3 -m pip install ..."
cmake --build build_py --target pip_install -j 4
That’s it! You can now run a first 3D PICMI script from our examples.
Developers could now change the WarpX source code and then call the build line again to refresh the Python installation.
Tip
If you do not develop with a user-level package manager, e.g., because you rely on a HPC system’s environment modules, then consider to set up a virtual environment via Python venv.
Otherwise, without a virtual environment, you likely need to add the CMake option -DPYINSTALLOPTIONS="--user"
.
Python Bindings (Package Management)
This section is relevant for Python package management, mainly for maintainers or people that rather like to interact only with pip
.
One can build and install pywarpx
from the root of the WarpX source tree:
python3 -m pip wheel -v .
python3 -m pip install pywarpx*whl
This will call the CMake logic above implicitly.
Using this workflow has the advantage that it can build and package up multiple libraries with varying WarpX_DIMS
into one pywarpx
package.
Environment variables can be used to control the build step:
Environment Variable |
Default & Values |
Description |
---|---|---|
|
NOACC/OMP/CUDA/SYCL/HIP |
On-node, accelerated computing backend |
|
|
Simulation dimensionalities (semicolon-separated list) |
|
ON/OFF |
Embedded boundary support (not supported in RZ yet) |
|
ON/OFF |
Multi-node support (message-passing) |
|
ON/OFF |
openPMD I/O (HDF5, ADIOS) |
|
SINGLE/DOUBLE |
Floating point precision (single/double) |
|
SINGLE/DOUBLE |
Particle floating point precision (single/double), defaults to WarpX_PRECISION value if not set |
|
ON/OFF |
Spectral solver |
|
ON/OFF |
PICSAR QED (requires PICSAR) |
|
ON/OFF |
QED table generation (requires PICSAR and Boost) |
|
|
Number of threads to use for parallel builds |
|
ON/OFF |
Build shared libraries for dependencies |
|
ON/OFF |
Prefer static libraries for HDF5 dependency (openPMD) |
|
ON/OFF |
Prefer static libraries for ADIOS1 dependency (openPMD) |
|
None |
Absolute path to AMReX source directory (preferred if set) |
|
None (uses cmake default) |
Repository URI to pull and build AMReX from |
|
None (uses cmake default) |
Repository branch for |
|
ON/OFF |
Needs a pre-installed AMReX library if set to |
|
None |
Absolute path to openPMD-api source directory (preferred if set) |
|
ON/OFF |
Needs a pre-installed openPMD-api library if set to |
|
None |
Absolute path to PICSAR source directory (preferred if set) |
|
ON/OFF |
Needs a pre-installed PICSAR library if set to |
|
None |
Absolute path to pyAMReX source directory (preferred if set) |
|
ON/OFF |
Needs a pre-installed pyAMReX library if set to |
|
ON/OFF |
Build Python w/ interprocedural/link optimization (IPO/LTO) |
|
None |
Absolute path to pybind11 source directory (preferred if set) |
|
ON/OFF |
Needs a pre-installed pybind11 library if set to |
|
First found |
Set to |
|
None |
If set, search for pre-built WarpX C++ libraries (see below) |
Note that we currently change the WARPX_MPI
default intentionally to OFF
, to simplify a first install from source.
Some hints and workflows follow.
Developers, that want to test a change of the source code but did not change the pywarpx
version number, can force a reinstall via:
python3 -m pip install --force-reinstall --no-deps -v .
Some Developers like to code directly against a local copy of AMReX, changing both code-bases at a time:
WARPX_AMREX_SRC=$PWD/../amrex python3 -m pip install --force-reinstall --no-deps -v .
Additional environment control as common for CMake (see above) can be set as well, e.g. CC
, CXX`, and CMAKE_PREFIX_PATH
hints.
So another sophisticated example might be: use Clang as the compiler, build with local source copies of PICSAR and AMReX, support the PSATD solver, MPI and openPMD, hint a parallel HDF5 installation in $HOME/sw/hdf5-parallel-1.10.4
, and only build 2D and 3D geometry:
CC=$(which clang) CXX=$(which clang++) WARPX_AMREX_SRC=$PWD/../amrex WARPX_PICSAR_SRC=$PWD/../picsar WARPX_PSATD=ON WARPX_MPI=ON WARPX_DIMS="2;3" CMAKE_PREFIX_PATH=$HOME/sw/hdf5-parallel-1.10.4:$CMAKE_PREFIX_PATH python3 -m pip install --force-reinstall --no-deps -v .
Here we wrote this all in one line, but one can also set all environment variables in a development environment and keep the pip call nice and short as in the beginning.
Note that you need to use absolute paths for external source trees, because pip builds in a temporary directory, e.g. export WARPX_AMREX_SRC=$HOME/src/amrex
.
All of this can also be run from CMake. This is the workflow most developers will prefer as it allows rapid re-compiles:
# build WarpX executables and libraries
cmake -S . -B build_py -DWarpX_DIMS="1;2;RZ;3" -DWarpX_PYTHON=ON
# build & install Python only
cmake --build build_py -j 4 --target pip_install
There is also a --target pip_install_nodeps
option that skips pip-based dependency checks.
WarpX release managers might also want to generate a self-contained source package that can be distributed to exotic architectures:
python setup.py sdist --dist-dir .
python3 -m pip wheel -v pywarpx-*.tar.gz
python3 -m pip install *whl
The above steps can also be executed in one go to build from source on a machine:
python3 setup.py sdist --dist-dir .
python3 -m pip install -v pywarpx-*.tar.gz
Last but not least, you can uninstall pywarpx
as usual with:
python3 -m pip uninstall pywarpx
HPC
On selected high-performance computing (HPC) systems, WarpX has documented or even pre-build installation routines. Follow the guide here instead of the generic installation routines for optimal stability and best performance.
warpx.profile
Use a warpx.profile
file to set up your software environment without colliding with other software.
Ideally, store that file directly in your $HOME/
and source it after connecting to the machine:
source $HOME/warpx.profile
We list example warpx.profile
files below, which can be used to set up WarpX on various HPC systems.
HPC Machines
This section documents quick-start guides for a selection of supercomputers that WarpX users are active on.
Adastra (CINES)
The Adastra cluster is located at CINES (France). Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$SHAREDSCRATCHDIR
: meant for short-term data storage, shared with all members of a project, purged every 30 days (17.6 TB default quota)$SCRATCHDIR
: meant for short-term data storage, single user, purged every 30 days$SHAREDWORKDIR
: meant for mid-term data storage, shared with all members of a project, never purged (4.76 TB default quota)$WORKDIR
: meant for mid-term data storage, single user, never purged$STORE
: meant for long term storage, single user, never purged, backed up$SHAREDHOMEDIR
: meant for scripts and tools, shared with all members of a project, never purged, backed up$HOME
: meant for scripts and tools, single user, never purged, backed up
Preparation
The following instructions will install WarpX in the $SHAREDHOMEDIR
directory,
which is shared among all the members of a given project. Due to the inode
quota enforced for this machine, a shared installation of WarpX is advised.
Use the following commands to download the WarpX source code:
# If you have multiple projects, activate the project that you want to use with:
#
# myproject -a YOUR_PROJECT_NAME
#
git clone https://github.com/ECP-WarpX/WarpX.git $SHAREDHOMEDIR/src/warpx
We use system software modules, add environment hints and further dependencies via the file $SHAREDHOMEDIR/adastra_warpx.profile
.
Create it now:
cp $SHAREDHOMEDIR/src/warpx/Tools/machines/adastra-cines/adastra_warpx.profile.example $SHAREDHOMEDIR/adastra_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable using a text editor
such as nano
, emacs
, or vim
(all available by default on Adastra login nodes) and
uncomment the 3rd line (which sets $proj
as the active project).
Important
Now, and as the first step on future logins to Adastra, activate these environment settings:
source $SHAREDHOMEDIR/adastra_warpx.profile
Finally, since Adastra does not yet provide software modules for some of our dependencies, install them once:
bash $SHAREDHOMEDIR/src/warpx/Tools/machines/adastra-cines/install_dependencies.sh
source $SHAREDHOMEDIR/sw/adastra/gpu/venvs/warpx-adastra/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $SHAREDHOMEDIR/src/warpx
rm -rf build_adastra
cmake -S . -B build_adastra -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_adastra -j 16
The WarpX application executables are now in $SHAREDHOMEDIR/src/warpx/build_adastra/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_adastra_py
cmake -S . -B build_adastra_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_adastra_py -j 16 --target pip_install
Now, you can submit Adstra compute jobs for WarpX Python (PICMI) scripts (example scripts). Or, you can use the WarpX executables to submit Adastra jobs (example inputs). For executables, you can reference their location in your job script .
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $SHAREDHOMEDIR/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_adastra
and rebuild WarpX.
Running
MI250X GPUs (2x64 GB)
In non-interactive runs:
$HOME/src/warpx/Tools/machines/adastra-cines/submit.sh
.#!/bin/bash
#SBATCH --account=<account_to_charge>
#SBATCH --job-name=warpx
#SBATCH --constraint=MI250
#SBATCH --nodes=2
#SBATCH --exclusive
#SBATCH --output=%x-%j.out
#SBATCH --time=00:10:00
module purge
# A CrayPE environment version
module load cpe/23.12
# An architecture
module load craype-accel-amd-gfx90a craype-x86-trento
# A compiler to target the architecture
module load PrgEnv-cray
# Some architecture related libraries and tools
module load CCE-GPU-3.0.0
module load amd-mixed/5.2.3
date
module list
export MPICH_GPU_SUPPORT_ENABLED=1
# note
# this environment setting is currently needed to work-around a
# known issue with Libfabric
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
# note
# On machines with similar architectures (Frontier, OLCF) these settings
# seem to prevent the following issue:
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software
# note
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null
export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
--cpus-per-task=8 --threads-per-core=1 --gpu-bind=closest \
./warpx inputs > output.txt
Post-Processing
Note
TODO: Document any Jupyter or data services.
Known System Issues
Warning
May 16th, 2022: There is a caching bug in Libfabric that causes WarpX simulations to occasionally hang on on more than 1 node.
As a work-around, please export the following environment variable in your job scripts until the issue is fixed:
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
Warning
Sep 2nd, 2022: rocFFT in ROCm 5.1-5.3 tries to write to a cache in the home area by default. This does not scale, disable it via:
export ROCFFT_RTC_CACHE_PATH=/dev/null
Warning
January, 2023: We discovered a regression in AMD ROCm, leading to 2x slower current deposition (and other slowdowns) in ROCm 5.3 and 5.4. Reported to AMD and fixed for the next release of ROCm.
Stay with the ROCm 5.2 module to avoid.
Crusher (OLCF)
The Crusher cluster is located at OLCF.
On Crusher, each compute node provides four AMD MI250X GPUs, each with two Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up; mounted as read-only on compute nodes, that means you cannot run in it (50 GB quota)$PROJWORK/$proj/
: shared with all members of a project, purged every 90 days, Lustre (recommended)$MEMBERWORK/$proj/
: single user, purged every 90 days, Lustre (usually smaller quota, 50TB default quota)$WORLDWORK/$proj/
: shared with all users, purged every 90 days, Lustre (50TB default quota)
Note: the Orion Lustre filesystem on Frontier and Crusher, and the older Alpine GPFS filesystem on Summit are not mounted on each others machines. Use Globus to transfer data between them if needed.
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use system software modules, add environment hints and further dependencies via the file $HOME/crusher_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/crusher-olcf/crusher_warpx.profile.example $HOME/crusher_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project aph114
, then run vi $HOME/crusher_warpx.profile
.
Enter the edit mode by typing i
and edit line 2 to read:
export proj="aph114"
Exit the vi
editor with Esc
and then type :wq
(write & quit).
Important
Now, and as the first step on future logins to Crusher, activate these environment settings:
source $HOME/crusher_warpx.profile
Finally, since Crusher does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/crusher-olcf/install_dependencies.sh
source $HOME/sw/crusher/gpu/venvs/warpx-crusher/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_crusher
cmake -S . -B build_crusher -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_crusher -j 16
The WarpX application executables are now in $HOME/src/warpx/build_crusher/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_crusher_py
cmake -S . -B build_crusher_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_crusher_py -j 16 --target pip_install
Now, you can submit Crusher compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Crusher jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_crusher
and rebuild WarpX.
Running
MI250X GPUs (2x64 GB)
After requesting an interactive node with the getNode
alias above, run a simulation like this, here using 8 MPI ranks and a single node:
runNode ./warpx inputs
Or in non-interactive runs:
$HOME/src/warpx/Tools/machines/crusher-olcf/submit.sh
.#!/usr/bin/env bash
#SBATCH -A <project id>
# note: WarpX ECP members use aph114
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p batch
#SBATCH --ntasks-per-node=8
# Since 2022-12-29 Crusher is using a low-noise mode layout,
# making only 7 instead of 8 cores available per process
# https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html#id6
#SBATCH --cpus-per-task=7
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=closest
#SBATCH -N 1
# From the documentation:
# Each Crusher compute node consists of [1x] 64-core AMD EPYC 7A53
# "Optimized 3rd Gen EPYC" CPU (with 2 hardware threads per physical core) with
# access to 512 GB of DDR4 memory.
# Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies
# (GCDs) for a total of 8 GCDs per node. The programmer can think of the 8 GCDs
# as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
# note (5-16-22, OLCFHELP-6888)
# this environment setting is currently needed on Crusher to work-around a
# known issue with Libfabric
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
# Seen since August 2023 on Frontier, adapting the same for Crusher
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software
# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null
export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
./warpx.3d inputs > output_${SLURM_JOBID}.txt
Post-Processing
For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).
Please follow the same guidance as for OLCF Summit post-processing.
Known System Issues
Note
Please see the Frontier Known System Issues due to the similarity of the two systems.
Frontier (OLCF)
The Frontier cluster is located at OLCF.
On Frontier, each compute node provides four AMD MI250X GPUs, each with two Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up; mounted as read-only on compute nodes, that means you cannot run in it (50 GB quota)$PROJWORK/$proj/
: shared with all members of a project, purged every 90 days, Lustre (recommended)$MEMBERWORK/$proj/
: single user, purged every 90 days, Lustre (usually smaller quota, 50TB default quota)$WORLDWORK/$proj/
: shared with all users, purged every 90 days, Lustre (50TB default quota)
Note: the Orion Lustre filesystem on Frontier and the older Alpine GPFS filesystem on Summit are not mounted on each others machines. Use Globus to transfer data between them if needed.
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use system software modules, add environment hints and further dependencies via the file $HOME/frontier_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/frontier-olcf/frontier_warpx.profile.example $HOME/frontier_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project aph114
, then run vi $HOME/frontier_warpx.profile
.
Enter the edit mode by typing i
and edit line 2 to read:
export proj="aph114"
Exit the vi
editor with Esc
and then type :wq
(write & quit).
Important
Now, and as the first step on future logins to Frontier, activate these environment settings:
source $HOME/frontier_warpx.profile
Finally, since Frontier does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/frontier-olcf/install_dependencies.sh
source $HOME/sw/frontier/gpu/venvs/warpx-frontier/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_frontier
cmake -S . -B build_frontier -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_frontier -j 16
The WarpX application executables are now in $HOME/src/warpx/build_frontier/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_frontier_py
cmake -S . -B build_frontier_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_frontier_py -j 16 --target pip_install
Now, you can submit Frontier compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Frontier jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_frontier
and rebuild WarpX.
Running
MI250X GPUs (2x64 GB)
After requesting an interactive node with the getNode
alias above, run a simulation like this, here using 8 MPI ranks and a single node:
runNode ./warpx inputs
Or in non-interactive runs:
$HOME/src/warpx/Tools/machines/frontier-olcf/submit.sh
.#!/usr/bin/env bash
#SBATCH -A <project id>
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p batch
#SBATCH --ntasks-per-node=8
# Due to Frontier's Low-Noise Mode Layout only 7 instead of 8 cores are available per process
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#low-noise-mode-layout
#SBATCH --cpus-per-task=7
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=closest
#SBATCH -N 20
# load cray libs and ROCm libs
#export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}
# From the documentation:
# Each Frontier compute node consists of [1x] 64-core AMD EPYC 7A53
# "Optimized 3rd Gen EPYC" CPU (with 2 hardware threads per physical core) with
# access to 512 GB of DDR4 memory.
# Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies
# (GCDs) for a total of 8 GCDs per node. The programmer can think of the 8 GCDs
# as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
# note (5-16-22 and 7-12-22)
# this environment setting is currently needed on Frontier to work-around a
# known issue with Libfabric (both in the May and June PE)
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
# Seen since August 2023
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software
# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null
export OMP_NUM_THREADS=1
export WARPX_NMPI_PER_NODE=8
export TOTAL_NMPI=$(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} ))
srun -N${SLURM_JOB_NUM_NODES} -n${TOTAL_NMPI} --ntasks-per-node=${WARPX_NMPI_PER_NODE} \
./warpx inputs > output.txt
Post-Processing
For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).
Please follow the same guidance as for OLCF Summit post-processing.
Known System Issues
Warning
May 16th, 2022 (OLCFHELP-6888): There is a caching bug in Libfabric that causes WarpX simulations to occasionally hang on Frontier on more than 1 node.
As a work-around, please export the following environment variable in your job scripts until the issue is fixed:
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
Warning
Sep 2nd, 2022 (OLCFDEV-1079): rocFFT in ROCm 5.1-5.3 tries to write to a cache in the home area by default. This does not scale, disable it via:
export ROCFFT_RTC_CACHE_PATH=/dev/null
Warning
January, 2023 (OLCFDEV-1284, AMD Ticket: ORNLA-130): We discovered a regression in AMD ROCm, leading to 2x slower current deposition (and other slowdowns) in ROCm 5.3 and 5.4.
June, 2023: Although a fix was planned for ROCm 5.5, we still see the same issue in this release and continue to exchange with AMD and HPE on the issue.
Stay with the ROCm 5.2 module to avoid a 2x slowdown.
Warning
August, 2023 (OLCFDEV-1597, OLCFHELP-12850, OLCFHELP-14253):
With runs above 500 nodes, we observed issues in MPI_Waitall
calls of the kind OFI Poll Failed UNDELIVERABLE
.
According to the system known issues entry OLCFDEV-1597, we work around this by setting this environment variable in job scripts:
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software
Warning
Checkpoints and I/O at scale seem to be slow with the default Lustre filesystem configuration.
Please test checkpointing and I/O with short #SBATCH -q debug
runs before running the full simulation.
Execute lfs getstripe -d <dir>
to show the default progressive file layout.
Consider using lfs setstripe
to change the striping for new files before you submit the run.
mkdir /lustre/orion/proj-shared/<your-project>/<path/to/new/sim/dir>
cd <new/sim/dir/above>
# create your diagnostics directory first
mkdir diags
# change striping for new files before you submit the simulation
# this is an example, striping 10 MB blocks onto 32 nodes
lfs setstripe -S 10M -c 32 diags
Additionally, other AMReX users reported good performance for plotfile checkpoint/restart when using
amr.plot_nfiles = -1
amr.checkpoint_nfiles = -1
amrex.async_out_nfiles = 4096 # set to number of GPUs used
Fugaku (Riken)
The Fugaku cluster is located at the Riken Center for Computational Science (Japan).
Introduction
If you are new to this system, please see the following resources:
Preparation
Use the following commands to download the WarpX source code and switch to the correct branch:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
Compiling WarpX on Fugaku is more practical on a compute node. Use the following commands to acquire a compute node for one hour:
pjsub --interact -L "elapse=02:00:00" -L "node=1" --sparam "wait-time=300" --mpi "max-proc-per-node=48" --all-mount-gfscache
We use system software modules, add environment hints and further dependencies via the file $HOME/fugaku_warpx.profile
.
Create it now, modify it if needed, and source it (it will take few minutes):
cp $HOME/src/warpx/Tools/machines/fugaku-riken/fugaku_warpx.profile.example $HOME/fugaku_warpx.profile
source $HOME/fugaku_warpx.profile
Finally, since Fugaku does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/fugaku-riken/install_dependencies.sh
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build
export CC=$(which mpifcc)
export CXX=$(which mpiFCC)
export CFLAGS="-Nclang"
export CXXFLAGS="-Nclang"
cmake -S . -B build -DWarpX_COMPUTE=OMP \
-DWarpX_DIMS="1;2;3" \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_FLAGS_RELEASE="-Ofast" \
-DAMReX_DIFFERENT_COMPILER=ON \
-DWarpX_MPI_THREAD_MULTIPLE=OFF
cmake --build build -j 48
That’s it!
A 3D WarpX executable is now in build/bin/
and can be run with a 3D example inputs file.
Running
A64FX CPUs
In non-interactive runs, you can use pjsub submit.sh where submit.sh can be adapted from:
Tools/machines/fugaku-riken/submit.sh
.#!/bin/bash
#PJM -L "node=48"
#PJM -L "rscgrp=small"
#PJM -L "elapse=0:30:00"
#PJM -s
#PJM -L "freq=2200,eco_state=2"
#PJM --mpi "max-proc-per-node=12"
#PJM -x PJM_LLIO_GFSCACHE=/vol0004:/vol0003
#PJM --llio localtmp-size=10Gi
#PJM --llio sharedtmp-size=10Gi
export NODES=48
export MPI_RANKS=$((NODES * 12))
export OMP_NUM_THREADS=4
export EXE="./warpx"
export INPUT="i.3d"
export XOS_MMM_L_PAGING_POLICY=demand:demand:demand
# Add HDF5 library path to LD_LIBRARY_PATH
# This is done manually to avoid calling spack during the run,
# since this would take a significant amount of time.
export LD_LIBRARY_PATH=/vol0004/apps/oss/spack-v0.19/opt/spack/linux-rhel8-a64fx/fj-4.8.1/hdf5-1.12.2-im6lxevf76cu6cbzspi4itgz3l4gncjj/lib:$LD_LIBRARY_PATH
# Broadcast WarpX executable to all the nodes
llio_transfer ${EXE}
mpiexec -stdout-proc ./output.%j/%/1000r/stdout -stderr-proc ./output.%j/%/1000r/stderr -n ${MPI_RANKS} ${EXE} ${INPUT}
llio_transfer --purge ${EXE}
Note: the Boost Eco Mode
mode that is set in this example increases the default frequency of the A64FX
from 2 GHz to 2.2 GHz, while at the same time switching off one of the two floating-point arithmetic
pipelines. Some preliminary tests with WarpX show that this mode achieves performances similar to those of
the normal mode but with a reduction of the energy consumption of approximately 20%.
HPC3 (UCI)
The HPC3 supercomputer is located at University of California, Irvine.
Introduction
If you are new to this system, please see the following resources:
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up (40GB)/pub/$USER
: per-user production directory; fast and larger storage for parallel jobs (1TB default quota)/dfsX/<lab-path>
lab group quota (based on PI’s purchase allocation). The storage owner (PI) can specify what users have read/write capability on the specific filesystem.
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
On HPC3, you recommend to run on the fast GPU nodes with V100 GPUs.
We use system software modules, add environment hints and further dependencies via the file $HOME/hpc3_gpu_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/hpc3-uci/hpc3_gpu_warpx.profile.example $HOME/hpc3_gpu_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project plasma
, then run vi $HOME/hpc3_gpu_warpx.profile
.
Enter the edit mode by typing i
and edit line 2 to read:
export proj="plasma"
Exit the vi
editor with Esc
and then type :wq
(write & quit).
Important
Now, and as the first step on future logins to HPC3, activate these environment settings:
source $HOME/hpc3_gpu_warpx.profile
Finally, since HPC3 does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/hpc3-uci/install_gpu_dependencies.sh
source $HOME/sw/hpc3/gpu/venvs/warpx-gpu/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build -j 8
The WarpX application executables are now in $HOME/src/warpx/build/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_py
cmake -S . -B build_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_py -j 8 --target pip_install
Now, you can submit HPC3 compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit HPC3 jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PSCRATCH
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build
and rebuild WarpX.
Running
The batch script below can be used to run a WarpX simulation on multiple nodes (change -N
accordingly) on the supercomputer HPC3 at UCI.
This partition as up to 32 nodes with four V100 GPUs (16 GB each) per node.
Replace descriptions between chevrons <>
by relevant values, for instance <proj>
could be plasma
.
Note that we run one MPI rank per GPU.
$HOME/src/warpx/Tools/machines/hpc3-uci/hpc3_gpu.sbatch
.#!/bin/bash -l
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl, Victor Flores
# License: BSD-3-Clause-LBNL
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH -J WarpX
#S BATCH -A <proj>
# V100 GPU options: gpu, free-gpu, debug-gpu
#SBATCH -p free-gpu
# use all four GPUs per node
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:V100:4
#SBATCH --cpus-per-task=10
#S BATCH --mail-type=begin,end
#S BATCH --mail-user=<your-email>@uci.edu
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx.rz
INPUTS=inputs_rz
# OpenMP threads
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
# run
mpirun -np ${SLURM_NTASKS} bash -c "
export CUDA_VISIBLE_DEVICES=\${SLURM_LOCALID};
${EXE} ${INPUTS}" \
> output.txt
To run a simulation, copy the lines above to a file hpc3_gpu.sbatch
and run
sbatch hpc3_gpu.sbatch
to submit the job.
Post-Processing
UCI provides a pre-configured Jupyter service that can be used for data-analysis.
We recommend to install at least the following pip
packages for running Python3 Jupyter notebooks on WarpX data analysis:
h5py ipympl ipywidgets matplotlib numpy openpmd-viewer openpmd-api pandas scipy yt
Juwels (JSC)
Note
For the moment, WarpX doesn’t run on Juwels with MPI_THREAD_MULTIPLE.
Please compile with this compilation flag: MPI_THREAD_MULTIPLE=FALSE
.
The Juwels supercomputer is located at JSC.
Introduction
If you are new to this system, please see the following resources:
See this page for a quick introduction. (Full user guide).
Batch system: Slurm
-
$SCRATCH
: Scratch filesystem for temporary data (90 day purge)$FASTDATA/
: Storage location for large data (backed up)Note that the
$HOME
directory is not designed for simulation runs and producing output there will impact performance.
Installation
Use the following commands to download the WarpX source code and switch to the correct branch:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use the following modules and environments on the system.
Tools/machines/juwels-jsc/juwels_warpx.profile.example
.# please set your project account
#export proj=<yourProject>
# required dependencies
module load ccache
module load CMake
module load GCC
module load CUDA/11.3
module load OpenMPI
module load FFTW
module load HDF5
module load Python
# JUWELS' job scheduler may not map ranks to GPUs,
# so we give a hint to AMReX about the node layout.
# This is usually done in Make.<supercomputing center> files in AMReX
# but there is no such file for JSC yet.
export GPUS_PER_SOCKET=2
export GPUS_PER_NODE=4
# optimize CUDA compilation for V100 (7.0) or for A100 (8.0)
export AMREX_CUDA_ARCH=8.0
Note that for now WarpX must rely on OpenMPI instead of the recommended MPI implementation on this platform MVAPICH2.
We recommend to store the above lines in a file, such as $HOME/juwels_warpx.profile
, and load it into your shell after a login:
source $HOME/juwels_warpx.profile
Then, cd
into the directory $HOME/src/warpx
and use the following commands to compile:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build -DWarpX_DIMS="1;2;3" -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_MPI_THREAD_MULTIPLE=OFF
cmake --build build -j 16
The other general compile-time options apply as usual.
That’s it!
A 3D WarpX executable is now in build/bin/
and can be run with a 3D example inputs file.
Most people execute the binary directly or copy it out to a location in $SCRATCH
.
Note
Currently, if you want to use HDF5 output with openPMD, you need to add
export OMPI_MCA_io=romio321
in your job scripts, before running the srun
command.
Running
Queue: gpus (4 x Nvidia V100 GPUs)
The Juwels GPUs are V100 (16GB) and A100 (40GB).
An example submission script reads
Tools/machines/juwels-jsc/juwels.sbatch
.#!/bin/bash -l
#SBATCH -A $proj
#SBATCH --partition=booster
#SBATCH --nodes=2
#SBATCH --ntasks=8
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:4
#SBATCH --time=00:05:00
#SBATCH --job-name=warpx
#SBATCH --output=warpx-%j-%N.txt
#SBATCH --error=warpx-%j-%N.err
export OMP_NUM_THREADS=1
export OMPI_MCA_io=romio321 # for HDF5 support in openPMD
# you can comment this out if you sourced the warpx.profile
# files before running sbatch:
module load GCC
module load OpenMPI
module load CUDA/11.3
module load HDF5
module load Python
srun -n 8 --cpu_bind=sockets $HOME/src/warpx/build/bin/warpx.3d.MPI.CUDA.DP.OPMD.QED inputs
Queue: batch (2 x Intel Xeon Platinum 8168 CPUs, 24 Cores + 24 Hyperthreads/CPU)
todo
See the data analysis section for more information on how to visualize the simulation results.
Karolina (IT4I)
The Karolina cluster is located at IT4I, Technical University of Ostrava.
Introduction
If you are new to this system, please see the following resources:
Batch system: SLURM
Jupyter service: not provided/documented (yet)
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up (25GB default quota)/scratch/
: production directory; very fast for parallel jobs (10TB default)/mnt/proj<N>/<proj>
: per-project work directory, used for long term data storage (20TB default)
Installation
We show how to install from scratch all the dependencies using Spack.
For size reasons it is not advisable to install WarpX in the $HOME
directory, it should be installed in the “work directory”. For this purpose we set an environment variable $WORK
with the path to the “work directory”.
On Karolina, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes.
Profile file
One can use the pre-prepared karolina_warpx.profile
script below,
which you can copy to ${HOME}/karolina_warpx.profile
, edit as required and then source
.
To have the environment activated on every login, add the following line to ${HOME}/.bashrc
:
source $HOME/karolina_warpx.profile
To install the spack
environment and Python packages:
bash $WORK/src/warpx/Tools/machines/karolina-it4i/install_dependencies.sh
Compilation
Use the following cmake commands to compile the application executable:
cd $WORK/src/warpx
rm -rf build_gpu
cmake -S . -B build_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu -j 48
The WarpX application executables are now in $WORK/src/warpx/build_gpu/bin/
.
Additionally, the following commands will install WarpX as a Python module:
cd $WORK/src/warpx
rm -rf build_gpu_py
cmake -S . -B build_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu_py -j 48 --target pip_install
Now, you can submit Karolina compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Karolina jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in /scratch/
.
Running
The batch script below can be used to run a WarpX simulation on multiple GPU nodes (change #SBATCH --nodes=
accordingly) on the supercomputer Karolina at IT4I.
This partition has up to 72 nodes.
Every node has 8x A100 (40GB) GPUs and 2x AMD EPYC 7763, 64-core, 2.45 GHz processors.
Replace descriptions between chevrons <>
by relevant values, for instance <proj>
could be DD-23-83
.
Note that we run one MPI rank per GPU.
$WORK/src/warpx/Tools/machines/karolina-it4i/karolina_gpu.sbatch
.#!/bin/bash -l
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl, Andrei Berceanu
# License: BSD-3-Clause-LBNL
#SBATCH --account=<proj>
#SBATCH --partition=qgpu
#SBATCH --time=00:10:00
#SBATCH --job-name=WarpX
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=16
#SBATCH --gpus-per-node=8
#SBATCH --gpu-bind=single:1
#SBATCH --mail-type=ALL
# change me!
#SBATCH --mail-user=someone@example.com
#SBATCH --chdir=/scratch/project/<proj>/it4i-<user>/runs/warpx
#SBATCH -o stdout_%j
#SBATCH -e stderr_%j
# OpenMP threads per MPI rank
export OMP_NUM_THREADS=16
export SRUN_CPUS_PER_TASK=16
# set user rights to u=rwx;g=r-x;o=---
umask 0027
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx.rz
INPUTS=./inputs_rz
# run
srun -K1 ${EXE} ${INPUTS}
To run a simulation, copy the lines above to a file karolina_gpu.sbatch
and run
sbatch karolina_gpu.sbatch
to submit the job.
Post-Processing
Note
This section was not yet written. Usually, we document here how to use a Jupyter service.
Lassen (LLNL)
The Lassen V100 GPU cluster is located at LLNL.
Introduction
If you are new to this system, please see the following resources:
LLNL user account (login required)
Batch system: LSF
Jupyter service (documentation, login required)
-
/p/gpfs1/$(whoami)
: personal directory on the parallel filesystemNote that the
$HOME
directory and the/usr/workspace/$(whoami)
space are NFS mounted and not suitable for production quality data generation.
Login
Lassen is currently transitioning to RHEL8. During this transition, first SSH into lassen and then to the updated RHEL8/TOSS4 nodes.
ssh lassen.llnl.gov
ssh eatoss4
Approximately October/November 2023, the new software environment on these nodes will be the new default.
ssh lassen.llnl.gov
Approximately October/November 2023, this partition will become TOSS4 (RHEL8) as well.
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git /usr/workspace/${USER}/lassen/src/warpx
We use system software modules, add environment hints and further dependencies via the file $HOME/lassen_v100_warpx.profile
.
Create it now:
cp /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/lassen_v100_warpx.profile.example $HOME/lassen_v100_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project nsldt
, then run vi $HOME/lassen_v100_warpx.profile
.
Enter the edit mode by typing i
and edit line 2 to read:
export proj="nsldt"
Exit the vi
editor with Esc
and then type :wq
(write & quit).
Important
Now, and as the first step on future logins to lassen, activate these environment settings:
source $HOME/lassen_v100_warpx.profile
We use system software modules, add environment hints and further dependencies via the file $HOME/lassen_v100_warpx_toss3.profile
.
Create it now:
cp /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/lassen_v100_warpx_toss3.profile.example $HOME/lassen_v100_warpx_toss3.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project nsldt
, then run vi $HOME/lassen_v100_warpx_toss3.profile
.
Enter the edit mode by typing i
and edit line 2 to read:
export proj="nsldt"
Exit the vi
editor with Esc
and then type :wq
(write & quit).
Important
Now, and as the first step on future logins to lassen, activate these environment settings:
source $HOME/lassen_v100_warpx_toss3.profile
Finally, since lassen does not yet provide software modules for some of our dependencies, install them once:
bash /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/install_v100_dependencies.sh
source /usr/workspace/${USER}/lassen/gpu/venvs/warpx-lassen/bin/activate
bash /usr/workspace/${USER}/lassen/src/warpx/Tools/machines/lassen-llnl/install_v100_dependencies_toss3.sh
source /usr/workspace/${USER}/lassen-toss3/gpu/venvs/warpx-lassen-toss3/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd /usr/workspace/${USER}/lassen/src/warpx
rm -rf build_lassen
cmake -S . -B build_lassen -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lassen -j 8
The WarpX application executables are now in /usr/workspace/${USER}/lassen/src/warpx/build_lassen/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_lassen_py
cmake -S . -B build_lassen_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lassen_py -j 8 --target pip_install
Now, you can submit lassen compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit lassen jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd /usr/workspace/${USER}/lassen/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf /usr/workspace/${USER}/lassen/src/warpx/build_lassen
and rebuild WarpX.
Running
V100 GPUs (16GB)
The batch script below can be used to run a WarpX simulation on 2 nodes on the supercomputer Lassen at LLNL.
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Note that the only option so far is to run with one MPI rank per GPU.
Tools/machines/lassen-llnl/lassen_v100.bsub
.#!/bin/bash
# Copyright 2020-2023 Axel Huebl
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# Refs.:
# https://jsrunvisualizer.olcf.ornl.gov/?s4f0o11n6c7g1r11d1b1l0=
# https://hpc.llnl.gov/training/tutorials/using-lcs-sierra-system#quick16
#BSUB -G <allocation ID>
#BSUB -W 00:10
#BSUB -nnodes 2
#BSUB -alloc_flags smt4
#BSUB -J WarpX
#BSUB -o WarpXo.%J
#BSUB -e WarpXe.%J
# Work-around OpenMPI bug with chunked HDF5
# https://github.com/open-mpi/ompi/issues/7795
export OMPI_MCA_io=ompio
# Work-around for broken IBM "libcollectives" MPI_Allgatherv
# https://github.com/ECP-WarpX/WarpX/pull/2874
export OMPI_MCA_coll_ibm_skip_allgatherv=true
# ROMIO has a hint for GPFS named IBM_largeblock_io which optimizes I/O with operations on large blocks
export IBM_largeblock_io=true
# MPI-I/O: ROMIO hints for parallel HDF5 performance
export ROMIO_HINTS=./romio-hints
# number of hosts: unique node names minus batch node
NUM_HOSTS=$(( $(echo $LSB_HOSTS | tr ' ' '\n' | uniq | wc -l) - 1 ))
cat > romio-hints << EOL
romio_cb_write enable
romio_ds_write enable
cb_buffer_size 16777216
cb_nodes ${NUM_HOSTS}
EOL
# OpenMPI file locks are slow and not needed
# https://github.com/open-mpi/ompi/issues/10053
export OMPI_MCA_sharedfp=^lockedfile,individual
# HDF5: disable slow locks (promise not to open half-written files)
export HDF5_USE_FILE_LOCKING=FALSE
# OpenMP: 1 thread per MPI rank
export OMP_NUM_THREADS=1
# store out task host mapping: helps identify broken nodes at scale
jsrun -r 4 -a1 -g 1 -c 7 -e prepended hostname > task_host_mapping.txt
# run WarpX
jsrun -r 4 -a 1 -g 1 -c 7 -l GPU-CPU -d packed -b rs -e prepended -M "-gpu" <path/to/executable> <input file> > output.txt
To run a simulation, copy the lines above to a file lassen_v100.bsub
and run
bsub lassen_v100.bsub
to submit the job.
For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell solver on V100 GPUs for a well load-balanced problem (in our case laser wakefield acceleration simulation in a boosted frame in the quasi-linear regime), the following set of parameters provided good performance:
amr.max_grid_size=256
andamr.blocking_factor=128
.One MPI rank per GPU (e.g., 4 MPI ranks for the 4 GPUs on each Lassen node)
Two `128x128x128` grids per GPU, or one `128x128x256` grid per GPU.
Known System Issues
Warning
Feb 17th, 2022 (INC0278922):
The implementation of AllGatherv
in IBM’s MPI optimization library “libcollectives” is broken and leads to HDF5 crashes for multi-node runs.
Our batch script templates above apply this work-around before the call to jsrun
, which avoids the broken routines from IBM and trades them for an OpenMPI implementation of collectives:
export OMPI_MCA_coll_ibm_skip_allgatherv=true
As part of the same CORAL acquisition program, Lassen is very similar to the design of Summit (OLCF). Thus, when encountering new issues it is worth checking also the known Summit issues and work-arounds.
Lawrencium (LBNL)
The Lawrencium cluster is located at LBNL.
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
/global/scratch/users/$USER/
: production directory/global/home/groups/$GROUP/
: group production directory/global/home/users/$USER
: home directory (10 GB)
Installation
Use the following commands to download the WarpX source code and switch to the correct branch:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use the following modules and environments on the system ($HOME/lawrencium_warpx.profile
).
Tools/machines/lawrencium-lbnl/lawrencium_warpx.profile.example
.# please set your project account
#export proj="<yourProject>" # change me, e.g., ac_blast
# required dependencies
module load cmake/3.24.1
module load cuda/11.4
module load gcc/7.4.0
module load openmpi/4.0.1-gcc
# optional: for QED support with detailed tables
module load boost/1.70.0-gcc
# optional: for openPMD and PSATD+RZ support
module load hdf5/1.10.5-gcc-p
module load lapack/3.8.0-gcc
# CPU only:
#module load fftw/3.3.8-gcc
export CMAKE_PREFIX_PATH=$HOME/sw/v100/c-blosc-1.21.1:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/v100/adios2-2.8.3:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/v100/blaspp-master:$CMAKE_PREFIX_PATH
export CMAKE_PREFIX_PATH=$HOME/sw/v100/lapackpp-master:$CMAKE_PREFIX_PATH
export PATH=$HOME/sw/v100/adios2-2.8.3/bin:$PATH
# optional: CCache
#module load ccache # missing
# optional: for Python bindings or libEnsemble
module load python/3.8.8
if [ -d "$HOME/sw/v100/venvs/warpx" ]
then
source $HOME/sw/v100/venvs/warpx/bin/activate
fi
# an alias to request an interactive batch node for one hour
# for parallel execution, start on the batch node: srun <command>
alias getNode="salloc -N 1 -t 1:00:00 --qos=es_debug --partition=es1 --constraint=es1_v100 --gres=gpu:1 --cpus-per-task=4 -A $proj"
# an alias to run a command on a batch node for up to 30min
# usage: runNode <command>
alias runNode="srun -N 1 -t 1:00:00 --qos=es_debug --partition=es1 --constraint=es1_v100 --gres=gpu:1 --cpus-per-task=4 -A $proj"
# optimize CUDA compilation for 1080 Ti (deprecated)
#export AMREX_CUDA_ARCH=6.1
# optimize CUDA compilation for V100
export AMREX_CUDA_ARCH=7.0
# optimize CUDA compilation for 2080 Ti
#export AMREX_CUDA_ARCH=7.5
# compiler environment hints
export CXX=$(which g++)
export CC=$(which gcc)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}
We recommend to store the above lines in a file, such as $HOME/lawrencium_warpx.profile
, and load it into your shell after a login:
source $HOME/lawrencium_warpx.profile
And since Lawrencium does not yet provide a module for them, install ADIOS2, BLAS++ and LAPACK++:
# c-blosc (I/O compression)
git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git src/c-blosc
rm -rf src/c-blosc-v100-build
cmake -S src/c-blosc -B src/c-blosc-v100-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/c-blosc-1.21.1
cmake --build src/c-blosc-v100-build --target install --parallel 12
# ADIOS2
git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git src/adios2
rm -rf src/adios2-v100-build
cmake -S src/adios2 -B src/adios2-v100-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/adios2-2.8.3
cmake --build src/adios2-v100-build --target install -j 12
# BLAS++ (for PSATD+RZ)
git clone https://github.com/icl-utk-edu/blaspp.git src/blaspp
rm -rf src/blaspp-v100-build
cmake -S src/blaspp -B src/blaspp-v100-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/blaspp-master
cmake --build src/blaspp-v100-build --target install --parallel 12
# LAPACK++ (for PSATD+RZ)
git clone https://github.com/icl-utk-edu/lapackpp.git src/lapackpp
rm -rf src/lapackpp-v100-build
cmake -S src/lapackpp -B src/lapackpp-v100-build -DCMAKE_CXX_STANDARD=17 -Dgpu_backend=cuda -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=$HOME/sw/v100/lapackpp-master -Duse_cmake_find_lapack=ON -DBLAS_LIBRARIES=${LAPACK_DIR}/lib/libblas.a -DLAPACK_LIBRARIES=${LAPACK_DIR}/lib/liblapack.a
cmake --build src/lapackpp-v100-build --target install --parallel 12
Optionally, download and install Python packages for PICMI or dynamic ensemble optimizations (libEnsemble):
python3 -m pip install --user --upgrade pip
python3 -m pip install --user virtualenv
python3 -m pip cache purge
rm -rf $HOME/sw/v100/venvs/warpx
python3 -m venv $HOME/sw/v100/venvs/warpx
source $HOME/sw/v100/venvs/warpx/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
python3 -m pip install --upgrade mpi4py --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
# optional: for libEnsemble
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt
Then, cd
into the directory $HOME/src/warpx
and use the following commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build -DWarpX_DIMS="1;2;RZ;3" -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON
cmake --build build -j 12
The general cmake compile-time options apply as usual.
That’s it!
A 3D WarpX executable is now in build/bin/
and can be run with a 3D example inputs file.
Most people execute the binary directly or copy it out to a location in /global/scratch/users/$USER/
.
For a full PICMI install, follow the instructions for Python (PICMI) bindings:
# PICMI build
cd $HOME/src/warpx
# install or update dependencies
python3 -m pip install -r requirements.txt
# compile parallel PICMI interfaces in 3D, 2D, 1D and RZ
WARPX_MPI=ON WARPX_COMPUTE=CUDA WARPX_PSATD=ON BUILD_PARALLEL=12 python3 -m pip install --force-reinstall --no-deps -v .
Or, if you are developing, do a quick PICMI install of a single geometry (see: WarpX_DIMS) using:
# find dependencies & configure
cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS=RZ
# build and then call "python3 -m pip install ..."
cmake --build build --target pip_install -j 12
Running
V100 GPUs (16 GB)
12 nodes with each two NVIDIA V100 GPUs.
Tools/machines/lawrencium-lbnl/lawrencium_v100.sbatch
.#!/bin/bash -l
# Copyright 2023 The WarpX Community
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL
#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH --job-name=WarpX
#SBATCH --account=<proj>
#SBATCH --qos=es_normal
# 2xV100 nodes
#SBATCH --partition=es1
#SBATCH --constraint=es1_v100
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=4
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
#S BATCH --mail-type=all
#S BATCH --mail-user=yourmail@lbl.gov
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs_3d
srun ${EXE} ${INPUTS} \
> output_${SLURM_JOB_ID}.txt
To run a simulation, copy the lines above to a file v100.sbatch
and run
sbatch lawrencium_v100.sbatch
2080 Ti GPUs (10 GB)
18 nodes with each four NVIDIA 2080 TI GPUs. These are most interesting if you run in single precision.
Use --constraint=es1_2080ti --cpus-per-task=2
in the above template to run on those nodes.
Leonardo (CINECA)
The Leonardo cluster is hosted at CINECA.
On Leonardo, each one of the 3456 compute nodes features a custom Atos Bull Sequana XH21355 “Da Vinci” blade, composed of:
1 x CPU Intel Ice Lake Xeon 8358 32 cores 2.60 GHz
512 (8 x 64) GB RAM DDR4 3200 MHz
4 x NVidia custom Ampere A100 GPU 64GB HBM2
2 x NVidia HDR 2×100 GB/s cards
Introduction
If you are new to this system, please see the following resources:
Storage organization:
$HOME
: permanent, backed up, user specific (50 GB quota)$CINECA_SCRATCH
: temporary, user specific, no backup, a large disk for the storage of run time data and files, automatic cleaning procedure of data older than 40 days$PUBLIC
: permanent, no backup (50 GB quota)$WORK
: permanent, project specific, no backup
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use system software modules, add environment hints and further dependencies via the file $HOME/leonardo_gpu_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/leonardo-cineca/leonardo_gpu_warpx.profile.example $HOME/leonardo_gpu_warpx.profile
Important
Now, and as the first step on future logins to Leonardo, activate these environment settings:
source $HOME/leonardo_gpu_warpx.profile
Finally, since Leonardo does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/leonardo_cineca/install_gpu_dependencies.sh
source $HOME/sw/venvs/warpx/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_gpu
cmake -S . -B build_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu -j 16
The WarpX application executables are now in $HOME/src/warpx/build_gpu/bin/
.
Additionally, the following commands will install WarpX as a Python module:
cd $HOME/src/warpx
rm -rf build_gpu_py
cmake -S . -B build_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_PYTHON=ON -DWarpX_APP=OFF -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_gpu_py -j 16 --target pip_install
Now, you can submit Leonardo compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Leonardo jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $CINECA_SCRATCH
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directories rm -rf $HOME/src/warpx/build_gpu*
and rebuild WarpX.
Running
The batch script below can be used to run a WarpX simulation on multiple nodes on Leonardo.
Replace descriptions between chevrons <>
by relevant values.
Note that we run one MPI rank per GPU.
$HOME/src/warpx/Tools/machines/leonardo-cineca/job.sh
.#!/usr/bin/bash
#SBATCH --time=02:00:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-socket=4
#SBATCH --cpus-per-task=8
#SBATCH --gpus-per-node=4
#SBATCH --gpus-per-task=1
#SBATCH --mem=494000
#SBATCH --partition=boost_usr_prod
#SBATCH --job-name=<job name>
#SBATCH --gres=gpu:4
#SBATCH --err=job.err
#SBATCH --out=job.out
#SBATCH --account=<project id>
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<mail>
cd /leonardo_scratch/large/userexternal/<username>/<directory>
srun /leonardo/home/userexternal/<username>/src/warpx/build_gpu/bin/warpx.2d <input file> > output.txt
To run a simulation, copy the lines above to a file job.sh
and run
sbatch job.sh
to submit the job.
Post-Processing
For post-processing, activate the environment settings:
source $HOME/leonardo_gpu_warpx.profile
and run python scripts.
LUMI (CSC)
The LUMI cluster is located at CSC (Finland). Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$HOME
: single user, intended to store user configuration files and personal data (20GB default quota)/project/$proj
: shared with all members of a project, purged at the end of a project (50 GB default quota)/scratch/$proj
: temporary storage, main storage to be used for disk I/O needs when running simulations on LUMI, purged every 90 days (50TB default quota)
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use system software modules, add environment hints and further dependencies via the file $HOME/lumi_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/lumi-csc/lumi_warpx.profile.example $HOME/lumi_warpx.profile
Edit the 2nd line of this script, which sets the export proj="project_..."
variable using a text editor
such as nano
, emacs
, or vim
(all available by default on LUMI login nodes).
You can find out your project name by running lumi-ldap-userinfo
on LUMI.
For example, if you are member of the project project_465000559
, then run nano $HOME/lumi_impactx.profile
and edit line 2 to read:
export proj="project_465000559"
Exit the nano
editor with Ctrl
+ O
(save) and then Ctrl
+ X
(exit).
Important
Now, and as the first step on future logins to LUMI, activate these environment settings:
source $HOME/lumi_warpx.profile
Finally, since LUMI does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/lumi-csc/install_dependencies.sh
source $HOME/sw/lumi/gpu/venvs/warpx-lumi/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_lumi
cmake -S . -B build_lumi -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_QED_TABLES_GEN_OMP=OFF -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lumi -j 16
The WarpX application executables are now in $HOME/src/warpx/build_lumi/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_lumi_py
cmake -S . -B build_lumi_py -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_QED_TABLES_GEN_OMP=OFF -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_lumi_py -j 16 --target pip_install
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_lumi
and rebuild WarpX.
Running
MI250X GPUs (2x64 GB)
The GPU partition on the supercomputer LUMI at CSC has up to 2978 nodes, each with 8 Graphics Compute Dies (GCDs). WarpX runs one MPI rank per Graphics Compute Die.
For interactive runs, simply use the aliases getNode
or runNode ...
.
The batch script below can be used to run a WarpX simulation on multiple nodes (change -N
accordingly).
Replace descriptions between chevrons <>
by relevant values, for instance <project id>
or the concete inputs file.
Copy the executable or point to it via EXE
and adjust the path for the INPUTS
variable accordingly.
Tools/machines/lumi-csc/lumi.sbatch
.#!/bin/bash -l
#SBATCH -A <project id>
#SBATCH --job-name=warpx
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
#SBATCH --partition=standard-g
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --gpus-per-node=8
#SBATCH --time=00:10:00
date
# note (12-12-22)
# this environment setting is currently needed on LUMI to work-around a
# known issue with Libfabric
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
# Seen since August 2023 seen on OLCF (not yet seen on LUMI?)
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
#export MPICH_SMP_SINGLE_COPY_MODE=NONE
#export FI_CXI_RX_MATCH_MODE=software
# note (9-2-22, OLCFDEV-1079)
# this environment setting is needed to avoid that rocFFT writes a cache in
# the home directory, which does not scale.
export ROCFFT_RTC_CACHE_PATH=/dev/null
# Seen since August 2023
# OLCFDEV-1597: OFI Poll Failed UNDELIVERABLE Errors
# https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#olcfdev-1597-ofi-poll-failed-undeliverable-errors
export MPICH_SMP_SINGLE_COPY_MODE=NONE
export FI_CXI_RX_MATCH_MODE=software
# LUMI documentation suggests using the following wrapper script
# to set the ROCR_VISIBLE_DEVICES to the value of SLURM_LOCALID
# see https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/lumig-job/
cat << EOF > select_gpu
#!/bin/bash
export ROCR_VISIBLE_DEVICES=\$SLURM_LOCALID
exec \$*
EOF
chmod +x ./select_gpu
sleep 1
# LUMI documentation suggests using the following CPU bind
# in order to have 6 threads per GPU (blosc compression in adios2 uses threads)
# see https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/lumig-job/
#
# WARNING: the following CPU_BIND options don't work on the dev-g partition.
# If you want to run your simulation on dev-g, please comment them
# out and replace them with CPU_BIND="map_cpu:49,57,17,25,1,9,33,41"
#
CPU_BIND="mask_cpu:7e000000000000,7e00000000000000"
CPU_BIND="${CPU_BIND},7e0000,7e000000"
CPU_BIND="${CPU_BIND},7e,7e00"
CPU_BIND="${CPU_BIND},7e00000000,7e0000000000"
export OMP_NUM_THREADS=6
export MPICH_GPU_SUPPORT_ENABLED=1
srun --cpu-bind=${CPU_BIND} ./select_gpu ./warpx inputs | tee outputs.txt
rm -rf ./select_gpu
To run a simulation, copy the lines above to a file lumi.sbatch
and run
sbatch lumi.sbatch
to submit the job.
Post-Processing
Note
TODO: Document any Jupyter or data services.
Known System Issues
Warning
December 12th, 2022: There is a caching bug in libFabric that causes WarpX simulations to occasionally hang on LUMI on more than 1 node.
As a work-around, please export the following environment variable in your job scripts until the issue is fixed:
#export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
# or, less invasive:
export FI_MR_CACHE_MONITOR=memhooks # alternative cache monitor
Warning
January, 2023: We discovered a regression in AMD ROCm, leading to 2x slower current deposition (and other slowdowns) in ROCm 5.3 and 5.4.
June, 2023: Although a fix was planned for ROCm 5.5, we still see the same issue in this release and continue to exchange with AMD and HPE on the issue.
Stay with the ROCm 5.2 module to avoid a 2x slowdown.
Warning
May 2023: rocFFT in ROCm 5.1-5.3 tries to write to a cache in the home area by default. This does not scale, disable it via:
export ROCFFT_RTC_CACHE_PATH=/dev/null
LXPLUS (CERN)
The LXPLUS cluster is located at CERN.
Introduction
If you are new to this system, please see the following resources:
Batch system: HTCondor
- Filesystem locations:
User folder:
/afs/cern.ch/user/<a>/<account>
(10GByte)Work folder:
/afs/cern.ch/work/<a>/<account>
(100GByte)Eos storage:
/eos/home-<a>/<account>
(1T)
Through LXPLUS we have access to CPU and GPU nodes (the latter equipped with NVIDIA V100 and T4 GPUs).
Installation
Only very little software is pre-installed on LXPLUS so we show how to install from scratch all the dependencies using Spack.
For size reasons it is not advisable to install WarpX in the $HOME
directory, while it should be installed in the “work directory”. For this purpose we set an environment variable with the path to the “work directory”
export WORK=/afs/cern.ch/work/${USER:0:1}/$USER/
We clone WarpX in $WORK
:
cd $WORK
git clone https://github.com/ECP-WarpX/WarpX.git warpx
Installation profile file
The easiest way to install the dependencies is to use the pre-prepared warpx.profile
as follows:
cp $WORK/warpx/WarpX/Tools/machines/lxplus-cern/lxplus_warpx.profile.example $WORK/lxplus_warpx.profile
source $WORK/lxplus_warpx.profile
When doing this one can directly skip to the Building WarpX section.
To have the environment activated at every login it is then possible to add the following lines to the .bashrc
export WORK=/afs/cern.ch/work/${USER:0:1}/$USER/
source $WORK/lxplus_warpx.profile
GCC
The pre-installed GNU compiler is outdated so we need a more recent compiler. Here we use the gcc 11.2.0 from the LCG project, but other options are possible.
We activate it by doing
source /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos7/setup.sh
In order to avoid using different compilers this line could be added directly into the $HOME/.bashrc
file.
Spack
We download and activate Spack in $WORK
:
cd $WORK
git clone -c feature.manyFiles=true https://github.com/spack/spack.git
source spack/share/spack/setup-env.sh
Now we add our gcc 11.2.0 compiler to spack:
spack compiler find /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos7/bin
Installing the Dependencies
To install the dependencies we create a virtual environment, which we call warpx-lxplus
:
spack env create warpx-lxplus $WORK/WarpX/Tools/machines/lxplus-cern/spack.yaml
spack env activate warpx-lxplus
spack install
If the GPU support or the Python bindings are not needed, it’s possible to skip the installation by respectively setting
the following environment variables export SPACK_STACK_USE_PYTHON=0
and export SPACK_STACK_USE_CUDA = 0
before
running the previous commands.
After the installation is done once, all we need to do in future sessions is just activate
the environment again:
spack env activate warpx-lxplus
The environment warpx-lxplus
(or -cuda
or -cuda-py
) must be reactivated everytime that we log in so it could
be a good idea to add the following lines to the .bashrc
:
source $WORK/spack/share/spack/setup-env.sh
spack env activate -d warpx-lxplus
cd $HOME
Building WarpX
We prepare and load the Spack software environment as above. Then we build WarpX:
cmake -S . -B build -DWarpX_DIMS="1;2;RZ;3"
cmake --build build -j 6
Or if we need to compile with CUDA:
cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_DIMS="1;2;RZ;3"
cmake --build build -j 6
That’s it!
A 3D WarpX executable is now in build/bin/
and can be run with a 3D example inputs file.
Most people execute the binary directly or copy it out to a location in $WORK
.
Python Bindings
Here we assume that a Python interpreter has been set up as explained previously.
Now, ensure Python tooling is up-to-date:
python3 -m pip install -U pip
python3 -m pip install -U build packaging setuptools wheel
python3 -m pip install -U cmake
Then we compile WarpX as in the previous section (with or without CUDA) adding -DWarpX_PYTHON=ON
and then we install it into our Python:
cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_DIMS="1;2;RZ;3" -DWarpX_APP=OFF -DWarpX_PYTHON=ON
cmake --build build --target pip_install -j 6
This builds WarpX for 3D geometry.
Alternatively, if you like to build WarpX for all geometries at once, use:
BUILD_PARALLEL=6 python3 -m pip wheel .
python3 -m pip install pywarpx-*whl
Ookami (Stony Brook)
The Ookami cluster is located at Stony Brook University.
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm (see available queues)
-
/lustre/home/<netid>
(30GByte, backuped)/lustre/scratch/<netid>
(14 day purge)/lustre/projects/<your_group>*
(1TByte default, up to 8TB possible, shared within our group/project, backuped, prefer this location)
We use Ookami as a development cluster for A64FX,
The cluster also provides a few extra nodes, e.g. two Thunder X2
(ARM) nodes.
Installation
Use the following commands to download the WarpX source code and switch to the correct branch:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use the following modules and environments on the system ($HOME/warpx_gcc10.profile
).
Tools/machines/ookami-sbu/ookami_warpx.profile.example
.# please set your project account (not relevant yet)
#export proj=<yourProject>
# required dependencies
module load cmake/3.19.0
module load gcc/10.3.0
module load openmpi/gcc10/4.1.0
# optional: faster builds (not available yet)
#module load ccache
#module load ninja
# optional: for PSATD support (not available yet)
#module load fftw
# optional: for QED lookup table generation support (not available yet)
#module load boost
# optional: for openPMD support
#module load adios2 # not available yet
#module load hdf5 # only serial
# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CXXFLAGS="-mcpu=a64fx"
We recommend to store the above lines in a file, such as $HOME/warpx_gcc10.profile
, and load it into your shell after a login:
source $HOME/warpx_gcc10.profile
Then, cd
into the directory $HOME/src/warpx
and use the following commands to compile:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build -DWarpX_COMPUTE=OMP -DWarpX_DIMS="1;2;3"
cmake --build build -j 10
# or (currently better performance)
cmake -S . -B build -DWarpX_COMPUTE=NOACC -DWarpX_DIMS="1;2;3"
cmake --build build -j 10
The general cmake compile-time options apply as usual.
That’s it!
A 3D WarpX executable is now in build/bin/
and can be run with a 3D example inputs file.
Most people execute the binary directly or copy it out to a location in /lustre/scratch/<netid>
.
Running
For running on 48 cores of a single node:
srun -p short -N 1 -n 48 --pty bash
OMP_NUM_THREADS=1 mpiexec -n 48 --map-by ppr:12:numa:pe=1 --report-bindings ./warpx inputs
# alternatively, using 4 MPI ranks with each 12 threads on a single node:
OMP_NUM_THREADS=12 mpiexec -n 4 --map-by ppr:4:numa:pe=12 --report-bindings ./warpx inputs
The Ookami HPE Apollo 80 system has 174 A64FX compute nodes each with 32GB of high-bandwidth memory.
Additional Compilers
This section is just a note for developers. We compiled with the Fujitsu Compiler (Clang) with the following build string:
cmake -S . -B build \
-DCMAKE_C_COMPILER=$(which mpifcc) \
-DCMAKE_C_COMPILER_ID="Clang" \
-DCMAKE_C_COMPILER_VERSION=12.0 \
-DCMAKE_C_STANDARD_COMPUTED_DEFAULT="11" \
-DCMAKE_CXX_COMPILER=$(which mpiFCC) \
-DCMAKE_CXX_COMPILER_ID="Clang" \
-DCMAKE_CXX_COMPILER_VERSION=12.0 \
-DCMAKE_CXX_STANDARD_COMPUTED_DEFAULT="14" \
-DCMAKE_CXX_FLAGS="-Nclang" \
-DAMReX_DIFFERENT_COMPILER=ON \
-DAMReX_MPI_THREAD_MULTIPLE=FALSE \
-DWarpX_COMPUTE=OMP
cmake --build build -j 10
Note that the best performance for A64FX is currently achieved with the GCC or ARM compilers.
Perlmutter (NERSC)
The Perlmutter cluster is located at NERSC.
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up (40GB)${CFS}/m3239/
: community file system for users in the projectm3239
(or equivalent); moderate performance (20TB default)$PSCRATCH
: per-user production directory; very fast for parallel jobs; purged every 8 weeks (20TB default)
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
On Perlmutter, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes.
We use system software modules, add environment hints and further dependencies via the file $HOME/perlmutter_gpu_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_gpu_warpx.profile.example $HOME/perlmutter_gpu_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
Perlmutter GPU projects must end in ..._g
.
For example, if you are member of the project m3239
, then run nano $HOME/perlmutter_gpu_warpx.profile
and edit line 2 to read:
export proj="m3239_g"
Exit the nano
editor with Ctrl
+ O
(save) and then Ctrl
+ X
(exit).
Important
Now, and as the first step on future logins to Perlmutter, activate these environment settings:
source $HOME/perlmutter_gpu_warpx.profile
Finally, since Perlmutter does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/perlmutter-nersc/install_gpu_dependencies.sh
source ${CFS}/${proj%_g}/${USER}/sw/perlmutter/gpu/venvs/warpx-gpu/bin/activate
We use system software modules, add environment hints and further dependencies via the file $HOME/perlmutter_cpu_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_cpu_warpx.profile.example $HOME/perlmutter_cpu_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project m3239
, then run nano $HOME/perlmutter_cpu_warpx.profile
and edit line 2 to read:
export proj="m3239"
Exit the nano
editor with Ctrl
+ O
(save) and then Ctrl
+ X
(exit).
Important
Now, and as the first step on future logins to Perlmutter, activate these environment settings:
source $HOME/perlmutter_cpu_warpx.profile
Finally, since Perlmutter does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/perlmutter-nersc/install_cpu_dependencies.sh
source ${CFS}/${proj}/${USER}/sw/perlmutter/cpu/venvs/warpx-cpu/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_pm_gpu
cmake -S . -B build_pm_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu -j 16
The WarpX application executables are now in $HOME/src/warpx/build_pm_gpu/bin/
.
Additionally, the following commands will install WarpX as a Python module:
cd $HOME/src/warpx
rm -rf build_pm_gpu_py
cmake -S . -B build_pm_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu_py -j 16 --target pip_install
cd $HOME/src/warpx
rm -rf build_pm_cpu
cmake -S . -B build_pm_cpu -DWarpX_COMPUTE=OMP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_cpu -j 16
The WarpX application executables are now in $HOME/src/warpx/build_pm_cpu/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_pm_cpu_py
cmake -S . -B build_pm_cpu_py -DWarpX_COMPUTE=OMP -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_cpu_py -j 16 --target pip_install
Now, you can submit Perlmutter compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Perlmutter jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PSCRATCH
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
update the perlmutter_gpu_warpx.profile or perlmutter_cpu_warpx files,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_pm_*
and rebuild WarpX.
Running
The batch script below can be used to run a WarpX simulation on multiple nodes (change -N
accordingly) on the supercomputer Perlmutter at NERSC.
This partition as up to 1536 nodes.
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Note that we run one MPI rank per GPU.
$HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_gpu.sbatch
.#!/bin/bash -l
# Copyright 2021-2023 Axel Huebl, Kevin Gott
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH -J WarpX
# note: <proj> must end on _g
#SBATCH -A <proj>
#SBATCH -q regular
# A100 40GB (most nodes)
#SBATCH -C gpu
# A100 80GB (256 nodes)
#S BATCH -C gpu&hbm80g
#SBATCH --exclusive
# ideally single:1, but NERSC cgroups issue
#SBATCH --gpu-bind=none
#SBATCH --ntasks-per-node=4
#SBATCH --gpus-per-node=4
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs
# pin to closest NIC to GPU
export MPICH_OFI_NIC_POLICY=GPU
# threads for OpenMP and threaded compressors per MPI rank
# note: 16 avoids hyperthreading (32 virtual cores, 16 physical)
export SRUN_CPUS_PER_TASK=16
export OMP_NUM_THREADS=${SRUN_CPUS_PER_TASK}
# GPU-aware MPI optimizations
GPU_AWARE_MPI="amrex.use_gpu_aware_mpi=1"
# CUDA visible devices are ordered inverse to local task IDs
# Reference: nvidia-smi topo -m
srun --cpu-bind=cores bash -c "
export CUDA_VISIBLE_DEVICES=\$((3-SLURM_LOCALID));
${EXE} ${INPUTS} ${GPU_AWARE_MPI}" \
> output.txt
To run a simulation, copy the lines above to a file perlmutter_gpu.sbatch
and run
sbatch perlmutter_gpu.sbatch
to submit the job.
Perlmutter has 256 nodes that provide 80 GB HBM per A100 GPU.
In the A100 (40GB) batch script, replace -C gpu
with -C gpu&hbm80g
to use these large-memory GPUs.
The Perlmutter CPU partition as up to 3072 nodes, each with 2x AMD EPYC 7763 CPUs.
$HOME/src/warpx/Tools/machines/perlmutter-nersc/perlmutter_cpu.sbatch
.#!/bin/bash -l
# Copyright 2021-2023 WarpX
#
# This file is part of WarpX.
#
# Authors: Axel Huebl
# License: BSD-3-Clause-LBNL
#SBATCH -t 00:10:00
#SBATCH -N 2
#SBATCH -J WarpX
#SBATCH -A <proj>
#SBATCH -q regular
#SBATCH -C cpu
#SBATCH --ntasks-per-node=16
#SBATCH --exclusive
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs_small
# each CPU node on Perlmutter (NERSC) has 64 hardware cores with
# 2x Hyperthreading/SMP
# https://en.wikichip.org/wiki/amd/epyc/7763
# https://www.amd.com/en/products/cpu/amd-epyc-7763
# Each CPU is made up of 8 chiplets, each sharing 32MB L3 cache.
# This will be our MPI rank assignment (2x8 is 16 ranks/node).
# threads for OpenMP and threaded compressors per MPI rank
export SRUN_CPUS_PER_TASK=16 # 8 cores per chiplet, 2x SMP
export OMP_PLACES=threads
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=${SRUN_CPUS_PER_TASK}
srun --cpu-bind=cores \
${EXE} ${INPUTS} \
> output.txt
Post-Processing
For post-processing, most users use Python via NERSC’s Jupyter service (documentation).
As a one-time preparatory setup, log into Perlmutter via SSH and do not source the WarpX profile script above. Create your own Conda environment and Jupyter kernel for post-processing:
module load python
conda config --set auto_activate_base false
# create conda environment
rm -rf $HOME/.conda/envs/warpx-pm-postproc
conda create --yes -n warpx-pm-postproc -c conda-forge mamba conda-libmamba-solver
conda activate warpx-pm-postproc
conda config --set solver libmamba
mamba install --yes -c conda-forge python ipykernel ipympl matplotlib numpy pandas yt openpmd-viewer openpmd-api h5py fast-histogram dask dask-jobqueue pyarrow
# create Jupyter kernel
rm -rf $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/
python -m ipykernel install --user --name warpx-pm-postproc --display-name WarpX-PM-PostProcessing
echo -e '#!/bin/bash\nmodule load python\nsource activate warpx-pm-postproc\nexec "$@"' > $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel-helper.sh
chmod a+rx $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel-helper.sh
KERNEL_STR=$(jq '.argv |= ["{resource_dir}/kernel-helper.sh"] + .' $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel.json | jq '.argv[1] = "python"')
echo ${KERNEL_STR} | jq > $HOME/.local/share/jupyter/kernels/warpx-pm-postproc/kernel.json
exit
When opening a Jupyter notebook on https://jupyter.nersc.gov, just select WarpX-PM-PostProcessing
from the list of available kernels on the top right of the notebook.
Additional software can be installed later on, e.g., in a Jupyter cell using !mamba install -y -c conda-forge ...
.
Software that is not available via conda can be installed via !python -m pip install ...
.
Polaris (ALCF)
The Polaris cluster is located at ALCF.
Introduction
If you are new to this system, please see the following resources:
Batch system: PBS
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
On Polaris, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes.
We use system software modules, add environment hints and further dependencies via the file $HOME/polaris_gpu_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/polaris-alcf/polaris_gpu_warpx.profile.example $HOME/polaris_gpu_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project proj_name
, then run nano $HOME/polaris_gpu_warpx.profile
and edit line 2 to read:
export proj="proj_name"
Exit the nano
editor with Ctrl
+ O
(save) and then Ctrl
+ X
(exit).
Important
Now, and as the first step on future logins to Polaris, activate these environment settings:
source $HOME/polaris_gpu_warpx.profile
Finally, since Polaris does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/polaris-alcf/install_gpu_dependencies.sh
source ${CFS}/${proj%_g}/${USER}/sw/polaris/gpu/venvs/warpx/bin/activate
Under construction
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_pm_gpu
cmake -S . -B build_pm_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu -j 16
The WarpX application executables are now in $HOME/src/warpx/build_pm_gpu/bin/
.
Additionally, the following commands will install WarpX as a Python module:
cd $HOME/src/warpx
rm -rf build_pm_gpu_py
cmake -S . -B build_pm_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_pm_gpu_py -j 16 --target pip_install
Under construction
Now, you can submit Polaris compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Polaris jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PSCRATCH
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
update the polaris_gpu_warpx.profile or polaris_cpu_warpx files,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_pm_*
and rebuild WarpX.
Running
The batch script below can be used to run a WarpX simulation on multiple nodes (change <NODES>
accordingly) on the supercomputer Polaris at ALCF.
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Note that we run one MPI rank per GPU.
$HOME/src/warpx/Tools/machines/polaris-alcf/polaris_gpu.pbs
.#!/bin/bash -l
#PBS -A <proj>
#PBS -l select=<NODES>:system=polaris
#PBS -l place=scatter
#PBS -l walltime=0:10:00
#PBS -l filesystems=home:eagle
#PBS -q debug
#PBS -N test_warpx
# Set required environment variables
# support gpu-aware-mpi
# export MPICH_GPU_SUPPORT_ENABLED=1
# Change to working directory
echo Working directory is $PBS_O_WORKDIR
cd ${PBS_O_WORKDIR}
echo Jobid: $PBS_JOBID
echo Running on host `hostname`
echo Running on nodes `cat $PBS_NODEFILE`
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=input1d
# MPI and OpenMP settings
NNODES=`wc -l < $PBS_NODEFILE`
NRANKS_PER_NODE=4
NDEPTH=1
NTHREADS=1
NTOTRANKS=$(( NNODES * NRANKS_PER_NODE ))
echo "NUM_OF_NODES= ${NNODES} TOTAL_NUM_RANKS= ${NTOTRANKS} RANKS_PER_NODE= ${NRANKS_PER_NODE} THREADS_PER_RANK= ${NTHREADS}"
mpiexec -np ${NTOTRANKS} ${EXE} ${INPUTS} > output.txt
To run a simulation, copy the lines above to a file polaris_gpu.pbs
and run
qsub polaris_gpu.pbs
to submit the job.
Under construction
Quartz (LLNL)
The Quartz Intel CPU cluster is located at LLNL.
Introduction
If you are new to this system, please see the following resources:
LLNL user account (login required)
Batch system: Slurm
Jupyter service (documentation, login required)
-
/p/lustre1/$(whoami)
and/p/lustre2/$(whoami)
: personal directory on the parallel filesystemNote that the
$HOME
directory and the/usr/workspace/$(whoami)
space are NFS mounted and not suitable for production quality data generation.
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use system software modules, add environment hints and further dependencies via the file $HOME/quartz_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/quartz-llnl/quartz_warpx.profile.example $HOME/quartz_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project tps
, then run vi $HOME/quartz_warpx.profile
.
Enter the edit mode by typing i
and edit line 2 to read:
export proj="tps"
Exit the vi
editor with Esc
and then type :wq
(write & quit).
Important
Now, and as the first step on future logins to Quartz, activate these environment settings:
source $HOME/quartz_warpx.profile
Finally, since Quartz does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/quartz-llnl/install_dependencies.sh
source /usr/workspace/${USER}/quartz/venvs/warpx-quartz/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_quartz
cmake -S . -B build_quartz -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_quartz -j 6
The WarpX application executables are now in $HOME/src/warpx/build_quartz/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_quartz_py
cmake -S . -B build_quartz_py -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_quartz_py -j 6 --target pip_install
Now, you can submit Quartz compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Quartz jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_quartz
and rebuild WarpX.
Running
Intel Xeon E5-2695 v4 CPUs
The batch script below can be used to run a WarpX simulation on 2 nodes on the supercomputer Quartz at LLNL.
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Tools/machines/quartz-llnl/quartz.sbatch
.#!/bin/bash -l
# Just increase this number of you need more nodes.
#SBATCH -N 2
#SBATCH -t 24:00:00
#SBATCH -A <allocation ID>
#SBATCH -J WarpX
#SBATCH -q pbatch
#SBATCH --qos=normal
#SBATCH --license=lustre1,lustre2
#SBATCH --export=ALL
#SBATCH -e error.txt
#SBATCH -o output.txt
# one MPI rank per half-socket (see below)
#SBATCH --tasks-per-node=2
# request all logical (virtual) cores per half-socket
#SBATCH --cpus-per-task=18
# each Quartz node has 1 socket of Intel Xeon E5-2695 v4
# each Xeon CPU is divided into 2 bus rings that each have direct L3 access
export WARPX_NMPI_PER_NODE=2
# each MPI rank per half-socket has 9 physical cores
# or 18 logical (virtual) cores
# over-subscribing each physical core with 2x
# hyperthreading led to a slight (3.5%) speedup on Cori's Intel Xeon E5-2698 v3,
# so we do the same here
# the settings below make sure threads are close to the
# controlling MPI rank (process) per half socket and
# distribute equally over close-by physical cores and,
# for N>9, also equally over close-by logical cores
export OMP_PROC_BIND=spread
export OMP_PLACES=threads
export OMP_NUM_THREADS=18
EXE="<path/to/executable>" # e.g. ./warpx
srun --cpu_bind=cores -n $(( ${SLURM_JOB_NUM_NODES} * ${WARPX_NMPI_PER_NODE} )) ${EXE} <input file>
To run a simulation, copy the lines above to a file quartz.sbatch
and run
sbatch quartz.sbatch
to submit the job.
Spock (OLCF)
The Spock cluster is located at OLCF.
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$PROJWORK/$proj/
: shared with all members of a project (recommended)$MEMBERWORK/$proj/
: single user (usually smaller quota)$WORLDWORK/$proj/
: shared with all usersNote that the
$HOME
directory is mounted as read-only on compute nodes. That means you cannot run in your$HOME
.
Installation
Use the following commands to download the WarpX source code and switch to the correct branch:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use the following modules and environments on the system ($HOME/spock_warpx.profile
).
Tools/machines/spock-olcf/spock_warpx.profile.example
.# please set your project account
#export proj=<yourProject>
# required dependencies
module load cmake/3.20.2
module load craype-accel-amd-gfx908
module load rocm/4.3.0
# optional: faster builds
module load ccache
module load ninja
# optional: just an additional text editor
module load nano
# optional: an alias to request an interactive node for one hour
alias getNode="salloc -A $proj -J warpx -t 01:00:00 -p ecp -N 1"
# fix system defaults: do not escape $ with a \ on tab completion
shopt -s direxpand
# optimize CUDA compilation for MI100
export AMREX_AMD_ARCH=gfx908
# compiler environment hints
export CC=$ROCM_PATH/llvm/bin/clang
export CXX=$(which hipcc)
export LDFLAGS="-L${CRAYLIBS_X86_64} $(CC --cray-print-opts=libs) -lmpi"
# GPU aware MPI: ${PE_MPICH_GTL_DIR_gfx908} -lmpi_gtl_hsa
We recommend to store the above lines in a file, such as $HOME/spock_warpx.profile
, and load it into your shell after a login:
source $HOME/spock_warpx.profile
Then, cd
into the directory $HOME/src/warpx
and use the following commands to compile:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build -DWarpX_DIMS="1;2;3" -DWarpX_COMPUTE=HIP -DWarpX_PSATD=ON -DAMReX_AMD_ARCH=gfx908 -DMPI_CXX_COMPILER=$(which CC) -DMPI_C_COMPILER=$(which cc) -DMPI_COMPILER_FLAGS="--cray-print-opts=all"
cmake --build build -j 10
The general cmake compile-time options apply as usual.
That’s it!
A 3D WarpX executable is now in build/bin/
and can be run with a 3D example inputs file.
Most people execute the binary directly or copy it out to a location in $PROJWORK/$proj/
.
Running
MI100 GPUs (32 GB)
After requesting an interactive node with the getNode
alias above, run a simulation like this, here using 4 MPI ranks:
srun -n 4 -c 2 --ntasks-per-node=4 ./warpx inputs
Or in non-interactive runs started with sbatch
:
Tools/machines/spock-olcf/spock_mi100.sbatch
.#!/bin/bash
#SBATCH -A <project id>
#SBATCH -J warpx
#SBATCH -o %x-%j.out
#SBATCH -t 00:10:00
#SBATCH -p ecp
#SBATCH -N 1
export OMP_NUM_THREADS=1
srun -n 4 -c 2 --ntasks-per-node=4 ./warpx inputs > output.txt
We can currently use up to 4
nodes with 4
GPUs each (maximum: -N 4 -n 16
).
Post-Processing
For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).
Please follow the same guidance as for OLCF Summit post-processing.
Summit (OLCF)
The Summit cluster is located at OLCF.
On Summit, each compute node provides six V100 GPUs (16GB) and two Power9 CPUs.
Introduction
If you are new to this system, please see the following resources:
Batch system: LSF
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up; mounted as read-only on compute nodes, that means you cannot run in it (50 GB quota)$PROJWORK/$proj/
: shared with all members of a project, purged every 90 days, GPFS (recommended)$MEMBERWORK/$proj/
: single user, purged every 90 days, GPFS (usually smaller quota)$WORLDWORK/$proj/
: shared with all users, purged every 90 days, GPFS/ccs/proj/$proj/
: another, non-GPFS, file system for software and smaller data.
Note: the Alpine GPFS filesystem on Summit and the new Orion Lustre filesystem on Frontier are not mounted on each others machines. Use Globus to transfer data between them if needed.
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use system software modules, add environment hints and further dependencies via the file $HOME/summit_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/summit-olcf/summit_warpx.profile.example $HOME/summit_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project aph114
, then run vi $HOME/summit_warpx.profile
.
Enter the edit mode by typing i
and edit line 2 to read:
export proj="aph114"
Exit the vi
editor with Esc
and then type :wq
(write & quit).
Important
Now, and as the first step on future logins to Summit, activate these environment settings:
source $HOME/summit_warpx.profile
Finally, since Summit does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/summit-olcf/install_gpu_dependencies.sh
source /ccs/proj/$proj/${USER}/sw/summit/gpu/venvs/warpx-summit/bin/activate
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_summit
cmake -S . -B build_summit -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_summit -j 8
The WarpX application executables are now in $HOME/src/warpx/build_summit/bin/
.
Additionally, the following commands will install WarpX as a Python module:
rm -rf build_summit_py
cmake -S . -B build_summit_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_summit_py -j 8 --target pip_install
Now, you can submit Summit compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit Summit jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in $PROJWORK/$proj/
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_summit
and rebuild WarpX.
Running
V100 GPUs (16GB)
The batch script below can be used to run a WarpX simulation on 2 nodes on
the supercomputer Summit at OLCF. Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be
plasma_mirror_inputs
.
Note that WarpX runs with one MPI rank per GPU and there are 6 GPUs per node:
Tools/machines/summit-olcf/summit_v100.bsub
.#!/bin/bash
# Copyright 2019-2020 Maxence Thevenet, Axel Huebl
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# Refs.:
# https://jsrunvisualizer.olcf.ornl.gov/?s4f0o11n6c7g1r11d1b1l0=
# https://docs.olcf.ornl.gov/systems/summit_user_guide.html#cuda-aware-mpi
#BSUB -P <allocation ID>
#BSUB -W 00:10
#BSUB -nnodes 2
#BSUB -alloc_flags smt4
#BSUB -J WarpX
#BSUB -o WarpXo.%J
#BSUB -e WarpXe.%J
# make output group-readable by default
umask 0027
# fix problems with collectives since RHEL8 update: OLCFHELP-3545
# disable all the IBM optimized barriers and drop back to HCOLL or OMPI's barrier implementations
export OMPI_MCA_coll_ibm_skip_barrier=true
# libfabric 1.6+: limit the visible devices
# Needed for ADIOS2 SST staging/streaming workflows since RHEL8 update
# https://github.com/ornladios/ADIOS2/issues/2887
#export FABRIC_IFACE=mlx5_0 # ADIOS SST: select interface (1 NIC on Summit)
#export FI_OFI_RXM_USE_SRX=1 # libfabric: use shared receive context from MSG provider
# ROMIO has a hint for GPFS named IBM_largeblock_io which optimizes I/O with operations on large blocks
export IBM_largeblock_io=true
# MPI-I/O: ROMIO hints for parallel HDF5 performance
export OMPI_MCA_io=romio321
export ROMIO_HINTS=./romio-hints
# number of hosts: unique node names minus batch node
NUM_HOSTS=$(( $(echo $LSB_HOSTS | tr ' ' '\n' | uniq | wc -l) - 1 ))
cat > romio-hints << EOL
romio_cb_write enable
romio_ds_write enable
cb_buffer_size 16777216
cb_nodes ${NUM_HOSTS}
EOL
# OpenMP: 1 thread per MPI rank
export OMP_NUM_THREADS=1
# run WarpX
jsrun -r 6 -a 1 -g 1 -c 7 -l GPU-CPU -d packed -b rs --smpiargs="-gpu" <path/to/executable> <input file> > output.txt
To run a simulation, copy the lines above to a file summit_v100.bsub
and
run
bsub summit_v100.bsub
to submit the job.
For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell solver on Summit for a well load-balanced problem (in our case laser wakefield acceleration simulation in a boosted frame in the quasi-linear regime), the following set of parameters provided good performance:
amr.max_grid_size=256
andamr.blocking_factor=128
.One MPI rank per GPU (e.g., 6 MPI ranks for the 6 GPUs on each Summit node)
Two `128x128x128` grids per GPU, or one `128x128x256` grid per GPU.
A batch script with more options regarding profiling on Summit can be found at
Summit batch script
Power9 CPUs
Similar to above, the batch script below can be used to run a WarpX simulation on 1 node on the supercomputer Summit at OLCF, on Power9 CPUs (i.e., the GPUs are ignored).
Tools/machines/summit-olcf/summit_power9.bsub
.#!/bin/bash
# Copyright 2019-2020 Maxence Thevenet, Axel Huebl, Michael Rowan
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# Refs.:
# https://jsrunvisualizer.olcf.ornl.gov/?s1f0o121n2c21g0r11d1b1l0=
#BSUB -P <allocation ID>
#BSUB -W 00:10
#BSUB -nnodes 1
#BSUB -alloc_flags "smt1"
#BSUB -J WarpX
#BSUB -o WarpXo.%J
#BSUB -e WarpXe.%J
# make output group-readable by default
umask 0027
# fix problems with collectives since RHEL8 update: OLCFHELP-3545
# disable all the IBM optimized barriers and drop back to HCOLL or OMPI's barrier implementations
export OMPI_MCA_coll_ibm_skip_barrier=true
# libfabric 1.6+: limit the visible devices
# Needed for ADIOS2 SST staging/streaming workflows since RHEL8 update
# https://github.com/ornladios/ADIOS2/issues/2887
#export FABRIC_IFACE=mlx5_0 # ADIOS SST: select interface (1 NIC on Summit)
#export FI_OFI_RXM_USE_SRX=1 # libfabric: use shared receive context from MSG provider
# ROMIO has a hint for GPFS named IBM_largeblock_io which optimizes I/O with operations on large blocks
export IBM_largeblock_io=true
# MPI-I/O: ROMIO hints for parallel HDF5 performance
export OMPI_MCA_io=romio321
export ROMIO_HINTS=./romio-hints
# number of hosts: unique node names minus batch node
NUM_HOSTS=$(( $(echo $LSB_HOSTS | tr ' ' '\n' | uniq | wc -l) - 1 ))
cat > romio-hints << EOL
romio_cb_write enable
romio_ds_write enable
cb_buffer_size 16777216
cb_nodes ${NUM_HOSTS}
EOL
# OpenMP: 21 threads per MPI rank
export OMP_NUM_THREADS=21
# run WarpX
jsrun -n 2 -a 1 -c 21 -r 2 -l CPU-CPU -d packed -b rs <path/to/executable> <input file> > output.txt
For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell solver on Summit for a well load-balanced problem, the following set of parameters provided good performance:
amr.max_grid_size=64
andamr.blocking_factor=64
Two MPI ranks per node (i.e. 2 resource sets per node; equivalently, 1 resource set per socket)
21 physical CPU cores per MPI rank
21 OpenMP threads per MPI rank (i.e. 1 OpenMP thread per physical core)
SMT 1 (Simultaneous Multithreading level 1)
Sixteen `64x64x64` grids per MPI rank (with default tiling in WarpX, this results in ~49 tiles per OpenMP thread)
I/O Performance Tuning
GPFS Large Block I/O
Setting IBM_largeblock_io
to true
disables data shipping, saving overhead when writing/reading large contiguous I/O chunks.
export IBM_largeblock_io=true
ROMIO MPI-IO Hints
You might notice some parallel HDF5 performance improvements on Summit by setting the appropriate ROMIO hints for MPI-IO operations.
export OMPI_MCA_io=romio321
export ROMIO_HINTS=./romio-hints
You can generate the romio-hints
by issuing the following command. Remember to change the number of cb_nodes
to match the number of compute nodes you are using (example here: 64
).
cat > romio-hints << EOL
romio_cb_write enable
romio_ds_write enable
cb_buffer_size 16777216
cb_nodes 64
EOL
The romio-hints
file contains pairs of key-value hints to enable and tune collective
buffering of MPI-IO operations. As Summit’s Alpine file system uses a 16MB block size,
you should set the collective buffer size to 16GB and tune the number of aggregators
(cb_nodes
) to the number of compute nodes you are using, i.e., one aggregator per node.
Further details are available at Summit’s documentation page.
Known System Issues
Warning
Sep 16th, 2021 (OLCFHELP-3685): The Jupyter service cannot open HDF5 files without hanging, due to a filesystem mounting problem.
Please apply this work-around in a Jupyter cell before opening any HDF5 files for read:
import os
os.environ['HDF5_USE_FILE_LOCKING'] = "FALSE"
Warning
Aug 27th, 2021 (OLCFHELP-3442):
Created simulation files and directories are no longer accessible by your team members, even if you create them on $PROJWORK
.
Setting the proper “user mask” (umask
) does not yet work to fix this.
Please run those commands after running a simulation to fix this.
You can also append this to the end of your job scripts after the jsrun
line:
# cd your-simulation-directory
find . -type d -exec chmod g+rwx {} \;
find . -type f -exec chmod g+rw {} \;
Warning
Sep 3rd, 2021 (OLCFHELP-3545): The implementation of barriers in IBM’s MPI fork is broken and leads to crashes at scale. This is seen with runs using 200 nodes and above.
Our batch script templates above apply this work-around before the call to jsrun
, which avoids the broken routines from IBM and trades them for an OpenMPI implementation of collectives:
export OMPI_MCA_coll_ibm_skip_barrier=true
Warning
Sep 3rd, 2021 (OLCFHELP-3319):
If you are an active developer and compile middleware libraries (e.g., ADIOS2) yourself that use MPI and/or infiniband, be aware of libfabric
: IBM forks the open source version of this library and ships a patched version.
Avoid conflicts with mainline versions of this library in MPI that lead to crashes at runtime by loading alongside the system MPI module:
module load libfabric/1.12.1-sysrdma
For instance, if you compile large software stacks with Spack, make sure to register libfabric
with that exact version as an external module.
If you load the documented ADIOS2 module above, this problem does not affect you, since the correct libfabric
version is chosen for this one.
Warning
Related to the above issue, the fabric selection in ADIOS2 was designed for libfabric 1.6. With newer versions of libfabric, a workaround is needed to guide the selection of a functional fabric for RDMA support. Details are discussed in ADIOS2 issue #2887.
The following environment variables can be set as work-arounds, when working with ADIOS2 SST:
export FABRIC_IFACE=mlx5_0 # ADIOS SST: select interface (1 NIC on Summit)
export FI_OFI_RXM_USE_SRX=1 # libfabric: use shared receive context from MSG provider
Warning
Oct 12th, 2021 (OLCFHELP-4242): There is currently a problem with the pre-installed Jupyter extensions, which can lead to connection splits at long running analysis sessions.
Work-around this issue by running in a single Jupyter cell, before starting analysis:
!jupyter serverextension enable --py --sys-prefix dask_labextension
Post-Processing
For post-processing, most users use Python via OLCFs’s Jupyter service (Docs).
We usually just install our software on-the-fly on Summit. When starting up a post-processing session, run this in your first cells:
Note
The following software packages are installed only into a temporary directory.
# work-around for OLCFHELP-4242
!jupyter serverextension enable --py --sys-prefix dask_labextension
# next Jupyter cell: the software you want
!mamba install --quiet -c conda-forge -y openpmd-api openpmd-viewer ipympl ipywidgets fast-histogram yt
# restart notebook
Taurus (ZIH)
The Taurus cluster is located at ZIH (TU Dresden).
The cluster has multiple partitions, this section describes how to use the AMD Rome CPUs + NVIDIA A100¶.
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
Jupyter service: Missing?
-
$PSCRATCH
: per-user production directory, purged every 30 days (<TBD>TB)/global/cscratch1/sd/m3239
: shared production directory for users in the projectm3239
, purged every 30 days (50TB)/global/cfs/cdirs/m3239/
: community file system for users in the projectm3239
(100TB)
Installation
Use the following commands to download the WarpX source code and switch to the correct branch:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use the following modules and environments on the system ($HOME/taurus_warpx.profile
).
Tools/machines/taurus-zih/taurus_warpx.profile.example
.# please set your project account
#export proj="<yourProject>" # change me
# required dependencies
module load modenv/hiera
module load foss/2021b
module load CUDA/11.8.0
module load CMake/3.22.1
# optional: for QED support with detailed tables
#module load Boost # TODO
# optional: for openPMD and PSATD+RZ support
module load HDF5/1.13.1
# optional: for Python bindings or libEnsemble
#module load python # TODO
#
#if [ -d "$HOME/sw/taurus/venvs/warpx" ]
#then
# source $HOME/sw/taurus/venvs/warpx/bin/activate
#fi
# an alias to request an interactive batch node for one hour
# for parallel execution, start on the batch node: srun <command>
alias getNode="salloc --time=2:00:00 -N1 -n1 --cpus-per-task=6 --mem-per-cpu=2048 --gres=gpu:1 --gpu-bind=single:1 -p alpha-interactive --pty bash"
# an alias to run a command on a batch node for up to 30min
# usage: runNode <command>
alias runNode="srun --time=2:00:00 -N1 -n1 --cpus-per-task=6 --mem-per-cpu=2048 --gres=gpu:1 --gpu-bind=single:1 -p alpha-interactive --pty bash"
# optimize CUDA compilation for A100
export AMREX_CUDA_ARCH=8.0
# compiler environment hints
#export CC=$(which gcc)
#export CXX=$(which g++)
#export FC=$(which gfortran)
#export CUDACXX=$(which nvcc)
#export CUDAHOSTCXX=${CXX}
We recommend to store the above lines in a file, such as $HOME/taurus_warpx.profile
, and load it into your shell after a login:
source $HOME/taurus_warpx.profile
Then, cd
into the directory $HOME/src/warpx
and use the following commands to compile:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build -DWarpX_DIMS="1;2;3" -DWarpX_COMPUTE=CUDA
cmake --build build -j 16
The general cmake compile-time options apply as usual.
Running
A100 GPUs (40 GB)
The alpha partition has 34 nodes with 8 x NVIDIA A100-SXM4 Tensor Core-GPUs and 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz (multithreading disabled) per node.
The batch script below can be used to run a WarpX simulation on multiple nodes (change -N
accordingly).
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Note that we run one MPI rank per GPU.
Tools/machines/taurus-zih/taurus.sbatch
.#!/bin/bash -l
# Copyright 2023 Axel Huebl, Thomas Miethlinger
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -J WarpX
#SBATCH -p alpha
#SBATCH --exclusive
#SBATCH --cpus-per-task=6
#SBATCH --mem-per-cpu=2048
#SBATCH --gres=gpu:1
#SBATCH --gpu-bind=single:1
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs_small
# run
srun ${EXE} ${INPUTS} \
> output.txt
To run a simulation, copy the lines above to a file taurus.sbatch
and run
sbatch taurus.sbatch
to submit the job.
Great Lakes (UMich)
The Great Lakes cluster is located at University of Michigan. The cluster has various partitions, including GPU nodes and CPU nodes.
Introduction
If you are new to this system, please see the following resources:
Batch system: Slurm
-
$HOME
: per-user directory, use only for inputs, source and scripts; backed up (80GB)/scratch
: per-project production directory; very fast for parallel jobs; purged every 60 days (10TB default)
Preparation
Use the following commands to download the WarpX source code:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
On Great Lakes, you can run either on GPU nodes with fast V100 GPUs (recommended), the even faster A100 GPUs (only a few available) or CPU nodes.
We use system software modules, add environment hints and further dependencies via the file $HOME/greatlakes_v100_warpx.profile
.
Create it now:
cp $HOME/src/warpx/Tools/machines/greatlakes-umich/greatlakes_v100_warpx.profile.example $HOME/greatlakes_v100_warpx.profile
Edit the 2nd line of this script, which sets the export proj=""
variable.
For example, if you are member of the project iloveplasma
, then run nano $HOME/greatlakes_v100_warpx.profile
and edit line 2 to read:
export proj="iloveplasma"
Exit the nano
editor with Ctrl
+ O
(save) and then Ctrl
+ X
(exit).
Important
Now, and as the first step on future logins to Great Lakes, activate these environment settings:
source $HOME/greatlakes_v100_warpx.profile
Finally, since Great Lakes does not yet provide software modules for some of our dependencies, install them once:
bash $HOME/src/warpx/Tools/machines/greatlakes-umich/install_v100_dependencies.sh
source ${HOME}/sw/greatlakes/v100/venvs/warpx-v100/bin/activate
Note
This section is TODO.
Note
This section is TODO.
Compilation
Use the following cmake commands to compile the application executable:
cd $HOME/src/warpx
rm -rf build_v100
cmake -S . -B build_v100 -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_v100 -j 8
The WarpX application executables are now in $HOME/src/warpx/build_v100/bin/
.
Additionally, the following commands will install WarpX as a Python module:
cd $HOME/src/warpx
rm -rf build_v100_py
cmake -S . -B build_v100_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3"
cmake --build build_v100_py -j 8 --target pip_install
Note
This section is TODO.
Note
This section is TODO.
Now, you can submit Great Lakes compute jobs for WarpX Python (PICMI) scripts (example scripts).
Or, you can use the WarpX executables to submit greatlakes jobs (example inputs).
For executables, you can reference their location in your job script or copy them to a location in /scratch
.
Update WarpX & Dependencies
If you already installed WarpX in the past and want to update it, start by getting the latest source code:
cd $HOME/src/warpx
# read the output of this command - does it look ok?
git status
# get the latest WarpX source code
git fetch
git pull
# read the output of these commands - do they look ok?
git status
git log # press q to exit
And, if needed,
log out and into the system, activate the now updated environment profile as usual,
As a last step, clean the build directory rm -rf $HOME/src/warpx/build_*
and rebuild WarpX.
Running
The batch script below can be used to run a WarpX simulation on multiple nodes (change -N
accordingly) on the supercomputer Great Lakes at University of Michigan.
This partition has 20 nodes, each with two V100 GPUs.
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Note that we run one MPI rank per GPU.
$HOME/src/warpx/Tools/machines/greatlakes-umich/greatlakes_v100.sbatch
.#!/bin/bash -l
# Copyright 2024 The WarpX Community
#
# Author: Axel Huebl
# License: BSD-3-Clause-LBNL
#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -J WarpX
#SBATCH -A <proj>
#SBATCH --partition=gpu
#SBATCH --exclusive
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=20
#SBATCH --gpus-per-task=v100:1
#SBATCH --gpu-bind=single:1
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
# executable & inputs file or python interpreter & PICMI script here
EXE=./warpx
INPUTS=inputs
# threads for OpenMP and threaded compressors per MPI rank
# per node are 2x 2.4 GHz Intel Xeon Gold 6148
# note: the system seems to only expose cores (20 per socket),
# not hyperthreads (40 per socket)
export SRUN_CPUS_PER_TASK=20
export OMP_NUM_THREADS=${SRUN_CPUS_PER_TASK}
# GPU-aware MPI optimizations
GPU_AWARE_MPI="amrex.use_gpu_aware_mpi=1"
# run WarpX
srun --cpu-bind=cores \
${EXE} ${INPUTS} ${GPU_AWARE_MPI} \
> output.txt
To run a simulation, copy the lines above to a file greatlakes_v100.sbatch
and run
sbatch greatlakes_v100.sbatch
to submit the job.
This partition has 2 nodes, each with four A100 GPUs that provide 80 GB HBM per A100 GPU. To the user, each node will appear as if it has 8 A100 GPUs with 40 GB memory each.
Note
This section is TODO.
The Great Lakes CPU partition as up to 455 nodes, each with 2x Intel Xeon Gold 6154 CPUs and 180 GB RAM.
Note
This section is TODO.
Post-Processing
For post-processing, many users prefer to use the online Jupyter service (documentation) that is directly connected to the cluster’s fast filesystem.
Note
This section is a stub and contributions are welcome. We can document further details, e.g., which recommended post-processing Python software to install or how to customize Jupyter kernels here.
Tip
Your HPC system is not in the list? Open an issue and together we can document it!
Batch Systems
HPC systems use a scheduling (“batch”) system for time sharing of computing resources. The batch system is used to request, queue, schedule and execute compute jobs asynchronously. The individual HPC machines above document job submission example scripts, as templates for your modifications.
In this section, we document a quick reference guide (or cheat sheet) to interact in more detail with the various batch systems that you might encounter on different systems.
Slurm
Slurm is a modern and very popular batch system. Slurm is used at NERSC, OLCF Frontier, among others.
Job Submission
sbatch your_job_script.sbatch
Job Control
interactive job:
salloc --time=1:00:00 --nodes=1 --ntasks-per-node=4 --cpus-per-task=8
e.g.
srun "hostname"
GPU allocation on most machines require additional flags, e.g.
--gpus-per-task=1
or--gres=...
details for my jobs:
scontrol -d show job 12345
all details for job with <job id>12345
squeue -u $(whoami) -l
all jobs under my user name
details for queues:
squeue -p queueName -l
list full queuesqueue -p queueName --start
(show start times for pending jobs)squeue -p queueName -l -t R
(only show running jobs in queue)sinfo -p queueName
(show online/offline nodes in queue)sview
(alternative on taurus:module load llview
andllview
)scontrol show partition queueName
communicate with job:
scancel <job id>
abort jobscancel -s <signal number> <job id>
send signal or signal name to jobscontrol update timelimit=4:00:00 jobid=12345
change the walltime of a jobscontrol update jobid=12345 dependency=afterany:54321
only start job12345
after job with id54321
has finishedscontrol hold <job id>
prevent the job from startingscontrol release <job id>
release the job to be eligible for run (after it was set on hold)
References
LSF
LSF (for Load Sharing Facility) is an IBM batch system. It is used at OLCF Summit, LLNL Lassen, and other IBM systems.
Job Submission
bsub your_job_script.bsub
Job Control
interactive job:
bsub -P $proj -W 2:00 -nnodes 1 -Is /bin/bash
-
bjobs 12345
all details for job with <job id>12345
bjobs [-l]
all jobs under my user namejobstat -u $(whoami)
job eligibilitybjdepinfo 12345
job dependencies on other jobs
details for queues:
bqueues
list queues
communicate with job:
bkill <job id>
abort jobbpeek [-f] <job id>
peek intostdout
/stderr
of a jobbkill -s <signal number> <job id>
send signal or signal name to jobbchkpnt
andbrestart
checkpoint and restart job (untested/unimplemented)bmod -W 1:30 12345
change the walltime of a job (currently not allowed)bstop <job id>
prevent the job from startingbresume <job id>
release the job to be eligible for run (after it was set on hold)
References
PBS
PBS (for Portable Batch System) is a popular HPC batch system. The OpenPBS project is related to PBS, PBS Pro and TORQUE.
Job Submission
qsub your_job_script.qsub
Job Control
interactive job:
qsub -I
details for my jobs:
qstat -f 12345
all details for job with <job id>12345
qstat -u $(whoami)
all jobs under my user name
details for queues:
qstat -a queueName
show all jobs in a queuepbs_free -l
compact view on free and busy nodespbsnodes
list all nodes and their detailed state (free, busy/job-exclusive, offline)
communicate with job:
qdel <job id>
abort jobqsig -s <signal number> <job id>
send signal or signal name to jobqalter -lwalltime=12:00:00 <job id>
change the walltime of a jobqalter -Wdepend=afterany:54321 12345
only start job12345
after job with id54321
has finishedqhold <job id>
prevent the job from startingqrls <job id>
release the job to be eligible for run (after it was set on hold)
References
PJM
PJM (probably for Parallel Job Manager?) is a Fujitsu batch system It is used at RIKEN Fugaku and on other Fujitsu systems.
Note
This section is a stub and improvements to complete the (TODO)
sections are welcome.
Job Submission
pjsub your_job_script.pjsub
Job Control
interactive job:
pjsub --interact
details for my jobs:
pjstat
status of all jobs(TODO) all details for job with <job id>
12345
(TODO) all jobs under my user name
details for queues:
(TODO) show all jobs in a queue
(TODO) compact view on free and busy nodes
(TODO) list all nodes and their detailed state (free, busy/job-exclusive, offline)
communicate with job:
pjdel <job id>
abort job(TODO) send signal or signal name to job
(TODO) change the walltime of a job
(TODO) only start job
12345
after job with id54321
has finishedpjhold <job id>
prevent the job from startingpjrls <job id>
release the job to be eligible for run (after it was set on hold)
References
Usage
Run WarpX
In order to run a new simulation:
create a new directory, where the simulation will be run
make sure the WarpX executable is either copied into this directory or in your
PATH
environment variableadd an inputs file and on HPC systems a submission script to the directory
run
1. Run Directory
On Linux/macOS, this is as easy as this
mkdir -p <run_directory>
Where <run_directory>
by the actual path to the run directory.
2. Executable
If you installed warpX with a package manager, a warpx
-prefixed executable will be available as a regular system command to you.
Depending on the chosen build options, the name is suffixed with more details.
Try it like this:
warpx<TAB>
Hitting the <TAB>
key will suggest available WarpX executables as found in your PATH
environment variable.
Note
WarpX needs separate binaries to run in dimensionality of 1D, 2D, 3D, and RZ. We encode the supported dimensionality in the binary file name.
If you compiled the code yourself, the WarpX executable is stored in the source folder under build/bin
.
We also create a symbolic link that is just called warpx
that points to the last executable you built, which can be copied, too.
Copy the executable to this directory:
cp build/bin/<warpx_executable> <run_directory>/
where <warpx_executable>
should be replaced by the actual name of the executable (see above) and <run_directory>
by the actual path to the run directory.
3. Inputs
Add an input file in the directory (see examples and parameters). This file contains the numerical and physical parameters that define the situation to be simulated.
On HPC systems, also copy and adjust a submission script that allocated computing nodes for you. Please reach out to us if you need help setting up a template that runs with ideal performance.
4. Run
Run the executable, e.g. with MPI:
cd <run_directory>
# run with an inputs file:
mpirun -np <n_ranks> ./warpx <input_file>
or
# run with a PICMI input script:
mpirun -np <n_ranks> python <python_script>
Here, <n_ranks>
is the number of MPI ranks used, and <input_file>
is the name of the input file (<python_script>
is the name of the PICMI script).
Note that the actual executable might have a longer name, depending on build options.
We used the copied executable in the current directory (./
); if you installed with a package manager, skip the ./
because WarpX is in your PATH
.
On an HPC system, you would instead submit the job script at this point, e.g. sbatch <submission_script>
(SLURM on Cori/NERSC) or bsub <submission_script>
(LSF on Summit/OLCF).
Tip
In the next sections, we will explain parameters of the <input_file>
.
You can overwrite all parameters inside this file also from the command line, e.g.:
mpirun -np 4 ./warpx <input_file> max_step=10 warpx.numprocs=1 2 2
5. Outputs
By default, WarpX will write a status update to the terminal (stdout
).
On HPC systems, we usually store a copy of this in a file called outputs.txt
.
We also store by default an exact copy of all explicitly and implicitly used inputs parameters in a file called warpx_used_inputs
(this file name can be changed).
This is important for reproducibility, since as we wrote in the previous paragraph, the options in the input file can be extended and overwritten from the command line.
Further configured diagnostics are explained in the next sections.
By default, they are written to a subdirectory in diags/
and can use various output formats.
Examples
This section allows you to download input files that correspond to different physical situations.
We provide two kinds of inputs:
PICMI python input files, with parameters described here.
AMReX
inputs
files, with parameters described here,
For a complete list of all example input files, also have a look at our Examples/ directory. It contains folders and subfolders with self-describing names that you can try. All these input files are automatically tested, so they should always be up-to-date.
Plasma-Based Acceleration
Laser-Wakefield Acceleration of Electrons
This example shows how to model a laser-wakefield accelerator (LWFA) [2, 3].
Laser-wakefield acceleration is best performed in 3D or quasi-cylindrical (RZ) geometry, in order to correctly capture some of the key physics (laser diffraction, beamloading, shape of the accelerating bubble in the blowout regime, etc.). For physical situations that have close-to-cylindrical symmetry, simulations in RZ geometry capture the relevant physics at a fraction of the computational cost of a 3D simulation. On the other hand, for physical situation with strong asymmetries (e.g., non-round laser driver, strong hosing of the accelerated beam, etc.), only 3D simulations are suitable.
For LWFA scenarios with long propagation lengths, use the boosted frame method. An example can be seen in the PWFA example.
Run
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
This example can be run either as:
Python script:
python3 PICMI_inputs_3d.py
orWarpX executable using an input file:
warpx.3d inputs_3d max_step=400
Examples/Physics_applications/laser_acceleration/PICMI_inputs_3d.py
.#!/usr/bin/env python3
from pywarpx import picmi
# Physical constants
c = picmi.constants.c
q_e = picmi.constants.q_e
# Number of time steps
max_steps = 100
# Number of cells
nx = 32
ny = 32
nz = 256
# Physical domain
xmin = -30e-06
xmax = 30e-06
ymin = -30e-06
ymax = 30e-06
zmin = -56e-06
zmax = 12e-06
# Domain decomposition
max_grid_size = 64
blocking_factor = 32
# Create grid
grid = picmi.Cartesian3DGrid(
number_of_cells = [nx, ny, nz],
lower_bound = [xmin, ymin, zmin],
upper_bound = [xmax, ymax, zmax],
lower_boundary_conditions = ['periodic', 'periodic', 'dirichlet'],
upper_boundary_conditions = ['periodic', 'periodic', 'dirichlet'],
lower_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
upper_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
moving_window_velocity = [0., 0., c],
warpx_max_grid_size = max_grid_size,
warpx_blocking_factor = blocking_factor)
# Particles: plasma electrons
plasma_density = 2e23
plasma_xmin = -20e-06
plasma_ymin = -20e-06
plasma_zmin = 0
plasma_xmax = 20e-06
plasma_ymax = 20e-06
plasma_zmax = None
uniform_distribution = picmi.UniformDistribution(
density = plasma_density,
lower_bound = [plasma_xmin, plasma_ymin, plasma_zmin],
upper_bound = [plasma_xmax, plasma_ymax, plasma_zmax],
fill_in = True)
electrons = picmi.Species(
particle_type = 'electron',
name = 'electrons',
initial_distribution = uniform_distribution)
# Particles: beam electrons
q_tot = 1e-12
x_m = 0.
y_m = 0.
z_m = -28e-06
x_rms = 0.5e-06
y_rms = 0.5e-06
z_rms = 0.5e-06
ux_m = 0.
uy_m = 0.
uz_m = 500.
ux_th = 2.
uy_th = 2.
uz_th = 50.
gaussian_bunch_distribution = picmi.GaussianBunchDistribution(
n_physical_particles = q_tot / q_e,
rms_bunch_size = [x_rms, y_rms, z_rms],
rms_velocity = [c*ux_th, c*uy_th, c*uz_th],
centroid_position = [x_m, y_m, z_m],
centroid_velocity = [c*ux_m, c*uy_m, c*uz_m])
beam = picmi.Species(
particle_type = 'electron',
name = 'beam',
initial_distribution = gaussian_bunch_distribution)
# Laser
e_max = 16e12
position_z = 9e-06
profile_t_peak = 30.e-15
profile_focal_distance = 100e-06
laser = picmi.GaussianLaser(
wavelength = 0.8e-06,
waist = 5e-06,
duration = 15e-15,
focal_position = [0, 0, profile_focal_distance + position_z],
centroid_position = [0, 0, position_z - c*profile_t_peak],
propagation_direction = [0, 0, 1],
polarization_direction = [0, 1, 0],
E0 = e_max,
fill_in = False)
laser_antenna = picmi.LaserAntenna(
position = [0., 0., position_z],
normal_vector = [0, 0, 1])
# Electromagnetic solver
solver = picmi.ElectromagneticSolver(
grid = grid,
method = 'Yee',
cfl = 1.,
divE_cleaning = 0)
# Diagnostics
diag_field_list = ['B', 'E', 'J', 'rho']
particle_diag = picmi.ParticleDiagnostic(
name = 'diag1',
period = 100,
write_dir = '.',
warpx_file_prefix = 'Python_LaserAcceleration_plt')
field_diag = picmi.FieldDiagnostic(
name = 'diag1',
grid = grid,
period = 100,
data_list = diag_field_list,
write_dir = '.',
warpx_file_prefix = 'Python_LaserAcceleration_plt')
# Set up simulation
sim = picmi.Simulation(
solver = solver,
max_steps = max_steps,
verbose = 1,
particle_shape = 'cubic',
warpx_use_filter = 1,
warpx_serialize_initial_conditions = 1,
warpx_do_dynamic_scheduling = 0)
# Add plasma electrons
sim.add_species(
electrons,
layout = picmi.GriddedLayout(grid = grid, n_macroparticle_per_cell = [1, 1, 1]))
# Add beam electrons
sim.add_species(
beam,
layout = picmi.PseudoRandomLayout(grid = grid, n_macroparticles = 100))
# Add laser
sim.add_laser(
laser,
injection_method = laser_antenna)
# Add diagnostics
sim.add_diagnostic(particle_diag)
sim.add_diagnostic(field_diag)
# Write input file that can be used to run with the compiled version
sim.write_input_file(file_name = 'inputs_3d_picmi')
# Initialize inputs and WarpX instance
sim.initialize_inputs()
sim.initialize_warpx()
# Advance simulation until last time step
sim.step(max_steps)
Examples/Physics_applications/laser_acceleration/inputs_3d
.#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 100 # for production, run for longer time, e.g. max_step = 1000
amr.n_cell = 32 32 256 # for production, run with finer mesh, e.g. amr.n_cell = 64 64 512
amr.max_grid_size = 64 # maximum size of each AMReX box, used to decompose the domain
amr.blocking_factor = 32 # minimum size of each AMReX box, used to decompose the domain
geometry.dims = 3
geometry.prob_lo = -30.e-6 -30.e-6 -56.e-6 # physical domain
geometry.prob_hi = 30.e-6 30.e-6 12.e-6
amr.max_level = 0 # Maximum level in hierarchy (1 might be unstable, >1 is not supported)
# warpx.fine_tag_lo = -5.e-6 -5.e-6 -50.e-6
# warpx.fine_tag_hi = 5.e-6 5.e-6 -30.e-6
#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic pec
boundary.field_hi = periodic periodic pec
#################################
############ NUMERICS ###########
#################################
warpx.verbose = 1
warpx.do_dive_cleaning = 0
warpx.use_filter = 1
warpx.cfl = 1. # if 1., the time step is set to its CFL limit
warpx.do_moving_window = 1
warpx.moving_window_dir = z
warpx.moving_window_v = 1.0 # units of speed of light
warpx.do_dynamic_scheduling = 0 # for production, set this to 1 (default)
warpx.serialize_initial_conditions = 1 # for production, set this to 0 (default)
# Order of particle shape factors
algo.particle_shape = 3
#################################
############ PLASMA #############
#################################
particles.species_names = electrons
electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 1 1 1
electrons.xmin = -20.e-6
electrons.xmax = 20.e-6
electrons.ymin = -20.e-6
electrons.ymax = 20.e-6
electrons.zmin = 0
electrons.profile = constant
electrons.density = 2.e23 # number of electrons per m^3
electrons.momentum_distribution_type = "at_rest"
electrons.do_continuous_injection = 1
electrons.addIntegerAttributes = regionofinterest
electrons.attribute.regionofinterest(x,y,z,ux,uy,uz,t) = "(z>12.0e-6) * (z<13.0e-6)"
electrons.addRealAttributes = initialenergy
electrons.attribute.initialenergy(x,y,z,ux,uy,uz,t) = " ux*ux + uy*uy + uz*uz"
#################################
############ LASER #############
#################################
lasers.names = laser1
laser1.profile = Gaussian
laser1.position = 0. 0. 9.e-6 # This point is on the laser plane
laser1.direction = 0. 0. 1. # The plane normal direction
laser1.polarization = 0. 1. 0. # The main polarization vector
laser1.e_max = 16.e12 # Maximum amplitude of the laser field (in V/m)
laser1.profile_waist = 5.e-6 # The waist of the laser (in m)
laser1.profile_duration = 15.e-15 # The duration of the laser (in s)
laser1.profile_t_peak = 30.e-15 # Time at which the laser reaches its peak (in s)
laser1.profile_focal_distance = 100.e-6 # Focal distance from the antenna (in m)
laser1.wavelength = 0.8e-6 # The wavelength of the laser (in m)
# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 100
diag1.diag_type = Full
diag1.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz rho
diag1.format = openpmd
# Reduced Diagnostics
warpx.reduced_diags_names = FP
FP.type = FieldProbe
FP.intervals = 10
FP.integrate = 0
FP.probe_geometry = Line
FP.x_probe = 0
FP.y_probe = 0
FP.z_probe = -56e-6
FP.x1_probe = 0
FP.y1_probe = 0
FP.z1_probe = 12e-6
FP.resolution = 300
FP.do_moving_window_FP = 1
This example can be run either as:
Python script:
python3 PICMI_inputs_rz.py
orWarpX executable using an input file:
warpx.rz inputs_3d max_step=400
Examples/Physics_applications/laser_acceleration/PICMI_inputs_rz.py
.#!/usr/bin/env python3
from pywarpx import picmi
# Physical constants
c = picmi.constants.c
q_e = picmi.constants.q_e
# Number of time steps
max_steps = 10
# Number of cells
nr = 64
nz = 512
# Physical domain
rmin = 0
rmax = 30e-06
zmin = -56e-06
zmax = 12e-06
# Domain decomposition
max_grid_size = 64
blocking_factor = 32
# Create grid
grid = picmi.CylindricalGrid(
number_of_cells = [nr, nz],
n_azimuthal_modes = 2,
lower_bound = [rmin, zmin],
upper_bound = [rmax, zmax],
lower_boundary_conditions = ['none', 'dirichlet'],
upper_boundary_conditions = ['dirichlet', 'dirichlet'],
lower_boundary_conditions_particles = ['absorbing', 'absorbing'],
upper_boundary_conditions_particles = ['absorbing', 'absorbing'],
moving_window_velocity = [0., c],
warpx_max_grid_size = max_grid_size,
warpx_blocking_factor = blocking_factor)
# Particles: plasma electrons
plasma_density = 2e23
plasma_xmin = -20e-06
plasma_ymin = None
plasma_zmin = 10e-06
plasma_xmax = 20e-06
plasma_ymax = None
plasma_zmax = None
uniform_distribution = picmi.UniformDistribution(
density = plasma_density,
lower_bound = [plasma_xmin, plasma_ymin, plasma_zmin],
upper_bound = [plasma_xmax, plasma_ymax, plasma_zmax],
fill_in = True)
electrons = picmi.Species(
particle_type = 'electron',
name = 'electrons',
initial_distribution = uniform_distribution)
# Particles: beam electrons
q_tot = 1e-12
x_m = 0.
y_m = 0.
z_m = -28e-06
x_rms = 0.5e-06
y_rms = 0.5e-06
z_rms = 0.5e-06
ux_m = 0.
uy_m = 0.
uz_m = 500.
ux_th = 2.
uy_th = 2.
uz_th = 50.
gaussian_bunch_distribution = picmi.GaussianBunchDistribution(
n_physical_particles = q_tot / q_e,
rms_bunch_size = [x_rms, y_rms, z_rms],
rms_velocity = [c*ux_th, c*uy_th, c*uz_th],
centroid_position = [x_m, y_m, z_m],
centroid_velocity = [c*ux_m, c*uy_m, c*uz_m])
beam = picmi.Species(
particle_type = 'electron',
name = 'beam',
initial_distribution = gaussian_bunch_distribution)
# Laser
e_max = 16e12
position_z = 9e-06
profile_t_peak = 30.e-15
profile_focal_distance = 100e-06
laser = picmi.GaussianLaser(
wavelength = 0.8e-06,
waist = 5e-06,
duration = 15e-15,
focal_position = [0, 0, profile_focal_distance + position_z],
centroid_position = [0, 0, position_z - c*profile_t_peak],
propagation_direction = [0, 0, 1],
polarization_direction = [0, 1, 0],
E0 = e_max,
fill_in = False)
laser_antenna = picmi.LaserAntenna(
position = [0., 0., position_z],
normal_vector = [0, 0, 1])
# Electromagnetic solver
solver = picmi.ElectromagneticSolver(
grid = grid,
method = 'Yee',
cfl = 1.,
divE_cleaning = 0)
# Diagnostics
diag_field_list = ['B', 'E', 'J', 'rho']
field_diag = picmi.FieldDiagnostic(
name = 'diag1',
grid = grid,
period = 10,
data_list = diag_field_list,
warpx_dump_rz_modes = 1,
write_dir = '.',
warpx_file_prefix = 'Python_LaserAccelerationRZ_plt')
diag_particle_list = ['weighting', 'momentum']
particle_diag = picmi.ParticleDiagnostic(
name = 'diag1',
period = 10,
species = [electrons, beam],
data_list = diag_particle_list,
write_dir = '.',
warpx_file_prefix = 'Python_LaserAccelerationRZ_plt')
# Set up simulation
sim = picmi.Simulation(
solver = solver,
max_steps = max_steps,
verbose = 1,
particle_shape = 'cubic',
warpx_use_filter = 0)
# Add plasma electrons
sim.add_species(
electrons,
layout = picmi.GriddedLayout(grid = grid, n_macroparticle_per_cell = [1, 4, 1]))
# Add beam electrons
sim.add_species(
beam,
layout = picmi.PseudoRandomLayout(grid = grid, n_macroparticles = 100))
# Add laser
sim.add_laser(
laser,
injection_method = laser_antenna)
# Add diagnostics
sim.add_diagnostic(field_diag)
sim.add_diagnostic(particle_diag)
# Write input file that can be used to run with the compiled version
sim.write_input_file(file_name = 'inputs_rz_picmi')
# Initialize inputs and WarpX instance
sim.initialize_inputs()
sim.initialize_warpx()
# Advance simulation until last time step
sim.step(max_steps)
Examples/Physics_applications/laser_acceleration/inputs_rz
.#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 10
amr.n_cell = 64 512
amr.max_grid_size = 64 # maximum size of each AMReX box, used to decompose the domain
amr.blocking_factor = 32 # minimum size of each AMReX box, used to decompose the domain
geometry.dims = RZ
geometry.prob_lo = 0. -56.e-6 # physical domain
geometry.prob_hi = 30.e-6 12.e-6
amr.max_level = 0 # Maximum level in hierarchy (1 might be unstable, >1 is not supported)
warpx.n_rz_azimuthal_modes = 2
boundary.field_lo = none pec
boundary.field_hi = pec pec
#################################
############ NUMERICS ###########
#################################
warpx.verbose = 1
warpx.do_dive_cleaning = 0
warpx.use_filter = 1
warpx.filter_npass_each_dir = 0 1
warpx.cfl = 1. # if 1., the time step is set to its CFL limit
warpx.do_moving_window = 1
warpx.moving_window_dir = z
warpx.moving_window_v = 1.0 # units of speed of light
# Order of particle shape factors
algo.particle_shape = 3
#################################
############ PLASMA #############
#################################
particles.species_names = electrons beam
electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 1 4 1
electrons.xmin = -20.e-6
electrons.xmax = 20.e-6
electrons.zmin = 10.e-6
electrons.profile = constant
electrons.density = 2.e23 # number of electrons per m^3
electrons.momentum_distribution_type = "at_rest"
electrons.do_continuous_injection = 1
electrons.addRealAttributes = orig_x orig_z
electrons.attribute.orig_x(x,y,z,ux,uy,uz,t) = "x"
electrons.attribute.orig_z(x,y,z,ux,uy,uz,t) = "z"
beam.charge = -q_e
beam.mass = m_e
beam.injection_style = "gaussian_beam"
beam.x_rms = .5e-6
beam.y_rms = .5e-6
beam.z_rms = .5e-6
beam.x_m = 0.
beam.y_m = 0.
beam.z_m = -28.e-6
beam.npart = 100
beam.q_tot = -1.e-12
beam.momentum_distribution_type = "gaussian"
beam.ux_m = 0.0
beam.uy_m = 0.0
beam.uz_m = 500.
beam.ux_th = 2.
beam.uy_th = 2.
beam.uz_th = 50.
#################################
############ LASER ##############
#################################
lasers.names = laser1
laser1.profile = Gaussian
laser1.position = 0. 0. 9.e-6 # This point is on the laser plane
laser1.direction = 0. 0. 1. # The plane normal direction
laser1.polarization = 0. 1. 0. # The main polarization vector
laser1.e_max = 16.e12 # Maximum amplitude of the laser field (in V/m)
laser1.profile_waist = 5.e-6 # The waist of the laser (in m)
laser1.profile_duration = 15.e-15 # The duration of the laser (in s)
laser1.profile_t_peak = 30.e-15 # Time at which the laser reaches its peak (in s)
laser1.profile_focal_distance = 100.e-6 # Focal distance from the antenna (in m)
laser1.wavelength = 0.8e-6 # The wavelength of the laser (in m)
# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 10
diag1.diag_type = Full
diag1.fields_to_plot = Er Et Ez Br Bt Bz jr jt jz rho
diag1.electrons.variables = w ux uy uz orig_x orig_z
diag1.beam.variables = w ux uy uz
Analyze
Note
This section is TODO.
Visualize
You can run the following script to visualize the beam evolution over time:
Script plot_3d.py
Examples/Physics_applications/laser_acceleration/plot_3d.py diags/diag1000400/
.#!/usr/bin/env python3
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Axel Huebl
# License: BSD-3-Clause-LBNL
#
# This is a script plots the wakefield of an LWFA simulation.
import sys
import matplotlib.pyplot as plt
import yt
yt.funcs.mylog.setLevel(50)
def plot_lwfa():
# this will be the name of the plot file
fn = sys.argv[1]
# Read the file
ds = yt.load(fn)
# plot the laser field and absolute density
fields = ["Ey", "rho"]
normal = "y"
sl = yt.SlicePlot(ds, normal=normal, fields=fields)
for field in fields:
sl.set_log(field, False)
sl.set_figure_size((4, 8))
fig = sl.export_to_mpl_figure(nrows_ncols=(2, 1))
fig.tight_layout()
plt.show()
if __name__ == "__main__":
plot_lwfa()

(top) Electric field of the laser pulse and (bottom) absolute density.
Beam-Driven Wakefield Acceleration of Electrons
This example shows how to model a beam-driven plasma-wakefield accelerator (PWFA) [2, 3].
PWFA is best performed in 3D or quasi-cylindrical (RZ) geometry, in order to correctly capture some of the key physics (structure of the space-charge fields, beamloading, shape of the accelerating bubble in the blowout regime, etc.). For physical situations that have close-to-cylindrical symmetry, simulations in RZ geometry capture the relevant physics at a fraction of the computational cost of a 3D simulation. On the other hand, for physical situation with strong asymmetries (e.g., non-round driver, strong hosing of the accelerated beam, etc.), only 3D simulations are suitable.
Additionally, to speed up computation, this example uses the boosted frame method to effectively model long acceleration lengths.
Alternatively, an other common approximation for PWFAs is quasi-static modeling, e.g., if effects such as self-injection can be ignored. In the Beam, Plasma & Accelerator Simulation Toolkit (BLAST), HiPACE++ provides such methods.
Note
TODO: The Python (PICMI) input file should use the boosted frame method, like the inputs_3d_boost
file.
Run
This example can be run either as:
Python script:
python3 PICMI_inputs_plasma_acceleration.py
orWarpX executable using an input file:
warpx.3d inputs_3d_boost
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Note
TODO: This input file should use the boosted frame method, like the inputs_3d_boost
file.
Examples/Physics_applications/plasma_acceleration/PICMI_inputs_plasma_acceleration.py
.#!/usr/bin/env python3
from pywarpx import picmi
#from warp import picmi
constants = picmi.constants
nx = 64
ny = 64
nz = 64
xmin = -200.e-6
xmax = +200.e-6
ymin = -200.e-6
ymax = +200.e-6
zmin = -200.e-6
zmax = +200.e-6
moving_window_velocity = [0., 0., constants.c]
number_per_cell_each_dim = [2, 2, 1]
max_steps = 10
grid = picmi.Cartesian3DGrid(number_of_cells = [nx, ny, nz],
lower_bound = [xmin, ymin, zmin],
upper_bound = [xmax, ymax, zmax],
lower_boundary_conditions = ['periodic', 'periodic', 'open'],
upper_boundary_conditions = ['periodic', 'periodic', 'open'],
lower_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
upper_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
moving_window_velocity = moving_window_velocity,
warpx_max_grid_size=32)
solver = picmi.ElectromagneticSolver(grid=grid, cfl=1)
beam_distribution = picmi.UniformDistribution(density = 1.e23,
lower_bound = [-20.e-6, -20.e-6, -150.e-6],
upper_bound = [+20.e-6, +20.e-6, -100.e-6],
directed_velocity = [0., 0., 1.e9])
plasma_distribution = picmi.UniformDistribution(density = 1.e22,
lower_bound = [-200.e-6, -200.e-6, 0.],
upper_bound = [+200.e-6, +200.e-6, None],
fill_in = True)
beam = picmi.Species(particle_type='electron', name='beam', initial_distribution=beam_distribution)
plasma = picmi.Species(particle_type='electron', name='plasma', initial_distribution=plasma_distribution)
sim = picmi.Simulation(solver = solver,
max_steps = max_steps,
verbose = 1,
warpx_current_deposition_algo = 'esirkepov',
warpx_use_filter = 0)
sim.add_species(beam, layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=number_per_cell_each_dim))
sim.add_species(plasma, layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=number_per_cell_each_dim))
field_diag = picmi.FieldDiagnostic(name = 'diag1',
grid = grid,
period = max_steps,
data_list = ['Ex', 'Ey', 'Ez', 'Jx', 'Jy', 'Jz', 'part_per_cell'],
write_dir = '.',
warpx_file_prefix = 'Python_PlasmaAcceleration_plt')
part_diag = picmi.ParticleDiagnostic(name = 'diag1',
period = max_steps,
species = [beam, plasma],
data_list = ['ux', 'uy', 'uz', 'weighting'])
sim.add_diagnostic(field_diag)
sim.add_diagnostic(part_diag)
# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name = 'inputs_from_PICMI')
# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()
Examples/Physics_applications/plasma_acceleration/inputs_3d_boost
.#################################
####### GENERAL PARAMETERS ######
#################################
stop_time = 3.93151387287e-11
amr.n_cell = 32 32 256
amr.max_grid_size = 64
amr.blocking_factor = 32
amr.max_level = 0
geometry.dims = 3
geometry.prob_lo = -0.00015 -0.00015 -0.00012
geometry.prob_hi = 0.00015 0.00015 1.e-06
#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic pml
boundary.field_hi = periodic periodic pml
#################################
############ NUMERICS ###########
#################################
algo.maxwell_solver = ckc
warpx.verbose = 1
warpx.do_dive_cleaning = 0
warpx.use_filter = 1
warpx.cfl = .99
warpx.do_moving_window = 1
warpx.moving_window_dir = z
warpx.moving_window_v = 1. # in units of the speed of light
my_constants.lramp = 8.e-3
my_constants.dens = 1e+23
# Order of particle shape factors
algo.particle_shape = 3
#################################
######### BOOSTED FRAME #########
#################################
warpx.gamma_boost = 10.0
warpx.boost_direction = z
#################################
############ PLASMA #############
#################################
particles.species_names = driver plasma_e plasma_p beam driverback
particles.use_fdtd_nci_corr = 1
particles.rigid_injected_species = driver beam
driver.charge = -q_e
driver.mass = 1.e10
driver.injection_style = "gaussian_beam"
driver.x_rms = 2.e-6
driver.y_rms = 2.e-6
driver.z_rms = 4.e-6
driver.x_m = 0.
driver.y_m = 0.
driver.z_m = -20.e-6
driver.npart = 1000
driver.q_tot = -1.e-9
driver.momentum_distribution_type = "gaussian"
driver.ux_m = 0.0
driver.uy_m = 0.0
driver.uz_m = 200000.
driver.ux_th = 2.
driver.uy_th = 2.
driver.uz_th = 20000.
driver.zinject_plane = 0.
driver.rigid_advance = true
driverback.charge = q_e
driverback.mass = 1.e10
driverback.injection_style = "gaussian_beam"
driverback.x_rms = 2.e-6
driverback.y_rms = 2.e-6
driverback.z_rms = 4.e-6
driverback.x_m = 0.
driverback.y_m = 0.
driverback.z_m = -20.e-6
driverback.npart = 1000
driverback.q_tot = 1.e-9
driverback.momentum_distribution_type = "gaussian"
driverback.ux_m = 0.0
driverback.uy_m = 0.0
driverback.uz_m = 200000.
driverback.ux_th = 2.
driverback.uy_th = 2.
driverback.uz_th = 20000.
driverback.do_backward_propagation = true
plasma_e.charge = -q_e
plasma_e.mass = m_e
plasma_e.injection_style = "NUniformPerCell"
plasma_e.zmin = -100.e-6 # 0.e-6
plasma_e.zmax = 0.2
plasma_e.xmin = -70.e-6
plasma_e.xmax = 70.e-6
plasma_e.ymin = -70.e-6
plasma_e.ymax = 70.e-6
# plasma_e.profile = constant
# plasma_e.density = 1.e23
plasma_e.profile = parse_density_function
plasma_e.density_function(x,y,z) = "(z<lramp)*0.5*(1-cos(pi*z/lramp))*dens+(z>lramp)*dens"
plasma_e.num_particles_per_cell_each_dim = 1 1 1
plasma_e.momentum_distribution_type = "at_rest"
plasma_e.do_continuous_injection = 1
plasma_p.charge = q_e
plasma_p.mass = m_p
plasma_p.injection_style = "NUniformPerCell"
plasma_p.zmin = -100.e-6 # 0.e-6
plasma_p.zmax = 0.2
# plasma_p.profile = "constant"
# plasma_p.density = 1.e23
plasma_p.profile = parse_density_function
plasma_p.density_function(x,y,z) = "(z<lramp)*0.5*(1-cos(pi*z/lramp))*dens+(z>lramp)*dens"
plasma_p.xmin = -70.e-6
plasma_p.xmax = 70.e-6
plasma_p.ymin = -70.e-6
plasma_p.ymax = 70.e-6
plasma_p.num_particles_per_cell_each_dim = 1 1 1
plasma_p.momentum_distribution_type = "at_rest"
plasma_p.do_continuous_injection = 1
beam.charge = -q_e
beam.mass = m_e
beam.injection_style = "gaussian_beam"
beam.x_rms = .5e-6
beam.y_rms = .5e-6
beam.z_rms = 1.e-6
beam.x_m = 0.
beam.y_m = 0.
beam.z_m = -100.e-6
beam.npart = 1000
beam.q_tot = -5.e-10
beam.momentum_distribution_type = "gaussian"
beam.ux_m = 0.0
beam.uy_m = 0.0
beam.uz_m = 2000.
beam.ux_th = 2.
beam.uy_th = 2.
beam.uz_th = 200.
beam.zinject_plane = .8e-3
beam.rigid_advance = true
# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 10000
diag1.diag_type = Full
Analyze
Note
This section is TODO.
Visualize
Note
This section is TODO.
In-Depth: PWFA
As described in the Introduction, one of the key applications of the WarpX exascale computing platform is in modelling future, compact and economic plasma-based accelerators.
In this section we describe the simulation setup of a realistic electron beam driven plasma wakefield accelerator (PWFA) configuration.
For illustration purposes the setup can be explored with WarpX using the example input file PWFA
.
The simulation setup consists of 4 particle species: drive beam (driver), witness beam (beam), plasma electrons (plasma_e), and plasma ions (plasma_p). The species physical parameters are summarized in the following table.
Species |
Parameters |
---|---|
driver |
\(\gamma\) = 48923; N = 2x10^8; \(\sigma_z\) = 4.0 um; \(\sigma_x\) = 2.0 um |
beam |
\(\gamma\) = 48923; N = 6x10^5; \(\sigma_z\) = 1.0 mm; \(\sigma_x\) = 0.5 um |
plasma_e |
n = 1x10^23 m^-3; w = 70 um; lr = 8 mm; L = 200 mm |
plasma_p |
n = 1x10^23 m^-3; w = 70 um; lr = 8 mm; L = 200 mm |
Where \(\gamma\) is the beam relativistic Lorentz factor, N is the number of particles, and \(\sigma_x\), \(\sigma_y\), \(\sigma_z\) are the beam widths (root-mean-squares of particle positions) in the transverse (x,y) and longitudinal directions.
The plasma, of total length L, has a density profile that consists of a lr long linear up-ramp, ranging from 0 to peak value n, is uniform within a transverse width of w and after the up-ramp.
With this configuration the driver excites a nonlinear plasma wake and drives the bubble depleted of plasma electrons where the beam accelerates, as can be seen in Fig. [fig:PWFA].
![[fig:PWFA] Plot of 2D PWFA example at dump 1000](_images/PWFA.png)
[fig:PWFA] Plot of the driver (blue), beam (red) and plasma_e (green) electron macroparticle distribution at the time step 1000 of the example simulation. These are overlapping the 2D plot of the longitudinal electric field showing the accelerating/deccelerating (red/blue) regions of the plasma bubble.
Listed below are the key arguments and best-practices relevant for choosing the pwfa simulation parameters used in the example.
2D Geometry
2D cartesian (with longitudinal direction z and transverse x) geometry simulations can give valuable physical and numerical insight into the simulation requirements and evolution. At the same time it is much less time consuming than the full 3D cartesian or cylindrical geometries.
Finite Difference Time Domain
For standard plasma wakefield configurations, it is possible to model the physics correctly using the Particle-In-Cell (PIC) Finite Difference Time Domain (FDTD) algorithms. If the simulation contains localised extremely high intensity fields, however, numerical instabilities might arise, such as the numerical Cherenkov instability (Moving window and optimal Lorentz boosted frame). In that case, it is recommended to use the Pseudo Spectral Analytical Time Domain (PSATD) or the Pseudo-Spectral Time-Domain (PSTD) algorithms. In the example we are describing, it is sufficient to use FDTD.
Cole-Karkkainen solver with Cowan coefficients
There are two FDTD Maxwell field solvers that compute the field push implemented in WarpX: the Yee and Cole-Karkkainen solver with Cowan coefficients (CKC) solvers. The later includes a modification that allows the numerical dispersion of light in vacuum to be exact, and that is why we choose CKC for the example.
Lorentz boosted frame
WarpX simulations can be done in the laboratory or Lorentz-boosted frames. In the laboratory frame, there is typically no need to model the plasma ions species, since they are mainly stationary during the short time scales associated with the motion of plasma electrons. In the boosted frame, that argument is no longer valid, as ions have relativistic velocities. The boosted frame still results in a substantial reduction to the simulation computational cost.
Note
Even if the simulations uses the boosted frame, most of its input file parameters are defined in respect to the laboratory frame.
We recommend that you design your numerical setup so that the width of the box is not significantly narrower than the distance from 0 to its right edge (done, for example, by setting the right edge equal to 0).
Moving window
To avoid having to simulate the whole 0.2 mm of plasma with the high resolution that is required to model the beam and plasma interaction correctly, we use the moving window. In this way we define a simulation box (grid) with a fixed size that travels at the speed-of-light (\(c\)), i.e. follows the beam.
Note
When using moving window the option of continuous injection needs to be active for all particles initialized outside of the simulation box.
Resolution
Longitudinal and transverse resolutions (i.e. number and dimensions of the PIC grid cells) should be chosen to accurately describe the physical processes taking place in the simulation. Convergence scans, where resolution in both directions is gradually increased, should be used to determine the optimal configuration. Multiple cells per beam length and width are recommended (our illustrative example resolution is coarse).
Note
To avoid spurious effects, in the boosted frame, we consider the constraint that the transverse cell size should be larger than the transverse one. Translating this condition to the cell transverse (\(d_{x}\)) and longitudinal dimensions (\(d_{z}\)) in the laboratory frame leads to: \(d_{x} > (d_{z} (1+\beta_{b}) \gamma_{b})\), where \(\beta_{b}\) is the boosted frame velocity in units of \(c\).
Time step
The time step (\(dt\)) is used to iterated over the main PIC loop and is computed by WarpX differently depending on the Maxwell field FDTD solvers used:
For Yee is equal to the CFL parameter chosen in the input file (Parameters: Inputs File) times the Courant–Friedrichs–Lewy condition (CFL) that follows the analytical expression in Particle-in-Cell Method
For CKC is equal to CFL times the minimum between the boosted frame cell dimensions
where CFL is chosen to be below unity and set an optimal trade-off between making the simulation faster and avoiding NCI and other spurious effects.
Duration of the simulation
To determine the total number of time steps of the simulation, we could either set the <zmax_plasma_to_compute_max_step> parameter to the end of the plasma (\(z_{\textrm{end}}\)), or compute it using:
boosted frame edge of the simulation box, \(\textrm{corner} = l_{e}/ ((1-\beta_{b}) \gamma_{b})\)
time of interaction in the boosted frame, \(T = \frac{z_{\textrm{end}}/\gamma_{b}-\textrm{corner}}{c (1+\beta_{b})}\)
total number of iterations, \(i_{\textrm{max}} = T/dt\)
where \(l_{e}\) is the position of the left edge of the simulation box (in respect to propagation direction).
Plotfiles and snapshots
WarpX allows the data to be stored in different formats, such as plotfiles (following the yt guidelines), hdf5 and openPMD (following its standard). In the example, we are dumping plotfiles with boosted frame information on the simulation particles and fields. We are also requesting back transformed diagnostics that transform that information back to the laboratory frame. The diagnostics results are analysed and stored in snapshots at each time step and so it is best to make sure that the run does not end before filling the final snapshot.
Maximum grid size and blocking factor
These parameters are carfully chosen to improve the code parallelization, load-balancing and performance (Parameters: Inputs File) for each numerical configuration. They define the smallest and largest number of cells that can be contained in each simulation box and are carefully defined in the AMReX documentation.
Laser-Plasma Interaction
Laser-Ion Acceleration with a Planar Target
This example shows how to model laser-ion acceleration with planar targets of solid density [4, 5, 6]. The acceleration mechanism in this scenario depends on target parameters.
Although laser-ion acceleration requires full 3D modeling for adequate description of the acceleration dynamics, especially the acceleration field lengths and decay times, this example models a 2D example. 2D modeling can often hint at a qualitative overview of the dynamics, but mostly saves computational costs since the plasma frequency (and Debye length) of the plasma determines the resolution need in laser-solid interaction modeling.
Note
The resolution of this 2D case is extremely low by default. This includes spatial and temporal resolution, but also the number of macro-particles per cell representing the target density for proper phase space sampling. You will need a computing cluster for adequate resolution of the target density, see comments in the input file.
Run
This example can be run either as:
Python script:
mpiexec -n 2 python3 PICMI_inputs_2d.py
orWarpX executable using an input file:
mpiexec -n 2 warpx.2d inputs_2d
Tip
For MPI-parallel runs on computing clusters, change the prefix to mpiexec -n <no. of MPI ranks> ...
or srun -n <no. of MPI ranks> ...
, depending on the system and number of MPI ranks you want to allocate.
The input option warpx_numprocs
/ warpx.numprocs
needs to be adjusted for parallel domain decomposition, to match the number of MPI ranks used.
In order to use dynamic load balancing, use the more general method of setting blocks.
Examples/Physics_applications/laser_ion/PICMI_inputs_2d.py
.#!/usr/bin/env python3
from pywarpx import picmi
# Physical constants
c = picmi.constants.c
q_e = picmi.constants.q_e
# We only run 100 steps for tests
# Disable `max_step` below to run until the physical `stop_time`.
max_step = 100
# time-scale with highly kinetic dynamics
stop_time = 0.2e-12
# proper resolution for 30 n_c (dx<=3.33nm) incl. acc. length
# (>=6x V100)
# --> choose larger `max_grid_size` and `blocking_factor` for 1 to 8 grids per GPU accordingly
#nx = 7488
#nz = 14720
# Number of cells
nx = 384
nz = 512
# Domain decomposition (deactivate `warpx_numprocs` in `picmi.Simulation` for this to take effect)
max_grid_size = 64
blocking_factor = 32
# Physical domain
xmin = -7.5e-06
xmax = 7.5e-06
zmin = -5.0e-06
zmax = 25.0e-06
# Create grid
grid = picmi.Cartesian2DGrid(
number_of_cells=[nx, nz],
lower_bound=[xmin, zmin],
upper_bound=[xmax, zmax],
lower_boundary_conditions=['open', 'open'],
upper_boundary_conditions=['open', 'open'],
lower_boundary_conditions_particles=['absorbing', 'absorbing'],
upper_boundary_conditions_particles=['absorbing', 'absorbing'],
warpx_max_grid_size=max_grid_size,
warpx_blocking_factor=blocking_factor)
# Particles: plasma parameters
# critical plasma density
nc = 1.742e27 # [m^-3] 1.11485e21 * 1.e6 / 0.8**2
# number density: "fully ionized" electron density as reference
# [material 1] cryogenic H2
n0 = 30.0 # [n_c]
# [material 2] liquid crystal
# n0 = 192
# [material 3] PMMA
# n0 = 230
# [material 4] Copper (ion density: 8.49e28/m^3; times ionization level)
# n0 = 1400
plasma_density = n0 * nc
preplasma_L = 0.05e-6 # [m] scale length (>0)
preplasma_Lcut = 2.0e-6 # [m] hard cutoff from surface
plasma_r0 = 2.5e-6 # [m] radius or half-thickness
plasma_eps_z = 0.05e-6 # [m] small offset in z to make zmin, zmax interval larger than 2*(r0 + Lcut)
plasma_creation_limit_z = plasma_r0 + preplasma_Lcut + plasma_eps_z # [m] upper limit in z for particle creation
plasma_xmin = None
plasma_ymin = None
plasma_zmin = -plasma_creation_limit_z
plasma_xmax = None
plasma_ymax = None
plasma_zmax = plasma_creation_limit_z
density_expression_str = f'{plasma_density}*((abs(z)<={plasma_r0}) + (abs(z)<{plasma_r0}+{preplasma_Lcut}) * (abs(z)>{plasma_r0}) * exp(-(abs(z)-{plasma_r0})/{preplasma_L}))'
slab_with_ramp_dist_hydrogen = picmi.AnalyticDistribution(
density_expression=density_expression_str,
lower_bound=[plasma_xmin, plasma_ymin, plasma_zmin],
upper_bound=[plasma_xmax, plasma_ymax, plasma_zmax]
)
# thermal velocity spread for electrons in gamma*beta
ux_th = .01
uz_th = .01
slab_with_ramp_dist_electrons = picmi.AnalyticDistribution(
density_expression=density_expression_str,
lower_bound=[plasma_xmin, plasma_ymin, plasma_zmin],
upper_bound=[plasma_xmax, plasma_ymax, plasma_zmax],
# if `momentum_expressions` and `momentum_spread_expressions` are unset,
# a Gaussian momentum distribution is assumed given that `rms_velocity` has any non-zero elements
rms_velocity=[c*ux_th, 0., c*uz_th] # thermal velocity spread in m/s
)
# TODO: add additional attributes orig_x and orig_z
electrons = picmi.Species(
particle_type='electron',
name='electrons',
initial_distribution=slab_with_ramp_dist_electrons,
)
# TODO: add additional attributes orig_x and orig_z
hydrogen = picmi.Species(
particle_type='proton',
name='hydrogen',
initial_distribution=slab_with_ramp_dist_hydrogen
)
# Laser
# e_max = a0 * 3.211e12 / lambda_0[mu]
# a0 = 16, lambda_0 = 0.8mu -> e_max = 64.22 TV/m
e_max = 64.22e12
position_z = -4.0e-06
profile_t_peak = 50.e-15
profile_focal_distance = 4.0e-06
laser = picmi.GaussianLaser(
wavelength=0.8e-06,
waist=4.e-06,
duration=30.e-15,
focal_position=[0, 0, profile_focal_distance + position_z],
centroid_position=[0, 0, position_z - c * profile_t_peak],
propagation_direction=[0, 0, 1],
polarization_direction=[1, 0, 0],
E0=e_max,
fill_in=False)
laser_antenna = picmi.LaserAntenna(
position=[0., 0., position_z],
normal_vector=[0, 0, 1])
# Electromagnetic solver
solver = picmi.ElectromagneticSolver(
grid=grid,
method='Yee',
cfl=0.999,
divE_cleaning=0,
#warpx_pml_ncell=10
)
# Diagnostics
particle_diag = picmi.ParticleDiagnostic(
name='Python_LaserIonAcc2d_plt',
period=100,
write_dir='./diags',
warpx_format='openpmd',
warpx_openpmd_backend='h5',
# demonstration of a spatial and momentum filter
warpx_plot_filter_function='(uz>=0) * (x<1.0e-6) * (x>-1.0e-6)'
)
# reduce resolution of output fields
coarsening_ratio = [4, 4]
ncell_field = []
for (ncell_comp, cr) in zip([nx,nz], coarsening_ratio):
ncell_field.append(int(ncell_comp/cr))
field_diag = picmi.FieldDiagnostic(
name='Python_LaserIonAcc2d_plt',
grid=grid,
period=100,
number_of_cells=ncell_field,
data_list=['B', 'E', 'J', 'rho', 'rho_electrons', 'rho_hydrogen'],
write_dir='./diags',
warpx_format='openpmd',
warpx_openpmd_backend='h5'
)
particle_fw_diag = picmi.ParticleDiagnostic(
name='openPMDfw',
period=100,
write_dir='./diags',
warpx_format='openpmd',
warpx_openpmd_backend='h5',
warpx_plot_filter_function='(uz>=0) * (x<1.0e-6) * (x>-1.0e-6)'
)
particle_bw_diag = picmi.ParticleDiagnostic(
name='openPMDbw',
period=100,
write_dir='./diags',
warpx_format='openpmd',
warpx_openpmd_backend='h5',
warpx_plot_filter_function='(uz<0)'
)
# histograms with 2.0 degree acceptance angle in fw direction
# 2 deg * pi / 180 : 0.03490658503 rad
# half-angle +/- : 0.017453292515 rad
histuH_rdiag = picmi.ReducedDiagnostic(
diag_type='ParticleHistogram',
name='histuH',
period=100,
species=hydrogen,
bin_number=1000,
bin_min=0.0,
bin_max=0.474, # 100 MeV protons
histogram_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)',
filter_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)')
histue_rdiag = picmi.ReducedDiagnostic(
diag_type='ParticleHistogram',
name='histue',
period=100,
species=electrons,
bin_number=1000,
bin_min=0.0,
bin_max=197.0, # 100 MeV electrons
histogram_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)',
filter_function='u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)')
# just a test entry to make sure that the histogram filter is purely optional:
# this one just records uz of all hydrogen ions, independent of their pointing
histuzAll_rdiag = picmi.ReducedDiagnostic(
diag_type='ParticleHistogram',
name='histuzAll',
period=100,
species=hydrogen,
bin_number=1000,
bin_min=-0.474,
bin_max=0.474,
histogram_function='uz')
field_probe_z_rdiag = picmi.ReducedDiagnostic(
diag_type='FieldProbe',
name='FieldProbe_Z',
period=100,
integrate=0,
probe_geometry='Line',
x_probe=0.0,
z_probe=-5.0e-6,
x1_probe=0.0,
z1_probe=25.0e-6,
resolution=3712)
field_probe_scat_point_rdiag = picmi.ReducedDiagnostic(
diag_type='FieldProbe',
name='FieldProbe_ScatPoint',
period=1,
integrate=0,
probe_geometry='Point',
x_probe=0.0,
z_probe=15.0e-6)
field_probe_scat_line_rdiag = picmi.ReducedDiagnostic(
diag_type='FieldProbe',
name='FieldProbe_ScatLine',
period=100,
integrate=1,
probe_geometry='Line',
x_probe=-2.5e-6,
z_probe=15.0e-6,
x1_probe=2.5e-6,
z1_probe=15e-6,
resolution=201)
load_balance_costs_rdiag = picmi.ReducedDiagnostic(
diag_type='LoadBalanceCosts',
name='LBC',
period=100)
# Set up simulation
sim = picmi.Simulation(
solver=solver,
max_time=stop_time, # need to remove `max_step` to run this far
verbose=1,
particle_shape='cubic',
warpx_numprocs=[1, 2], # deactivate `numprocs` for dynamic load balancing
warpx_use_filter=1,
warpx_load_balance_intervals=100,
warpx_load_balance_costs_update='heuristic'
)
# Add plasma electrons
sim.add_species(
electrons,
layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[2,2])
# for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
#layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[4,8])
)
# Add hydrogen ions
sim.add_species(
hydrogen,
layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[2,2])
# for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
#layout=picmi.GriddedLayout(grid=grid, n_macroparticle_per_cell=[4,8])
)
# Add laser
sim.add_laser(
laser,
injection_method=laser_antenna)
# Add full diagnostics
sim.add_diagnostic(particle_diag)
sim.add_diagnostic(field_diag)
sim.add_diagnostic(particle_fw_diag)
sim.add_diagnostic(particle_bw_diag)
# Add reduced diagnostics
sim.add_diagnostic(histuH_rdiag)
sim.add_diagnostic(histue_rdiag)
sim.add_diagnostic(histuzAll_rdiag)
sim.add_diagnostic(field_probe_z_rdiag)
sim.add_diagnostic(field_probe_scat_point_rdiag)
sim.add_diagnostic(field_probe_scat_line_rdiag)
sim.add_diagnostic(load_balance_costs_rdiag)
# TODO: make ParticleHistogram2D available
# Write input file that can be used to run with the compiled version
sim.write_input_file(file_name='inputs_2d_picmi')
# Initialize inputs and WarpX instance
sim.initialize_inputs()
sim.initialize_warpx()
# Advance simulation until last time step
sim.step(max_step)
Examples/Physics_applications/laser_ion/inputs_2d
.#################################
# Domain, Resolution & Numerics
#
# We only run 100 steps for tests
# Disable `max_step` below to run until the physical `stop_time`.
max_step = 100
# time-scale with highly kinetic dynamics
stop_time = 0.2e-12 # [s]
# time-scale for converged ion energy
# notes: - effective acc. time depends on laser pulse
# - ions will start to leave the box
#stop_time = 1.0e-12 # [s]
# quick tests at ultra-low res. (for CI, and local computer)
amr.n_cell = 384 512
# proper resolution for 10 n_c excl. acc. length
# (>=1x V100)
#amr.n_cell = 2688 3712
# proper resolution for 30 n_c (dx<=3.33nm) incl. acc. length
# (>=6x V100)
#amr.n_cell = 7488 14720
# simulation box, no MR
# note: increase z (space & cells) for converging ion energy
amr.max_level = 0
geometry.dims = 2
geometry.prob_lo = -7.5e-6 -5.e-6
geometry.prob_hi = 7.5e-6 25.e-6
# Boundary condition
boundary.field_lo = pml pml
boundary.field_hi = pml pml
# Order of particle shape factors
algo.particle_shape = 3
# improved plasma stability for 2D with very low initial target temperature
# when using Esirkepov current deposition with energy-conserving field gather
interpolation.galerkin_scheme = 0
# numerical tuning
warpx.cfl = 0.999
warpx.use_filter = 1 # bilinear current/charge filter
#################################
# Performance Tuning
#
# simple tuning:
# the numprocs product must be equal to the number of MPI ranks and splits
# the domain on the coarsest level equally into grids;
# slicing in the 2nd dimension is preferred for ideal performance
warpx.numprocs = 1 2 # 2 MPI ranks
#warpx.numprocs = 1 4 # 4 MPI ranks
# detail tuning instead of warpx.numprocs:
# It is important to have enough cells in a block & grid, otherwise
# performance will suffer.
# Use larger values for GPUs, try to fill a GPU well with memory and place
# few large grids on each device (you can go as low as 1 large grid / device
# if you do not need load balancing).
# Slicing in the 2nd dimension is preferred for ideal performance
#amr.blocking_factor = 64
#amr.max_grid_size_x = 2688
#amr.max_grid_size_y = 128 # this is confusingly named and means z in 2D
# load balancing
# The grid & block parameters above are needed for load balancing:
# an average of ~10 grids per MPI rank (and device) are a good granularity
# to allow efficient load-balancing as the simulation evolves
algo.load_balance_intervals = 100
algo.load_balance_costs_update = Heuristic
# particle bin-sorting on GPU (ideal defaults not investigated in 2D)
# Try larger values than the defaults below and report back! :)
#warpx.sort_intervals = 4 # default on CPU: -1 (off); on GPU: 4
#warpx.sort_bin_size = 1 1 1
#################################
# Target Profile
#
# definitions for target extent and pre-plasma
my_constants.L = 0.05e-6 # [m] scale length (>0)
my_constants.Lcut = 2.0e-6 # [m] hard cutoff from surface
my_constants.r0 = 2.5e-6 # [m] radius or half-thickness
my_constants.eps_z = 0.05e-6 # [m] small offset in z to make zmin, zmax interval larger than 2*(r0 + Lcut)
my_constants.zmax = r0 + Lcut + eps_z # [m] upper limit in z for particle creation
particles.species_names = electrons hydrogen
# particle species
hydrogen.species_type = hydrogen
hydrogen.injection_style = NUniformPerCell
hydrogen.num_particles_per_cell_each_dim = 2 2
# for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
#hydrogen.num_particles_per_cell_each_dim = 4 8
hydrogen.momentum_distribution_type = at_rest
# minimum and maximum z position between which particles are initialized
# --> should be set for dense targets limit memory consumption during initialization
hydrogen.zmin = -zmax
hydrogen.zmax = zmax
hydrogen.profile = parse_density_function
hydrogen.addRealAttributes = orig_x orig_z
hydrogen.attribute.orig_x(x,y,z,ux,uy,uz,t) = "x"
hydrogen.attribute.orig_z(x,y,z,ux,uy,uz,t) = "z"
electrons.species_type = electron
electrons.injection_style = NUniformPerCell
electrons.num_particles_per_cell_each_dim = 2 2
# for more realistic simulations, try to avoid that macro-particles represent more than 1 n_c
#electrons.num_particles_per_cell_each_dim = 4 8
electrons.momentum_distribution_type = "gaussian"
electrons.ux_th = .01
electrons.uz_th = .01
# minimum and maximum z position between which particles are initialized
# --> should be set for dense targets limit memory consumption during initialization
electrons.zmin = -zmax
electrons.zmax = zmax
# ionization physics (field ionization/ADK)
# [i1] none (fully pre-ionized):
electrons.profile = parse_density_function
# [i2] field ionization (ADK):
#hydrogen.do_field_ionization = 1
#hydrogen.physical_element = H
#hydrogen.ionization_initial_level = 0
#hydrogen.ionization_product_species = electrons
#electrons.profile = constant
#electrons.density = 0.0
# collisional physics (binary MC model after Nanbu/Perez)
#collisions.collision_names = c_eH c_ee c_HH
#c_eH.species = electrons hydrogen
#c_ee.species = electrons electrons
#c_HH.species = hydrogen hydrogen
#c_eH.CoulombLog = 15.9
#c_ee.CoulombLog = 15.9
#c_HH.CoulombLog = 15.9
# number density: "fully ionized" electron density as reference
# [material 1] cryogenic H2
my_constants.nc = 1.742e27 # [m^-3] 1.11485e21 * 1.e6 / 0.8**2
my_constants.n0 = 30.0 # [n_c]
# [material 2] liquid crystal
#my_constants.n0 = 192
# [material 3] PMMA
#my_constants.n0 = 230
# [material 4] Copper (ion density: 8.49e28/m^3; times ionization level)
#my_constants.n0 = 1400
# density profiles (target extent, pre-plasma and cutoffs defined above particle species list)
# [target 1] flat foil (thickness = 2*r0)
electrons.density_function(x,y,z) = "nc*n0*(
if(abs(z)<=r0, 1.0, if(abs(z)<r0+Lcut, exp((-abs(z)+r0)/L), 0.0)) )"
hydrogen.density_function(x,y,z) = "nc*n0*(
if(abs(z)<=r0, 1.0, if(abs(z)<r0+Lcut, exp((-abs(z)+r0)/L), 0.0)) )"
# [target 2] cylinder
#electrons.density_function(x,y,z) = "nc*n0*(
# ((x*x+z*z)<=(r0*r0)) +
# (sqrt(x*x+z*z)>r0)*(sqrt(x*x+z*z)<r0+Lcut)*exp( (-sqrt(x*x+z*z)+r0)/L ) )"
#hydrogen.density_function(x,y,z) = "nc*n0*(
# ((x*x+z*z)<=(r0*r0)) +
# (sqrt(x*x+z*z)>r0)*(sqrt(x*x+z*z)<r0+Lcut)*exp( (-sqrt(x*x+z*z)+r0)/L ) )"
# [target 3] sphere
#electrons.density_function(x,y,z) = "nc*n0*(
# ((x*x+y*y+z*z)<=(r0*r0)) +
# (sqrt(x*x+y*y+z*z)>r0)*(sqrt(x*x+y*y+z*z)<r0+Lcut)*exp( (-sqrt(x*x+y*y+z*z)+r0)/L ) )"
#hydrogen.density_function(x,y,z) = "nc*n0*(
# ((x*x+y*y+z*z)<=(r0*r0)) +
# (sqrt(x*x+y*y+z*z)>r0)*(sqrt(x*x+y*y+z*z)<r0+Lcut)*exp( (-sqrt(x*x+y*y+z*z)+r0)/L ) )"
#################################
# Laser Pulse Profile
#
lasers.names = laser1
laser1.position = 0. 0. -4.0e-6 # point the laser plane (antenna)
laser1.direction = 0. 0. 1. # the plane's (antenna's) normal direction
laser1.polarization = 1. 0. 0. # the main polarization vector
laser1.a0 = 16.0 # maximum amplitude of the laser field [V/m]
laser1.wavelength = 0.8e-6 # central wavelength of the laser pulse [m]
laser1.profile = Gaussian
laser1.profile_waist = 4.e-6 # beam waist (E(w_0)=E_0/e) [m]
laser1.profile_duration = 30.e-15 # pulse length (E(tau)=E_0/e; tau=tau_E=FWHM_I/1.17741) [s]
laser1.profile_t_peak = 50.e-15 # time until peak intensity reached at the laser plane [s]
laser1.profile_focal_distance = 4.0e-6 # focal distance from the antenna [m]
# e_max = a0 * 3.211e12 / lambda_0[mu]
# a0 = 16, lambda_0 = 0.8mu -> e_max = 64.22 TV/m
#################################
# Diagnostics
#
diagnostics.diags_names = diag1 openPMDfw openPMDbw
diag1.intervals = 100
diag1.diag_type = Full
diag1.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz rho rho_electrons rho_hydrogen
# reduce resolution of output fields
diag1.coarsening_ratio = 4 4
# demonstration of a spatial and momentum filter
diag1.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)
diag1.hydrogen.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)
diag1.format = openpmd
diag1.openpmd_backend = h5
openPMDfw.intervals = 100
openPMDfw.diag_type = Full
openPMDfw.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz rho rho_electrons rho_hydrogen
# reduce resolution of output fields
openPMDfw.coarsening_ratio = 4 4
openPMDfw.format = openpmd
openPMDfw.openpmd_backend = h5
# demonstration of a spatial and momentum filter
openPMDfw.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)
openPMDfw.hydrogen.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz>=0) * (x<1.0e-6) * (x>-1.0e-6)
openPMDbw.intervals = 100
openPMDbw.diag_type = Full
openPMDbw.fields_to_plot = rho_hydrogen
# reduce resolution of output fields
openPMDbw.coarsening_ratio = 4 4
openPMDbw.format = openpmd
openPMDbw.openpmd_backend = h5
# demonstration of a momentum filter
openPMDbw.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz<0)
openPMDbw.hydrogen.plot_filter_function(t,x,y,z,ux,uy,uz) = (uz<0)
#################################
# Reduced Diagnostics
#
# histograms with 2.0 degree acceptance angle in fw direction
# 2 deg * pi / 180 : 0.03490658503 rad
# half-angle +/- : 0.017453292515 rad
warpx.reduced_diags_names = histuH histue histuzAll FieldProbe_Z FieldProbe_ScatPoint FieldProbe_ScatLine LBC PhaseSpaceIons PhaseSpaceElectrons
histuH.type = ParticleHistogram
histuH.intervals = 100
histuH.species = hydrogen
histuH.bin_number = 1000
histuH.bin_min = 0.0
histuH.bin_max = 0.474 # 100 MeV protons
histuH.histogram_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)"
histuH.filter_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)"
histue.type = ParticleHistogram
histue.intervals = 100
histue.species = electrons
histue.bin_number = 1000
histue.bin_min = 0.0
histue.bin_max = 197 # 100 MeV electrons
histue.histogram_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, sqrt(u2), 0.0)"
histue.filter_function(t,x,y,z,ux,uy,uz) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)"
# just a test entry to make sure that the histogram filter is purely optional:
# this one just records uz of all hydrogen ions, independent of their pointing
histuzAll.type = ParticleHistogram
histuzAll.intervals = 100
histuzAll.species = hydrogen
histuzAll.bin_number = 1000
histuzAll.bin_min = -0.474
histuzAll.bin_max = 0.474
histuzAll.histogram_function(t,x,y,z,ux,uy,uz) = "uz"
FieldProbe_Z.type = FieldProbe
FieldProbe_Z.intervals = 100
FieldProbe_Z.integrate = 0
FieldProbe_Z.probe_geometry = Line
FieldProbe_Z.x_probe = 0.0
FieldProbe_Z.z_probe = -5.0e-6
FieldProbe_Z.x1_probe = 0.0
FieldProbe_Z.z1_probe = 25.0e-6
FieldProbe_Z.resolution = 3712
FieldProbe_ScatPoint.type = FieldProbe
FieldProbe_ScatPoint.intervals = 1
FieldProbe_ScatPoint.integrate = 0
FieldProbe_ScatPoint.probe_geometry = Point
FieldProbe_ScatPoint.x_probe = 0.0
FieldProbe_ScatPoint.z_probe = 15e-6
FieldProbe_ScatLine.type = FieldProbe
FieldProbe_ScatLine.intervals = 100
FieldProbe_ScatLine.integrate = 1
FieldProbe_ScatLine.probe_geometry = Line
FieldProbe_ScatLine.x_probe = -2.5e-6
FieldProbe_ScatLine.z_probe = 15e-6
FieldProbe_ScatLine.x1_probe = 2.5e-6
FieldProbe_ScatLine.z1_probe = 15e-6
FieldProbe_ScatLine.resolution = 201
# check computational load per box
LBC.type = LoadBalanceCosts
LBC.intervals = 100
PhaseSpaceIons.type = ParticleHistogram2D
PhaseSpaceIons.intervals = 100
PhaseSpaceIons.species = hydrogen
PhaseSpaceIons.bin_number_abs = 1000
PhaseSpaceIons.bin_number_ord = 1000
PhaseSpaceIons.bin_min_abs = -5.e-6
PhaseSpaceIons.bin_max_abs = 25.e-6
PhaseSpaceIons.bin_min_ord = -0.474
PhaseSpaceIons.bin_max_ord = 0.474
PhaseSpaceIons.histogram_function_abs(t,x,y,z,ux,uy,uz,w) = "z"
PhaseSpaceIons.histogram_function_ord(t,x,y,z,ux,uy,uz,w) = "uz"
PhaseSpaceIons.value_function(t,x,y,z,ux,uy,uz,w) = "w"
# PhaseSpaceIons.filter_function(t,x,y,z,ux,uy,uz,w) = "u2=ux*ux+uy*uy+uz*uz; if(u2>0, abs(acos(uz / sqrt(u2))) <= 0.017453, 0)"
PhaseSpaceElectrons.type = ParticleHistogram2D
PhaseSpaceElectrons.intervals = 100
PhaseSpaceElectrons.species = electrons
PhaseSpaceElectrons.bin_number_abs = 1000
PhaseSpaceElectrons.bin_number_ord = 1000
PhaseSpaceElectrons.bin_min_abs = -5.e-6
PhaseSpaceElectrons.bin_max_abs = 25.e-6
PhaseSpaceElectrons.bin_min_ord = -197
PhaseSpaceElectrons.bin_max_ord = 197
PhaseSpaceElectrons.histogram_function_abs(t,x,y,z,ux,uy,uz,w) = "z"
PhaseSpaceElectrons.histogram_function_ord(t,x,y,z,ux,uy,uz,w) = "uz"
PhaseSpaceElectrons.value_function(t,x,y,z,ux,uy,uz,w) = "w"
PhaseSpaceElectrons.filter_function(t,x,y,z,ux,uy,uz,w) = "sqrt(x*x+y*y) < 1e-6"
#################################
# Physical Background
#
# This example is modeled after a target similar to the hydrogen jet here:
# [1] https://doi.org/10.1038/s41598-017-10589-3
# [2] https://arxiv.org/abs/1903.06428
#
authors = "Axel Huebl <axelhuebl@lbl.gov>"
Analyze

Longitudinal phase space of forward-moving electrons in a 2 degree opening angle.

Longitudinal phase space of forward-moving protons in a 2 degree opening angle.
Time-resolved phase electron space analysis as in Fig. 3 gives information about, e.g., how laser energy is locally converted into electron kinetic energy. Later in time, ion phase spaces like Fig. 4 can reveal where accelerated ion populations originate.
Script analysis_histogram_2D.py
Examples/Physics_applications/laser_ion/analysis_histogram_2D.py
.#!/usr/bin/env python3
# This script displays a 2D histogram.
import argparse
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import numpy as np
from openpmd_viewer import OpenPMDTimeSeries
parser = argparse.ArgumentParser(description='Process a 2D histogram name and an integer.')
parser.add_argument("hist2D", help="Folder name of the reduced diagnostic.")
parser.add_argument("iter", help="Iteration number of the simulation that is plotted. Enter a number from the list of iterations or 'All' if you want all plots.")
args = parser.parse_args()
path = 'diags/reducedfiles/' + args.hist2D
ts = OpenPMDTimeSeries(path)
it = ts.iterations
data, info = ts.get_field(field="data", iteration=0, plot=True)
print('The available iterations of the simulation are:', it)
print('The axes of the histogram are (0: ordinate ; 1: abscissa):', info.axes)
print('The data shape is:', data.shape)
# Add the simulation time to the title once this information
# is available in the "info" FieldMetaInformation object.
if args.iter == 'All' :
for it_idx, i in enumerate(it):
plt.figure()
data, info = ts.get_field(field="data", iteration=i, plot=False)
abscissa_name = info.axes[1] # This might be 'z' or something else
abscissa_values = getattr(info, abscissa_name, None)
ordinate_name = info.axes[0] # This might be 'z' or something else
ordinate_values = getattr(info, ordinate_name, None)
plt.pcolormesh(abscissa_values/1e-6, ordinate_values, data, norm=colors.LogNorm(), rasterized=True)
plt.title(args.hist2D + f" Time: {ts.t[it_idx]:.2e} s (Iteration: {i:d})")
plt.xlabel(info.axes[1]+r' ($\mu$m)')
plt.ylabel(info.axes[0]+r' ($m_\mathrm{species} c$)')
plt.colorbar()
plt.tight_layout()
plt.savefig('Histogram_2D_' + args.hist2D + '_iteration_' + str(i) + '.png')
else :
i = int(args.iter)
it_idx = np.where(i == it)[0][0]
plt.figure()
data, info = ts.get_field(field="data", iteration=i, plot=False)
abscissa_name = info.axes[1] # This might be 'z' or something else
abscissa_values = getattr(info, abscissa_name, None)
ordinate_name = info.axes[0] # This might be 'z' or something else
ordinate_values = getattr(info, ordinate_name, None)
plt.pcolormesh(abscissa_values/1e-6, ordinate_values, data, norm=colors.LogNorm(), rasterized=True)
plt.title(args.hist2D + f" Time: {ts.t[it_idx]:.2e} s (Iteration: {i:d})")
plt.xlabel(info.axes[1]+r' ($\mu$m)')
plt.ylabel(info.axes[0]+r' ($m_\mathrm{species} c$)')
plt.colorbar()
plt.tight_layout()
plt.savefig('Histogram_2D_' + args.hist2D + '_iteration_' + str(i) + '.png')
Visualize
Note
The following images for densities and electromagnetic fields were created with a run on 64 NVidia A100 GPUs featuring a total number of cells of nx = 8192
and nz = 16384
, as well as 64 particles per cell per species.

Particle density output illustrates the evolution of the target in time and space.
Logarithmic scales can help to identify where the target becomes transparent for the laser pulse (bottom panel in fig-tnsa-densities
).

Electromagnetic field visualization for \(E_x\) (top), \(B_y\) (middle), and \(E_z\) (bottom).
Electromagnetic field output shows where the laser field is strongest at a given point in time, and where accelerating fields build up Fig. 5.
Script plot_2d.py
Examples/Physics_applications/laser_ion/plot_2d.py
.#!/usr/bin/env python3
# Copyright 2023 The WarpX Community
#
# This file is part of WarpX.
#
# Authors: Marco Garten
# License: BSD-3-Clause-LBNL
#
# This script plots the densities and fields of a 2D laser-ion acceleration simulation.
import argparse
import os
import re
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scipy.constants as sc
from matplotlib.colors import TwoSlopeNorm
from openpmd_viewer import OpenPMDTimeSeries
plt.rcParams.update({'font.size':16})
def create_analysis_dir(directory):
if not os.path.exists(directory):
os.makedirs(directory)
def visualize_density_iteration(ts, iteration, out_dir):
"""
Visualize densities and fields of a single iteration.
:param ts: OpenPMDTimeSeries
:param iteration: Output iteration (simulation timestep)
:param out_dir: Directory for PNG output
:return:
"""
# Physics parameters
lambda_L = 800e-9 # Laser wavelength in meters
omega_L = 2 * np.pi * sc.c / lambda_L # Laser frequency in seconds
n_c = sc.m_e * sc.epsilon_0 * omega_L**2 / sc.elementary_charge**2 # Critical plasma density in meters^(-3)
micron = 1e-6
# Simulation parameters
n_e0 = 30
n_max = 2 * n_e0
nr = 1 # Number to decrease resolution
# Data fetching
it = iteration
ii = np.where(ts.iterations == it)[0][0]
time = ts.t[ii]
rho_e, rho_e_info = ts.get_field(field="rho_electrons", iteration=it)
rho_d, rho_d_info = ts.get_field(field="rho_hydrogen", iteration=it)
# Rescale to critical density
rho_e = rho_e / (sc.elementary_charge * n_c)
rho_d = rho_d / (sc.elementary_charge * n_c)
# Axes setup
fig, axs = plt.subplots(3, 1, figsize=(5, 8))
xax, zax = rho_e_info.x, rho_e_info.z
# Plotting
# Electron density
im0 = axs[0].pcolormesh(zax[::nr]/micron, xax[::nr]/micron, -rho_e.T[::nr, ::nr],
vmin=0, vmax=n_max, cmap="Reds", rasterized=True)
plt.colorbar(im0, ax=axs[0], label=r"$n_\mathrm{\,e}\ (n_\mathrm{c})$")
# Hydrogen density
im1 = axs[1].pcolormesh(zax[::nr]/micron, xax[::nr]/micron, rho_d.T[::nr, ::nr],
vmin=0, vmax=n_max, cmap="Blues", rasterized=True)
plt.colorbar(im1, ax=axs[1], label=r"$n_\mathrm{\,H}\ (n_\mathrm{c})$")
# Masked electron density
divnorm = TwoSlopeNorm(vmin=-7., vcenter=0., vmax=2)
masked_data = np.ma.masked_where(rho_e.T == 0, rho_e.T)
my_cmap = plt.cm.PiYG_r.copy()
my_cmap.set_bad(color='black')
im2 = axs[2].pcolormesh(zax[::nr]/micron, xax[::nr]/micron, np.log(-masked_data[::nr, ::nr]),
norm=divnorm, cmap=my_cmap, rasterized=True)
plt.colorbar(im2, ax=axs[2], ticks=[-6, -3, 0, 1, 2], extend='both',
label=r"$\log n_\mathrm{\,e}\ (n_\mathrm{c})$")
# Axis labels and title
for ax in axs:
ax.set_aspect(1.0)
ax.set_ylabel(r"$x$ ($\mu$m)")
for ax in axs[:-1]:
ax.set_xticklabels([])
axs[2].set_xlabel(r"$z$ ($\mu$m)")
fig.suptitle(f"Iteration: {it}, Time: {time/1e-15:.1f} fs")
plt.tight_layout()
plt.savefig(f"{out_dir}/densities_{it:06d}.png")
def visualize_field_iteration(ts, iteration, out_dir):
# Additional parameters
nr = 1 # Number to decrease resolution
micron = 1e-6
# Data fetching
it = iteration
ii = np.where(ts.iterations == it)[0][0]
time = ts.t[ii]
Ex, Ex_info = ts.get_field(field="E", coord="x", iteration=it)
Exmax = np.max(np.abs([np.min(Ex),np.max(Ex)]))
By, By_info = ts.get_field(field="B", coord="y", iteration=it)
Bymax = np.max(np.abs([np.min(By),np.max(By)]))
Ez, Ez_info = ts.get_field(field="E", coord="z", iteration=it)
Ezmax = np.max(np.abs([np.min(Ez),np.max(Ez)]))
# Axes setup
fig,axs = plt.subplots(3, 1, figsize=(5, 8))
xax, zax = Ex_info.x, Ex_info.z
# Plotting
im0 = axs[0].pcolormesh(
zax[::nr]/micron,xax[::nr]/micron,Ex.T[::nr,::nr],
vmin=-Exmax, vmax=Exmax,
cmap="RdBu", rasterized=True)
plt.colorbar(im0,ax=axs[00], label=r"$E_x$ (V/m)")
im1 = axs[1].pcolormesh(
zax[::nr]/micron,xax[::nr]/micron,By.T[::nr,::nr],
vmin=-Bymax, vmax=Bymax,
cmap="RdBu", rasterized=True)
plt.colorbar(im1,ax=axs[1], label=r"$B_y$ (T)")
im2 = axs[2].pcolormesh(
zax[::nr]/micron,xax[::nr]/micron,Ez.T[::nr,::nr],
vmin=-Ezmax, vmax=Ezmax,
cmap="RdBu", rasterized=True)
plt.colorbar(im2,ax=axs[2],label=r"$E_z$ (V/m)")
# Axis labels and title
for ax in axs:
ax.set_aspect(1.0)
ax.set_ylabel(r"$x$ ($\mu$m)")
for ax in axs[:-1]:
ax.set_xticklabels([])
axs[2].set_xlabel(r"$z$ ($\mu$m)")
fig.suptitle(f"Iteration: {it}, Time: {time/1e-15:.1f} fs")
plt.tight_layout()
plt.savefig(f"{out_dir}/fields_{it:06d}.png")
def visualize_particle_histogram_iteration(diag_name="histuH", species="hydrogen", iteration=1000, out_dir="./analysis"):
it = iteration
if species == "hydrogen":
# proton rest energy in eV
mc2 = sc.m_p/sc.electron_volt * sc.c**2
elif species == "electron":
mc2 = sc.m_e/sc.electron_volt * sc.c**2
else:
raise NotImplementedError("The only implemented presets for this analysis script are `electron` or `hydrogen`.")
fs = 1.e-15
MeV = 1.e6
df = pd.read_csv(f"./diags/reducedfiles/{diag_name}.txt",delimiter=r'\s+')
# the columns look like this:
# #[0]step() [1]time(s) [2]bin1=0.000220() [3]bin2=0.000660() [4]bin3=0.001100()
# matches words, strings surrounded by " ' ", dots, minus signs and e for scientific notation in numbers
nested_list = [re.findall(r"[\w'\.]+",col) for col in df.columns]
index = pd.MultiIndex.from_tuples(nested_list, names=('column#', 'name', 'bin value'))
df.columns = (index)
steps = df.values[:, 0].astype(int)
ii = np.where(steps == it)[0][0]
time = df.values[:, 1]
data = df.values[:, 2:]
edge_vals = np.array([float(row[2]) for row in df.columns[2:]])
edges_MeV = (np.sqrt(edge_vals**2 + 1)-1) * mc2 / MeV
time_fs = time / fs
fig,ax = plt.subplots(1,1)
ax.plot(edges_MeV, data[ii, :])
ax.set_yscale("log")
ax.set_ylabel(r"d$N$/d$\mathcal{E}$ (arb. u.)")
ax.set_xlabel(r"$\mathcal{E}$ (MeV)")
fig.suptitle(f"{species} - Iteration: {it}, Time: {time_fs[ii]:.1f} fs")
plt.tight_layout()
plt.savefig(f"./{out_dir}/{diag_name}_{it:06d}.png")
if __name__ == "__main__":
# Argument parsing
parser = argparse.ArgumentParser(description='Visualize Laser-Ion Accelerator Densities and Fields')
parser.add_argument('-d', '--diag_dir', type=str, default='./diags/diag1', help='Directory containing density and field diagnostics')
parser.add_argument('-i', '--iteration', type=int, default=None, help='Specific iteration to visualize')
parser.add_argument('-hn', '--histogram_name', type=str, default='histuH', help='Name of histogram diagnostic to visualize')
parser.add_argument('-hs', '--histogram_species', type=str, default='hydrogen', help='Particle species in the visualized histogram diagnostic')
args = parser.parse_args()
# Create analysis directory
analysis_dir = 'analysis'
create_analysis_dir(analysis_dir)
# Loading the time series
ts = OpenPMDTimeSeries(args.diag_dir)
if args.iteration is not None:
visualize_density_iteration(ts, args.iteration, analysis_dir)
visualize_field_iteration(ts, args.iteration, analysis_dir)
visualize_particle_histogram_iteration(args.histogram_name, args.histogram_species, args.iteration, analysis_dir)
else:
for it in ts.iterations:
visualize_density_iteration(ts, it, analysis_dir)
visualize_field_iteration(ts, it, analysis_dir)
visualize_particle_histogram_iteration(args.histogram_name, args.histogram_species, it, analysis_dir)
Plasma-Mirror
This example shows how to model a plasma mirror, using a planar target of solid density [7, 8].
Although laser-solid interaction modeling requires full 3D modeling for adequate description of the dynamics at play, this example models a 2D example. 2D modeling provide a qualitative overview of the dynamics, but mostly saves computational costs since the plasma frequency (and Debye length) of the surface plasma determines the resolution need in laser-solid interaction modeling.
Note
TODO: The Python (PICMI) input file needs to be created.
Run
This example can be run either as:
Python script: (TODO) or
WarpX executable using an input file:
warpx.2d inputs_2d
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Note
TODO: This input file should be created following the inputs_2d
file.
Examples/Physics_applications/plasma_mirror/inputs_2d
.#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 1000
amr.n_cell = 1024 512
amr.max_grid_size = 128
amr.blocking_factor = 32
amr.max_level = 0
geometry.dims = 2
geometry.prob_lo = -100.e-6 0. # physical domain
geometry.prob_hi = 100.e-6 100.e-6
warpx.verbose = 1
warpx.serialize_initial_conditions = 1
#################################
####### Boundary condition ######
#################################
boundary.field_lo = pml pml
boundary.field_hi = pml pml
#################################
############ NUMERICS ###########
#################################
my_constants.zc = 20.e-6
my_constants.zp = 20.05545177444479562e-6
my_constants.lgrad = .08e-6
my_constants.nc = 1.74e27
my_constants.zp2 = 24.e-6
my_constants.zc2 = 24.05545177444479562e-6
warpx.cfl = 1.0
warpx.use_filter = 1
algo.load_balance_intervals = 66
# Order of particle shape factors
algo.particle_shape = 3
#################################
############ PLASMA #############
#################################
particles.species_names = electrons ions
electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = NUniformPerCell
electrons.num_particles_per_cell_each_dim = 2 2
electrons.momentum_distribution_type = "gaussian"
electrons.ux_th = .01
electrons.uz_th = .01
electrons.zmin = "zc-lgrad*log(400)"
electrons.zmax = 25.47931e-6
electrons.profile = parse_density_function
electrons.density_function(x,y,z) = "if(z<zp, nc*exp((z-zc)/lgrad), if(z<=zp2, 2.*nc, nc*exp(-(z-zc2)/lgrad)))"
ions.charge = q_e
ions.mass = m_p
ions.injection_style = NUniformPerCell
ions.num_particles_per_cell_each_dim = 2 2
ions.momentum_distribution_type = "at_rest"
ions.zmin = 19.520e-6
ions.zmax = 25.47931e-6
ions.profile = parse_density_function
ions.density_function(x,y,z) = "if(z<zp, nc*exp((z-zc)/lgrad), if(z<=zp2, 2.*nc, nc*exp(-(z-zc2)/lgrad)))"
#################################
############# LASER #############
#################################
lasers.names = laser1
laser1.position = 0. 0. 5.e-6 # This point is on the laser plane
laser1.direction = 0. 0. 1. # The plane normal direction
laser1.polarization = 1. 0. 0. # The main polarization vector
laser1.e_max = 4.e12 # Maximum amplitude of the laser field (in V/m)
laser1.wavelength = 0.8e-6 # The wavelength of the laser (in meters)
laser1.profile = Gaussian
laser1.profile_waist = 5.e-6 # The waist of the laser (in meters)
laser1.profile_duration = 15.e-15 # The duration of the laser (in seconds)
laser1.profile_t_peak = 25.e-15 # The time at which the laser reaches its peak (in seconds)
laser1.profile_focal_distance = 15.e-6 # Focal distance from the antenna (in meters)
# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 10
diag1.diag_type = Full
Analyze
Note
This section is TODO.
Visualize
Note
This section is TODO.
Particle Accelerator & Beam Physics
Gaussian Beam
This example initializes a Gaussian beam distribution.
Run
This example can be run either as:
Python script:
python3 PICMI_inputs_gaussian_beam.py
orWarpX executable using an input file: (TODO)
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Examples/Tests/gaussian_beam/PICMI_inputs_gaussian_beam.py
.#!/usr/bin/env python3
#from warp import picmi
import argparse
from pywarpx import picmi
parser = argparse.ArgumentParser(description="Gaussian beam PICMI example")
parser.add_argument('--diagformat', type=str,
help='Format of the full diagnostics (plotfile, openpmd, ascent, sensei, ...)',
default='plotfile')
parser.add_argument('--fields_to_plot', type=str,
help='List of fields to write to diagnostics',
default=['E', 'B', 'J', 'part_per_cell'],
nargs = '*')
args = parser.parse_args()
constants = picmi.constants
nx = 32
ny = 32
nz = 32
xmin = -2.
xmax = +2.
ymin = -2.
ymax = +2.
zmin = -2.
zmax = +2.
number_sim_particles = 32768
total_charge = 8.010883097437485e-07
beam_rms_size = 0.25
electron_beam_divergence = -0.04*constants.c
em_order = 3
grid = picmi.Cartesian3DGrid(number_of_cells = [nx, ny, nz],
lower_bound = [xmin, ymin, zmin],
upper_bound = [xmax, ymax, zmax],
lower_boundary_conditions = ['periodic', 'periodic', 'open'],
upper_boundary_conditions = ['periodic', 'periodic', 'open'],
lower_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
upper_boundary_conditions_particles = ['periodic', 'periodic', 'absorbing'],
warpx_max_grid_size=16)
solver = picmi.ElectromagneticSolver(grid = grid,
cfl = 1.,
stencil_order=[em_order,em_order,em_order])
electron_beam = picmi.GaussianBunchDistribution(n_physical_particles = total_charge/constants.q_e,
rms_bunch_size = [beam_rms_size, beam_rms_size, beam_rms_size],
velocity_divergence = [electron_beam_divergence, electron_beam_divergence, electron_beam_divergence])
proton_beam = picmi.GaussianBunchDistribution(n_physical_particles = total_charge/constants.q_e,
rms_bunch_size = [beam_rms_size, beam_rms_size, beam_rms_size])
electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=electron_beam)
protons = picmi.Species(particle_type='proton', name='protons', initial_distribution=proton_beam)
field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
grid = grid,
period = 10,
data_list = args.fields_to_plot,
warpx_format = args.diagformat,
write_dir = '.',
warpx_file_prefix = 'Python_gaussian_beam_plt')
part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
period = 10,
species = [electrons, protons],
data_list = ['weighting', 'momentum'],
warpx_format = args.diagformat)
sim = picmi.Simulation(solver = solver,
max_steps = 10,
verbose = 1,
warpx_current_deposition_algo = 'direct',
warpx_use_filter = 0)
sim.add_species(electrons, layout=picmi.PseudoRandomLayout(n_macroparticles=number_sim_particles))
sim.add_species(protons, layout=picmi.PseudoRandomLayout(n_macroparticles=number_sim_particles))
sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)
# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name = 'inputs_from_PICMI')
# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()
Note
TODO: This input file should be created following the PICMI_inputs_gaussian_beam.py
file.
Analyze
Note
This section is TODO.
Visualize
Note
This section is TODO.
Beam-beam collision
This example shows how to simulate the collision between two ultra-relativistic particle beams. This is representative of what happens at the interaction point of a linear collider. We consider a right-propagating electron bunch colliding against a left-propagating positron bunch.
We turn on the Quantum Synchrotron QED module for photon emission (also known as beamstrahlung in the collider community) and the Breit-Wheeler QED module for the generation of electron-positron pairs (also known as coherent pair generation in the collider community).
To solve for the electromagnetic field we use the nodal version of the electrostatic relativistic solver.
This solver computes the average velocity of each species, and solves the corresponding relativistic Poisson equation (see the WarpX documentation for warpx.do_electrostatic = relativistic for more detail). This solver accurately reproduced the subtle cancellation that occur for some component of the E + v x B
terms which are crucial in simulations of relativistic particles.
This example is based on the following paper Yakimenko et al. [9].
Run
The PICMI input file is not available for this example yet.
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Examples/Physics_applications/beam-beam_collision/inputs
.#################################
########## MY CONSTANTS #########
#################################
my_constants.mc2 = m_e*clight*clight
my_constants.nano = 1.0e-9
my_constants.GeV = q_e*1.e9
# BEAMS
my_constants.beam_energy = 125.*GeV
my_constants.beam_uz = beam_energy/(mc2)
my_constants.beam_charge = 0.14*nano
my_constants.sigmax = 10*nano
my_constants.sigmay = 10*nano
my_constants.sigmaz = 10*nano
my_constants.beam_uth = 0.1/100.*beam_uz
my_constants.n0 = beam_charge / (q_e * sigmax * sigmay * sigmaz * (2.*pi)**(3./2.))
my_constants.omegab = sqrt(n0 * q_e**2 / (epsilon0*m_e))
my_constants.mux = 0.0
my_constants.muy = 0.0
my_constants.muz = -0.5*Lz+3.2*sigmaz
# BOX
my_constants.Lx = 100.0*clight/omegab
my_constants.Ly = 100.0*clight/omegab
my_constants.Lz = 180.0*clight/omegab
# for a full scale simulation use: nx, ny, nz = 512, 512, 1024
my_constants.nx = 64
my_constants.ny = 64
my_constants.nz = 128
# TIME
my_constants.T = 0.7*Lz/clight
my_constants.dt = sigmaz/clight/10.
# DIAGS
my_constants.every_red = 1.
warpx.used_inputs_file = warpx_used_inputs.txt
#################################
####### GENERAL PARAMETERS ######
#################################
stop_time = T
amr.n_cell = nx ny nz
amr.max_grid_size = 128
amr.blocking_factor = 2
amr.max_level = 0
geometry.dims = 3
geometry.prob_lo = -0.5*Lx -0.5*Ly -0.5*Lz
geometry.prob_hi = 0.5*Lx 0.5*Ly 0.5*Lz
#################################
######## BOUNDARY CONDITION #####
#################################
boundary.field_lo = PEC PEC PEC
boundary.field_hi = PEC PEC PEC
boundary.particle_lo = Absorbing Absorbing Absorbing
boundary.particle_hi = Absorbing Absorbing Absorbing
#################################
############ NUMERICS ###########
#################################
warpx.do_electrostatic = relativistic
warpx.const_dt = dt
warpx.grid_type = collocated
algo.particle_shape = 3
algo.load_balance_intervals=100
algo.particle_pusher = vay
#################################
########### PARTICLES ###########
#################################
particles.species_names = beam1 beam2 pho1 pho2 ele1 pos1 ele2 pos2
particles.photon_species = pho1 pho2
beam1.species_type = electron
beam1.injection_style = NUniformPerCell
beam1.num_particles_per_cell_each_dim = 1 1 1
beam1.profile = parse_density_function
beam1.density_function(x,y,z) = "n0 * exp(-(x-mux)**2/(2*sigmax**2)) * exp(-(y-muy)**2/(2*sigmay**2)) * exp(-(z-muz)**2/(2*sigmaz**2))"
beam1.density_min = n0 / 1e3
beam1.momentum_distribution_type = gaussian
beam1.uz_m = beam_uz
beam1.uy_m = 0.0
beam1.ux_m = 0.0
beam1.ux_th = beam_uth
beam1.uy_th = beam_uth
beam1.uz_th = beam_uth
beam1.initialize_self_fields = 1
beam1.self_fields_required_precision = 5e-10
beam1.self_fields_max_iters = 10000
beam1.do_qed_quantum_sync = 1
beam1.qed_quantum_sync_phot_product_species = pho1
beam1.do_classical_radiation_reaction = 0
beam2.species_type = positron
beam2.injection_style = NUniformPerCell
beam2.num_particles_per_cell_each_dim = 1 1 1
beam2.profile = parse_density_function
beam2.density_function(x,y,z) = "n0 * exp(-(x-mux)**2/(2*sigmax**2)) * exp(-(y-muy)**2/(2*sigmay**2)) * exp(-(z+muz)**2/(2*sigmaz**2))"
beam2.density_min = n0 / 1e3
beam2.momentum_distribution_type = gaussian
beam2.uz_m = -beam_uz
beam2.uy_m = 0.0
beam2.ux_m = 0.0
beam2.ux_th = beam_uth
beam2.uy_th = beam_uth
beam2.uz_th = beam_uth
beam2.initialize_self_fields = 1
beam2.self_fields_required_precision = 5e-10
beam2.self_fields_max_iters = 10000
beam2.do_qed_quantum_sync = 1
beam2.qed_quantum_sync_phot_product_species = pho2
beam2.do_classical_radiation_reaction = 0
pho1.species_type = photon
pho1.injection_style = none
pho1.do_qed_breit_wheeler = 1
pho1.qed_breit_wheeler_ele_product_species = ele1
pho1.qed_breit_wheeler_pos_product_species = pos1
pho2.species_type = photon
pho2.injection_style = none
pho2.do_qed_breit_wheeler = 1
pho2.qed_breit_wheeler_ele_product_species = ele2
pho2.qed_breit_wheeler_pos_product_species = pos2
ele1.species_type = electron
ele1.injection_style = none
ele1.self_fields_required_precision = 1e-11
ele1.self_fields_max_iters = 10000
ele1.do_qed_quantum_sync = 1
ele1.qed_quantum_sync_phot_product_species = pho1
ele1.do_classical_radiation_reaction = 0
pos1.species_type = positron
pos1.injection_style = none
pos1.self_fields_required_precision = 1e-11
pos1.self_fields_max_iters = 10000
pos1.do_qed_quantum_sync = 1
pos1.qed_quantum_sync_phot_product_species = pho1
pos1.do_classical_radiation_reaction = 0
ele2.species_type = electron
ele2.injection_style = none
ele2.self_fields_required_precision = 1e-11
ele2.self_fields_max_iters = 10000
ele2.do_qed_quantum_sync = 1
ele2.qed_quantum_sync_phot_product_species = pho2
ele2.do_classical_radiation_reaction = 0
pos2.species_type = positron
pos2.injection_style = none
pos2.self_fields_required_precision = 1e-11
pos2.self_fields_max_iters = 10000
pos2.do_qed_quantum_sync = 1
pos2.qed_quantum_sync_phot_product_species = pho2
pos2.do_classical_radiation_reaction = 0
pho1.species_type = photon
pho1.injection_style = none
pho1.do_qed_breit_wheeler = 1
pho1.qed_breit_wheeler_ele_product_species = ele1
pho1.qed_breit_wheeler_pos_product_species = pos1
pho2.species_type = photon
pho2.injection_style = none
pho2.do_qed_breit_wheeler = 1
pho2.qed_breit_wheeler_ele_product_species = ele2
pho2.qed_breit_wheeler_pos_product_species = pos2
#################################
############# QED ###############
#################################
qed_qs.photon_creation_energy_threshold = 0.
qed_qs.lookup_table_mode = builtin
qed_qs.chi_min = 1.e-3
qed_bw.lookup_table_mode = builtin
qed_bw.chi_min = 1.e-2
# for accurate results use the generated tables with
# the following parameters
# note: must compile with -DWarpX_QED_TABLE_GEN=ON
#qed_qs.lookup_table_mode = generate
#qed_bw.lookup_table_mode = generate
#qed_qs.tab_dndt_chi_min=1e-3
#qed_qs.tab_dndt_chi_max=2e3
#qed_qs.tab_dndt_how_many=512
#qed_qs.tab_em_chi_min=1e-3
#qed_qs.tab_em_chi_max=2e3
#qed_qs.tab_em_chi_how_many=512
#qed_qs.tab_em_frac_how_many=512
#qed_qs.tab_em_frac_min=1e-12
#qed_qs.save_table_in=my_qs_table.txt
#qed_bw.tab_dndt_chi_min=1e-2
#qed_bw.tab_dndt_chi_max=2e3
#qed_bw.tab_dndt_how_many=512
#qed_bw.tab_pair_chi_min=1e-2
#qed_bw.tab_pair_chi_max=2e3
#qed_bw.tab_pair_chi_how_many=512
#qed_bw.tab_pair_frac_how_many=512
#qed_bw.save_table_in=my_bw_table.txt
warpx.do_qed_schwinger = 0.
#################################
######### DIAGNOSTICS ###########
#################################
# FULL
diagnostics.diags_names = diag1
diag1.intervals = 0
diag1.diag_type = Full
diag1.write_species = 1
diag1.fields_to_plot = Ex Ey Ez Bx By Bz rho_beam1 rho_beam2 rho_ele1 rho_pos1 rho_ele2 rho_pos2
diag1.format = openpmd
diag1.dump_last_timestep = 1
diag1.species = pho1 pho2 ele1 pos1 ele2 pos2 beam1 beam2
# REDUCED
warpx.reduced_diags_names = ParticleNumber ColliderRelevant_beam1_beam2
ColliderRelevant_beam1_beam2.type = ColliderRelevant
ColliderRelevant_beam1_beam2.intervals = every_red
ColliderRelevant_beam1_beam2.species = beam1 beam2
ParticleNumber.type = ParticleNumber
ParticleNumber.intervals = every_red
Visualize
The figure below shows the number of photons emitted per beam particle (left) and the number of secondary pairs generated per beam particle (right).
We compare different results:
* (red) simplified WarpX simulation as the example stored in the directory /Examples/Physics_applications/beam-beam_collision
;
* (blue) large-scale WarpX simulation (high resolution and ad hoc generated tables ;
* (black) literature results from Yakimenko et al. [9].
The small-scale simulation has been performed with a resolution of nx = 64, ny = 64, nz = 128
grid cells, while the large-scale one has a much higher resolution of nx = 512, ny = 512, nz = 1024
. Moreover, the large-scale simulation uses dedicated QED lookup tables instead of the builtin tables. To generate the tables within WarpX, the code must be compiled with the flag -DWarpX_QED_TABLE_GEN=ON
. For the large-scale simulation we have used the following options:
qed_qs.lookup_table_mode = generate
qed_bw.lookup_table_mode = generate
qed_qs.tab_dndt_chi_min=1e-3
qed_qs.tab_dndt_chi_max=2e3
qed_qs.tab_dndt_how_many=512
qed_qs.tab_em_chi_min=1e-3
qed_qs.tab_em_chi_max=2e3
qed_qs.tab_em_chi_how_many=512
qed_qs.tab_em_frac_how_many=512
qed_qs.tab_em_frac_min=1e-12
qed_qs.save_table_in=my_qs_table.txt
qed_bw.tab_dndt_chi_min=1e-2
qed_bw.tab_dndt_chi_max=2e3
qed_bw.tab_dndt_how_many=512
qed_bw.tab_pair_chi_min=1e-2
qed_bw.tab_pair_chi_max=2e3
qed_bw.tab_pair_chi_how_many=512
qed_bw.tab_pair_frac_how_many=512
qed_bw.save_table_in=my_bw_table.txt

Beam-beam collision benchmark against Yakimenko et al. [9].
High Energy Astrophysical Plasma Physics
Ohm Solver: Magnetic Reconnection
Hybrid-PIC codes are often used to simulate magnetic reconnection in space plasmas. An example of magnetic reconnection from a force-free sheet is provided, based on the simulation described in Le et al. [10].
Run
The following Python script configures and launches the simulation.
Script PICMI_inputs.py
Examples/Tests/ohm_solver_magnetic_reconnection/PICMI_inputs.py
.#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script demonstrates the use of this model to
# --- simulate magnetic reconnection in a force-free sheet. The setup is based
# --- on the problem described in Le et al. (2016)
# --- https://aip.scitation.org/doi/10.1063/1.4943893.
import argparse
import shutil
import sys
from pathlib import Path
import dill
import numpy as np
from mpi4py import MPI as mpi
from pywarpx import callbacks, fields, libwarpx, picmi
constants = picmi.constants
comm = mpi.COMM_WORLD
simulation = picmi.Simulation(
warpx_serialize_initial_conditions=True,
verbose=0
)
class ForceFreeSheetReconnection(object):
# B0 is chosen with all other quantities scaled by it
B0 = 0.1 # Initial magnetic field strength (T)
# Physical parameters
m_ion = 400.0 # Ion mass (electron masses)
beta_e = 0.1
Bg = 0.3 # times B0 - guiding field
dB = 0.01 # times B0 - initial perturbation to seed reconnection
T_ratio = 5.0 # T_i / T_e
# Domain parameters
LX = 40 # ion skin depths
LZ = 20 # ion skin depths
LT = 50 # ion cyclotron periods
DT = 1e-3 # ion cyclotron periods
# Resolution parameters
NX = 512
NZ = 512
# Starting number of particles per cell
NPPC = 400
# Plasma resistivity - used to dampen the mode excitation
eta = 6e-3 # normalized resistivity
# Number of substeps used to update B
substeps = 20
def __init__(self, test, verbose):
self.test = test
self.verbose = verbose or self.test
# calculate various plasma parameters based on the simulation input
self.get_plasma_quantities()
self.Lx = self.LX * self.l_i
self.Lz = self.LZ * self.l_i
self.dt = self.DT * self.t_ci
# run very low resolution as a CI test
if self.test:
self.total_steps = 20
self.diag_steps = self.total_steps // 5
self.NX = 128
self.NZ = 128
else:
self.total_steps = int(self.LT / self.DT)
self.diag_steps = self.total_steps // 200
# Initial magnetic field
self.Bg *= self.B0
self.dB *= self.B0
self.Bx = (
f"{self.B0}*tanh(z*{1.0/self.l_i})"
f"+{-self.dB*self.Lx/(2.0*self.Lz)}*cos({2.0*np.pi/self.Lx}*x)"
f"*sin({np.pi/self.Lz}*z)"
)
self.By = (
f"sqrt({self.Bg**2 + self.B0**2}-"
f"({self.B0}*tanh(z*{1.0/self.l_i}))**2)"
)
self.Bz = f"{self.dB}*sin({2.0*np.pi/self.Lx}*x)*cos({np.pi/self.Lz}*z)"
self.J0 = self.B0 / constants.mu0 / self.l_i
# dump all the current attributes to a dill pickle file
if comm.rank == 0:
with open(f'sim_parameters.dpkl', 'wb') as f:
dill.dump(self, f)
# print out plasma parameters
if comm.rank == 0:
print(
f"Initializing simulation with input parameters:\n"
f"\tTi = {self.Ti*1e-3:.1f} keV\n"
f"\tn0 = {self.n_plasma:.1e} m^-3\n"
f"\tB0 = {self.B0:.2f} T\n"
f"\tM/m = {self.m_ion:.0f}\n"
)
print(
f"Plasma parameters:\n"
f"\tl_i = {self.l_i:.1e} m\n"
f"\tt_ci = {self.t_ci:.1e} s\n"
f"\tv_ti = {self.vi_th:.1e} m/s\n"
f"\tvA = {self.vA:.1e} m/s\n"
)
print(
f"Numerical parameters:\n"
f"\tdz = {self.Lz/self.NZ:.1e} m\n"
f"\tdt = {self.dt:.1e} s\n"
f"\tdiag steps = {self.diag_steps:d}\n"
f"\ttotal steps = {self.total_steps:d}\n"
)
self.setup_run()
def get_plasma_quantities(self):
"""Calculate various plasma parameters based on the simulation input."""
# Ion mass (kg)
self.M = self.m_ion * constants.m_e
# Cyclotron angular frequency (rad/s) and period (s)
self.w_ce = constants.q_e * abs(self.B0) / constants.m_e
self.w_ci = constants.q_e * abs(self.B0) / self.M
self.t_ci = 2.0 * np.pi / self.w_ci
# Electron plasma frequency: w_pe / omega_ce = 2 is given
self.w_pe = 2.0 * self.w_ce
# calculate plasma density based on electron plasma frequency
self.n_plasma = (
self.w_pe**2 * constants.m_e * constants.ep0 / constants.q_e**2
)
# Ion plasma frequency (Hz)
self.w_pi = np.sqrt(
constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
)
# Ion skin depth (m)
self.l_i = constants.c / self.w_pi
# # Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
self.vA = abs(self.B0) / np.sqrt(
constants.mu0 * self.n_plasma * (constants.m_e + self.M)
)
# calculate Te based on beta
self.Te = (
self.beta_e * self.B0**2 / (2.0 * constants.mu0 * self.n_plasma)
/ constants.q_e
)
self.Ti = self.Te * self.T_ratio
# calculate thermal speeds
self.ve_th = np.sqrt(self.Te * constants.q_e / constants.m_e)
self.vi_th = np.sqrt(self.Ti * constants.q_e / self.M)
# Ion Larmor radius (m)
self.rho_i = self.vi_th / self.w_ci
# Reference resistivity (Malakit et al.)
self.eta0 = self.l_i * self.vA / (constants.ep0 * constants.c**2)
def setup_run(self):
"""Setup simulation components."""
#######################################################################
# Set geometry and boundary conditions #
#######################################################################
# Create grid
self.grid = picmi.Cartesian2DGrid(
number_of_cells=[self.NX, self.NZ],
lower_bound=[-self.Lx/2.0, -self.Lz/2.0],
upper_bound=[self.Lx/2.0, self.Lz/2.0],
lower_boundary_conditions=['periodic', 'dirichlet'],
upper_boundary_conditions=['periodic', 'dirichlet'],
lower_boundary_conditions_particles=['periodic', 'reflecting'],
upper_boundary_conditions_particles=['periodic', 'reflecting'],
warpx_max_grid_size=self.NZ
)
simulation.time_step_size = self.dt
simulation.max_steps = self.total_steps
simulation.current_deposition_algo = 'direct'
simulation.particle_shape = 1
simulation.use_filter = False
simulation.verbose = self.verbose
#######################################################################
# Field solver and external field #
#######################################################################
self.solver = picmi.HybridPICSolver(
grid=self.grid, gamma=1.0,
Te=self.Te, n0=self.n_plasma, n_floor=0.1*self.n_plasma,
plasma_resistivity=self.eta*self.eta0,
substeps=self.substeps
)
simulation.solver = self.solver
B_ext = picmi.AnalyticInitialField(
Bx_expression=self.Bx,
By_expression=self.By,
Bz_expression=self.Bz
)
simulation.add_applied_field(B_ext)
#######################################################################
# Particle types setup #
#######################################################################
self.ions = picmi.Species(
name='ions', charge='q_e', mass=self.M,
initial_distribution=picmi.UniformDistribution(
density=self.n_plasma,
rms_velocity=[self.vi_th]*3,
)
)
simulation.add_species(
self.ions,
layout=picmi.PseudoRandomLayout(
grid=self.grid,
n_macroparticles_per_cell=self.NPPC
)
)
#######################################################################
# Add diagnostics #
#######################################################################
callbacks.installafterEsolve(self.check_fields)
if self.test:
particle_diag = picmi.ParticleDiagnostic(
name='diag1',
period=self.total_steps,
write_dir='.',
species=[self.ions],
data_list=['ux', 'uy', 'uz', 'x', 'z', 'weighting'],
warpx_file_prefix='Python_ohms_law_solver_magnetic_reconnection_2d_plt',
# warpx_format='openpmd',
# warpx_openpmd_backend='h5',
)
simulation.add_diagnostic(particle_diag)
field_diag = picmi.FieldDiagnostic(
name='diag1',
grid=self.grid,
period=self.total_steps,
data_list=['Bx', 'By', 'Bz', 'Ex', 'Ey', 'Ez'],
write_dir='.',
warpx_file_prefix='Python_ohms_law_solver_magnetic_reconnection_2d_plt',
# warpx_format='openpmd',
# warpx_openpmd_backend='h5',
)
simulation.add_diagnostic(field_diag)
# reduced diagnostics for reconnection rate calculation
# create a 2 l_i box around the X-point on which to measure
# magnetic flux changes
plane = picmi.ReducedDiagnostic(
diag_type="FieldProbe",
name='plane',
period=self.diag_steps,
path='diags/',
extension='dat',
probe_geometry='Plane',
resolution=60,
x_probe=0.0, z_probe=0.0, detector_radius=self.l_i,
target_up_x=0, target_up_z=1.0
)
simulation.add_diagnostic(plane)
#######################################################################
# Initialize #
#######################################################################
if comm.rank == 0:
if Path.exists(Path("diags")):
shutil.rmtree("diags")
Path("diags/fields").mkdir(parents=True, exist_ok=True)
# Initialize inputs and WarpX instance
simulation.initialize_inputs()
simulation.initialize_warpx()
def check_fields(self):
step = simulation.extension.warpx.getistep(lev=0) - 1
if not (step == 1 or step%self.diag_steps == 0):
return
rho = fields.RhoFPWrapper(include_ghosts=False)[:,:]
Jiy = fields.JyFPWrapper(include_ghosts=False)[...] / self.J0
Jy = fields.JyFPAmpereWrapper(include_ghosts=False)[...] / self.J0
Bx = fields.BxFPWrapper(include_ghosts=False)[...] / self.B0
By = fields.ByFPWrapper(include_ghosts=False)[...] / self.B0
Bz = fields.BzFPWrapper(include_ghosts=False)[...] / self.B0
if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
return
# save the fields to file
with open(f"diags/fields/fields_{step:06d}.npz", 'wb') as f:
np.savez(f, rho=rho, Jiy=Jiy, Jy=Jy, Bx=Bx, By=By, Bz=Bz)
##########################
# parse input parameters
##########################
parser = argparse.ArgumentParser()
parser.add_argument(
'-t', '--test', help='toggle whether this script is run as a short CI test',
action='store_true',
)
parser.add_argument(
'-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left
run = ForceFreeSheetReconnection(test=args.test, verbose=args.verbose)
simulation.step()
Running the full simulation should take about 4 hours if executed on 1 V100 GPU.
For MPI-parallel runs, prefix these lines with
mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
python3 PICMI_inputs.py
Analyze
The following script extracts the reconnection rate as a function of time and animates the evolution of the magnetic field (as shown below).
Script analysis.py
Examples/Tests/ohm_solver_magnetic_reconnection/analysis.py
.#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example of magnetic reconnection.
import glob
import dill
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import colors
plt.rcParams.update({'font.size': 20})
# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
sim = dill.load(f)
x_idx = 2
z_idx = 4
Ey_idx = 6
Bx_idx = 8
plane_data = np.loadtxt(f'diags/plane.dat', skiprows=1)
steps = np.unique(plane_data[:,0])
num_steps = len(steps)
num_cells = plane_data.shape[0]//num_steps
plane_data = plane_data.reshape((num_steps, num_cells, plane_data.shape[1]))
times = plane_data[:, 0, 1]
dt = np.mean(np.diff(times))
plt.plot(
times / sim.t_ci,
np.mean(plane_data[:,:,Ey_idx], axis=1) / (sim.vA * sim.B0),
'o-'
)
plt.grid()
plt.xlabel(r'$t/\tau_{c,i}$')
plt.ylabel('$<E_y>/v_AB_0$')
plt.title("Reconnection rate")
plt.tight_layout()
plt.savefig("diags/reconnection_rate.png")
if not sim.test:
from matplotlib.animation import FFMpegWriter, FuncAnimation
from scipy import interpolate
# Animate the magnetic reconnection
fig, axes = plt.subplots(3, 1, sharex=True, figsize=(7, 9))
for ax in axes.flatten():
ax.set_aspect('equal')
ax.set_ylabel('$z/l_i$')
axes[2].set_xlabel('$x/l_i$')
datafiles = sorted(glob.glob("diags/fields/*.npz"))
num_steps = len(datafiles)
data0 = np.load(datafiles[0])
sX = axes[0].imshow(
data0['Jy'].T, origin='lower',
norm=colors.TwoSlopeNorm(vmin=-0.6, vcenter=0., vmax=1.6),
extent=[0, sim.LX, -sim.LZ/2, sim.LZ/2],
cmap=plt.cm.RdYlBu_r
)
# axes[0].set_ylim(-5, 5)
cb = plt.colorbar(sX, ax=axes[0], label='$J_y/J_0$')
cb.ax.set_yscale('linear')
cb.ax.set_yticks([-0.5, 0.0, 0.75, 1.5])
sY = axes[1].imshow(
data0['By'].T, origin='lower', extent=[0, sim.LX, -sim.LZ/2, sim.LZ/2],
cmap=plt.cm.plasma
)
# axes[1].set_ylim(-5, 5)
cb = plt.colorbar(sY, ax=axes[1], label='$B_y/B_0$')
cb.ax.set_yscale('linear')
sZ = axes[2].imshow(
data0['Bz'].T, origin='lower', extent=[0, sim.LX, -sim.LZ/2, sim.LZ/2],
# norm=colors.TwoSlopeNorm(vmin=-0.02, vcenter=0., vmax=0.02),
cmap=plt.cm.RdBu
)
cb = plt.colorbar(sZ, ax=axes[2], label='$B_z/B_0$')
cb.ax.set_yscale('linear')
# plot field lines
x_grid = np.linspace(0, sim.LX, data0['Bx'][:-1].shape[0])
z_grid = np.linspace(-sim.LZ/2.0, sim.LZ/2.0, data0['Bx'].shape[1])
n_lines = 10
start_x = np.zeros(n_lines)
start_x[:n_lines//2] = sim.LX
start_z = np.linspace(-sim.LZ/2.0*0.9, sim.LZ/2.0*0.9, n_lines)
step_size = 1.0 / 100.0
def get_field_lines(Bx, Bz):
field_line_coords = []
Bx_interp = interpolate.interp2d(x_grid, z_grid, Bx[:-1].T)
Bz_interp = interpolate.interp2d(x_grid, z_grid, Bz[:,:-1].T)
for kk, z in enumerate(start_z):
path_x = [start_x[kk]]
path_z = [z]
ii = 0
while ii < 10000:
ii+=1
Bx = Bx_interp(path_x[-1], path_z[-1])[0]
Bz = Bz_interp(path_x[-1], path_z[-1])[0]
# print(path_x[-1], path_z[-1], Bx, Bz)
# normalize and scale
B_mag = np.sqrt(Bx**2 + Bz**2)
if B_mag == 0:
break
dx = Bx / B_mag * step_size
dz = Bz / B_mag * step_size
x_new = path_x[-1] + dx
z_new = path_z[-1] + dz
if np.isnan(x_new) or x_new <= 0 or x_new > sim.LX or abs(z_new) > sim.LZ/2:
break
path_x.append(x_new)
path_z.append(z_new)
field_line_coords.append([path_x, path_z])
return field_line_coords
field_lines = []
for path in get_field_lines(data0['Bx'], data0['Bz']):
path_x = path[0]
path_z = path[1]
l, = axes[2].plot(path_x, path_z, '--', color='k')
# draws arrows on the field lines
# if path_x[10] > path_x[0]:
axes[2].arrow(
path_x[50], path_z[50],
path_x[250]-path_x[50], path_z[250]-path_z[50],
shape='full', length_includes_head=True, lw=0, head_width=1.0,
color='g'
)
field_lines.append(l)
def animate(i):
data = np.load(datafiles[i])
sX.set_array(data['Jy'].T)
sY.set_array(data['By'].T)
sZ.set_array(data['Bz'].T)
sZ.set_clim(-np.max(abs(data['Bz'])), np.max(abs(data['Bz'])))
for ii, path in enumerate(get_field_lines(data['Bx'], data['Bz'])):
path_x = path[0]
path_z = path[1]
field_lines[ii].set_data(path_x, path_z)
anim = FuncAnimation(
fig, animate, frames=num_steps-1, repeat=True
)
writervideo = FFMpegWriter(fps=14)
anim.save('diags/mag_reconnection.mp4', writer=writervideo)
if sim.test:
import os
import sys
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)

Magnetic reconnection from a force-free sheet.
Microelectronics
ARTEMIS (Adaptive mesh Refinement Time-domain ElectrodynaMIcs Solver) is based on WarpX and couples the Maxwell’s equations implementation in WarpX with classical equations that describe quantum material behavior (such as, LLG equation for micromagnetics and London equation for superconducting materials) for quantifying the performance of next-generation microelectronics.
Nuclear Fusion
Note
TODO
Fundamental Plasma Physics
Langmuir Waves
These are examples of Plasma oscillations (Langmuir waves) in a uniform plasma in 1D, 2D, 3D, and RZ.
In each case, a uniform plasma is setup with a sinusoidal perturbation in the electron momentum along each axis. The plasma is followed for a short period of time, long enough so that E fields develop. The resulting fields can be compared to the analytic solutions.
Run
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
This example can be run as a Python script: python3 PICMI_inputs_3d.py
.
Examples/Tests/langmuir/PICMI_inputs_3d.py
.#!/usr/bin/env python3
#
# --- Simple example of Langmuir oscillations in a uniform plasma
from pywarpx import picmi
constants = picmi.constants
##########################
# physics parameters
##########################
plasma_density = 1.e25
plasma_xmin = 0.
plasma_x_velocity = 0.1*constants.c
##########################
# numerics parameters
##########################
# --- Number of time steps
max_steps = 40
diagnostic_interval = 10
# --- Grid
nx = 64
ny = 64
nz = 64
xmin = -20.e-6
ymin = -20.e-6
zmin = -20.e-6
xmax = +20.e-6
ymax = +20.e-6
zmax = +20.e-6
number_per_cell_each_dim = [2,2,2]
##########################
# physics components
##########################
uniform_plasma = picmi.UniformDistribution(density = 1.e25,
upper_bound = [0., None, None],
directed_velocity = [0.1*constants.c, 0., 0.])
electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=uniform_plasma)
##########################
# numerics components
##########################
grid = picmi.Cartesian3DGrid(number_of_cells = [nx, ny, nz],
lower_bound = [xmin, ymin, zmin],
upper_bound = [xmax, ymax, zmax],
lower_boundary_conditions = ['periodic', 'periodic', 'periodic'],
upper_boundary_conditions = ['periodic', 'periodic', 'periodic'],
moving_window_velocity = [0., 0., 0.],
warpx_max_grid_size = 32)
solver = picmi.ElectromagneticSolver(grid=grid, cfl=1.)
##########################
# diagnostics
##########################
field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
grid = grid,
period = diagnostic_interval,
data_list = ['Ex', 'Jx'],
write_dir = '.',
warpx_file_prefix = 'Python_Langmuir_plt')
part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
period = diagnostic_interval,
species = [electrons],
data_list = ['weighting', 'ux'])
##########################
# simulation setup
##########################
sim = picmi.Simulation(solver = solver,
max_steps = max_steps,
verbose = 1,
warpx_current_deposition_algo = 'direct')
sim.add_species(electrons,
layout = picmi.GriddedLayout(n_macroparticle_per_cell=number_per_cell_each_dim, grid=grid))
sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)
##########################
# simulation run
##########################
# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name = 'inputs_from_PICMI')
# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()
This example can be run as WarpX executable using an input file: warpx.3d inputs_3d
Examples/Tests/langmuir/inputs_3d
.# Parameters for the plasma wave
my_constants.max_step = 40
my_constants.lx = 40.e-6 # length of sides
my_constants.dx = 6.25e-07 # grid cell size
my_constants.nx = lx/dx # number of cells in each dimension
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24 # electron and positron densities, #/m^3
my_constants.wp = sqrt(2.*n0*q_e**2/(epsilon0*m_e)) # plasma frequency
my_constants.kp = wp/clight # plasma wavenumber
my_constants.k = 2.*2.*pi/lx # perturbation wavenumber
# Note: kp is calculated in SI for a density of 4e24 (i.e. 2e24 electrons + 2e24 positrons)
# k is calculated so as to have 2 periods within the 40e-6 wide box.
# Maximum number of time steps
max_step = max_step
# number of grid points
amr.n_cell = nx nx nx
# Maximum allowable size of each subdomain in the problem domain;
# this is used to decompose the domain for parallel calculations.
amr.max_grid_size = nx nx nx
# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0
# Geometry
geometry.dims = 3
geometry.prob_lo = -lx/2. -lx/2. -lx/2. # physical domain
geometry.prob_hi = lx/2. lx/2. lx/2.
# Boundary condition
boundary.field_lo = periodic periodic periodic
boundary.field_hi = periodic periodic periodic
warpx.serialize_initial_conditions = 1
# Verbosity
warpx.verbose = 1
# Algorithms
algo.current_deposition = esirkepov
algo.field_gathering = energy-conserving
warpx.use_filter = 0
# Order of particle shape factors
algo.particle_shape = 1
# CFL
warpx.cfl = 1.0
# Particles
particles.species_names = electrons positrons
electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 1 1 1
electrons.xmin = -20.e-6
electrons.xmax = 20.e-6
electrons.ymin = -20.e-6
electrons.ymax = 20.e-6
electrons.zmin = -20.e-6
electrons.zmax = 20.e-6
electrons.profile = constant
electrons.density = n0 # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
electrons.momentum_function_uz(x,y,z) = "epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"
positrons.charge = q_e
positrons.mass = m_e
positrons.injection_style = "NUniformPerCell"
positrons.num_particles_per_cell_each_dim = 1 1 1
positrons.xmin = -20.e-6
positrons.xmax = 20.e-6
positrons.ymin = -20.e-6
positrons.ymax = 20.e-6
positrons.zmin = -20.e-6
positrons.zmax = 20.e-6
positrons.profile = constant
positrons.density = n0 # number of positrons per m^3
positrons.momentum_distribution_type = parse_momentum_function
positrons.momentum_function_ux(x,y,z) = "-epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
positrons.momentum_function_uy(x,y,z) = "-epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
positrons.momentum_function_uz(x,y,z) = "-epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"
# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = max_step
diag1.diag_type = Full
diag1.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz part_per_cell rho
diag1.electrons.variables = w ux
diag1.positrons.variables = uz
This example can be run as a Python script: python3 PICMI_inputs_2d.py
.
Examples/Tests/langmuir/PICMI_inputs_2d.py
.#!/usr/bin/env python3
#
# --- Simple example of Langmuir oscillations in a uniform plasma
# --- in two dimensions
from pywarpx import picmi
constants = picmi.constants
##########################
# physics parameters
##########################
plasma_density = 1.e25
plasma_xmin = 0.
plasma_x_velocity = 0.1*constants.c
##########################
# numerics parameters
##########################
# --- Number of time steps
max_steps = 40
diagnostic_intervals = "::10"
# --- Grid
nx = 64
ny = 64
xmin = -20.e-6
ymin = -20.e-6
xmax = +20.e-6
ymax = +20.e-6
number_per_cell_each_dim = [2,2]
##########################
# physics components
##########################
uniform_plasma = picmi.UniformDistribution(density = 1.e25,
upper_bound = [0., None, None],
directed_velocity = [0.1*constants.c, 0., 0.])
electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=uniform_plasma)
##########################
# numerics components
##########################
grid = picmi.Cartesian2DGrid(number_of_cells = [nx, ny],
lower_bound = [xmin, ymin],
upper_bound = [xmax, ymax],
lower_boundary_conditions = ['periodic', 'periodic'],
upper_boundary_conditions = ['periodic', 'periodic'],
moving_window_velocity = [0., 0., 0.],
warpx_max_grid_size = 32)
solver = picmi.ElectromagneticSolver(grid=grid, cfl=1.)
##########################
# diagnostics
##########################
field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
grid = grid,
period = diagnostic_intervals,
data_list = ['Ex', 'Jx'],
write_dir = '.',
warpx_file_prefix = 'Python_Langmuir_2d_plt')
part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
period = diagnostic_intervals,
species = [electrons],
data_list = ['weighting', 'ux'])
##########################
# simulation setup
##########################
sim = picmi.Simulation(solver = solver,
max_steps = max_steps,
verbose = 1,
warpx_current_deposition_algo = 'direct',
warpx_use_filter = 0)
sim.add_species(electrons,
layout = picmi.GriddedLayout(n_macroparticle_per_cell=number_per_cell_each_dim, grid=grid))
sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)
##########################
# simulation run
##########################
# write_inputs will create an inputs file that can be used to run
# with the compiled version.
sim.write_input_file(file_name = 'inputs2d_from_PICMI')
# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()
This example can be run as WarpX executable using an input file: warpx.2d inputs_2d
Examples/Tests/langmuir/inputs_2d
.# Maximum number of time steps
max_step = 80
# number of grid points
amr.n_cell = 128 128
# Maximum allowable size of each subdomain in the problem domain;
# this is used to decompose the domain for parallel calculations.
amr.max_grid_size = 64
# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0
# Geometry
geometry.dims = 2
geometry.prob_lo = -20.e-6 -20.e-6 # physical domain
geometry.prob_hi = 20.e-6 20.e-6
# Boundary condition
boundary.field_lo = periodic periodic
boundary.field_hi = periodic periodic
warpx.serialize_initial_conditions = 1
# Verbosity
warpx.verbose = 1
# Algorithms
algo.field_gathering = energy-conserving
warpx.use_filter = 0
# Order of particle shape factors
algo.particle_shape = 1
# CFL
warpx.cfl = 1.0
# Parameters for the plasma wave
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24 # electron and positron densities, #/m^3
my_constants.wp = sqrt(2.*n0*q_e**2/(epsilon0*m_e)) # plasma frequency
my_constants.kp = wp/clight # plasma wavenumber
my_constants.k = 2.*pi/20.e-6 # perturbation wavenumber
# Note: kp is calculated in SI for a density of 4e24 (i.e. 2e24 electrons + 2e24 positrons)
# k is calculated so as to have 2 periods within the 40e-6 wide box.
# Particles
particles.species_names = electrons positrons
electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 2 2
electrons.xmin = -20.e-6
electrons.xmax = 20.e-6
electrons.ymin = -20.e-6
electrons.ymax = 20.e-6
electrons.zmin = -20.e-6
electrons.zmax = 20.e-6
electrons.profile = constant
electrons.density = n0 # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
electrons.momentum_function_uz(x,y,z) = "epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"
positrons.charge = q_e
positrons.mass = m_e
positrons.injection_style = "NUniformPerCell"
positrons.num_particles_per_cell_each_dim = 2 2
positrons.xmin = -20.e-6
positrons.xmax = 20.e-6
positrons.ymin = -20.e-6
positrons.ymax = 20.e-6
positrons.zmin = -20.e-6
positrons.zmax = 20.e-6
positrons.profile = constant
positrons.density = n0 # number of positrons per m^3
positrons.momentum_distribution_type = parse_momentum_function
positrons.momentum_function_ux(x,y,z) = "-epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
positrons.momentum_function_uy(x,y,z) = "-epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
positrons.momentum_function_uz(x,y,z) = "-epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"
# Diagnostics
diagnostics.diags_names = diag1
diag1.intervals = 40
diag1.diag_type = Full
This example can be run as a Python script: python3 PICMI_inputs_rz.py
.
Examples/Tests/langmuir/PICMI_inputs_rz.py
.#!/usr/bin/env python3
#
# This is a script that analyses the multimode simulation results.
# This simulates a RZ multimode periodic plasma wave.
# The electric field from the simulation is compared to the analytic value
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
from pywarpx import fields, picmi
constants = picmi.constants
##########################
# physics parameters
##########################
density = 2.e24
epsilon0 = 0.001*constants.c
epsilon1 = 0.001*constants.c
epsilon2 = 0.001*constants.c
w0 = 5.e-6
n_osc_z = 3
# Plasma frequency
wp = np.sqrt((density*constants.q_e**2)/(constants.m_e*constants.ep0))
kp = wp/constants.c
##########################
# numerics parameters
##########################
nr = 64
nz = 200
rmin = 0.e0
zmin = 0.e0
rmax = +20.e-6
zmax = +40.e-6
# Wave vector of the wave
k0 = 2.*np.pi*n_osc_z/(zmax - zmin)
diagnostic_intervals = 40
##########################
# physics components
##########################
uniform_plasma = picmi.UniformDistribution(density = density,
upper_bound = [+18e-6, +18e-6, None],
directed_velocity = [0., 0., 0.])
momentum_expressions = ["""+ epsilon0/kp*2*x/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
- epsilon1/kp*2/w0*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
+ epsilon1/kp*4*x**2/w0**3*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
- epsilon2/kp*8*x/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
+ epsilon2/kp*8*x*(x**2-y**2)/w0**4*exp(-(x**2+y**2)/w0**2)*sin(k0*z)""",
"""+ epsilon0/kp*2*y/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
+ epsilon1/kp*4*x*y/w0**3*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
+ epsilon2/kp*8*y/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)
+ epsilon2/kp*8*y*(x**2-y**2)/w0**4*exp(-(x**2+y**2)/w0**2)*sin(k0*z)""",
"""- epsilon0/kp*k0*exp(-(x**2+y**2)/w0**2)*cos(k0*z)
- epsilon1/kp*k0*2*x/w0*exp(-(x**2+y**2)/w0**2)*cos(k0*z)
- epsilon2/kp*k0*4*(x**2-y**2)/w0**2*exp(-(x**2+y**2)/w0**2)*cos(k0*z)"""]
analytic_plasma = picmi.AnalyticDistribution(density_expression = density,
upper_bound = [+18e-6, +18e-6, None],
epsilon0 = epsilon0,
epsilon1 = epsilon1,
epsilon2 = epsilon2,
kp = kp,
k0 = k0,
w0 = w0,
momentum_expressions = momentum_expressions)
electrons = picmi.Species(particle_type='electron', name='electrons', initial_distribution=analytic_plasma)
protons = picmi.Species(particle_type='proton', name='protons', initial_distribution=uniform_plasma)
##########################
# numerics components
##########################
grid = picmi.CylindricalGrid(number_of_cells = [nr, nz],
n_azimuthal_modes = 3,
lower_bound = [rmin, zmin],
upper_bound = [rmax, zmax],
lower_boundary_conditions = ['none', 'periodic'],
upper_boundary_conditions = ['none', 'periodic'],
lower_boundary_conditions_particles = ['absorbing', 'periodic'],
upper_boundary_conditions_particles = ['absorbing', 'periodic'],
moving_window_velocity = [0.,0.],
warpx_max_grid_size=64)
solver = picmi.ElectromagneticSolver(grid=grid, cfl=1.)
##########################
# diagnostics
##########################
field_diag1 = picmi.FieldDiagnostic(name = 'diag1',
grid = grid,
period = diagnostic_intervals,
data_list = ['Er', 'Ez', 'Bt', 'Jr', 'Jz', 'part_per_cell'],
write_dir = '.',
warpx_file_prefix = 'Python_Langmuir_rz_multimode_plt')
part_diag1 = picmi.ParticleDiagnostic(name = 'diag1',
period = diagnostic_intervals,
species = [electrons],
data_list = ['weighting', 'momentum'])
##########################
# simulation setup
##########################
sim = picmi.Simulation(solver = solver,
max_steps = 40,
verbose = 1,
warpx_current_deposition_algo = 'esirkepov',
warpx_field_gathering_algo = 'energy-conserving',
warpx_particle_pusher_algo = 'boris',
warpx_use_filter = 0)
sim.add_species(electrons, layout=picmi.GriddedLayout(n_macroparticle_per_cell=[2,16,2], grid=grid))
sim.add_species(protons, layout=picmi.GriddedLayout(n_macroparticle_per_cell=[2,16,2], grid=grid))
sim.add_diagnostic(field_diag1)
sim.add_diagnostic(part_diag1)
##########################
# simulation run
##########################
# write_inputs will create an inputs file that can be used to run
# with the compiled version.
#sim.write_input_file(file_name='inputsrz_from_PICMI')
# Alternatively, sim.step will run WarpX, controlling it from Python
sim.step()
# Below is WarpX specific code to check the results.
def calcEr( z, r, k0, w0, wp, t, epsilons) :
"""
Return the radial electric field as an array
of the same length as z and r, in the half-plane theta=0
"""
Er_array = (
epsilons[0] * constants.m_e*constants.c/constants.q_e * 2*r/w0**2 *
np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
- epsilons[1] * constants.m_e*constants.c/constants.q_e * 2/w0 *
np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
+ epsilons[1] * constants.m_e*constants.c/constants.q_e * 4*r**2/w0**3 *
np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
- epsilons[2] * constants.m_e*constants.c/constants.q_e * 8*r/w0**2 *
np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
+ epsilons[2] * constants.m_e*constants.c/constants.q_e * 8*r**3/w0**4 *
np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t ))
return( Er_array )
def calcEz( z, r, k0, w0, wp, t, epsilons) :
"""
Return the longitudinal electric field as an array
of the same length as z and r, in the half-plane theta=0
"""
Ez_array = (
- epsilons[0] * constants.m_e*constants.c/constants.q_e * k0 *
np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t )
- epsilons[1] * constants.m_e*constants.c/constants.q_e * k0 * 2*r/w0 *
np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t )
- epsilons[2] * constants.m_e*constants.c/constants.q_e * k0 * 4*r**2/w0**2 *
np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t ))
return( Ez_array )
# Current time of the simulation
t0 = sim.extension.warpx.gett_new(0)
# Get the raw field data. Note that these are the real and imaginary
# parts of the fields for each azimuthal mode.
Ex_sim_wrap = fields.ExWrapper()
Ez_sim_wrap = fields.EzWrapper()
Ex_sim_modes = Ex_sim_wrap[...]
Ez_sim_modes = Ez_sim_wrap[...]
rr_Er = Ex_sim_wrap.mesh('r')
zz_Er = Ex_sim_wrap.mesh('z')
rr_Ez = Ez_sim_wrap.mesh('r')
zz_Ez = Ez_sim_wrap.mesh('z')
rr_Er = rr_Er[:,np.newaxis]*np.ones(zz_Er.shape[0])[np.newaxis,:]
zz_Er = zz_Er[np.newaxis,:]*np.ones(rr_Er.shape[0])[:,np.newaxis]
rr_Ez = rr_Ez[:,np.newaxis]*np.ones(zz_Ez.shape[0])[np.newaxis,:]
zz_Ez = zz_Ez[np.newaxis,:]*np.ones(rr_Ez.shape[0])[:,np.newaxis]
# Sum the real components to get the field along x-axis (theta = 0)
Er_sim = Ex_sim_modes[:,:,0] + np.sum(Ex_sim_modes[:,:,1::2], axis=2)
Ez_sim = Ez_sim_modes[:,:,0] + np.sum(Ez_sim_modes[:,:,1::2], axis=2)
# The analytical solutions
Er_th = calcEr(zz_Er, rr_Er, k0, w0, wp, t0, [epsilon0, epsilon1, epsilon2])
Ez_th = calcEz(zz_Ez, rr_Ez, k0, w0, wp, t0, [epsilon0, epsilon1, epsilon2])
max_error_Er = abs(Er_sim - Er_th).max()/abs(Er_th).max()
max_error_Ez = abs(Ez_sim - Ez_th).max()/abs(Ez_th).max()
print("Max error Er %e"%max_error_Er)
print("Max error Ez %e"%max_error_Ez)
# Plot the last field from the loop (Er at iteration 40)
fig, ax = plt.subplots(3)
im = ax[0].imshow( Er_sim, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[0], orientation='vertical')
ax[0].set_title('Er, last iteration (simulation)')
ax[1].imshow( Er_th, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[1], orientation='vertical')
ax[1].set_title('Er, last iteration (theory)')
im = ax[2].imshow( (Er_sim - Er_th)/abs(Er_th).max(), aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[2], orientation='vertical')
ax[2].set_title('Er, last iteration (difference)')
plt.savefig('langmuir_multi_rz_multimode_analysis_Er.png')
fig, ax = plt.subplots(3)
im = ax[0].imshow( Ez_sim, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[0], orientation='vertical')
ax[0].set_title('Ez, last iteration (simulation)')
ax[1].imshow( Ez_th, aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[1], orientation='vertical')
ax[1].set_title('Ez, last iteration (theory)')
im = ax[2].imshow( (Ez_sim - Ez_th)/abs(Ez_th).max(), aspect='auto', origin='lower' )
fig.colorbar(im, ax=ax[2], orientation='vertical')
ax[2].set_title('Ez, last iteration (difference)')
plt.savefig('langmuir_multi_rz_multimode_analysis_Ez.png')
assert max(max_error_Er, max_error_Ez) < 0.02
This example can be run as WarpX executable using an input file: warpx.rz inputs_rz
Examples/Tests/langmuir/inputs_rz
.# Parameters for the plasma wave
my_constants.max_step = 80
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24 # electron density, #/m^3
my_constants.wp = sqrt(n0*q_e**2/(epsilon0*m_e)) # plasma frequency
my_constants.kp = wp/clight # plasma wavenumber
my_constants.k0 = 2.*pi/20.e-6 # longitudianl perturbation wavenumber
my_constants.w0 = 5.e-6 # transverse perturbation length
# Note: kp is calculated in SI for a density of 2e24
# k0 is calculated so as to have 2 periods within the 40e-6 wide box.
# Maximum number of time steps
max_step = max_step
# number of grid points
amr.n_cell = 64 128
# Maximum allowable size of each subdomain in the problem domain;
# this is used to decompose the domain for parallel calculations.
amr.max_grid_size = 64
# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0
# Geometry
geometry.dims = RZ
geometry.prob_lo = 0.e-6 -20.e-6 # physical domain
geometry.prob_hi = 20.e-6 20.e-6
boundary.field_lo = none periodic
boundary.field_hi = none periodic
warpx.serialize_initial_conditions = 1
# Verbosity
warpx.verbose = 1
# Algorithms
algo.field_gathering = energy-conserving
algo.current_deposition = esirkepov
warpx.use_filter = 0
# Order of particle shape factors
algo.particle_shape = 1
# CFL
warpx.cfl = 1.0
# Having this turned on makes for a more sensitive test
warpx.do_dive_cleaning = 1
# Particles
particles.species_names = electrons ions
electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 2 2 2
electrons.xmin = 0.e-6
electrons.xmax = 18.e-6
electrons.zmin = -20.e-6
electrons.zmax = +20.e-6
electrons.profile = constant
electrons.density = n0 # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon/kp*2*x/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon/kp*2*y/w0**2*exp(-(x**2+y**2)/w0**2)*sin(k0*z)"
electrons.momentum_function_uz(x,y,z) = "-epsilon/kp*k0*exp(-(x**2+y**2)/w0**2)*cos(k0*z)"
ions.charge = q_e
ions.mass = m_p
ions.injection_style = "NUniformPerCell"
ions.num_particles_per_cell_each_dim = 2 2 2
ions.xmin = 0.e-6
ions.xmax = 18.e-6
ions.zmin = -20.e-6
ions.zmax = +20.e-6
ions.profile = constant
ions.density = n0 # number of ions per m^3
ions.momentum_distribution_type = at_rest
# Diagnostics
diagnostics.diags_names = diag1 diag_parser_filter diag_uniform_filter diag_random_filter
diag1.intervals = max_step/2
diag1.diag_type = Full
diag1.fields_to_plot = jr jz Er Ez Bt
## diag_parser_filter is a diag used to test the particle filter function.
diag_parser_filter.intervals = max_step:max_step:
diag_parser_filter.diag_type = Full
diag_parser_filter.species = electrons
diag_parser_filter.electrons.plot_filter_function(t,x,y,z,ux,uy,uz) = "(uy-uz < 0) *
(sqrt(x**2+y**2)<10e-6) * (z > 0)"
## diag_uniform_filter is a diag used to test the particle uniform filter.
diag_uniform_filter.intervals = max_step:max_step:
diag_uniform_filter.diag_type = Full
diag_uniform_filter.species = electrons
diag_uniform_filter.electrons.uniform_stride = 3
## diag_random_filter is a diag used to test the particle random filter.
diag_random_filter.intervals = max_step:max_step:
diag_random_filter.diag_type = Full
diag_random_filter.species = electrons
diag_random_filter.electrons.random_fraction = 0.66
Note
TODO: This input file should be created, like the inputs_1d
file.
This example can be run as WarpX executable using an input file: warpx.1d inputs_1d
Examples/Tests/langmuir/inputs_1d
.# Maximum number of time steps
max_step = 80
# number of grid points
amr.n_cell = 128
# Maximum allowable size of each subdomain in the problem domain;
# this is used to decompose the domain for parallel calculations.
amr.max_grid_size = 64
# Maximum level in hierarchy (for now must be 0, i.e., one level in total)
amr.max_level = 0
# Geometry
geometry.dims = 1
geometry.prob_lo = -20.e-6 # physical domain
geometry.prob_hi = 20.e-6
# Boundary condition
boundary.field_lo = periodic
boundary.field_hi = periodic
warpx.serialize_initial_conditions = 1
# Verbosity
warpx.verbose = 1
# Algorithms
algo.field_gathering = energy-conserving
warpx.use_filter = 0
# Order of particle shape factors
algo.particle_shape = 1
# CFL
warpx.cfl = 0.8
# Parameters for the plasma wave
my_constants.epsilon = 0.01
my_constants.n0 = 2.e24 # electron and positron densities, #/m^3
my_constants.wp = sqrt(2.*n0*q_e**2/(epsilon0*m_e)) # plasma frequency
my_constants.kp = wp/clight # plasma wavenumber
my_constants.k = 2.*pi/20.e-6 # perturbation wavenumber
# Note: kp is calculated in SI for a density of 4e24 (i.e. 2e24 electrons + 2e24 positrons)
# k is calculated so as to have 2 periods within the 40e-6 wide box.
# Particles
particles.species_names = electrons positrons
electrons.charge = -q_e
electrons.mass = m_e
electrons.injection_style = "NUniformPerCell"
electrons.num_particles_per_cell_each_dim = 2
electrons.zmin = -20.e-6
electrons.zmax = 20.e-6
electrons.profile = constant
electrons.density = n0 # number of electrons per m^3
electrons.momentum_distribution_type = parse_momentum_function
electrons.momentum_function_ux(x,y,z) = "epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
electrons.momentum_function_uy(x,y,z) = "epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
electrons.momentum_function_uz(x,y,z) = "epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"
positrons.charge = q_e
positrons.mass = m_e
positrons.injection_style = "NUniformPerCell"
positrons.num_particles_per_cell_each_dim = 2
positrons.zmin = -20.e-6
positrons.zmax = 20.e-6
positrons.profile = constant
positrons.density = n0 # number of positrons per m^3
positrons.momentum_distribution_type = parse_momentum_function
positrons.momentum_function_ux(x,y,z) = "-epsilon * k/kp * sin(k*x) * cos(k*y) * cos(k*z)"
positrons.momentum_function_uy(x,y,z) = "-epsilon * k/kp * cos(k*x) * sin(k*y) * cos(k*z)"
positrons.momentum_function_uz(x,y,z) = "-epsilon * k/kp * cos(k*x) * cos(k*y) * sin(k*z)"
# Diagnostics
diagnostics.diags_names = diag1 openpmd
diag1.intervals = 40
diag1.diag_type = Full
openpmd.intervals = 40
openpmd.diag_type = Full
openpmd.format = openpmd
Analyze
We run the following script to analyze correctness:
Script analysis_3d.py
Examples/Tests/langmuir/analysis_3d.py
.#!/usr/bin/env python3
# Copyright 2019-2022 Jean-Luc Vay, Maxence Thevenet, Remi Lehe, Axel Huebl
#
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rt`. This simulates a 3D periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_x = \epsilon \,\frac{m_e c^2 k_x}{q_e}\sin(k_x x)\cos(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_y = \epsilon \,\frac{m_e c^2 k_y}{q_e}\cos(k_x x)\sin(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_z = \epsilon \,\frac{m_e c^2 k_z}{q_e}\cos(k_x x)\cos(k_y y)\sin(k_z z)\sin( \omega_p t)$$
import os
import re
import sys
import matplotlib.pyplot as plt
import yt
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
yt.funcs.mylog.setLevel(50)
import numpy as np
from scipy.constants import c, e, epsilon_0, m_e
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
# Parse test name and check if current correction (psatd.current_correction=1) is applied
current_correction = True if re.search( 'current_correction', fn ) else False
# Parse test name and check if Vay current deposition (algo.current_deposition=vay) is used
vay_deposition = True if re.search( 'Vay_deposition', fn ) else False
# Parse test name and check if div(E)/div(B) cleaning (warpx.do_div<e,b>_cleaning=1) is used
div_cleaning = True if re.search('div_cleaning', fn) else False
# Parameters (these parameters must match the parameters in `inputs.multi.rt`)
epsilon = 0.01
n = 4.e24
n_osc_x = 2
n_osc_y = 2
n_osc_z = 2
lo = [-20.e-6, -20.e-6, -20.e-6]
hi = [ 20.e-6, 20.e-6, 20.e-6]
Ncell = [64, 64, 64]
# Wave vector of the wave
kx = 2.*np.pi*n_osc_x/(hi[0]-lo[0])
ky = 2.*np.pi*n_osc_y/(hi[1]-lo[1])
kz = 2.*np.pi*n_osc_z/(hi[2]-lo[2])
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))
k = {'Ex':kx, 'Ey':ky, 'Ez':kz}
cos = {'Ex': (0,1,1), 'Ey':(1,0,1), 'Ez':(1,1,0)}
def get_contribution( is_cos, k, idim ):
du = (hi[idim]-lo[idim])/Ncell[idim]
u = lo[idim] + du*( 0.5 + np.arange(Ncell[idim]) )
if is_cos[idim] == 1:
return( np.cos(k*u) )
else:
return( np.sin(k*u) )
def get_theoretical_field( field, t ):
amplitude = epsilon * (m_e*c**2*k[field])/e * np.sin(wp*t)
cos_flag = cos[field]
x_contribution = get_contribution( cos_flag, kx, 0 )
y_contribution = get_contribution( cos_flag, ky, 1 )
z_contribution = get_contribution( cos_flag, kz, 2 )
E = amplitude * x_contribution[:, np.newaxis, np.newaxis] \
* y_contribution[np.newaxis, :, np.newaxis] \
* z_contribution[np.newaxis, np.newaxis, :]
return( E )
# Read the file
ds = yt.load(fn)
# Check that the particle selective output worked:
species = 'electrons'
print('ds.field_list', ds.field_list)
for field in ['particle_weight',
'particle_momentum_x']:
print('assert that this is in ds.field_list', (species, field))
assert (species, field) in ds.field_list
for field in ['particle_momentum_y',
'particle_momentum_z']:
print('assert that this is NOT in ds.field_list', (species, field))
assert (species, field) not in ds.field_list
species = 'positrons'
for field in ['particle_momentum_x',
'particle_momentum_y']:
print('assert that this is NOT in ds.field_list', (species, field))
assert (species, field) not in ds.field_list
t0 = ds.current_time.to_value()
data = ds.covering_grid(level = 0, left_edge = ds.domain_left_edge, dims = ds.domain_dimensions)
edge = np.array([(ds.domain_left_edge[2]).item(), (ds.domain_right_edge[2]).item(), \
(ds.domain_left_edge[0]).item(), (ds.domain_right_edge[0]).item()])
# Check the validity of the fields
error_rel = 0
for field in ['Ex', 'Ey', 'Ez']:
E_sim = data[('mesh',field)].to_ndarray()
E_th = get_theoretical_field(field, t0)
max_error = abs(E_sim-E_th).max()/abs(E_th).max()
print('%s: Max error: %.2e' %(field,max_error))
error_rel = max( error_rel, max_error )
# Plot the last field from the loop (Ez at iteration 40)
fig, (ax1, ax2) = plt.subplots(1, 2, dpi = 100)
# First plot (slice at y=0)
E_plot = E_sim[:,Ncell[1]//2+1,:]
vmin = E_plot.min()
vmax = E_plot.max()
cax1 = make_axes_locatable(ax1).append_axes('right', size = '5%', pad = '5%')
im1 = ax1.imshow(E_plot, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb1 = fig.colorbar(im1, cax = cax1)
ax1.set_xlabel(r'$z$')
ax1.set_ylabel(r'$x$')
ax1.set_title(r'$E_z$ (sim)')
# Second plot (slice at y=0)
E_plot = E_th[:,Ncell[1]//2+1,:]
vmin = E_plot.min()
vmax = E_plot.max()
cax2 = make_axes_locatable(ax2).append_axes('right', size = '5%', pad = '5%')
im2 = ax2.imshow(E_plot, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb2 = fig.colorbar(im2, cax = cax2)
ax2.set_xlabel(r'$z$')
ax2.set_ylabel(r'$x$')
ax2.set_title(r'$E_z$ (theory)')
# Save figure
fig.tight_layout()
fig.savefig('Langmuir_multi_analysis.png', dpi = 200)
tolerance_rel = 5e-2
print("error_rel : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))
assert( error_rel < tolerance_rel )
# Check relative L-infinity spatial norm of rho/epsilon_0 - div(E)
# with current correction (and periodic single box option) or with Vay current deposition
if current_correction:
tolerance = 1e-9
elif vay_deposition:
tolerance = 1e-3
if current_correction or vay_deposition:
rho = data[('boxlib','rho')].to_ndarray()
divE = data[('boxlib','divE')].to_ndarray()
error_rel = np.amax( np.abs( divE - rho/epsilon_0 ) ) / np.amax( np.abs( rho/epsilon_0 ) )
print("Check charge conservation:")
print("error_rel = {}".format(error_rel))
print("tolerance = {}".format(tolerance))
assert( error_rel < tolerance )
if div_cleaning:
ds_old = yt.load('Langmuir_multi_psatd_div_cleaning_plt000038')
ds_mid = yt.load('Langmuir_multi_psatd_div_cleaning_plt000039')
ds_new = yt.load(fn) # this is the last plotfile
ad_old = ds_old.covering_grid(level = 0, left_edge = ds_old.domain_left_edge, dims = ds_old.domain_dimensions)
ad_mid = ds_mid.covering_grid(level = 0, left_edge = ds_mid.domain_left_edge, dims = ds_mid.domain_dimensions)
ad_new = ds_new.covering_grid(level = 0, left_edge = ds_new.domain_left_edge, dims = ds_new.domain_dimensions)
rho = ad_mid['rho'].v.squeeze()
divE = ad_mid['divE'].v.squeeze()
F_old = ad_old['F'].v.squeeze()
F_new = ad_new['F'].v.squeeze()
# Check max norm of error on dF/dt = div(E) - rho/epsilon_0
# (the time interval between the old and new data is 2*dt)
dt = 1.203645751e-15
x = F_new - F_old
y = (divE - rho/epsilon_0) * 2 * dt
error_rel = np.amax(np.abs(x - y)) / np.amax(np.abs(y))
tolerance = 1e-2
print("Check div(E) cleaning:")
print("error_rel = {}".format(error_rel))
print("tolerance = {}".format(tolerance))
assert(error_rel < tolerance)
test_name = os.path.split(os.getcwd())[1]
if re.search( 'single_precision', fn ):
checksumAPI.evaluate_checksum(test_name, fn, rtol=1.e-3)
else:
checksumAPI.evaluate_checksum(test_name, fn)
Script analysis_2d.py
Examples/Tests/langmuir/analysis_2d.py
.#!/usr/bin/env python3
# Copyright 2019 Jean-Luc Vay, Maxence Thevenet, Remi Lehe
#
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rt`. This simulates a 3D periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_x = \epsilon \,\frac{m_e c^2 k_x}{q_e}\sin(k_x x)\cos(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_y = \epsilon \,\frac{m_e c^2 k_y}{q_e}\cos(k_x x)\sin(k_y y)\cos(k_z z)\sin( \omega_p t)$$
# $$ E_z = \epsilon \,\frac{m_e c^2 k_z}{q_e}\cos(k_x x)\cos(k_y y)\sin(k_z z)\sin( \omega_p t)$$
import os
import re
import sys
import matplotlib.pyplot as plt
import yt
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
yt.funcs.mylog.setLevel(50)
import numpy as np
from scipy.constants import c, e, epsilon_0, m_e
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
# Parse test name and check if current correction (psatd.current_correction=1) is applied
current_correction = True if re.search( 'current_correction', fn ) else False
# Parse test name and check if Vay current deposition (algo.current_deposition=vay) is used
vay_deposition = True if re.search( 'Vay_deposition', fn ) else False
# Parse test name and check if particle_shape = 4 is used
particle_shape_4 = True if re.search('particle_shape_4', fn) else False
# Parameters (these parameters must match the parameters in `inputs.multi.rt`)
epsilon = 0.01
n = 4.e24
n_osc_x = 2
n_osc_z = 2
xmin = -20e-6; xmax = 20.e-6; Nx = 128
zmin = -20e-6; zmax = 20.e-6; Nz = 128
# Wave vector of the wave
kx = 2.*np.pi*n_osc_x/(xmax-xmin)
kz = 2.*np.pi*n_osc_z/(zmax-zmin)
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))
k = {'Ex':kx, 'Ez':kz}
cos = {'Ex': (0,1,1), 'Ez':(1,1,0)}
def get_contribution( is_cos, k ):
du = (xmax-xmin)/Nx
u = xmin + du*( 0.5 + np.arange(Nx) )
if is_cos == 1:
return( np.cos(k*u) )
else:
return( np.sin(k*u) )
def get_theoretical_field( field, t ):
amplitude = epsilon * (m_e*c**2*k[field])/e * np.sin(wp*t)
cos_flag = cos[field]
x_contribution = get_contribution( cos_flag[0], kx )
z_contribution = get_contribution( cos_flag[2], kz )
E = amplitude * x_contribution[:, np.newaxis ] \
* z_contribution[np.newaxis, :]
return( E )
# Read the file
ds = yt.load(fn)
t0 = ds.current_time.to_value()
data = ds.covering_grid(level = 0, left_edge = ds.domain_left_edge, dims = ds.domain_dimensions)
edge = np.array([(ds.domain_left_edge[1]).item(), (ds.domain_right_edge[1]).item(), \
(ds.domain_left_edge[0]).item(), (ds.domain_right_edge[0]).item()])
# Check the validity of the fields
error_rel = 0
for field in ['Ex', 'Ez']:
E_sim = data[('mesh',field)].to_ndarray()[:,:,0]
E_th = get_theoretical_field(field, t0)
max_error = abs(E_sim-E_th).max()/abs(E_th).max()
print('%s: Max error: %.2e' %(field,max_error))
error_rel = max( error_rel, max_error )
# Plot the last field from the loop (Ez at iteration 40)
fig, (ax1, ax2) = plt.subplots(1, 2, dpi = 100)
# First plot
vmin = E_sim.min()
vmax = E_sim.max()
cax1 = make_axes_locatable(ax1).append_axes('right', size = '5%', pad = '5%')
im1 = ax1.imshow(E_sim, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb1 = fig.colorbar(im1, cax = cax1)
ax1.set_xlabel(r'$z$')
ax1.set_ylabel(r'$x$')
ax1.set_title(r'$E_z$ (sim)')
# Second plot
vmin = E_th.min()
vmax = E_th.max()
cax2 = make_axes_locatable(ax2).append_axes('right', size = '5%', pad = '5%')
im2 = ax2.imshow(E_th, origin = 'lower', extent = edge, vmin = vmin, vmax = vmax)
cb2 = fig.colorbar(im2, cax = cax2)
ax2.set_xlabel(r'$z$')
ax2.set_ylabel(r'$x$')
ax2.set_title(r'$E_z$ (theory)')
# Save figure
fig.tight_layout()
fig.savefig('Langmuir_multi_2d_analysis.png', dpi = 200)
if particle_shape_4:
# lower fidelity, due to smoothing
tolerance_rel = 0.07
else:
tolerance_rel = 0.05
print("error_rel : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))
assert( error_rel < tolerance_rel )
# Check relative L-infinity spatial norm of rho/epsilon_0 - div(E)
# with current correction (and periodic single box option) or with Vay current deposition
if current_correction:
tolerance = 1e-9
elif vay_deposition:
tolerance = 1e-3
if current_correction or vay_deposition:
rho = data[('boxlib','rho')].to_ndarray()
divE = data[('boxlib','divE')].to_ndarray()
error_rel = np.amax( np.abs( divE - rho/epsilon_0 ) ) / np.amax( np.abs( rho/epsilon_0 ) )
print("Check charge conservation:")
print("error_rel = {}".format(error_rel))
print("tolerance = {}".format(tolerance))
assert( error_rel < tolerance )
test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)
Script analysis_rz.py
Examples/Tests/langmuir/analysis_rz.py
.#!/usr/bin/env python3
# Copyright 2019 David Grote, Maxence Thevenet
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rz.rt`. This simulates a RZ periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_r = -\partial_r \phi = \epsilon \,\frac{mc^2}{e}\frac{2\,r}{w_0^2} \exp\left(-\frac{r^2}{w_0^2}\right) \sin(k_0 z) \sin(\omega_p t)
# $$ E_z = -\partial_z \phi = - \epsilon \,\frac{mc^2}{e} k_0 \exp\left(-\frac{r^2}{w_0^2}\right) \cos(k_0 z) \sin(\omega_p t)
# Unrelated to the Langmuir waves, we also test the plotfile particle filter function in this
# analysis script.
import os
import re
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import yt
yt.funcs.mylog.setLevel(50)
import numpy as np
import post_processing_utils
from scipy.constants import c, e, epsilon_0, m_e
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
test_name = os.path.split(os.getcwd())[1]
# Parse test name and check if current correction (psatd.current_correction) is applied
current_correction = True if re.search('current_correction', fn) else False
# Parameters (these parameters must match the parameters in `inputs.multi.rz.rt`)
epsilon = 0.01
n = 2.e24
w0 = 5.e-6
n_osc_z = 2
rmin = 0e-6; rmax = 20.e-6; Nr = 64
zmin = -20e-6; zmax = 20.e-6; Nz = 128
# Wave vector of the wave
k0 = 2.*np.pi*n_osc_z/(zmax-zmin)
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))
kp = wp/c
def Er( z, r, epsilon, k0, w0, wp, t) :
"""
Return the radial electric field as an array
of the same length as z and r, in the half-plane theta=0
"""
Er_array = \
epsilon * m_e*c**2/e * 2*r/w0**2 * \
np.exp( -r**2/w0**2 ) * np.sin( k0*z ) * np.sin( wp*t )
return( Er_array )
def Ez( z, r, epsilon, k0, w0, wp, t) :
"""
Return the longitudinal electric field as an array
of the same length as z and r, in the half-plane theta=0
"""
Ez_array = \
- epsilon * m_e*c**2/e * k0 * \
np.exp( -r**2/w0**2 ) * np.cos( k0*z ) * np.sin( wp*t )
return( Ez_array )
# Read the file
ds = yt.load(fn)
t0 = ds.current_time.to_value()
data = ds.covering_grid(level=0, left_edge=ds.domain_left_edge,
dims=ds.domain_dimensions)
# Get cell centered coordinates
dr = (rmax - rmin)/Nr
dz = (zmax - zmin)/Nz
coords = np.indices([Nr, Nz],'d')
rr = rmin + (coords[0] + 0.5)*dr
zz = zmin + (coords[1] + 0.5)*dz
# Check the validity of the fields
overall_max_error = 0
Er_sim = data[('boxlib','Er')].to_ndarray()[:,:,0]
Er_th = Er(zz, rr, epsilon, k0, w0, wp, t0)
max_error = abs(Er_sim-Er_th).max()/abs(Er_th).max()
print('Er: Max error: %.2e' %(max_error))
overall_max_error = max( overall_max_error, max_error )
Ez_sim = data[('boxlib','Ez')].to_ndarray()[:,:,0]
Ez_th = Ez(zz, rr, epsilon, k0, w0, wp, t0)
max_error = abs(Ez_sim-Ez_th).max()/abs(Ez_th).max()
print('Ez: Max error: %.2e' %(max_error))
overall_max_error = max( overall_max_error, max_error )
# Plot the last field from the loop (Ez at iteration 40)
plt.subplot2grid( (1,2), (0,0) )
plt.imshow( Ez_sim )
plt.colorbar()
plt.title('Ez, last iteration\n(simulation)')
plt.subplot2grid( (1,2), (0,1) )
plt.imshow( Ez_th )
plt.colorbar()
plt.title('Ez, last iteration\n(theory)')
plt.tight_layout()
plt.savefig(test_name+'_analysis.png')
error_rel = overall_max_error
tolerance_rel = 0.12
print("error_rel : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))
assert( error_rel < tolerance_rel )
# Check charge conservation (relative L-infinity norm of error) with current correction
if current_correction:
divE = data[('boxlib','divE')].to_ndarray()
rho = data[('boxlib','rho')].to_ndarray() / epsilon_0
error_rel = np.amax(np.abs(divE - rho)) / max(np.amax(divE), np.amax(rho))
tolerance = 1.e-9
print("Check charge conservation:")
print("error_rel = {}".format(error_rel))
print("tolerance = {}".format(tolerance))
assert( error_rel < tolerance )
## In the final past of the test, we verify that the diagnostic particle filter function works as
## expected in RZ geometry. For this, we only use the last simulation timestep.
dim = "rz"
species_name = "electrons"
parser_filter_fn = "diags/diag_parser_filter000080"
parser_filter_expression = "(py-pz < 0) * (r<10e-6) * (z > 0)"
post_processing_utils.check_particle_filter(fn, parser_filter_fn, parser_filter_expression,
dim, species_name)
uniform_filter_fn = "diags/diag_uniform_filter000080"
uniform_filter_expression = "ids%3 == 0"
post_processing_utils.check_particle_filter(fn, uniform_filter_fn, uniform_filter_expression,
dim, species_name)
random_filter_fn = "diags/diag_random_filter000080"
random_fraction = 0.66
post_processing_utils.check_random_filter(fn, random_filter_fn, random_fraction,
dim, species_name)
checksumAPI.evaluate_checksum(test_name, fn)
Script analysis_1d.py
Examples/Tests/langmuir/analysis_1d.py
.#!/usr/bin/env python3
# Copyright 2019-2022 Jean-Luc Vay, Maxence Thevenet, Remi Lehe, Prabhat Kumar, Axel Huebl
#
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#
# This is a script that analyses the simulation results from
# the script `inputs.multi.rt`. This simulates a 1D periodic plasma wave.
# The electric field in the simulation is given (in theory) by:
# $$ E_z = \epsilon \,\frac{m_e c^2 k_z}{q_e}\sin(k_z z)\sin( \omega_p t)$$
import os
import re
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import yt
yt.funcs.mylog.setLevel(50)
import numpy as np
from scipy.constants import c, e, epsilon_0, m_e
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
# Parse test name and check if current correction (psatd.current_correction=1) is applied
current_correction = True if re.search( 'current_correction', fn ) else False
# Parse test name and check if Vay current deposition (algo.current_deposition=vay) is used
vay_deposition = True if re.search( 'Vay_deposition', fn ) else False
# Parameters (these parameters must match the parameters in `inputs.multi.rt`)
epsilon = 0.01
n = 4.e24
n_osc_z = 2
zmin = -20e-6; zmax = 20.e-6; Nz = 128
# Wave vector of the wave
kz = 2.*np.pi*n_osc_z/(zmax-zmin)
# Plasma frequency
wp = np.sqrt((n*e**2)/(m_e*epsilon_0))
k = {'Ez':kz}
cos = {'Ez':(1,1,0)}
def get_contribution( is_cos, k ):
du = (zmax-zmin)/Nz
u = zmin + du*( 0.5 + np.arange(Nz) )
if is_cos == 1:
return( np.cos(k*u) )
else:
return( np.sin(k*u) )
def get_theoretical_field( field, t ):
amplitude = epsilon * (m_e*c**2*k[field])/e * np.sin(wp*t)
cos_flag = cos[field]
z_contribution = get_contribution( cos_flag[2], kz )
E = amplitude * z_contribution
return( E )
# Read the file
ds = yt.load(fn)
t0 = ds.current_time.to_value()
data = ds.covering_grid(level=0, left_edge=ds.domain_left_edge,
dims=ds.domain_dimensions)
# Check the validity of the fields
error_rel = 0
for field in ['Ez']:
E_sim = data[('mesh',field)].to_ndarray()[:,0,0]
E_th = get_theoretical_field(field, t0)
max_error = abs(E_sim-E_th).max()/abs(E_th).max()
print('%s: Max error: %.2e' %(field,max_error))
error_rel = max( error_rel, max_error )
# Plot the last field from the loop (Ez at iteration 80)
plt.subplot2grid( (1,2), (0,0) )
plt.plot( E_sim )
#plt.colorbar()
plt.title('Ez, last iteration\n(simulation)')
plt.subplot2grid( (1,2), (0,1) )
plt.plot( E_th )
#plt.colorbar()
plt.title('Ez, last iteration\n(theory)')
plt.tight_layout()
plt.savefig('langmuir_multi_1d_analysis.png')
tolerance_rel = 0.05
print("error_rel : " + str(error_rel))
print("tolerance_rel: " + str(tolerance_rel))
assert( error_rel < tolerance_rel )
# Check relative L-infinity spatial norm of rho/epsilon_0 - div(E) when
# current correction (psatd.do_current_correction=1) is applied or when
# Vay current deposition (algo.current_deposition=vay) is used
if current_correction or vay_deposition:
rho = data[('boxlib','rho')].to_ndarray()
divE = data[('boxlib','divE')].to_ndarray()
error_rel = np.amax( np.abs( divE - rho/epsilon_0 ) ) / np.amax( np.abs( rho/epsilon_0 ) )
tolerance = 1.e-9
print("Check charge conservation:")
print("error_rel = {}".format(error_rel))
print("tolerance = {}".format(tolerance))
assert( error_rel < tolerance )
test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)
Visualize
Note
This section is TODO.
Capacitive Discharge
The examples in this directory are based on the benchmark cases from Turner et al. (Phys. Plasmas 20, 013507, 2013) [11].
The Monte-Carlo collision (MCC) model can be used to simulate electron and ion collisions with a neutral background gas. In particular this can be used to study capacitive discharges between parallel plates. The implementation has been tested against the benchmark results from Turner et al. [11].
Note
This example needs additional calibration data for cross sections. Download this data alongside your inputs file and update the paths in the inputs file:
git clone https://github.com/ECP-WarpX/warpx-data.git
Run
The 1D PICMI input file can be used to reproduce the results from Turner et al. for a given case, N
from 1 to 4, by executing python3 PICMI_inputs_1d.py -n N
, e.g.,
python3 PICMI_inputs_1d.py -n 1
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Examples/Physics_applications/capacitive_discharge/PICMI_inputs_1d.py
.#!/usr/bin/env python3
#
# --- Copyright 2021 Modern Electron (DSMC test added in 2023 by TAE Technologies)
# --- Monte-Carlo Collision script to reproduce the benchmark tests from
# --- Turner et al. (2013) - https://doi.org/10.1063/1.4775084
import argparse
import sys
import numpy as np
from scipy.sparse import csc_matrix
from scipy.sparse import linalg as sla
from pywarpx import callbacks, fields, libwarpx, particle_containers, picmi
constants = picmi.constants
class PoissonSolver1D(picmi.ElectrostaticSolver):
"""This solver is maintained as an example of the use of Python callbacks.
However, it is not necessarily needed since the 1D code has the direct tridiagonal
solver implemented."""
def __init__(self, grid, **kwargs):
"""Direct solver for the Poisson equation using superLU. This solver is
useful for 1D cases.
Arguments:
grid (picmi.Cartesian1DGrid): Instance of the grid on which the
solver will be installed.
"""
# Sanity check that this solver is appropriate to use
if not isinstance(grid, picmi.Cartesian1DGrid):
raise RuntimeError('Direct solver can only be used on a 1D grid.')
super(PoissonSolver1D, self).__init__(
grid=grid, method=kwargs.pop('method', 'Multigrid'),
required_precision=1, **kwargs
)
def solver_initialize_inputs(self):
"""Grab geometrical quantities from the grid. The boundary potentials
are also obtained from the grid using 'warpx_potential_zmin' for the
left_voltage and 'warpx_potential_zmax' for the right_voltage.
These can be given as floats or strings that can be parsed by the
WarpX parser.
"""
# grab the boundary potentials from the grid object
self.right_voltage = self.grid.potential_zmax
# set WarpX boundary potentials to None since we will handle it
# ourselves in this solver
self.grid.potential_xmin = None
self.grid.potential_xmax = None
self.grid.potential_ymin = None
self.grid.potential_ymax = None
self.grid.potential_zmin = None
self.grid.potential_zmax = None
super(PoissonSolver1D, self).solver_initialize_inputs()
self.nz = self.grid.number_of_cells[0]
self.dz = (self.grid.upper_bound[0] - self.grid.lower_bound[0]) / self.nz
self.nxguardphi = 1
self.nzguardphi = 1
self.phi = np.zeros(self.nz + 1 + 2*self.nzguardphi)
self.decompose_matrix()
callbacks.installpoissonsolver(self._run_solve)
def decompose_matrix(self):
"""Function to build the superLU object used to solve the linear
system."""
self.nsolve = self.nz + 1
# Set up the computation matrix in order to solve A*phi = rho
A = np.zeros((self.nsolve, self.nsolve))
idx = np.arange(self.nsolve)
A[idx, idx] = -2.0
A[idx[1:], idx[:-1]] = 1.0
A[idx[:-1], idx[1:]] = 1.0
A[0, 1] = 0.0
A[-1, -2] = 0.0
A[0, 0] = 1.0
A[-1, -1] = 1.0
A = csc_matrix(A, dtype=np.float64)
self.lu = sla.splu(A)
def _run_solve(self):
"""Function run on every step to perform the required steps to solve
Poisson's equation."""
# get rho from WarpX
self.rho_data = fields.RhoFPWrapper(0, False)[...]
# run superLU solver to get phi
self.solve()
# write phi to WarpX
fields.PhiFPWrapper(0, True)[...] = self.phi[:]
def solve(self):
"""The solution step. Includes getting the boundary potentials and
calculating phi from rho."""
left_voltage = 0.0
right_voltage = eval(
self.right_voltage, {
't': self.sim.extension.warpx.gett_new(0),
'sin': np.sin, 'pi': np.pi
}
)
# Construct b vector
rho = -self.rho_data / constants.ep0
b = np.zeros(rho.shape[0], dtype=np.float64)
b[:] = rho * self.dz**2
b[0] = left_voltage
b[-1] = right_voltage
phi = self.lu.solve(b)
self.phi[self.nzguardphi:-self.nzguardphi] = phi
self.phi[:self.nzguardphi] = left_voltage
self.phi[-self.nzguardphi:] = right_voltage
class CapacitiveDischargeExample(object):
'''The following runs a simulation of a parallel plate capacitor seeded
with a plasma in the spacing between the plates. A time varying voltage is
applied across the capacitor. The groups of 4 values below correspond to
the 4 cases simulated by Turner et al. (2013) in their benchmarks of
PIC-MCC codes.
'''
gap = 0.067 # m
freq = 13.56e6 # Hz
voltage = [450.0, 200.0, 150.0, 120.0] # V
gas_density = [9.64e20, 32.1e20, 96.4e20, 321e20] # m^-3
gas_temp = 300.0 # K
m_ion = 6.67e-27 # kg
plasma_density = [2.56e14, 5.12e14, 5.12e14, 3.84e14] # m^-3
elec_temp = 30000.0 # K
seed_nppc = 16 * np.array([32, 16, 8, 4])
nz = [128, 256, 512, 512]
dt = 1.0 / (np.array([400, 800, 1600, 3200]) * freq)
# Total simulation time in seconds
total_time = np.array([1280, 5120, 5120, 15360]) / freq
# Time (in seconds) between diagnostic evaluations
diag_interval = 32 / freq
def __init__(self, n=0, test=False, pythonsolver=False, dsmc=False):
"""Get input parameters for the specific case (n) desired."""
self.n = n
self.test = test
self.pythonsolver = pythonsolver
self.dsmc = dsmc
# Case specific input parameters
self.voltage = f"{self.voltage[n]}*sin(2*pi*{self.freq:.5e}*t)"
self.gas_density = self.gas_density[n]
self.plasma_density = self.plasma_density[n]
self.seed_nppc = self.seed_nppc[n]
self.nz = self.nz[n]
self.dt = self.dt[n]
self.max_steps = int(self.total_time[n] / self.dt)
self.diag_steps = int(self.diag_interval / self.dt)
if self.test:
self.max_steps = 50
self.diag_steps = 5
self.mcc_subcycling_steps = 2
self.rng = np.random.default_rng(23094290)
else:
self.mcc_subcycling_steps = None
self.rng = np.random.default_rng()
self.ion_density_array = np.zeros(self.nz + 1)
self.setup_run()
def setup_run(self):
"""Setup simulation components."""
#######################################################################
# Set geometry and boundary conditions #
#######################################################################
self.grid = picmi.Cartesian1DGrid(
number_of_cells=[self.nz],
warpx_max_grid_size=128,
lower_bound=[0],
upper_bound=[self.gap],
lower_boundary_conditions=['dirichlet'],
upper_boundary_conditions=['dirichlet'],
lower_boundary_conditions_particles=['absorbing'],
upper_boundary_conditions_particles=['absorbing'],
warpx_potential_hi_z=self.voltage,
)
#######################################################################
# Field solver #
#######################################################################
if self.pythonsolver:
self.solver = PoissonSolver1D(grid=self.grid)
else:
# This will use the tridiagonal solver
self.solver = picmi.ElectrostaticSolver(grid=self.grid)
#######################################################################
# Particle types setup #
#######################################################################
self.electrons = picmi.Species(
particle_type='electron', name='electrons',
initial_distribution=picmi.UniformDistribution(
density=self.plasma_density,
rms_velocity=[np.sqrt(constants.kb * self.elec_temp / constants.m_e)]*3,
)
)
self.ions = picmi.Species(
particle_type='He', name='he_ions',
charge='q_e', mass=self.m_ion,
initial_distribution=picmi.UniformDistribution(
density=self.plasma_density,
rms_velocity=[np.sqrt(constants.kb * self.gas_temp / self.m_ion)]*3,
)
)
if self.dsmc:
self.neutrals = picmi.Species(
particle_type='He', name='neutrals',
charge=0, mass=self.m_ion,
warpx_reflection_model_zlo=1.0,
warpx_reflection_model_zhi=1.0,
warpx_do_resampling=True,
warpx_resampling_trigger_max_avg_ppc=int(self.seed_nppc*1.5),
initial_distribution=picmi.UniformDistribution(
density=self.gas_density,
rms_velocity=[np.sqrt(constants.kb * self.gas_temp / self.m_ion)]*3,
)
)
#######################################################################
# Collision initialization #
#######################################################################
cross_sec_direc = '../../../../warpx-data/MCC_cross_sections/He/'
electron_colls = picmi.MCCCollisions(
name='coll_elec',
species=self.electrons,
background_density=self.gas_density,
background_temperature=self.gas_temp,
background_mass=self.ions.mass,
ndt=self.mcc_subcycling_steps,
scattering_processes={
'elastic' : {
'cross_section' : cross_sec_direc+'electron_scattering.dat'
},
'excitation1' : {
'cross_section': cross_sec_direc+'excitation_1.dat',
'energy' : 19.82
},
'excitation2' : {
'cross_section': cross_sec_direc+'excitation_2.dat',
'energy' : 20.61
},
'ionization' : {
'cross_section' : cross_sec_direc+'ionization.dat',
'energy' : 24.55,
'species' : self.ions
},
}
)
ion_scattering_processes={
'elastic': {'cross_section': cross_sec_direc+'ion_scattering.dat'},
'back': {'cross_section': cross_sec_direc+'ion_back_scatter.dat'},
# 'charge_exchange': {'cross_section': cross_sec_direc+'charge_exchange.dat'}
}
if self.dsmc:
ion_colls = picmi.DSMCCollisions(
name='coll_ion',
species=[self.ions, self.neutrals],
ndt=5, scattering_processes=ion_scattering_processes
)
else:
ion_colls = picmi.MCCCollisions(
name='coll_ion',
species=self.ions,
background_density=self.gas_density,
background_temperature=self.gas_temp,
ndt=self.mcc_subcycling_steps,
scattering_processes=ion_scattering_processes
)
#######################################################################
# Initialize simulation #
#######################################################################
self.sim = picmi.Simulation(
solver=self.solver,
time_step_size=self.dt,
max_steps=self.max_steps,
warpx_collisions=[electron_colls, ion_colls],
verbose=self.test
)
self.solver.sim = self.sim
self.sim.add_species(
self.electrons,
layout = picmi.GriddedLayout(
n_macroparticle_per_cell=[self.seed_nppc], grid=self.grid
)
)
self.sim.add_species(
self.ions,
layout = picmi.GriddedLayout(
n_macroparticle_per_cell=[self.seed_nppc], grid=self.grid
)
)
if self.dsmc:
self.sim.add_species(
self.neutrals,
layout = picmi.GriddedLayout(
n_macroparticle_per_cell=[self.seed_nppc//2], grid=self.grid
)
)
self.solver.sim_ext = self.sim.extension
if self.dsmc:
# Periodically reset neutral density to starting temperature
callbacks.installbeforecollisions(self.rethermalize_neutrals)
#######################################################################
# Add diagnostics for the CI test to be happy #
#######################################################################
if self.dsmc:
file_prefix = 'Python_dsmc_1d_plt'
else:
if self.pythonsolver:
file_prefix = 'Python_background_mcc_1d_plt'
else:
file_prefix = 'Python_background_mcc_1d_tridiag_plt'
species = [self.electrons, self.ions]
if self.dsmc:
species.append(self.neutrals)
particle_diag = picmi.ParticleDiagnostic(
species=species,
name='diag1',
period=0,
write_dir='.',
warpx_file_prefix=file_prefix
)
field_diag = picmi.FieldDiagnostic(
name='diag1',
grid=self.grid,
period=0,
data_list=['rho_electrons', 'rho_he_ions'],
write_dir='.',
warpx_file_prefix=file_prefix
)
self.sim.add_diagnostic(particle_diag)
self.sim.add_diagnostic(field_diag)
def rethermalize_neutrals(self):
# When using DSMC the neutral temperature will change due to collisions
# with the ions. This is not captured in the original MCC test.
# Re-thermalize the neutrals every 1000 steps
step = self.sim.extension.warpx.getistep(lev=0)
if step % 1000 != 10:
return
if not hasattr(self, 'neutral_cont'):
self.neutral_cont = particle_containers.ParticleContainerWrapper(
self.neutrals.name
)
ux_arrays = self.neutral_cont.uxp
uy_arrays = self.neutral_cont.uyp
uz_arrays = self.neutral_cont.uzp
vel_std = np.sqrt(constants.kb * self.gas_temp / self.m_ion)
for ii in range(len(ux_arrays)):
nps = len(ux_arrays[ii])
ux_arrays[ii][:] = vel_std * self.rng.normal(size=nps)
uy_arrays[ii][:] = vel_std * self.rng.normal(size=nps)
uz_arrays[ii][:] = vel_std * self.rng.normal(size=nps)
def _get_rho_ions(self):
# deposit the ion density in rho_fp
he_ions_wrapper = particle_containers.ParticleContainerWrapper('he_ions')
he_ions_wrapper.deposit_charge_density(level=0)
rho_data = self.rho_wrapper[...]
self.ion_density_array += rho_data / constants.q_e / self.diag_steps
def run_sim(self):
self.sim.step(self.max_steps - self.diag_steps)
self.rho_wrapper = fields.RhoFPWrapper(0, False)
callbacks.installafterstep(self._get_rho_ions)
self.sim.step(self.diag_steps)
if self.pythonsolver:
# confirm that the external solver was run
assert hasattr(self.solver, 'phi')
if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
np.save(f'ion_density_case_{self.n+1}.npy', self.ion_density_array)
# query the particle z-coordinates if this is run during CI testing
# to cover that functionality
if self.test:
he_ions_wrapper = particle_containers.ParticleContainerWrapper('he_ions')
nparts = he_ions_wrapper.get_particle_count(local=True)
z_coords = np.concatenate(he_ions_wrapper.zp)
assert len(z_coords) == nparts
assert np.all(z_coords >= 0.0) and np.all(z_coords <= self.gap)
##########################
# parse input parameters
##########################
parser = argparse.ArgumentParser()
parser.add_argument(
'-t', '--test', help='toggle whether this script is run as a short CI test',
action='store_true',
)
parser.add_argument(
'-n', help='Test number to run (1 to 4)', required=False, type=int,
default=1
)
parser.add_argument(
'--pythonsolver', help='toggle whether to use the Python level solver',
action='store_true'
)
parser.add_argument(
'--dsmc', help='toggle whether to use DSMC for ions in place of MCC',
action='store_true'
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left
if args.n < 1 or args.n > 4:
raise AttributeError('Test number must be an integer from 1 to 4.')
run = CapacitiveDischargeExample(
n=args.n-1, test=args.test, pythonsolver=args.pythonsolver, dsmc=args.dsmc
)
run.run_sim()
Analyze
Once the simulation completes an output file avg_ion_density.npy
will be created which can be compared to the literature results as in the plot below.
Running case 1
on four CPU processors takes roughly 20 minutes to complete.
Visualize
The figure below shows a comparison of the ion density as calculated in WarpX (in June 2022 with PR #3118) compared to the literature results (which can be found in the supplementary materials of Turner et al.).

Kinetic-fluid Hybrid Models
WarpX includes a reduced plasma model in which electrons are treated as a massless fluid while ions are kinetically evolved, and Ohm’s law is used to calculate the electric field. This model is appropriate for problems in which ion kinetics dominate (ion cyclotron waves, for instance). See the theory section for more details. Several examples and benchmarks of this kinetic-fluid hybrid model are provided below. A few of the examples are replications of the verification tests described in Muñoz et al. [1]. The hybrid-PIC model was added to WarpX in PR #3665 - the figures in the examples below were generated at that time.
Ohm solver: Electromagnetic modes
In this example a simulation is seeded with a thermal plasma while an initial magnetic field is applied in either the \(z\) or \(x\) direction. The simulation is progressed for a large number of steps and the resulting fields are Fourier analyzed for Alfvén mode excitations.
Run
The same input script can be used for 1d, 2d or 3d Cartesian simulations as well as replicating either the parallel propagating or ion-Bernstein modes as indicated below.
Script PICMI_inputs.py
Examples/Tests/ohm_solver_EM_modes/PICMI_inputs.py
.#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script is set up to produce either parallel or
# --- perpendicular (Bernstein) EM modes and can be run in 1d, 2d or 3d
# --- Cartesian geometries. See Section 4.2 and 4.3 of Munoz et al. (2018).
# --- As a CI test only a small number of steps are taken using the 1d version.
import argparse
import os
import sys
import dill
import numpy as np
from mpi4py import MPI as mpi
from pywarpx import callbacks, fields, libwarpx, picmi
constants = picmi.constants
comm = mpi.COMM_WORLD
simulation = picmi.Simulation(
warpx_serialize_initial_conditions=True,
verbose=0
)
class EMModes(object):
'''The following runs a simulation of an uniform plasma at a set
temperature (Te = Ti) with an external magnetic field applied in either the
z-direction (parallel to domain) or x-direction (perpendicular to domain).
The analysis script (in this same directory) analyzes the output field data
for EM modes. This input is based on the EM modes tests as described by
Munoz et al. (2018) and tests done by Scott Nicks at TAE Technologies.
'''
# Applied field parameters
B0 = 0.25 # Initial magnetic field strength (T)
beta = [0.01, 0.1] # Plasma beta, used to calculate temperature
# Plasma species parameters
m_ion = [100.0, 400.0] # Ion mass (electron masses)
vA_over_c = [1e-4, 1e-3] # ratio of Alfven speed and the speed of light
# Spatial domain
Nz = [1024, 1920] # number of cells in z direction
Nx = 8 # number of cells in x (and y) direction for >1 dimensions
# Temporal domain (if not run as a CI test)
LT = 300.0 # Simulation temporal length (ion cyclotron periods)
# Numerical parameters
NPPC = [1024, 256, 64] # Seed number of particles per cell
DZ = 1.0 / 10.0 # Cell size (ion skin depths)
DT = [5e-3, 4e-3] # Time step (ion cyclotron periods)
# Plasma resistivity - used to dampen the mode excitation
eta = [[1e-7, 1e-7], [1e-7, 1e-5], [1e-7, 1e-4]]
# Number of substeps used to update B
substeps = 20
def __init__(self, test, dim, B_dir, verbose):
"""Get input parameters for the specific case desired."""
self.test = test
self.dim = int(dim)
self.B_dir = B_dir
self.verbose = verbose or self.test
# sanity check
assert (dim > 0 and dim < 4), f"{dim}-dimensions not a valid input"
# get simulation parameters from the defaults given the direction of
# the initial B-field and the dimensionality
self.get_simulation_parameters()
# calculate various plasma parameters based on the simulation input
self.get_plasma_quantities()
self.dz = self.DZ * self.l_i
self.Lz = self.Nz * self.dz
self.Lx = self.Nx * self.dz
self.dt = self.DT * self.t_ci
if not self.test:
self.total_steps = int(self.LT / self.DT)
else:
# if this is a test case run for only a small number of steps
self.total_steps = 250
# output diagnostics 20 times per cyclotron period
self.diag_steps = int(1.0/20 / self.DT)
# dump all the current attributes to a dill pickle file
if comm.rank == 0:
with open(f'sim_parameters.dpkl', 'wb') as f:
dill.dump(self, f)
# print out plasma parameters
if comm.rank == 0:
print(
f"Initializing simulation with input parameters:\n"
f"\tT = {self.T_plasma:.3f} eV\n"
f"\tn = {self.n_plasma:.1e} m^-3\n"
f"\tB0 = {self.B0:.2f} T\n"
f"\tM/m = {self.m_ion:.0f}\n"
)
print(
f"Plasma parameters:\n"
f"\tl_i = {self.l_i:.1e} m\n"
f"\tt_ci = {self.t_ci:.1e} s\n"
f"\tv_ti = {self.v_ti:.1e} m/s\n"
f"\tvA = {self.vA:.1e} m/s\n"
)
print(
f"Numerical parameters:\n"
f"\tdz = {self.dz:.1e} m\n"
f"\tdt = {self.dt:.1e} s\n"
f"\tdiag steps = {self.diag_steps:d}\n"
f"\ttotal steps = {self.total_steps:d}\n"
)
self.setup_run()
def get_simulation_parameters(self):
"""Pick appropriate parameters from the defaults given the direction
of the B-field and the simulation dimensionality."""
if self.B_dir == 'z':
idx = 0
self.Bx = 0.0
self.By = 0.0
self.Bz = self.B0
elif self.B_dir == 'y':
idx = 1
self.Bx = 0.0
self.By = self.B0
self.Bz = 0.0
else:
idx = 1
self.Bx = self.B0
self.By = 0.0
self.Bz = 0.0
self.beta = self.beta[idx]
self.m_ion = self.m_ion[idx]
self.vA_over_c = self.vA_over_c[idx]
self.Nz = self.Nz[idx]
self.DT = self.DT[idx]
self.NPPC = self.NPPC[self.dim-1]
self.eta = self.eta[self.dim-1][idx]
def get_plasma_quantities(self):
"""Calculate various plasma parameters based on the simulation input."""
# Ion mass (kg)
self.M = self.m_ion * constants.m_e
# Cyclotron angular frequency (rad/s) and period (s)
self.w_ci = constants.q_e * abs(self.B0) / self.M
self.t_ci = 2.0 * np.pi / self.w_ci
# Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
self.vA = self.vA_over_c * constants.c
self.n_plasma = (
(self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
)
# Ion plasma frequency (Hz)
self.w_pi = np.sqrt(
constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
)
# Skin depth (m)
self.l_i = constants.c / self.w_pi
# Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
self.v_ti = np.sqrt(self.beta / 2.0) * self.vA
# Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV
# Larmor radius (m)
self.rho_i = self.v_ti / self.w_ci
def setup_run(self):
"""Setup simulation components."""
#######################################################################
# Set geometry and boundary conditions #
#######################################################################
if self.dim == 1:
grid_object = picmi.Cartesian1DGrid
elif self.dim == 2:
grid_object = picmi.Cartesian2DGrid
else:
grid_object = picmi.Cartesian3DGrid
self.grid = grid_object(
number_of_cells=[self.Nx, self.Nx, self.Nz][-self.dim:],
warpx_max_grid_size=self.Nz,
lower_bound=[-self.Lx/2.0, -self.Lx/2.0, 0][-self.dim:],
upper_bound=[self.Lx/2.0, self.Lx/2.0, self.Lz][-self.dim:],
lower_boundary_conditions=['periodic']*self.dim,
upper_boundary_conditions=['periodic']*self.dim
)
simulation.time_step_size = self.dt
simulation.max_steps = self.total_steps
simulation.current_deposition_algo = 'direct'
simulation.particle_shape = 1
simulation.verbose = self.verbose
#######################################################################
# Field solver and external field #
#######################################################################
self.solver = picmi.HybridPICSolver(
grid=self.grid,
Te=self.T_plasma, n0=self.n_plasma, plasma_resistivity=self.eta,
substeps=self.substeps
)
simulation.solver = self.solver
B_ext = picmi.AnalyticInitialField(
Bx_expression=self.Bx,
By_expression=self.By,
Bz_expression=self.Bz
)
simulation.add_applied_field(B_ext)
#######################################################################
# Particle types setup #
#######################################################################
self.ions = picmi.Species(
name='ions', charge='q_e', mass=self.M,
initial_distribution=picmi.UniformDistribution(
density=self.n_plasma,
rms_velocity=[self.v_ti]*3,
)
)
simulation.add_species(
self.ions,
layout=picmi.PseudoRandomLayout(
grid=self.grid, n_macroparticles_per_cell=self.NPPC
)
)
#######################################################################
# Add diagnostics #
#######################################################################
if self.B_dir == 'z':
self.output_file_name = 'par_field_data.txt'
else:
self.output_file_name = 'perp_field_data.txt'
if self.test:
particle_diag = picmi.ParticleDiagnostic(
name='field_diag',
period=self.total_steps,
write_dir='.',
warpx_file_prefix='Python_ohms_law_solver_EM_modes_1d_plt',
# warpx_format = 'openpmd',
# warpx_openpmd_backend = 'h5'
)
simulation.add_diagnostic(particle_diag)
field_diag = picmi.FieldDiagnostic(
name='field_diag',
grid=self.grid,
period=self.total_steps,
data_list=['B', 'E', 'J_displacement'],
write_dir='.',
warpx_file_prefix='Python_ohms_law_solver_EM_modes_1d_plt',
# warpx_format = 'openpmd',
# warpx_openpmd_backend = 'h5'
)
simulation.add_diagnostic(field_diag)
if self.B_dir == 'z' or self.dim == 1:
line_diag = picmi.ReducedDiagnostic(
diag_type='FieldProbe',
probe_geometry='Line',
z_probe=0,
z1_probe=self.Lz,
resolution=self.Nz - 1,
name=self.output_file_name[:-4],
period=self.diag_steps,
path='diags/'
)
simulation.add_diagnostic(line_diag)
else:
# install a custom "reduced diagnostic" to save the average field
callbacks.installafterEsolve(self._record_average_fields)
try:
os.mkdir("diags")
except OSError:
# diags directory already exists
pass
with open(f"diags/{self.output_file_name}", 'w') as f:
f.write(
"[0]step() [1]time(s) [2]z_coord(m) "
"[3]Ez_lev0-(V/m) [4]Bx_lev0-(T) [5]By_lev0-(T)\n"
)
#######################################################################
# Initialize simulation #
#######################################################################
simulation.initialize_inputs()
simulation.initialize_warpx()
def _record_average_fields(self):
"""A custom reduced diagnostic to store the average E&M fields in a
similar format as the reduced diagnostic so that the same analysis
script can be used regardless of the simulation dimension.
"""
step = simulation.extension.warpx.getistep(lev=0) - 1
if step % self.diag_steps != 0:
return
Bx_warpx = fields.BxWrapper()[...]
By_warpx = fields.ByWrapper()[...]
Ez_warpx = fields.EzWrapper()[...]
if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
return
t = step * self.dt
z_vals = np.linspace(0, self.Lz, self.Nz, endpoint=False)
if self.dim == 2:
Ez = np.mean(Ez_warpx[:-1], axis=0)
Bx = np.mean(Bx_warpx[:-1], axis=0)
By = np.mean(By_warpx[:-1], axis=0)
else:
Ez = np.mean(Ez_warpx[:-1, :-1], axis=(0, 1))
Bx = np.mean(Bx_warpx[:-1], axis=(0, 1))
By = np.mean(By_warpx[:-1], axis=(0, 1))
with open(f"diags/{self.output_file_name}", 'a') as f:
for ii in range(self.Nz):
f.write(
f"{step:05d} {t:.10e} {z_vals[ii]:.10e} {Ez[ii]:+.10e} "
f"{Bx[ii]:+.10e} {By[ii]:+.10e}\n"
)
##########################
# parse input parameters
##########################
parser = argparse.ArgumentParser()
parser.add_argument(
'-t', '--test', help='toggle whether this script is run as a short CI test',
action='store_true',
)
parser.add_argument(
'-d', '--dim', help='Simulation dimension', required=False, type=int,
default=1
)
parser.add_argument(
'--bdir', help='Direction of the B-field', required=False,
choices=['x', 'y', 'z'], default='z'
)
parser.add_argument(
'-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left
run = EMModes(test=args.test, dim=args.dim, B_dir=args.bdir, verbose=args.verbose)
simulation.step()
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Execute:
python3 PICMI_inputs.py -dim {1/2/3} --bdir z
Execute:
python3 PICMI_inputs.py -dim {1/2/3} --bdir {x/y}
Analyze
The following script reads the simulation output from the above example, performs Fourier transforms of the field data and compares the calculated spectrum to the theoretical dispersions.
Script analysis.py
Examples/Tests/ohm_solver_EM_modes/analysis.py
.#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example producing EM modes.
import dill
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from pywarpx import picmi
constants = picmi.constants
matplotlib.rcParams.update({'font.size': 20})
# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
sim = dill.load(f)
if sim.B_dir == 'z':
field_idx_dict = {'z': 4, 'Ez': 7, 'Bx': 8, 'By': 9}
data = np.loadtxt("diags/par_field_data.txt", skiprows=1)
else:
if sim.dim == 1:
field_idx_dict = {'z': 4, 'Ez': 7, 'Bx': 8, 'By': 9}
else:
field_idx_dict = {'z': 2, 'Ez': 3, 'Bx': 4, 'By': 5}
data = np.loadtxt("diags/perp_field_data.txt", skiprows=1)
# step, t, z, Ez, Bx, By = raw_data.T
step = data[:,0]
num_steps = len(np.unique(step))
# get the spatial resolution
resolution = len(np.where(step == 0)[0]) - 1
# reshape to separate spatial and time coordinates
sim_data = data.reshape((num_steps, resolution+1, data.shape[1]))
z_grid = sim_data[1, :, field_idx_dict['z']]
idx = np.argsort(z_grid)[1:]
dz = np.mean(np.diff(z_grid[idx]))
dt = np.mean(np.diff(sim_data[:,0,1]))
data = np.zeros((num_steps, resolution, 3))
for i in range(num_steps):
data[i,:,0] = sim_data[i,idx,field_idx_dict['Bx']]
data[i,:,1] = sim_data[i,idx,field_idx_dict['By']]
data[i,:,2] = sim_data[i,idx,field_idx_dict['Ez']]
print(f"Data file contains {num_steps} time snapshots.")
print(f"Spatial resolution is {resolution}")
def get_analytic_R_mode(w):
return w / np.sqrt(1.0 + abs(w))
def get_analytic_L_mode(w):
return w / np.sqrt(1.0 - abs(w))
if sim.B_dir == 'z':
global_norm = (
1.0 / (2.0*constants.mu0)
/ ((3.0/2)*sim.n_plasma*sim.T_plasma*constants.q_e)
)
else:
global_norm = (
constants.ep0 / 2.0
/ ((3.0/2)*sim.n_plasma*sim.T_plasma*constants.q_e)
)
if sim.B_dir == 'z':
Bl = (data[:, :, 0] + 1.0j * data[:, :, 1]) / np.sqrt(2.0)
field_kw = np.fft.fftshift(np.fft.fft2(Bl))
else:
field_kw = np.fft.fftshift(np.fft.fft2(data[:, :, 2]))
w_norm = sim.w_ci
if sim.B_dir == 'z':
k_norm = 1.0 / sim.l_i
else:
k_norm = 1.0 / sim.rho_i
k = 2*np.pi * np.fft.fftshift(np.fft.fftfreq(resolution, dz)) / k_norm
w = 2*np.pi * np.fft.fftshift(np.fft.fftfreq(num_steps, dt)) / w_norm
w = -np.flipud(w)
# aspect = (xmax-xmin)/(ymax-ymin) / aspect_true
extent = [k[0], k[-1], w[0], w[-1]]
fig, ax1 = plt.subplots(1, 1, figsize=(10, 7.25))
if sim.B_dir == 'z' and sim.dim == 1:
vmin = -3
vmax = 3.5
else:
vmin = None
vmax = None
im = ax1.imshow(
np.log10(np.abs(field_kw**2) * global_norm), extent=extent,
aspect="equal", cmap='inferno', vmin=vmin, vmax=vmax
)
# Colorbars
fig.subplots_adjust(right=0.5)
cbar_ax = fig.add_axes([0.525, 0.15, 0.03, 0.7])
fig.colorbar(im, cax=cbar_ax, orientation='vertical')
#cbar_lab = r'$\log_{10}(\frac{|B_{R/L}|^2}{2\mu_0}\frac{2}{3n_0k_BT_e})$'
if sim.B_dir == 'z':
cbar_lab = r'$\log_{10}(\beta_{R/L})$'
else:
cbar_lab = r'$\log_{10}(\varepsilon_0|E_z|^2/(3n_0k_BT_e))$'
cbar_ax.set_ylabel(cbar_lab, rotation=270, labelpad=30)
if sim.B_dir == 'z':
# plot the L mode
ax1.plot(get_analytic_L_mode(w), np.abs(w), c='limegreen', ls='--', lw=1.25,
label='L mode:\n'+r'$(kl_i)^2=\frac{(\omega/\Omega_i)^2}{1-\omega/\Omega_i}$')
# plot the R mode
ax1.plot(get_analytic_R_mode(w), -np.abs(w), c='limegreen', ls='-.', lw=1.25,
label='R mode:\n'+r'$(kl_i)^2=\frac{(\omega/\Omega_i)^2}{1+\omega/\Omega_i}$')
ax1.plot(k,1.0+3.0*sim.v_ti/w_norm*k*k_norm, c='limegreen', ls=':', lw=1.25, label = r'$\omega = \Omega_i + 3v_{th,i} k$')
ax1.plot(k,1.0-3.0*sim.v_ti/w_norm*k*k_norm, c='limegreen', ls=':', lw=1.25)
else:
# digitized values from Munoz et al. (2018)
x = [0.006781609195402272, 0.1321379310344828, 0.2671034482758621, 0.3743678160919539, 0.49689655172413794, 0.6143908045977011, 0.766022988505747, 0.885448275862069, 1.0321149425287355, 1.193862068965517, 1.4417701149425288, 1.7736781609195402]
y = [-0.033194664836814436, 0.5306857657503109, 1.100227301968521, 1.5713856842646996, 2.135780760818287, 2.675601492473303, 3.3477291246729854, 3.8469357121413563, 4.4317021915340735, 5.1079898786293265, 6.10275764463696, 7.310074194793499]
ax1.plot(x, y, c='limegreen', ls='-.', lw=1.5, label="X mode")
x = [3.9732873563218387, 3.6515862068965514, 3.306275862068966, 2.895655172413793, 2.4318850574712645, 2.0747586206896553, 1.8520229885057473, 1.6589195402298849, 1.4594942528735633, 1.2911724137931033, 1.1551264367816092, 1.0335402298850576, 0.8961149425287356, 0.7419770114942528, 0.6141379310344828, 0.4913103448275862]
y = [1.1145945018655916, 1.1193978642192393, 1.1391259596002916, 1.162971222713042, 1.1986533430544237, 1.230389844319595, 1.2649997855641806, 1.3265857528841618, 1.3706737573444268, 1.4368486511986962, 1.4933310460179268, 1.5485268259210019, 1.6386327572157655, 1.7062658146416778, 1.7828194021529358, 1.8533687867221342]
ax1.plot(x, y, c='limegreen', ls=':', lw=2, label="Bernstein modes")
x = [3.9669885057471266, 3.6533333333333333, 3.3213563218390805, 2.9646896551724136, 2.6106436781609195, 2.2797011494252875, 1.910919540229885, 1.6811724137931034, 1.4499540229885057, 1.2577011494252872, 1.081057471264368, 0.8791494252873564, 0.7153103448275862]
y = [2.2274306300124374, 2.2428271218424327, 2.272505039241755, 2.3084873697302397, 2.3586224642964364, 2.402667581592829, 2.513873997512545, 2.5859673199811297, 2.6586610627439207, 2.7352146502551786, 2.8161427284813656, 2.887850066475104, 2.9455761890466183]
ax1.plot(x, y, c='limegreen', ls=':', lw=2)
x = [3.9764137931034487, 3.702022988505747, 3.459793103448276, 3.166712643678161, 2.8715862068965516, 2.5285057471264367, 2.2068505747126435, 1.9037011494252871, 1.6009885057471265, 1.3447816091954023, 1.1538850574712645, 0.9490114942528736]
y = [3.3231976669382854, 3.34875841660591, 3.378865205643951, 3.424454260839731, 3.474160483767209, 3.522194107303684, 3.6205343740618434, 3.7040356821203417, 3.785435519149119, 3.868851052879873, 3.9169704507440923, 3.952481022429987]
ax1.plot(x, y, c='limegreen', ls=':', lw=2)
x = [3.953609195402299, 3.7670114942528734, 3.5917471264367817, 3.39735632183908, 3.1724137931034484, 2.9408045977011494, 2.685977011494253, 2.4593563218390804, 2.2203218390804595, 2.0158850574712646, 1.834183908045977, 1.6522758620689655, 1.4937471264367814, 1.3427586206896551, 1.2075402298850575]
y = [4.427971008277223, 4.458335120298495, 4.481579963117039, 4.495861388686366, 4.544581206844791, 4.587425483552773, 4.638160998413175, 4.698631899472488, 4.757987734271133, 4.813955483123902, 4.862332203971352, 4.892481880173264, 4.9247759145687695, 4.947934983059571, 4.953124329888064]
ax1.plot(x, y, c='limegreen', ls=':', lw=2)
# ax1.legend(loc='upper left')
fig.legend(loc=7, fontsize=18)
if sim.B_dir == 'z':
ax1.set_xlabel(r'$k l_i$')
ax1.set_title('$B_{R/L} = B_x \pm iB_y$')
fig.suptitle("Parallel EM modes")
ax1.set_xlim(-3, 3)
ax1.set_ylim(-6, 3)
dir_str = 'par'
else:
ax1.set_xlabel(r'$k \rho_i$')
ax1.set_title('$E_z(k, \omega)$')
fig.suptitle(f"Perpendicular EM modes (ion Bernstein) - {sim.dim}D")
ax1.set_xlim(-3, 3)
ax1.set_ylim(0, 8)
dir_str = 'perp'
ax1.set_ylabel(r'$\omega / \Omega_i$')
plt.savefig(
f"spectrum_{dir_str}_{sim.dim}d_{sim.substeps}_substeps_{sim.eta}_eta.png",
bbox_inches='tight'
)
if not sim.test:
plt.show()
if sim.test:
import os
import sys
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)
Right and left circularly polarized electromagnetic waves are supported through the cyclotron motion of the ions, except in a region of thermal resonances as indicated on the plot below.

Calculated Alvén waves spectrum with the theoretical dispersions overlaid.
Perpendicularly propagating modes are also supported, commonly referred to as ion-Bernstein modes.

Calculated ion Bernstein waves spectrum with the theoretical dispersion overlaid.
Ohm solver: Cylindrical normal modes
A RZ-geometry example case for normal modes propagating along an applied magnetic field in a cylinder is also available. The analytical solution for these modes are described in Stix [12] Chapter 6, Sec. 2.
Run
The following script initializes a thermal plasma in a metallic cylinder with periodic boundaries at the cylinder ends.
Script PICMI_inputs_rz.py
Examples/Tests/ohm_solver_EM_modes/PICMI_inputs_rz.py
.#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script is set up to produce parallel normal EM modes
# --- in a metallic cylinder and is run in RZ geometry.
# --- As a CI test only a small number of steps are taken.
import argparse
import sys
import dill
import numpy as np
from mpi4py import MPI as mpi
from pywarpx import picmi
constants = picmi.constants
comm = mpi.COMM_WORLD
simulation = picmi.Simulation(verbose=0)
class CylindricalNormalModes(object):
'''The following runs a simulation of an uniform plasma at a set ion
temperature (and Te = 0) with an external magnetic field applied in the
z-direction (parallel to domain).
The analysis script (in this same directory) analyzes the output field
data for EM modes.
'''
# Applied field parameters
B0 = 0.5 # Initial magnetic field strength (T)
beta = 0.01 # Plasma beta, used to calculate temperature
# Plasma species parameters
m_ion = 400.0 # Ion mass (electron masses)
vA_over_c = 5e-3 # ratio of Alfven speed and the speed of light
# Spatial domain
Nz = 512 # number of cells in z direction
Nr = 128 # number of cells in r direction
# Temporal domain (if not run as a CI test)
LT = 800.0 # Simulation temporal length (ion cyclotron periods)
# Numerical parameters
NPPC = 8000 # Seed number of particles per cell
DZ = 0.4 # Cell size (ion skin depths)
DR = 0.4 # Cell size (ion skin depths)
DT = 0.02 # Time step (ion cyclotron periods)
# Plasma resistivity - used to dampen the mode excitation
eta = 5e-4
# Number of substeps used to update B
substeps = 20
def __init__(self, test, verbose):
"""Get input parameters for the specific case desired."""
self.test = test
self.verbose = verbose or self.test
# calculate various plasma parameters based on the simulation input
self.get_plasma_quantities()
if not self.test:
self.total_steps = int(self.LT / self.DT)
else:
# if this is a test case run for only a small number of steps
self.total_steps = 100
# and make the grid and particle count smaller
self.Nz = 128
self.Nr = 64
self.NPPC = 200
# output diagnostics 5 times per cyclotron period
self.diag_steps = max(10, int(1.0 / 5 / self.DT))
self.Lz = self.Nz * self.DZ * self.l_i
self.Lr = self.Nr * self.DR * self.l_i
self.dt = self.DT * self.t_ci
# dump all the current attributes to a dill pickle file
if comm.rank == 0:
with open(f'sim_parameters.dpkl', 'wb') as f:
dill.dump(self, f)
# print out plasma parameters
if comm.rank == 0:
print(
f"Initializing simulation with input parameters:\n"
f"\tT = {self.T_plasma:.3f} eV\n"
f"\tn = {self.n_plasma:.1e} m^-3\n"
f"\tB0 = {self.B0:.2f} T\n"
f"\tM/m = {self.m_ion:.0f}\n"
)
print(
f"Plasma parameters:\n"
f"\tl_i = {self.l_i:.1e} m\n"
f"\tt_ci = {self.t_ci:.1e} s\n"
f"\tv_ti = {self.v_ti:.1e} m/s\n"
f"\tvA = {self.vA:.1e} m/s\n"
)
print(
f"Numerical parameters:\n"
f"\tdt = {self.dt:.1e} s\n"
f"\tdiag steps = {self.diag_steps:d}\n"
f"\ttotal steps = {self.total_steps:d}\n",
flush=True
)
self.setup_run()
def get_plasma_quantities(self):
"""Calculate various plasma parameters based on the simulation input."""
# Ion mass (kg)
self.M = self.m_ion * constants.m_e
# Cyclotron angular frequency (rad/s) and period (s)
self.w_ci = constants.q_e * abs(self.B0) / self.M
self.t_ci = 2.0 * np.pi / self.w_ci
# Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
self.vA = self.vA_over_c * constants.c
self.n_plasma = (
(self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
)
# Ion plasma frequency (Hz)
self.w_pi = np.sqrt(
constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
)
# Skin depth (m)
self.l_i = constants.c / self.w_pi
# Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
self.v_ti = np.sqrt(self.beta / 2.0) * self.vA
# Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV
# Larmor radius (m)
self.rho_i = self.v_ti / self.w_ci
def setup_run(self):
"""Setup simulation components."""
#######################################################################
# Set geometry and boundary conditions #
#######################################################################
self.grid = picmi.CylindricalGrid(
number_of_cells=[self.Nr, self.Nz],
warpx_max_grid_size=self.Nz,
lower_bound=[0, -self.Lz/2.0],
upper_bound=[self.Lr, self.Lz/2.0],
lower_boundary_conditions = ['none', 'periodic'],
upper_boundary_conditions = ['dirichlet', 'periodic'],
lower_boundary_conditions_particles = ['absorbing', 'periodic'],
upper_boundary_conditions_particles = ['reflecting', 'periodic']
)
simulation.time_step_size = self.dt
simulation.max_steps = self.total_steps
simulation.current_deposition_algo = 'direct'
simulation.particle_shape = 1
simulation.verbose = self.verbose
#######################################################################
# Field solver and external field #
#######################################################################
self.solver = picmi.HybridPICSolver(
grid=self.grid,
Te=0.0, n0=self.n_plasma, plasma_resistivity=self.eta,
substeps=self.substeps,
n_floor=self.n_plasma*0.05
)
simulation.solver = self.solver
B_ext = picmi.AnalyticInitialField(
Bz_expression=self.B0
)
simulation.add_applied_field(B_ext)
#######################################################################
# Particle types setup #
#######################################################################
self.ions = picmi.Species(
name='ions', charge='q_e', mass=self.M,
initial_distribution=picmi.UniformDistribution(
density=self.n_plasma,
rms_velocity=[self.v_ti]*3,
)
)
simulation.add_species(
self.ions,
layout=picmi.PseudoRandomLayout(
grid=self.grid, n_macroparticles_per_cell=self.NPPC
)
)
#######################################################################
# Add diagnostics #
#######################################################################
field_diag = picmi.FieldDiagnostic(
name='field_diag',
grid=self.grid,
period=self.diag_steps,
data_list=['B', 'E'],
write_dir='diags',
warpx_file_prefix='field_diags',
warpx_format='openpmd',
warpx_openpmd_backend='h5',
)
simulation.add_diagnostic(field_diag)
# add particle diagnostic for checksum
if self.test:
part_diag = picmi.ParticleDiagnostic(
name='diag1',
period=self.total_steps,
species=[self.ions],
data_list=['ux', 'uy', 'uz', 'weighting'],
write_dir='.',
warpx_file_prefix='Python_ohms_law_solver_EM_modes_rz_plt'
)
simulation.add_diagnostic(part_diag)
##########################
# parse input parameters
##########################
parser = argparse.ArgumentParser()
parser.add_argument(
'-t', '--test', help='toggle whether this script is run as a short CI test',
action='store_true',
)
parser.add_argument(
'-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left
run = CylindricalNormalModes(test=args.test, verbose=args.verbose)
simulation.step()
The example can be executed using:
python3 PICMI_inputs_rz.py
Analyze
After the simulation completes the following script can be used to analyze the field evolution and extract the normal mode dispersion relation. It performs a standard Fourier transform along the cylinder axis and a Hankel transform in the radial direction.
Script analysis_rz.py
Examples/Tests/ohm_solver_EM_modes/analysis_rz.py
.#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example producing EM modes.
import dill
import matplotlib.pyplot as plt
import numpy as np
import scipy.fft as fft
from matplotlib import colors
from openpmd_viewer import OpenPMDTimeSeries
from scipy.interpolate import RegularGridInterpolator
from scipy.special import j1, jn, jn_zeros
from pywarpx import picmi
constants = picmi.constants
# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
sim = dill.load(f)
diag_dir = "diags/field_diags"
ts = OpenPMDTimeSeries(diag_dir, check_all_files=True)
def transform_spatially(data_for_transform):
# interpolate from regular r-grid to special r-grid
interp = RegularGridInterpolator(
(info.z, info.r), data_for_transform,
method='linear'
)
data_interp = interp((zg, rg))
# Applying manual hankel in r
# Fmz = np.sum(proj*data_for_transform, axis=(2,3))
Fmz = np.einsum('ijkl,kl->ij', proj, data_interp)
# Standard fourier in z
Fmn = fft.fftshift(fft.fft(Fmz, axis=1), axes=1)
return Fmn
def process(it):
print(f"Processing iteration {it}", flush=True)
field, info = ts.get_field('E', 'y', iteration=it)
F_k = transform_spatially(field)
return F_k
# grab the first iteration to get the grids
Bz, info = ts.get_field('B', 'z', iteration=0)
nr = len(info.r)
nz = len(info.z)
nkr = 12 # number of radial modes to solve for
r_max = np.max(info.r)
# create r-grid with points spaced out according to zeros of the Bessel function
r_grid = jn_zeros(1, nr) / jn_zeros(1, nr)[-1] * r_max
zg, rg = np.meshgrid(info.z, r_grid)
# Setup Hankel Transform
j_1M = jn_zeros(1, nr)[-1]
r_modes = np.arange(nkr)
A = (
4.0 * np.pi * r_max**2 / j_1M**2
* j1(np.outer(jn_zeros(1, max(r_modes)+1)[r_modes], jn_zeros(1, nr)) / j_1M)
/ jn(2 ,jn_zeros(1, nr))**2
)
# No transformation for z
B = np.identity(nz)
# combine projection arrays
proj = np.einsum('ab,cd->acbd', A, B)
results = np.zeros((len(ts.t), nkr, nz), dtype=complex)
for ii, it in enumerate(ts.iterations):
results[ii] = process(it)
# now Fourier transform in time
F_kw = fft.fftshift(fft.fft(results, axis=0), axes=0)
dz = info.z[1] - info.z[0]
kz = 2*np.pi*fft.fftshift(fft.fftfreq(F_kw[0].shape[1], dz))
dt = ts.iterations[1] - ts.iterations[0]
omega = 2*np.pi*fft.fftshift(fft.fftfreq(F_kw.shape[0], sim.dt*dt))
# Save data for future plotting purposes
np.savez(
"diags/spectrograms.npz",
F_kw=F_kw, dz=dz, kz=kz, dt=dt, omega=omega
)
# plot the resulting dispersions
k = np.linspace(0, 250, 500)
kappa = k * sim.l_i
fig, axes = plt.subplots(2, 2, sharex=True, sharey=True, figsize=(6.75, 5))
vmin = [2e-3, 1.5e-3, 7.5e-4, 5e-4]
vmax = 1.0
# plot m = 1
for ii, m in enumerate([1, 3, 6, 8]):
ax = axes.flatten()[ii]
ax.set_title(f"m = {m}", fontsize=11)
m -= 1
pm1 = ax.pcolormesh(
kz*sim.l_i, omega/sim.w_ci,
abs(F_kw[:, m, :])/np.max(abs(F_kw[:, m, :])),
norm=colors.LogNorm(vmin=vmin[ii], vmax=vmax),
cmap='inferno'
)
cb = fig.colorbar(pm1, ax=ax)
cb.set_label(r'Normalized $E_\theta(k_z, m, \omega)$')
# Get dispersion relation - see for example
# T. Stix, Waves in Plasmas (American Inst. of Physics, 1992), Chap 6, Sec 2
nu_m = jn_zeros(1, m+1)[-1] / sim.Lr
R2 = 0.5 * (nu_m**2 * (1.0 + kappa**2) + k**2 * (kappa**2 + 2.0))
P4 = k**2 * (nu_m**2 + k**2)
omega_fast = sim.vA * np.sqrt(R2 + np.sqrt(R2**2 - P4))
omega_slow = sim.vA * np.sqrt(R2 - np.sqrt(R2**2 - P4))
# Upper right corner
ax.plot(k*sim.l_i, omega_fast/sim.w_ci, 'w--', label = f"$\omega_{{fast}}$")
ax.plot(k*sim.l_i, omega_slow/sim.w_ci, color='white', linestyle='--', label = f"$\omega_{{slow}}$")
# Thermal resonance
thermal_res = sim.w_ci + 3*sim.v_ti*k
ax.plot(k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "$\omega = \Omega_i + 3v_{th,i}k$")
ax.plot(-k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "")
thermal_res = sim.w_ci - 3*sim.v_ti*k
ax.plot(k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "$\omega = \Omega_i + 3v_{th,i}k$")
ax.plot(-k*sim.l_i, thermal_res/sim.w_ci, color='magenta', linestyle='--', label = "")
for ax in axes.flatten():
ax.set_xlim(-1.75, 1.75)
ax.set_ylim(0, 1.6)
axes[0, 0].set_ylabel('$\omega/\Omega_{ci}$')
axes[1, 0].set_ylabel('$\omega/\Omega_{ci}$')
axes[1, 0].set_xlabel('$k_zl_i$')
axes[1, 1].set_xlabel('$k_zl_i$')
plt.savefig('normal_modes_disp.png', dpi=600)
if not sim.test:
plt.show()
else:
plt.close()
# check if power spectrum sampling match earlier results
amps = np.abs(F_kw[2, 1, len(kz)//2-2:len(kz)//2+2])
print("Amplitude sample: ", amps)
assert np.allclose(
amps, np.array([ 61.02377286, 19.80026021, 100.47687017, 10.83331295])
)
if sim.test:
import os
import sys
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn, rtol=1e-6)
The following figure was produced with the above analysis script, showing excellent agreement between the calculated and theoretical dispersion relations.

Cylindrical normal mode dispersion comparing the calculated spectrum with the theoretical one.
Ohm solver: Ion Beam R Instability
In this example a low density ion beam interacts with a “core” plasma population which induces an instability. Based on the relative density between the beam and the core plasma a resonant or non-resonant condition can be accessed.
Run
The same input script can be used for 1d, 2d or 3d simulations as well as replicating either the resonant or non-resonant condition as indicated below.
Script PICMI_inputs.py
Examples/Tests/ohm_solver_ion_beam_instability/PICMI_inputs.py
.#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script simulates an ion beam instability wherein a
# --- low density ion beam interacts with background plasma. See Section 6.5 of
# --- Matthews (1994) and Section 4.4 of Munoz et al. (2018).
import argparse
import os
import sys
import time
import dill
import numpy as np
from mpi4py import MPI as mpi
from pywarpx import callbacks, fields, libwarpx, particle_containers, picmi
constants = picmi.constants
comm = mpi.COMM_WORLD
simulation = picmi.Simulation(
warpx_serialize_initial_conditions=True,
verbose=0
)
class HybridPICBeamInstability(object):
'''This input is based on the ion beam R instability test as described by
Munoz et al. (2018).
'''
# Applied field parameters
B0 = 0.25 # Initial magnetic field strength (T)
beta = 1.0 # Plasma beta, used to calculate temperature
# Plasma species parameters
m_ion = 100.0 # Ion mass (electron masses)
vA_over_c = 1e-4 # ratio of Alfven speed and the speed of light
# Spatial domain
Nz = 1024 # number of cells in z direction
Nx = 8 # number of cells in x (and y) direction for >1 dimensions
# Temporal domain (if not run as a CI test)
LT = 120.0 # Simulation temporal length (ion cyclotron periods)
# Numerical parameters
NPPC = [1024, 256, 64] # Seed number of particles per cell
DZ = 1.0 / 4.0 # Cell size (ion skin depths)
DT = 0.01 # Time step (ion cyclotron periods)
# Plasma resistivity - used to dampen the mode excitation
eta = 1e-7
# Number of substeps used to update B
substeps = 10
# Beam parameters
n_beam = [0.02, 0.1]
U_bc = 10.0 # relative drifts between beam and core in Alfven speeds
def __init__(self, test, dim, resonant, verbose):
"""Get input parameters for the specific case desired."""
self.test = test
self.dim = int(dim)
self.resonant = resonant
self.verbose = verbose or self.test
# sanity check
assert (dim > 0 and dim < 4), f"{dim}-dimensions not a valid input"
# calculate various plasma parameters based on the simulation input
self.get_plasma_quantities()
self.n_beam = self.n_beam[1 - int(resonant)]
self.u_beam = 1.0 / (1.0 + self.n_beam) * self.U_bc * self.vA
self.u_c = -1.0 * self.n_beam / (1.0 + self.n_beam) * self.U_bc * self.vA
self.n_beam = self.n_beam * self.n_plasma
self.dz = self.DZ * self.l_i
self.Lz = self.Nz * self.dz
self.Lx = self.Nx * self.dz
if self.dim == 3:
self.volume = self.Lx * self.Lx * self.Lz
self.N_cells = self.Nx * self.Nx * self.Nz
elif self.dim == 2:
self.volume = self.Lx * self.Lz
self.N_cells = self.Nx * self.Nz
else:
self.volume = self.Lz
self.N_cells = self.Nz
diag_period = 1 / 4.0 # Output interval (ion cyclotron periods)
self.diag_steps = int(diag_period / self.DT)
# if this is a test case run for only 25 cyclotron periods
if self.test:
self.LT = 25.0
self.total_steps = int(np.ceil(self.LT / self.DT))
self.dt = self.DT / self.w_ci
# dump all the current attributes to a dill pickle file
if comm.rank == 0:
with open('sim_parameters.dpkl', 'wb') as f:
dill.dump(self, f)
# print out plasma parameters
if comm.rank == 0:
print(
f"Initializing simulation with input parameters:\n"
f"\tT = {self.T_plasma*1e-3:.1f} keV\n"
f"\tn = {self.n_plasma:.1e} m^-3\n"
f"\tB0 = {self.B0:.2f} T\n"
f"\tM/m = {self.m_ion:.0f}\n"
)
print(
f"Plasma parameters:\n"
f"\tl_i = {self.l_i:.1e} m\n"
f"\tt_ci = {self.t_ci:.1e} s\n"
f"\tv_ti = {self.v_ti:.1e} m/s\n"
f"\tvA = {self.vA:.1e} m/s\n"
)
print(
f"Numerical parameters:\n"
f"\tdz = {self.dz:.1e} m\n"
f"\tdt = {self.dt:.1e} s\n"
f"\tdiag steps = {self.diag_steps:d}\n"
f"\ttotal steps = {self.total_steps:d}\n"
)
self.setup_run()
def get_plasma_quantities(self):
"""Calculate various plasma parameters based on the simulation input."""
# Ion mass (kg)
self.M = self.m_ion * constants.m_e
# Cyclotron angular frequency (rad/s) and period (s)
self.w_ci = constants.q_e * abs(self.B0) / self.M
self.t_ci = 2.0 * np.pi / self.w_ci
# Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
self.vA = self.vA_over_c * constants.c
self.n_plasma = (
(self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
)
# Ion plasma frequency (Hz)
self.w_pi = np.sqrt(
constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
)
# Skin depth (m)
self.l_i = constants.c / self.w_pi
# Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
self.v_ti = np.sqrt(self.beta / 2.0) * self.vA
# Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV
# Larmor radius (m)
self.rho_i = self.v_ti / self.w_ci
def setup_run(self):
"""Setup simulation components."""
#######################################################################
# Set geometry and boundary conditions #
#######################################################################
if self.dim == 1:
grid_object = picmi.Cartesian1DGrid
elif self.dim == 2:
grid_object = picmi.Cartesian2DGrid
else:
grid_object = picmi.Cartesian3DGrid
self.grid = grid_object(
number_of_cells=[self.Nx, self.Nx, self.Nz][-self.dim:],
warpx_max_grid_size=self.Nz,
lower_bound=[-self.Lx/2.0, -self.Lx/2.0, 0][-self.dim:],
upper_bound=[self.Lx/2.0, self.Lx/2.0, self.Lz][-self.dim:],
lower_boundary_conditions=['periodic']*self.dim,
upper_boundary_conditions=['periodic']*self.dim
)
simulation.time_step_size = self.dt
simulation.max_steps = self.total_steps
simulation.current_deposition_algo = 'direct'
simulation.particle_shape = 1
simulation.verbose = self.verbose
#######################################################################
# Field solver and external field #
#######################################################################
self.solver = picmi.HybridPICSolver(
grid=self.grid, gamma=1.0,
Te=self.T_plasma/10.0,
n0=self.n_plasma+self.n_beam,
plasma_resistivity=self.eta, substeps=self.substeps
)
simulation.solver = self.solver
B_ext = picmi.AnalyticInitialField(
Bx_expression=0.0,
By_expression=0.0,
Bz_expression=self.B0
)
simulation.add_applied_field(B_ext)
#######################################################################
# Particle types setup #
#######################################################################
self.ions = picmi.Species(
name='ions', charge='q_e', mass=self.M,
initial_distribution=picmi.UniformDistribution(
density=self.n_plasma,
rms_velocity=[self.v_ti]*3,
directed_velocity=[0, 0, self.u_c]
)
)
simulation.add_species(
self.ions,
layout=picmi.PseudoRandomLayout(
grid=self.grid, n_macroparticles_per_cell=self.NPPC[self.dim-1]
)
)
self.beam_ions = picmi.Species(
name='beam_ions', charge='q_e', mass=self.M,
initial_distribution=picmi.UniformDistribution(
density=self.n_beam,
rms_velocity=[self.v_ti]*3,
directed_velocity=[0, 0, self.u_beam]
)
)
simulation.add_species(
self.beam_ions,
layout=picmi.PseudoRandomLayout(
grid=self.grid,
n_macroparticles_per_cell=self.NPPC[self.dim-1]/2
)
)
#######################################################################
# Add diagnostics #
#######################################################################
callbacks.installafterstep(self.energy_diagnostic)
callbacks.installafterstep(self.text_diag)
if self.test:
part_diag = picmi.ParticleDiagnostic(
name='diag1',
period=1250,
species=[self.ions, self.beam_ions],
data_list = ['ux', 'uy', 'uz', 'z', 'weighting'],
write_dir='.',
warpx_file_prefix='Python_ohms_law_solver_ion_beam_1d_plt',
)
simulation.add_diagnostic(part_diag)
field_diag = picmi.FieldDiagnostic(
name='diag1',
grid=self.grid,
period=1250,
data_list = ['Bx', 'By', 'Bz', 'Ex', 'Ey', 'Ez', 'Jx', 'Jy', 'Jz'],
write_dir='.',
warpx_file_prefix='Python_ohms_law_solver_ion_beam_1d_plt',
)
simulation.add_diagnostic(field_diag)
# output the full particle data at t*w_ci = 40
step = int(40.0 / self.DT)
parts_diag = picmi.ParticleDiagnostic(
name='parts_diag',
period=f"{step}:{step}",
species=[self.ions, self.beam_ions],
write_dir='diags',
warpx_file_prefix='Python_hybrid_PIC_plt',
warpx_format = 'openpmd',
warpx_openpmd_backend = 'h5'
)
simulation.add_diagnostic(parts_diag)
self.output_file_name = 'field_data.txt'
if self.dim == 1:
line_diag = picmi.ReducedDiagnostic(
diag_type='FieldProbe',
probe_geometry='Line',
z_probe=0,
z1_probe=self.Lz,
resolution=self.Nz - 1,
name=self.output_file_name[:-4],
period=self.diag_steps,
path='diags/'
)
simulation.add_diagnostic(line_diag)
else:
# install a custom "reduced diagnostic" to save the average field
callbacks.installafterEsolve(self._record_average_fields)
try:
os.mkdir("diags")
except OSError:
# diags directory already exists
pass
with open(f"diags/{self.output_file_name}", 'w') as f:
f.write("[0]step() [1]time(s) [2]z_coord(m) [3]By_lev0-(T)\n")
#######################################################################
# Initialize simulation #
#######################################################################
simulation.initialize_inputs()
simulation.initialize_warpx()
# create particle container wrapper for the ion species to access
# particle data
self.ion_container_wrapper = particle_containers.ParticleContainerWrapper(
self.ions.name
)
self.beam_ion_container_wrapper = particle_containers.ParticleContainerWrapper(
self.beam_ions.name
)
def _create_data_arrays(self):
self.prev_time = time.time()
self.start_time = self.prev_time
self.prev_step = 0
if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
# allocate arrays for storing energy values
self.energy_vals = np.zeros((self.total_steps//self.diag_steps, 4))
def text_diag(self):
"""Diagnostic function to print out timing data and particle numbers."""
step = simulation.extension.warpx.getistep(lev=0) - 1
if not hasattr(self, "prev_time"):
self._create_data_arrays()
if step % (self.total_steps // 10) != 0:
return
wall_time = time.time() - self.prev_time
steps = step - self.prev_step
step_rate = steps / wall_time
status_dict = {
'step': step,
'nplive beam ions': self.ion_container_wrapper.nps,
'nplive ions': self.beam_ion_container_wrapper.nps,
'wall_time': wall_time,
'step_rate': step_rate,
"diag_steps": self.diag_steps,
'iproc': None
}
diag_string = (
"Step #{step:6d}; "
"{nplive beam ions} beam ions; "
"{nplive ions} core ions; "
"{wall_time:6.1f} s wall time; "
"{step_rate:4.2f} steps/s"
)
if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
print(diag_string.format(**status_dict))
self.prev_time = time.time()
self.prev_step = step
def energy_diagnostic(self):
"""Diagnostic to get the total, magnetic and kinetic energies in the
simulation."""
step = simulation.extension.warpx.getistep(lev=0) - 1
if step % self.diag_steps != 1:
return
idx = (step - 1) // self.diag_steps
if not hasattr(self, "prev_time"):
self._create_data_arrays()
# get the simulation energies
Ec_par, Ec_perp = self._get_kinetic_energy(self.ion_container_wrapper)
Eb_par, Eb_perp = self._get_kinetic_energy(self.beam_ion_container_wrapper)
if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
return
self.energy_vals[idx, 0] = Ec_par
self.energy_vals[idx, 1] = Ec_perp
self.energy_vals[idx, 2] = Eb_par
self.energy_vals[idx, 3] = Eb_perp
if step == self.total_steps:
np.save('diags/energies.npy', run.energy_vals)
def _get_kinetic_energy(self, container_wrapper):
"""Utility function to retrieve the total kinetic energy in the
simulation."""
try:
ux = np.concatenate(container_wrapper.get_particle_ux())
uy = np.concatenate(container_wrapper.get_particle_uy())
uz = np.concatenate(container_wrapper.get_particle_uz())
w = np.concatenate(container_wrapper.get_particle_weight())
except ValueError:
return 0.0, 0.0
my_E_perp = 0.5 * self.M * np.sum(w * (ux**2 + uy**2))
E_perp = comm.allreduce(my_E_perp, op=mpi.SUM)
my_E_par = 0.5 * self.M * np.sum(w * uz**2)
E_par = comm.allreduce(my_E_par, op=mpi.SUM)
return E_par, E_perp
def _record_average_fields(self):
"""A custom reduced diagnostic to store the average E&M fields in a
similar format as the reduced diagnostic so that the same analysis
script can be used regardless of the simulation dimension.
"""
step = simulation.extension.warpx.getistep(lev=0) - 1
if step % self.diag_steps != 0:
return
By_warpx = fields.BxWrapper()[...]
if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
return
t = step * self.dt
z_vals = np.linspace(0, self.Lz, self.Nz, endpoint=False)
if self.dim == 2:
By = np.mean(By_warpx[:-1], axis=0)
else:
By = np.mean(By_warpx[:-1], axis=(0, 1))
with open(f"diags/{self.output_file_name}", 'a') as f:
for ii in range(self.Nz):
f.write(
f"{step:05d} {t:.10e} {z_vals[ii]:.10e} {By[ii]:+.10e}\n"
)
##########################
# parse input parameters
##########################
parser = argparse.ArgumentParser()
parser.add_argument(
'-t', '--test', help='toggle whether this script is run as a short CI test',
action='store_true',
)
parser.add_argument(
'-d', '--dim', help='Simulation dimension', required=False, type=int,
default=1
)
parser.add_argument(
'-r', '--resonant', help='Run the resonant case', required=False,
action='store_true',
)
parser.add_argument(
'-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left
run = HybridPICBeamInstability(
test=args.test, dim=args.dim, resonant=args.resonant, verbose=args.verbose
)
simulation.step()
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Execute:
python3 PICMI_inputs.py -dim {1/2/3} --resonant
Execute:
python3 PICMI_inputs.py -dim {1/2/3}
Analyze
The following script reads the simulation output from the above example, performs Fourier transforms of the field data and outputs the figures shown below.
Script analysis.py
Examples/Tests/ohm_solver_ion_beam_instability/analysis.py
.#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example of ion beam R instability.
import dill
import h5py
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from pywarpx import picmi
constants = picmi.constants
matplotlib.rcParams.update({'font.size': 20})
# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
sim = dill.load(f)
if sim.resonant:
resonant_str = 'resonant'
else:
resonant_str = 'non resonant'
data = np.loadtxt("diags/field_data.txt", skiprows=1)
if sim.dim == 1:
field_idx_dict = {'z': 4, 'By': 8}
else:
field_idx_dict = {'z': 2, 'By': 3}
step = data[:,0]
num_steps = len(np.unique(step))
# get the spatial resolution
resolution = len(np.where(step == 0)[0]) - 1
# reshape to separate spatial and time coordinates
sim_data = data.reshape((num_steps, resolution+1, data.shape[1]))
z_grid = sim_data[1, :, field_idx_dict['z']]
idx = np.argsort(z_grid)[1:]
dz = np.mean(np.diff(z_grid[idx]))
dt = np.mean(np.diff(sim_data[:,0,1]))
data = np.zeros((num_steps, resolution))
for i in range(num_steps):
data[i,:] = sim_data[i,idx,field_idx_dict['By']]
print(f"Data file contains {num_steps} time snapshots.")
print(f"Spatial resolution is {resolution}")
# Create the stack time plot
fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))
max_val = np.max(np.abs(data[:,:]/sim.B0))
extent = [0, sim.Lz/sim.l_i, 0, num_steps*dt*sim.w_ci] # num_steps*dt/sim.t_ci]
im = ax1.imshow(
data[:,:]/sim.B0, extent=extent, origin='lower',
cmap='seismic', vmin=-max_val, vmax=max_val, aspect="equal",
)
# Colorbar
fig.subplots_adjust(right=0.825)
cbar_ax = fig.add_axes([0.85, 0.2, 0.03, 0.6])
fig.colorbar(im, cax=cbar_ax, orientation='vertical', label='$B_y/B_0$')
ax1.set_xlabel("$x/l_i$")
ax1.set_ylabel("$t \Omega_i$ (rad)")
ax1.set_title(f"Ion beam R instability - {resonant_str} case")
plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}.png")
plt.close()
if sim.resonant:
# Plot the 4th, 5th and 6th Fourier modes
field_kt = np.fft.fft(data[:, :], axis=1)
k = 2*np.pi * np.fft.fftfreq(resolution, dz) * sim.l_i
t_grid = np.arange(num_steps)*dt*sim.w_ci
plt.plot(t_grid, np.abs(field_kt[:, 4] / sim.B0), 'r', label=f'm = 4, $kl_i={k[4]:.2f}$')
plt.plot(t_grid, np.abs(field_kt[:, 5] / sim.B0), 'b', label=f'm = 5, $kl_i={k[5]:.2f}$')
plt.plot(t_grid, np.abs(field_kt[:, 6] / sim.B0), 'k', label=f'm = 6, $kl_i={k[6]:.2f}$')
# The theoretical growth rates for the 4th, 5th and 6th Fourier modes of
# the By-field was obtained from Fig. 12a of Munoz et al.
# Note the rates here are gamma / w_ci
gamma4 = 0.1915611861780133
gamma5 = 0.20087036355662818
gamma6 = 0.17123024228396777
# Draw the line of best fit with the theoretical growth rate (slope) in the
# window t*w_ci between 10 and 40
idx = np.where((t_grid > 10) & (t_grid < 40))
t_points = t_grid[idx]
A4 = np.exp(np.mean(np.log(np.abs(field_kt[idx, 4] / sim.B0)) - t_points*gamma4))
plt.plot(t_points, A4*np.exp(t_points*gamma4), 'r--', lw=3)
A5 = np.exp(np.mean(np.log(np.abs(field_kt[idx, 5] / sim.B0)) - t_points*gamma5))
plt.plot(t_points, A5*np.exp(t_points*gamma5), 'b--', lw=3)
A6 = np.exp(np.mean(np.log(np.abs(field_kt[idx, 6] / sim.B0)) - t_points*gamma6))
plt.plot(t_points, A6*np.exp(t_points*gamma6), 'k--', lw=3)
plt.grid()
plt.legend()
plt.yscale('log')
plt.ylabel('$|B_y/B_0|$')
plt.xlabel('$t\Omega_i$ (rad)')
plt.tight_layout()
plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}_low_modes.png")
plt.close()
# check if the growth rate matches expectation
m4_rms_error = np.sqrt(np.mean(
(np.abs(field_kt[idx, 4] / sim.B0) - A4*np.exp(t_points*gamma4))**2
))
m5_rms_error = np.sqrt(np.mean(
(np.abs(field_kt[idx, 5] / sim.B0) - A5*np.exp(t_points*gamma5))**2
))
m6_rms_error = np.sqrt(np.mean(
(np.abs(field_kt[idx, 6] / sim.B0) - A6*np.exp(t_points*gamma6))**2
))
print("Growth rate RMS errors:")
print(f" m = 4: {m4_rms_error:.3e}")
print(f" m = 5: {m5_rms_error:.3e}")
print(f" m = 6: {m6_rms_error:.3e}")
if not sim.test:
with h5py.File('diags/Python_hybrid_PIC_plt/openpmd_004000.h5', 'r') as data:
timestep = str(np.squeeze([key for key in data['data'].keys()]))
z = np.array(data['data'][timestep]['particles']['ions']['position']['z'])
vy = np.array(data['data'][timestep]['particles']['ions']['momentum']['y'])
w = np.array(data['data'][timestep]['particles']['ions']['weighting'])
fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))
im = ax1.hist2d(
z/sim.l_i, vy/sim.M/sim.vA, weights=w, density=True,
range=[[0, 250], [-10, 10]], bins=250, cmin=1e-5
)
# Colorbar
fig.subplots_adjust(bottom=0.15, right=0.815)
cbar_ax = fig.add_axes([0.83, 0.2, 0.03, 0.6])
fig.colorbar(im[3], cax=cbar_ax, orientation='vertical', format='%.0e', label='$f(z, v_y)$')
ax1.set_xlabel("$x/l_i$")
ax1.set_ylabel("$v_{y}/v_A$")
ax1.set_title(f"Ion beam R instability - {resonant_str} case")
plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}_core_phase_space.png")
plt.close()
with h5py.File('diags/Python_hybrid_PIC_plt/openpmd_004000.h5', 'r') as data:
timestep = str(np.squeeze([key for key in data['data'].keys()]))
z = np.array(data['data'][timestep]['particles']['beam_ions']['position']['z'])
vy = np.array(data['data'][timestep]['particles']['beam_ions']['momentum']['y'])
w = np.array(data['data'][timestep]['particles']['beam_ions']['weighting'])
fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))
im = ax1.hist2d(
z/sim.l_i, vy/sim.M/sim.vA, weights=w, density=True,
range=[[0, 250], [-10, 10]], bins=250, cmin=1e-5
)
# Colorbar
fig.subplots_adjust(bottom=0.15, right=0.815)
cbar_ax = fig.add_axes([0.83, 0.2, 0.03, 0.6])
fig.colorbar(im[3], cax=cbar_ax, orientation='vertical', format='%.0e', label='$f(z, v_y)$')
ax1.set_xlabel("$x/l_i$")
ax1.set_ylabel("$v_{y}/v_A$")
ax1.set_title(f"Ion beam R instability - {resonant_str} case")
plt.savefig(f"diags/ion_beam_R_instability_{resonant_str}_eta_{sim.eta}_substeps_{sim.substeps}_beam_phase_space.png")
plt.show()
if sim.test:
# physics based check - these error tolerances are not set from theory
# but from the errors that were present when the test was created. If these
# assert's fail, the full benchmark should be rerun (same as the test but
# without the `--test` argument) and the growth rates (up to saturation)
# compared to the theoretical ones to determine if the physics test passes.
# At creation, the full test (3d) had the following errors (ran on 1 V100):
# m4_rms_error = 3.329; m5_rms_error = 1.052; m6_rms_error = 2.583
assert np.isclose(m4_rms_error, 1.515, atol=0.01)
assert np.isclose(m5_rms_error, 0.718, atol=0.01)
assert np.isclose(m6_rms_error, 0.357, atol=0.01)
# checksum check
import os
import sys
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)
The figures below show the evolution of the y-component of the magnetic field as the beam and core plasma interact.


Evolution of \(B_y\) for resonant (top) and non-resonant (bottom) conditions.
The growth rates of the strongest growing modes for the resonant case are compared to theory (dashed lines) in the figure below.
Time series of the mode amplitudes for m = 4, 5, 6 from simulation. The theoretical growth for these modes are also shown as dashed lines.
Ohm solver: Ion Landau Damping
Landau damping is a well known process in which electrostatic (acoustic) waves are damped by transferring energy to particles satisfying a resonance condition. The process can be simulated by seeding a plasma with a specific acoustic mode (density perturbation) and tracking the strength of the mode as a function of time.
Run
The same input script can be used for 1d, 2d or 3d simulations and to sweep different temperature ratios.
Script PICMI_inputs.py
Examples/Tests/ohm_solver_ion_Landau_damping/PICMI_inputs.py
.#!/usr/bin/env python3
#
# --- Test script for the kinetic-fluid hybrid model in WarpX wherein ions are
# --- treated as kinetic particles and electrons as an isothermal, inertialess
# --- background fluid. The script simulates ion Landau damping as described
# --- in section 4.5 of Munoz et al. (2018).
import argparse
import os
import sys
import time
import dill
import numpy as np
from mpi4py import MPI as mpi
from pywarpx import callbacks, fields, libwarpx, particle_containers, picmi
constants = picmi.constants
comm = mpi.COMM_WORLD
simulation = picmi.Simulation(
warpx_serialize_initial_conditions=True,
verbose=0
)
class IonLandauDamping(object):
'''This input is based on the ion Landau damping test as described by
Munoz et al. (2018).
'''
# Applied field parameters
B0 = 0.1 # Initial magnetic field strength (T)
beta = 2.0 # Plasma beta, used to calculate temperature
# Plasma species parameters
m_ion = 100.0 # Ion mass (electron masses)
vA_over_c = 1e-3 # ratio of Alfven speed and the speed of light
# Spatial domain
Nz = 256 # number of cells in z direction
Nx = 4 # number of cells in x (and y) direction for >1 dimensions
# Temporal domain (if not run as a CI test)
LT = 40.0 # Simulation temporal length (ion cyclotron periods)
# Numerical parameters
NPPC = [8192, 4096, 1024] # Seed number of particles per cell
DZ = 1.0 / 6.0 # Cell size (ion skin depths)
DT = 1e-3 # Time step (ion cyclotron periods)
# density perturbation strength
epsilon = 0.03
# Plasma resistivity - used to dampen the mode excitation
eta = 1e-7
# Number of substeps used to update B
substeps = 10
def __init__(self, test, dim, m, T_ratio, verbose):
"""Get input parameters for the specific case desired."""
self.test = test
self.dim = int(dim)
self.m = m
self.T_ratio = T_ratio
self.verbose = verbose or self.test
# sanity check
assert (dim > 0 and dim < 4), f"{dim}-dimensions not a valid input"
# calculate various plasma parameters based on the simulation input
self.get_plasma_quantities()
self.dz = self.DZ * self.l_i
self.Lz = self.Nz * self.dz
self.Lx = self.Nx * self.dz
diag_period = 1 / 16.0 # Output interval (ion cyclotron periods)
self.diag_steps = int(diag_period / self.DT)
self.total_steps = int(np.ceil(self.LT / self.DT))
# if this is a test case run for only 100 steps
if self.test:
self.total_steps = 100
self.dt = self.DT / self.w_ci # self.DT * self.t_ci
# dump all the current attributes to a dill pickle file
if comm.rank == 0:
with open('sim_parameters.dpkl', 'wb') as f:
dill.dump(self, f)
# print out plasma parameters
if comm.rank == 0:
print(
f"Initializing simulation with input parameters:\n"
f"\tT = {self.T_plasma*1e-3:.1f} keV\n"
f"\tn = {self.n_plasma:.1e} m^-3\n"
f"\tB0 = {self.B0:.2f} T\n"
f"\tM/m = {self.m_ion:.0f}\n"
)
print(
f"Plasma parameters:\n"
f"\tl_i = {self.l_i:.1e} m\n"
f"\tt_ci = {self.t_ci:.1e} s\n"
f"\tv_ti = {self.v_ti:.1e} m/s\n"
f"\tvA = {self.vA:.1e} m/s\n"
)
print(
f"Numerical parameters:\n"
f"\tdz = {self.dz:.1e} m\n"
f"\tdt = {self.dt:.1e} s\n"
f"\tdiag steps = {self.diag_steps:d}\n"
f"\ttotal steps = {self.total_steps:d}\n"
)
self.setup_run()
def get_plasma_quantities(self):
"""Calculate various plasma parameters based on the simulation input."""
# Ion mass (kg)
self.M = self.m_ion * constants.m_e
# Cyclotron angular frequency (rad/s) and period (s)
self.w_ci = constants.q_e * abs(self.B0) / self.M
self.t_ci = 2.0 * np.pi / self.w_ci
# Alfven speed (m/s): vA = B / sqrt(mu0 * n * (M + m)) = c * omega_ci / w_pi
self.vA = self.vA_over_c * constants.c
self.n_plasma = (
(self.B0 / self.vA)**2 / (constants.mu0 * (self.M + constants.m_e))
)
# Ion plasma frequency (Hz)
self.w_pi = np.sqrt(
constants.q_e**2 * self.n_plasma / (self.M * constants.ep0)
)
# Skin depth (m)
self.l_i = constants.c / self.w_pi
# Ion thermal velocity (m/s) from beta = 2 * (v_ti / vA)**2
self.v_ti = np.sqrt(self.beta / 2.0) * self.vA
# Temperature (eV) from thermal speed: v_ti = sqrt(kT / M)
self.T_plasma = self.v_ti**2 * self.M / constants.q_e # eV
# Larmor radius (m)
self.rho_i = self.v_ti / self.w_ci
def setup_run(self):
"""Setup simulation components."""
#######################################################################
# Set geometry and boundary conditions #
#######################################################################
if self.dim == 1:
grid_object = picmi.Cartesian1DGrid
elif self.dim == 2:
grid_object = picmi.Cartesian2DGrid
else:
grid_object = picmi.Cartesian3DGrid
self.grid = grid_object(
number_of_cells=[self.Nx, self.Nx, self.Nz][-self.dim:],
warpx_max_grid_size=self.Nz,
lower_bound=[-self.Lx/2.0, -self.Lx/2.0, 0][-self.dim:],
upper_bound=[self.Lx/2.0, self.Lx/2.0, self.Lz][-self.dim:],
lower_boundary_conditions=['periodic']*self.dim,
upper_boundary_conditions=['periodic']*self.dim,
warpx_blocking_factor=4
)
simulation.time_step_size = self.dt
simulation.max_steps = self.total_steps
simulation.current_deposition_algo = 'direct'
simulation.particle_shape = 1
simulation.verbose = self.verbose
#######################################################################
# Field solver and external field #
#######################################################################
self.solver = picmi.HybridPICSolver(
grid=self.grid, gamma=1.0,
Te=self.T_plasma/self.T_ratio,
n0=self.n_plasma,
plasma_resistivity=self.eta, substeps=self.substeps
)
simulation.solver = self.solver
#######################################################################
# Particle types setup #
#######################################################################
k_m = 2.0*np.pi*self.m / self.Lz
self.ions = picmi.Species(
name='ions', charge='q_e', mass=self.M,
initial_distribution=picmi.AnalyticDistribution(
density_expression=f"{self.n_plasma}*(1+{self.epsilon}*cos({k_m}*z))",
rms_velocity=[self.v_ti]*3
)
)
simulation.add_species(
self.ions,
layout=picmi.PseudoRandomLayout(
grid=self.grid, n_macroparticles_per_cell=self.NPPC[self.dim-1]
)
)
#######################################################################
# Add diagnostics #
#######################################################################
callbacks.installafterstep(self.text_diag)
if self.test:
particle_diag = picmi.ParticleDiagnostic(
name='diag1',
period=100,
write_dir='.',
species=[self.ions],
data_list = ['ux', 'uy', 'uz', 'x', 'z', 'weighting'],
warpx_file_prefix=f'Python_ohms_law_solver_landau_damping_{self.dim}d_plt',
)
simulation.add_diagnostic(particle_diag)
field_diag = picmi.FieldDiagnostic(
name='diag1',
grid=self.grid,
period=100,
write_dir='.',
data_list = ['Bx', 'By', 'Bz', 'Ex', 'Ey', 'Ez', 'Jx', 'Jy', 'Jz'],
warpx_file_prefix=f'Python_ohms_law_solver_landau_damping_{self.dim}d_plt',
)
simulation.add_diagnostic(field_diag)
self.output_file_name = 'field_data.txt'
# install a custom "reduced diagnostic" to save the average field
callbacks.installafterEsolve(self._record_average_fields)
try:
os.mkdir("diags")
except OSError:
# diags directory already exists
pass
with open(f"diags/{self.output_file_name}", 'w') as f:
f.write("[0]step() [1]time(s) [2]z_coord(m) [3]Ez_lev0-(V/m)\n")
self.prev_time = time.time()
self.start_time = self.prev_time
self.prev_step = 0
#######################################################################
# Initialize simulation #
#######################################################################
simulation.initialize_inputs()
simulation.initialize_warpx()
# get ion particle container wrapper
self.ion_part_container = particle_containers.ParticleContainerWrapper(
'ions'
)
def text_diag(self):
"""Diagnostic function to print out timing data and particle numbers."""
step = simulation.extension.warpx.getistep(lev=0) - 1
if step % (self.total_steps // 10) != 0:
return
wall_time = time.time() - self.prev_time
steps = step - self.prev_step
step_rate = steps / wall_time
status_dict = {
'step': step,
'nplive ions': self.ion_part_container.nps,
'wall_time': wall_time,
'step_rate': step_rate,
"diag_steps": self.diag_steps,
'iproc': None
}
diag_string = (
"Step #{step:6d}; "
"{nplive ions} core ions; "
"{wall_time:6.1f} s wall time; "
"{step_rate:4.2f} steps/s"
)
if libwarpx.amr.ParallelDescriptor.MyProc() == 0:
print(diag_string.format(**status_dict))
self.prev_time = time.time()
self.prev_step = step
def _record_average_fields(self):
"""A custom reduced diagnostic to store the average E&M fields in a
similar format as the reduced diagnostic so that the same analysis
script can be used regardless of the simulation dimension.
"""
step = simulation.extension.warpx.getistep(lev=0) - 1
if step % self.diag_steps != 0:
return
Ez_warpx = fields.EzWrapper()[...]
if libwarpx.amr.ParallelDescriptor.MyProc() != 0:
return
t = step * self.dt
z_vals = np.linspace(0, self.Lz, self.Nz, endpoint=False)
if self.dim == 1:
Ez = Ez_warpx
elif self.dim == 2:
Ez = np.mean(Ez_warpx, axis=0)
else:
Ez = np.mean(Ez_warpx, axis=(0, 1))
with open(f"diags/{self.output_file_name}", 'a') as f:
for ii in range(self.Nz):
f.write(
f"{step:05d} {t:.10e} {z_vals[ii]:.10e} {Ez[ii]:+.10e}\n"
)
##########################
# parse input parameters
##########################
parser = argparse.ArgumentParser()
parser.add_argument(
'-t', '--test', help='toggle whether this script is run as a short CI test',
action='store_true',
)
parser.add_argument(
'-d', '--dim', help='Simulation dimension', required=False, type=int,
default=1
)
parser.add_argument(
'-m', help='Mode number to excite', required=False, type=int,
default=4
)
parser.add_argument(
'--temp_ratio', help='Ratio of ion to electron temperature', required=False,
type=float, default=1.0/3
)
parser.add_argument(
'-v', '--verbose', help='Verbose output', action='store_true',
)
args, left = parser.parse_known_args()
sys.argv = sys.argv[:1]+left
run = IonLandauDamping(
test=args.test, dim=args.dim, m=args.m, T_ratio=args.temp_ratio,
verbose=args.verbose
)
simulation.step()
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
python3 PICMI_inputs.py -dim {1/2/3} --temp_ratio {value}
Analyze
The following script extracts the amplitude of the seeded mode as a function of time and compares it to the theoretical damping rate.
Script analysis.py
Examples/Tests/ohm_solver_ion_Landau_damping/analysis.py
.#!/usr/bin/env python3
#
# --- Analysis script for the hybrid-PIC example of ion Landau damping.
import dill
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from pywarpx import picmi
constants = picmi.constants
matplotlib.rcParams.update({'font.size': 20})
# load simulation parameters
with open(f'sim_parameters.dpkl', 'rb') as f:
sim = dill.load(f)
# theoretical damping rates were taken from Fig. 14b of Munoz et al.
theoretical_damping_rate = np.array([
[0.09456706, 0.05113443], [0.09864177, 0.05847507],
[0.10339559, 0.0659153 ], [0.10747029, 0.07359366],
[0.11290323, 0.08256106], [0.11833616, 0.09262114],
[0.12580645, 0.10541121], [0.13327674, 0.11825558],
[0.14006791, 0.13203098], [0.14889643, 0.14600538],
[0.15772496, 0.16379615], [0.16791171, 0.18026693],
[0.17606112, 0.19650209], [0.18828523, 0.21522808],
[0.19983022, 0.23349062], [0.21273345, 0.25209216],
[0.22835314, 0.27877403], [0.24465195, 0.30098317],
[0.25959253, 0.32186286], [0.27657046, 0.34254601],
[0.29626486, 0.36983567], [0.3139219 , 0.38984826],
[0.33157895, 0.40897973], [0.35195246, 0.43526107],
[0.37368421, 0.45662113], [0.39745331, 0.47902942],
[0.44974533, 0.52973074], [0.50747029, 0.57743925],
[0.57334465, 0.63246726], [0.64193548, 0.67634255]
])
expected_gamma = np.interp(
sim.T_ratio, theoretical_damping_rate[:, 0], theoretical_damping_rate[:, 1]
)
data = np.loadtxt("diags/field_data.txt", skiprows=1)
field_idx_dict = {'z': 2, 'Ez': 3}
step = data[:,0]
num_steps = len(np.unique(step))
# get the spatial resolution
resolution = len(np.where(step == 0)[0]) - 1
# reshape to separate spatial and time coordinates
sim_data = data.reshape((num_steps, resolution+1, data.shape[1]))
z_grid = sim_data[1, :, field_idx_dict['z']]
idx = np.argsort(z_grid)[1:]
dz = np.mean(np.diff(z_grid[idx]))
dt = np.mean(np.diff(sim_data[:,0,1]))
data = np.zeros((num_steps, resolution))
for i in range(num_steps):
data[i,:] = sim_data[i,idx,field_idx_dict['Ez']]
print(f"Data file contains {num_steps} time snapshots.")
print(f"Spatial resolution is {resolution}")
field_kt = np.fft.fft(data[:, :], axis=1)
t_norm = 2.0 * np.pi * sim.m / sim.Lz * sim.v_ti
# Plot the 4th Fourier mode
fig, ax1 = plt.subplots(1, 1, figsize=(10, 5))
t_points = np.arange(num_steps)*dt*t_norm
ax1.plot(
t_points, np.abs(field_kt[:, sim.m] / field_kt[0, sim.m]), 'r',
label=f'$T_i/T_e$ = {sim.T_ratio:.2f}'
)
# Plot a line showing the expected damping rate
t_points = t_points[np.where(t_points < 8)]
ax1.plot(
t_points, np.exp(-t_points*expected_gamma), 'k--', lw=2
)
ax1.grid()
ax1.legend()
ax1.set_yscale('log')
ax1.set_ylabel('$|E_z|/E_0$')
ax1.set_xlabel('t $(k_mv_{th,i})$')
ax1.set_xlim(0, 18)
ax1.set_title(f"Ion Landau damping - {sim.dim}d")
plt.tight_layout()
plt.savefig(f"diags/ion_Landau_damping_T_ratio_{sim.T_ratio}.png")
if sim.test:
import os
import sys
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
test_name = os.path.split(os.getcwd())[1]
checksumAPI.evaluate_checksum(test_name, fn)
The figure below shows a set of such simulations with parameters matching those described in section 4.5 of Muñoz et al. [1]. The straight lines show the theoretical damping rate for the given temperature ratios.

Decay of seeded modes as a function of time for different electron-ion temperature ratios. The theoretical damping of the given modes are shown in dashed lines.
High-Performance Computing and Numerics
The following examples are commonly used to study the performance of WarpX, e.g., for computing efficiency, scalability, and I/O patterns. While all prior examples are used for such studies as well, the examples here need less explanation on the physics, less-detail tuning on load balancing, and often simply scale (weak or strong) by changing the number of cells, AMReX block size and number of compute units.
Uniform Plasma
This example evolves a uniformly distributed, hot plasma over time.
Run
For MPI-parallel runs, prefix these lines with mpiexec -n 4 ...
or srun -n 4 ...
, depending on the system.
Note
TODO: This input file should be created following the inputs_3d
file.
This example can be run either as WarpX executable using an input file: warpx.3d inputs_3d
You can copy this file fromusage/examples/lwfa/inputs_3d
.################################# ####### GENERAL PARAMETERS ###### ################################# max_step = 10 amr.n_cell = 64 32 32 amr.max_grid_size = 32 amr.blocking_factor = 16 amr.max_level = 0 geometry.dims = 3 geometry.prob_lo = -20.e-6 -20.e-6 -20.e-6 # physical domain geometry.prob_hi = 20.e-6 20.e-6 20.e-6 ################################# ####### Boundary condition ###### ################################# boundary.field_lo = periodic periodic periodic boundary.field_hi = periodic periodic periodic ################################# ############ NUMERICS ########### ################################# warpx.serialize_initial_conditions = 1 warpx.verbose = 1 warpx.cfl = 1.0 # Order of particle shape factors algo.particle_shape = 1 ################################# ############ PLASMA ############# ################################# particles.species_names = electrons electrons.species_type = electron electrons.injection_style = "NUniformPerCell" electrons.num_particles_per_cell_each_dim = 1 1 2 electrons.profile = constant electrons.density = 1.e25 # number of electrons per m^3 electrons.momentum_distribution_type = "gaussian" electrons.ux_th = 0.01 # uth the std of the (unitless) momentum electrons.uy_th = 0.01 # uth the std of the (unitless) momentum electrons.uz_th = 0.01 # uth the std of the (unitless) momentum # Diagnostics diagnostics.diags_names = diag1 chk diag1.intervals = 4 diag1.diag_type = Full diag1.electrons.variables = ux uy uz w diag1.fields_to_plot = Bx By Bz Ex Ey Ez jx jy jz rho chk.intervals = 6 chk.diag_type = Full chk.format = checkpoint
Note
TODO: This input file should be created following the inputs_2d
file.
This example can be run either as WarpX executable using an input file: warpx.2d inputs_2d
You can copy this file fromusage/examples/lwfa/inputs_2d
.################################# ####### GENERAL PARAMETERS ###### ################################# max_step = 10 amr.n_cell = 128 128 amr.max_grid_size = 64 amr.blocking_factor = 32 amr.max_level = 0 geometry.dims = 2 geometry.prob_lo = -20.e-6 -20.e-6 # physical domain geometry.prob_hi = 20.e-6 20.e-6 ################################# ####### Boundary condition ###### ################################# boundary.field_lo = periodic periodic boundary.field_hi = periodic periodic ################################# ############ NUMERICS ########### ################################# warpx.serialize_initial_conditions = 1 warpx.verbose = 1 warpx.cfl = 1.0 warpx.use_filter = 0 # Order of particle shape factors algo.particle_shape = 1 ################################# ############ PLASMA ############# ################################# particles.species_names = electrons electrons.charge = -q_e electrons.mass = m_e electrons.injection_style = "NUniformPerCell" electrons.num_particles_per_cell_each_dim = 2 2 electrons.profile = constant electrons.density = 1.e25 # number of electrons per m^3 electrons.momentum_distribution_type = "gaussian" electrons.ux_th = 0.01 # uth the std of the (unitless) momentum electrons.uy_th = 0.01 # uth the std of the (unitless) momentum electrons.uz_th = 0.01 # uth the std of the (unitless) momentum # Diagnostics diagnostics.diags_names = diag1 diag1.intervals = 10 diag1.diag_type = Full
Analyze
Note
This section is TODO.
Visualize
Note
This section is TODO.
Manipulating fields via Python
Note
TODO: The section needs to be sorted into either science cases (above) or later sections (workflows and Python API details).
An example of using Python to access the simulation charge density, solve the Poisson equation (using superLU
) and write the resulting electrostatic potential back to the simulation is given in the input file below. This example uses the fields.py
module included in the pywarpx
library.
An example of initializing the fields by accessing their data through Python, advancing the simulation for a chosen number of time steps, and plotting the fields again through Python. The simulation runs with 128 regular cells, 8 guard cells, and 10 PML cells, in each direction. Moreover, it uses div(E) and div(B) cleaning both in the regular grid and in the PML and initializes all available electromagnetic fields (E,B,F,G) identically.
Many Further Examples, Demos and Tests
WarpX runs over 200 integration tests on a variety of modeling cases, which validate and demonstrate its functionality. Please see the Examples/Tests/ directory for many more examples.
Example References
P. A. Muñoz, N. Jain, P. Kilian, and J. Büchner. A new hybrid code (CHIEF) implementing the inertial electron fluid equation without approximation. Computer Physics Communications, 224:245–264, 2018. URL: https://www.sciencedirect.com/science/article/pii/S0010465517303521, doi:https://doi.org/10.1016/j.cpc.2017.10.012.
T. Tajima and J. M. Dawson. Laser accelerator by plasma waves. AIP Conference Proceedings, 91(1):69–93, Sep 1982. URL: https://doi.org/10.1063/1.33805, doi:10.1063/1.33805.
E. Esarey, P. Sprangle, J. Krall, and A. Ting. Overview of plasma-based accelerator concepts. IEEE Transactions on Plasma Science, 24(2):252–288, 1996. doi:10.1109/27.509991.
S. C. Wilks, A. B. Langdon, T. E. Cowan, M. Roth, M. Singh, S. Hatchett, M. H. Key, D. Pennington, A. MacKinnon, and R. A. Snavely. Energetic proton generation in ultra-intense laser–solid interactions. Physics of Plasmas, 8(2):542–549, Feb 2001. URL: https://doi.org/10.1063/1.1333697, arXiv:https://pubs.aip.org/aip/pop/article-pdf/8/2/542/12669088/542\_1\_online.pdf, doi:10.1063/1.1333697.
S. S. Bulanov, A. Brantov, V. Yu. Bychenkov, V. Chvykov, G. Kalinchenko, T. Matsuoka, P. Rousseau, S. Reed, V. Yanovsky, D. W. Litzenberg, K. Krushelnick, and A. Maksimchuk. Accelerating monoenergetic protons from ultrathin foils by flat-top laser pulses in the directed-Coulomb-explosion regime. Phys. Rev. E, 78:026412, Aug 2008. URL: https://link.aps.org/doi/10.1103/PhysRevE.78.026412, doi:10.1103/PhysRevE.78.026412.
A. Macchi, M. Borghesi, and M. Passoni. Ion acceleration by superintense laser-plasma interaction. Rev. Mod. Phys., 85:751–793, May 2013. URL: https://link.aps.org/doi/10.1103/RevModPhys.85.751, doi:10.1103/RevModPhys.85.751.
B. Dromey, S. Kar, M. Zepf, and P. Foster. The plasma mirror—A subpicosecond optical switch for ultrahigh power lasers. Review of Scientific Instruments, 75(3):645–649, Feb 2004. URL: https://doi.org/10.1063/1.1646737, arXiv:https://pubs.aip.org/aip/rsi/article-pdf/75/3/645/8814694/645\_1\_online.pdf, doi:10.1063/1.1646737.
C. Rödel, M. Heyer, M. Behmke, M. Kübel, O. Jäckel, W. Ziegler, D. Ehrt, M. C. Kaluza, and G. G. Paulus. High repetition rate plasma mirror for temporal contrast enhancement of terawatt femtosecond laser pulses by three orders of magnitude. Applied Physics B, 103(2):295–302, Nov 2010. URL: http://dx.doi.org/10.1007/s00340-010-4329-7, doi:10.1007/s00340-010-4329-7.
V. Yakimenko, S. Meuren, F. Del Gaudio, C. Baumann, A. Fedotov, F. Fiuza, T. Grismayer, M. J. Hogan, A. Pukhov, L. O. Silva, and G. White. Prospect of studying nonperturbative qed with beam-beam collisions. Phys. Rev. Lett., 122:190404, May 2019. doi:10.1103/PhysRevLett.122.190404.
A. Le, W. Daughton, H. Karimabadi, and J. Egedal. Hybrid simulations of magnetic reconnection with kinetic ions and fluid electron pressure anisotropy. Physics of Plasmas, Mar 2016. 032114. URL: https://doi.org/10.1063/1.4943893, doi:10.1063/1.4943893.
M. M. Turner, A. Derzsi, Z. Donkó, D. Eremin, S. J. Kelly, T. Lafleur, and T. Mussenbrock. Simulation benchmarks for low-pressure plasmas: Capacitive discharges. Physics of Plasmas, Jan 2013. 013507. URL: https://doi.org/10.1063/1.4775084, doi:10.1063/1.4775084.
T. H. Stix. Waves in Plasmas. American Inst. of Physics, 1992. ISBN 978-0-88318-859-0. URL: https://books.google.com/books?id=OsOWJ8iHpmMC.
Parameters: Python (PICMI)
This documents on how to use WarpX as a Python script (e.g., python3 PICMI_script.py
).
WarpX uses the PICMI standard for its Python input files. Complete example input files can be found in the examples section.
In the input file, instances of classes are created defining the various aspects of the simulation.
A variable of type pywarpx.picmi.Simulation
is the central object to which all other options are passed, defining the simulation time, field solver, registered species, etc.
Once the simulation is fully configured, it can be used in one of two modes. Interactive use is the most common and can be extended with custom runtime functionality:
step()
: run directly from Python
When run directly from Python, one can also extend WarpX with further custom user logic. See the detailed workflow page on how to extend WarpX from Python.
Simulation and Grid Setup
- class pywarpx.picmi.Simulation(solver=None, time_step_size=None, max_steps=None, max_time=None, verbose=None, particle_shape='linear', gamma_boost=None, cpu_split=None, load_balancing=None, **kw)[source]
Creates a Simulation object
- Parameters:
solver (field solver instance) – This is the field solver to be used in the simulation. It should be an instance of field solver classes.
time_step_size (float) – Absolute time step size of the simulation [s]. Needed if the CFL is not specified elsewhere.
max_steps (integer) – Maximum number of time steps. Specify either this, or max_time, or use the step function directly.
max_time (float) – Maximum physical time to run the simulation [s]. Specify either this, or max_steps, or use the step function directly.
verbose (integer, optional) – Verbosity flag. A larger integer results in more verbose output
particle_shape ({'NGP', 'linear', 'quadratic', 'cubic'}) – Default particle shape for species added to this simulation
gamma_boost (float, optional) – Lorentz factor of the boosted simulation frame. Note that all input values should be in the lab frame.
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_current_deposition_algo ({'direct', 'esirkepov', and 'vay'}, optional) – Current deposition algorithm. The default depends on conditions.
warpx_charge_deposition_algo ({'standard'}, optional) – Charge deposition algorithm.
warpx_field_gathering_algo ({'energy-conserving', 'momentum-conserving'}, optional) – Field gathering algorithm. The default depends on conditions.
warpx_particle_pusher_algo ({'boris', 'vay', 'higuera'}, default='boris') – Particle pushing algorithm.
warpx_use_filter (bool, optional) – Whether to use filtering. The default depends on the conditions.
warpx_do_multi_J (bool, default=0) – Whether to use the multi-J algorithm, where current deposition and field update are performed multiple times within each time step.
warpx_do_multi_J_n_depositions (integer) – Number of sub-steps to use with the multi-J algorithm, when
warpx_do_multi_J=1
. Note that this input parameter is not optional and must always be set in all input files wherewarpx.do_multi_J=1
. No default value is provided automatically.warpx_grid_type ({'collocated', 'staggered', 'hybrid'}, default='staggered') – Whether to use a collocated grid (all fields defined at the cell nodes), a staggered grid (fields defined on a Yee grid), or a hybrid grid (fields and currents are interpolated back and forth between a staggered grid and a collocated grid, must be used with momentum-conserving field gathering algorithm).
warpx_do_current_centering (bool, optional) – If true, the current is deposited on a nodal grid and then centered to a staggered grid (Yee grid), using finite-order interpolation. Default: warpx.do_current_centering=0 with collocated or staggered grids, warpx.do_current_centering=1 with hybrid grids.
warpx_field_centering_nox/noy/noz (integer, optional) – The order of interpolation used with staggered or hybrid grids (
warpx_grid_type=staggered
orwarpx_grid_type=hybrid
) and momentum-conserving field gathering (warpx_field_gathering_algo=momentum-conserving
) to interpolate the electric and magnetic fields from the cell centers to the cell nodes, before gathering the fields from the cell nodes to the particle positions. Default:warpx_field_centering_no<x,y,z>=2
with staggered grids,warpx_field_centering_no<x,y,z>=8
with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).warpx_current_centering_nox/noy/noz (integer, optional) – The order of interpolation used with hybrid grids (
warpx_grid_type=hybrid
) to interpolate the currents from the cell nodes to the cell centers whenwarpx_do_current_centering=1
, before pushing the Maxwell fields on staggered grids. Default:warpx_current_centering_no<x,y,z>=8
with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).warpx_serialize_initial_conditions (bool, default=False) – Controls the random numbers used for initialization. This parameter should only be used for testing and continuous integration.
warpx_random_seed (string or int, optional) – (See documentation)
warpx_do_dynamic_scheduling (bool, default=True) – Whether to do dynamic scheduling with OpenMP
warpx_load_balance_intervals (string, default='0') – The intervals for doing load balancing
warpx_load_balance_efficiency_ratio_threshold (float, default=1.1) – (See documentation)
warpx_load_balance_with_sfc (bool, default=0) – (See documentation)
warpx_load_balance_knapsack_factor (float, default=1.24) – (See documentation)
warpx_load_balance_costs_update ({'heuristic' or 'timers'}, optional) – (See documentation)
warpx_costs_heuristic_particles_wt (float, optional) – (See documentation)
warpx_costs_heuristic_cells_wt (float, optional) – (See documentation)
warpx_use_fdtd_nci_corr (bool, optional) – Whether to use the NCI correction when using the FDTD solver
warpx_amr_check_input (bool, optional) – Whether AMReX should perform checks on the input (primarily related to the max grid size and blocking factors)
warpx_amr_restart (string, optional) – The name of the restart to use
warpx_amrex_the_arena_is_managed (bool, optional) – Whether to use managed memory in the AMReX Arena
warpx_amrex_the_arena_init_size (long int, optional) – The amount of memory in bytes to allocate in the Arena.
warpx_amrex_use_gpu_aware_mpi (bool, optional) – Whether to use GPU-aware MPI communications
warpx_zmax_plasma_to_compute_max_step (float, optional) – Sets the simulation run time based on the maximum z value
warpx_compute_max_step_from_btd (bool, default=0) – If specified, automatically calculates the number of iterations required in the boosted frame for all back-transformed diagnostics to be completed.
warpx_collisions (collision instance, optional) – The collision instance specifying the particle collisions
warpx_embedded_boundary (embedded boundary instance, optional) –
warpx_break_signals (list of strings) – Signals on which to break
warpx_checkpoint_signals (list of strings) – Signals on which to write out a checkpoint
warpx_numprocs (list of ints (1 in 1D, 2 in 2D, 3 in 3D)) – Domain decomposition on the coarsest level. The domain will be chopped into the exact number of pieces in each dimension as specified by this parameter. https://warpx.readthedocs.io/en/latest/usage/parameters.html#distribution-across-mpi-ranks-and-parallelization https://warpx.readthedocs.io/en/latest/usage/domain_decomposition.html#simple-method
warpx_sort_intervals (string, optional (defaults: -1 on CPU; 4 on GPU)) – Using the Intervals parser syntax, this string defines the timesteps at which particles are sorted. If <=0, do not sort particles. It is turned on on GPUs for performance reasons (to improve memory locality).
warpx_sort_particles_for_deposition (bool, optional (default: true for the CUDA backend, otherwise false)) – This option controls the type of sorting used if particle sorting is turned on, i.e. if sort_intervals is not <=0. If true, particles will be sorted by cell to optimize deposition with many particles per cell, in the order x -> y -> z -> ppc. If false, particles will be sorted by bin, using the sort_bin_size parameter below, in the order ppc -> x -> y -> z. true is recommended for best performance on NVIDIA GPUs, especially if there are many particles per cell.
warpx_sort_idx_type (list of int, optional (default: 0 0 0)) –
This controls the type of grid used to sort the particles when sort_particles_for_deposition is true. Possible values are:
idx_type = {0, 0, 0}: Sort particles to a cell centered grid,
idx_type = {1, 1, 1}: Sort particles to a node centered grid,
idx_type = {2, 2, 2}: Compromise between a cell and node centered grid.
In 2D (XZ and RZ), only the first two elements are read. In 1D, only the first element is read.
warpx_sort_bin_size (list of int, optional (default 1 1 1)) – If sort_intervals is activated and sort_particles_for_deposition is false, particles are sorted in bins of sort_bin_size cells. In 2D, only the first two elements are read.
warpx_used_inputs_file (string, optional) – The name of the text file that the used input parameters is written to,
- add_applied_field(applied_field)
Add an applied field
- Parameters:
applied_field (applied field instance) – One of the applied field instance. Specifies the properties of the applied field.
- add_laser(laser, injection_method)
Add a laser pulses that to be injected in the simulation
- Parameters:
laser_profile (laser instance) – One of laser profile instances. Specifies the physical properties of the laser pulse (e.g. spatial and temporal profile, wavelength, amplitude, etc.).
injection_method (laser injection instance, optional) – Specifies how the laser is injected (numerically) into the simulation (e.g. through a laser antenna, or directly added to the mesh). This argument describes an algorithm, not a physical object. It is up to each code to define the default method of injection, if the user does not provide injection_method.
- add_species(species, layout, initialize_self_field=None)
Add species to be used in the simulation
- Parameters:
species (species instance) – An instance of one of the PICMI species objects. Defines species to be added from the physical point of view (e.g. charge, mass, initial distribution of particles).
layout (layout instance) – An instance of one of the PICMI particle layout objects. Defines how particles are added into the simulation, from the numerical point of view.
initialize_self_field (bool, optional) – Whether the initial space-charge fields of this species is calculated and added to the simulation
- step(nsteps=None, mpi_comm=None)[source]
Run the simulation for nsteps timesteps
- Parameters:
nsteps (integer, default=1) – The number of timesteps
- write_input_file(file_name='inputs')[source]
Write the parameters of the simulation, as defined in the PICMI input, into a code-specific input file.
This can be used for codes that are not Python-driven (e.g. compiled, pure C++ or Fortran codes) and expect a text input in a given format.
- Parameters:
file_name (string) – The path to the file that will be created
- class pywarpx.picmi.Cartesian3DGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nx=None, ny=None, nz=None, xmin=None, xmax=None, ymin=None, ymax=None, zmin=None, zmax=None, bc_xmin=None, bc_xmax=None, bc_ymin=None, bc_ymax=None, bc_zmin=None, bc_zmax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, xmin_particles=None, xmax_particles=None, ymin_particles=None, ymax_particles=None, zmin_particles=None, zmax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_xmin_particles=None, bc_xmax_particles=None, bc_ymin_particles=None, bc_ymax_particles=None, bc_zmin_particles=None, bc_zmax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]
Three dimensional Cartesian grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)
- Parameters:
number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)
lower_bound (vector of floats) – Position of the node at the lower bound [m]
upper_bound (vector of floats) – Position of the node at the upper bound [m]
lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
nx (integer) – Number of cells along X (number of nodes=nx+1)
ny (integer) – Number of cells along Y (number of nodes=ny+1)
nz (integer) – Number of cells along Z (number of nodes=nz+1)
xmin (float) – Position of first node along X [m]
xmax (float) – Position of last node along X [m]
ymin (float) – Position of first node along Y [m]
ymax (float) – Position of last node along Y [m]
zmin (float) – Position of first node along Z [m]
zmax (float) – Position of last node along Z [m]
bc_xmin (string) – Boundary condition at min X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_xmax (string) – Boundary condition at max X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_ymin (string) – Boundary condition at min Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_ymax (string) – Boundary condition at max Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_zmin (string) – Boundary condition at min Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_zmax (string) – Boundary condition at max Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]
refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 3 specifying the extent of the region, and refinement_factor defaulting to [2,2,2] (relative to next lower level)
lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]
upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]
xmin_particles (float, optional) – Position of min particle boundary along X [m]
xmax_particles (float, optional) – Position of max particle boundary along X [m]
ymin_particles (float, optional) – Position of min particle boundary along Y [m]
ymax_particles (float, optional) – Position of max particle boundary along Y [m]
float (zmin_particles) – Position of min particle boundary along Z [m]
optional – Position of min particle boundary along Z [m]
zmax_particles (float, optional) – Position of max particle boundary along Z [m]
lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal
upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal
bc_xmin_particles (string, optional) – Boundary condition at min X for particles: One of periodic, absorbing, reflect, thermal
bc_xmax_particles (string, optional) – Boundary condition at max X for particles: One of periodic, absorbing, reflect, thermal
bc_ymin_particles (string, optional) – Boundary condition at min Y for particles: One of periodic, absorbing, reflect, thermal
bc_ymax_particles (string, optional) – Boundary condition at max Y for particles: One of periodic, absorbing, reflect, thermal
bc_zmin_particles (string, optional) – Boundary condition at min Z for particles: One of periodic, absorbing, reflect, thermal
bc_zmax_particles (string, optional) – Boundary condition at max Z for particles: One of periodic, absorbing, reflect, thermal
guard_cells (vector of integers, optional) – Number of guard cells used along each direction
pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction
References
absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,
B Engquist and A. Majdat, “Absorbing boundary conditions for numerical simulation of waves,” 1977, https://doi.org/10.1073/pnas.74.5.1765
R. Lehe, “Electromagnetic wave propagation in Particle-In-Cell codes,” 2016, US Particle Accelerator School (USPAS) Summer Session, Self-Consistent Simulations of Beam and Plasma Systems https://people.nscl.msu.edu/~lund/uspas/scs_2016/lec_adv/A1b_EM_Waves.pdf
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_max_grid_size (integer, default=32) – Maximum block size in either direction
warpx_max_grid_size_x (integer, optional) – Maximum block size in x direction
warpx_max_grid_size_y (integer, optional) – Maximum block size in z direction
warpx_max_grid_size_z (integer, optional) – Maximum block size in z direction
warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)
warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the x direction
warpx_blocking_factor_y (integer, optional) – Blocking factor (which controls the block size) in the z direction
warpx_blocking_factor_z (integer, optional) – Blocking factor (which controls the block size) in the z direction
warpx_potential_lo_x (float, default=0.) – Electrostatic potential on the lower x boundary
warpx_potential_hi_x (float, default=0.) – Electrostatic potential on the upper x boundary
warpx_potential_lo_y (float, default=0.) – Electrostatic potential on the lower z boundary
warpx_potential_hi_y (float, default=0.) – Electrostatic potential on the upper z boundary
warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower z boundary
warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper z boundary
warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts
warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.
warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.
- class pywarpx.picmi.Cartesian2DGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nx=None, ny=None, xmin=None, xmax=None, ymin=None, ymax=None, bc_xmin=None, bc_xmax=None, bc_ymin=None, bc_ymax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, xmin_particles=None, xmax_particles=None, ymin_particles=None, ymax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_xmin_particles=None, bc_xmax_particles=None, bc_ymin_particles=None, bc_ymax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]
Two dimensional Cartesian grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)
- Parameters:
number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)
lower_bound (vector of floats) – Position of the node at the lower bound [m]
upper_bound (vector of floats) – Position of the node at the upper bound [m]
lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
nx (integer) – Number of cells along X (number of nodes=nx+1)
ny (integer) – Number of cells along Y (number of nodes=ny+1)
xmin (float) – Position of first node along X [m]
xmax (float) – Position of last node along X [m]
ymin (float) – Position of first node along Y [m]
ymax (float) – Position of last node along Y [m]
bc_xmin (vector of strings) – Boundary condition at min X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_xmax (vector of strings) – Boundary condition at max X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_ymin (vector of strings) – Boundary condition at min Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_ymax (vector of strings) – Boundary condition at max Y: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]
refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 2 specifying the extent of the region, and refinement_factor defaulting to [2,2] (relative to next lower level)
lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]
upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]
xmin_particles (float, optional) – Position of min particle boundary along X [m]
xmax_particles (float, optional) – Position of max particle boundary along X [m]
ymin_particles (float, optional) – Position of min particle boundary along Y [m]
ymax_particles (float, optional) – Position of max particle boundary along Y [m]
lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal
upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal
bc_xmin_particles (string, optional) – Boundary condition at min X for particles: One of periodic, absorbing, reflect, thermal
bc_xmax_particles (string, optional) – Boundary condition at max X for particles: One of periodic, absorbing, reflect, thermal
bc_ymin_particles (string, optional) – Boundary condition at min Y for particles: One of periodic, absorbing, reflect, thermal
bc_ymax_particles (string, optional) – Boundary condition at max Y for particles: One of periodic, absorbing, reflect, thermal
guard_cells (vector of integers, optional) – Number of guard cells used along each direction
pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction
References
absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,
B Engquist and A. Majdat, “Absorbing boundary conditions for numerical simulation of waves,” 1977, https://doi.org/10.1073/pnas.74.5.1765
R. Lehe, “Electromagnetic wave propagation in Particle-In-Cell codes,” 2016, US Particle Accelerator School (USPAS) Summer Session, Self-Consistent Simulations of Beam and Plasma Systems https://people.nscl.msu.edu/~lund/uspas/scs_2016/lec_adv/A1b_EM_Waves.pdf
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_max_grid_size (integer, default=32) – Maximum block size in either direction
warpx_max_grid_size_x (integer, optional) – Maximum block size in x direction
warpx_max_grid_size_y (integer, optional) – Maximum block size in z direction
warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)
warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the x direction
warpx_blocking_factor_y (integer, optional) – Blocking factor (which controls the block size) in the z direction
warpx_potential_lo_x (float, default=0.) – Electrostatic potential on the lower x boundary
warpx_potential_hi_x (float, default=0.) – Electrostatic potential on the upper x boundary
warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower z boundary
warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper z boundary
warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts
warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.
warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.
- class pywarpx.picmi.Cartesian1DGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nx=None, xmin=None, xmax=None, bc_xmin=None, bc_xmax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, xmin_particles=None, xmax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_xmin_particles=None, bc_xmax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]
One-dimensional Cartesian grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)
- Parameters:
number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)
lower_bound (vector of floats) – Position of the node at the lower bound [m]
upper_bound (vector of floats) – Position of the node at the upper bound [m]
lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
nx (integer) – Number of cells along X (number of nodes=nx+1)
xmin (float) – Position of first node along X [m]
xmax (float) – Position of last node along X [m]
bc_xmin (vector of strings) – Boundary condition at min X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_xmax (vector of strings) – Boundary condition at max X: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]
refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 2 specifying the extent of the region, and refinement_factor defaulting to [2,2] (relative to next lower level)
lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]
upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]
xmin_particles (float, optional) – Position of min particle boundary along X [m]
xmax_particles (float, optional) – Position of max particle boundary along X [m]
lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal
upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal
bc_xmin_particles (string, optional) – Boundary condition at min X for particles: One of periodic, absorbing, reflect, thermal
bc_xmax_particles (string, optional) – Boundary condition at max X for particles: One of periodic, absorbing, reflect, thermal
guard_cells (vector of integers, optional) – Number of guard cells used along each direction
pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction
References
absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,
B Engquist and A. Majdat, “Absorbing boundary conditions for numerical simulation of waves,” 1977, https://doi.org/10.1073/pnas.74.5.1765
R. Lehe, “Electromagnetic wave propagation in Particle-In-Cell codes,” 2016, US Particle Accelerator School (USPAS) Summer Session, Self-Consistent Simulations of Beam and Plasma Systems https://people.nscl.msu.edu/~lund/uspas/scs_2016/lec_adv/A1b_EM_Waves.pdf
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_max_grid_size (integer, default=32) – Maximum block size in either direction
warpx_max_grid_size_x (integer, optional) – Maximum block size in longitudinal direction
warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)
warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the longitudinal direction
warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower longitudinal boundary
warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper longitudinal boundary
warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts
warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.
warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.
- class pywarpx.picmi.CylindricalGrid(number_of_cells=None, lower_bound=None, upper_bound=None, lower_boundary_conditions=None, upper_boundary_conditions=None, nr=None, nz=None, n_azimuthal_modes=None, rmin=None, rmax=None, zmin=None, zmax=None, bc_rmin=None, bc_rmax=None, bc_zmin=None, bc_zmax=None, moving_window_velocity=None, refined_regions=[], lower_bound_particles=None, upper_bound_particles=None, rmin_particles=None, rmax_particles=None, zmin_particles=None, zmax_particles=None, lower_boundary_conditions_particles=None, upper_boundary_conditions_particles=None, bc_rmin_particles=None, bc_rmax_particles=None, bc_zmin_particles=None, bc_zmax_particles=None, guard_cells=None, pml_cells=None, **kw)[source]
Axisymmetric, cylindrical grid Parameters can be specified either as vectors or separately. (If both are specified, the vector is used.)
- Parameters:
number_of_cells (vector of integers) – Number of cells along each axis (number of nodes is number_of_cells+1)
lower_bound (vector of floats) – Position of the node at the lower bound [m]
upper_bound (vector of floats) – Position of the node at the upper bound [m]
lower_boundary_conditions (vector of strings) – Conditions at lower boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
upper_boundary_conditions (vector of strings) – Conditions at upper boundaries, periodic, open, dirichlet, absorbing_silver_mueller, or neumann
nr (integer) – Number of cells along R (number of nodes=nr+1)
nz (integer) – Number of cells along Z (number of nodes=nz+1)
n_azimuthal_modes (integer) – Number of azimuthal modes
rmin (float) – Position of first node along R [m]
rmax (float) – Position of last node along R [m]
zmin (float) – Position of first node along Z [m]
zmax (float) – Position of last node along Z [m]
bc_rmin (vector of strings) – Boundary condition at min R: One of open, dirichlet, absorbing_silver_mueller, or neumann
bc_rmax (vector of strings) – Boundary condition at max R: One of open, dirichlet, absorbing_silver_mueller, or neumann
bc_zmin (vector of strings) – Boundary condition at min Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
bc_zmax (vector of strings) – Boundary condition at max Z: One of periodic, open, dirichlet, absorbing_silver_mueller, or neumann
moving_window_velocity (vector of floats, optional) – Moving frame velocity [m/s]
refined_regions (list of lists, optional) – List of refined regions, each element being a list of the format [level, lo, hi, refinement_factor], with level being the refinement level, with 1 being the first level of refinement, 2 being the second etc, lo and hi being vectors of length 2 specifying the extent of the region, and refinement_factor defaulting to [2,2] (relative to next lower level)
lower_bound_particles (vector of floats, optional) – Position of particle lower bound [m]
upper_bound_particles (vector of floats, optional) – Position of particle upper bound [m]
rmin_particles (float, optional) – Position of min particle boundary along R [m]
rmax_particles (float, optional) – Position of max particle boundary along R [m]
zmin_particles (float, optional) – Position of min particle boundary along Z [m]
zmax_particles (float, optional) – Position of max particle boundary along Z [m]
lower_boundary_conditions_particles (vector of strings, optional) – Conditions at lower boundaries for particles, periodic, absorbing, reflect or thermal
upper_boundary_conditions_particles (vector of strings, optional) – Conditions at upper boundaries for particles, periodic, absorbing, reflect or thermal
bc_rmin_particles (string, optional) – Boundary condition at min R for particles: One of periodic, absorbing, reflect, thermal
bc_rmax_particles (string, optional) – Boundary condition at max R for particles: One of periodic, absorbing, reflect, thermal
bc_zmin_particles (string, optional) – Boundary condition at min Z for particles: One of periodic, absorbing, reflect, thermal
bc_zmax_particles (string, optional) – Boundary condition at max Z for particles: One of periodic, absorbing, reflect, thermal
guard_cells (vector of integers, optional) – Number of guard cells used along each direction
pml_cells (vector of integers, optional) – Number of Perfectly Matched Layer (PML) cells along each direction
References
absorbing_silver_mueller: A local absorbing boundary condition that works best under normal incidence angle. Based on the Silver-Mueller Radiation Condition, e.g., in - A. K. Belhora and L. Pichon, “Maybe Efficient Absorbing Boundary Conditions for the Finite Element Solution of 3D Scattering Problems,” 1995,
B Engquist and A. Majdat, “Absorbing boundary conditions for numerical simulation of waves,” 1977, https://doi.org/10.1073/pnas.74.5.1765
R. Lehe, “Electromagnetic wave propagation in Particle-In-Cell codes,” 2016, US Particle Accelerator School (USPAS) Summer Session, Self-Consistent Simulations of Beam and Plasma Systems https://people.nscl.msu.edu/~lund/uspas/scs_2016/lec_adv/A1b_EM_Waves.pdf
Implementation specific documentation
This assumes that WarpX was compiled with USE_RZ = TRUE
See Input Parameters for more information.
- Parameters:
warpx_max_grid_size (integer, default=32) – Maximum block size in either direction
warpx_max_grid_size_x (integer, optional) – Maximum block size in radial direction
warpx_max_grid_size_y (integer, optional) – Maximum block size in longitudinal direction
warpx_blocking_factor (integer, optional) – Blocking factor (which controls the block size)
warpx_blocking_factor_x (integer, optional) – Blocking factor (which controls the block size) in the radial direction
warpx_blocking_factor_y (integer, optional) – Blocking factor (which controls the block size) in the longitudinal direction
warpx_potential_lo_r (float, default=0.) – Electrostatic potential on the lower radial boundary
warpx_potential_hi_r (float, default=0.) – Electrostatic potential on the upper radial boundary
warpx_potential_lo_z (float, default=0.) – Electrostatic potential on the lower longitudinal boundary
warpx_potential_hi_z (float, default=0.) – Electrostatic potential on the upper longitudinal boundary
warpx_reflect_all_velocities (bool default=False) – Whether the sign of all of the particle velocities are changed upon reflection on a boundary, or only the velocity normal to the surface
warpx_start_moving_window_step (int, default=0) – The timestep at which the moving window starts
warpx_end_moving_window_step (int, default=-1) – The timestep at which the moving window ends. If -1, the moving window will continue until the end of the simulation.
warpx_boundary_u_th (dict, default=None) – If a thermal boundary is used for particles, this dictionary should specify the thermal speed for each species in the form {<species>: u_th}. Note: u_th = sqrt(T*q_e/mass)/clight with T in eV.
- class pywarpx.picmi.EmbeddedBoundary(implicit_function=None, stl_file=None, stl_scale=None, stl_center=None, stl_reverse_normal=False, potential=None, cover_multiple_cuts=None, **kw)[source]
Custom class to handle set up of embedded boundaries specific to WarpX. If embedded boundary initialization is added to picmistandard this can be changed to inherit that functionality. The geometry can be specified either as an implicit function or as an STL file (ASCII or binary). In the latter case the geometry specified in the STL file can be scaled, translated and inverted.
- Parameters:
implicit_function (string) – Analytic expression describing the embedded boundary
stl_file (string) – STL file path (string), file contains the embedded boundary geometry
stl_scale (float) – Factor by which the STL geometry is scaled
stl_center (vector of floats) – Vector by which the STL geometry is translated (in meters)
stl_reverse_normal (bool) – If True inverts the orientation of the STL geometry
potential (string, default=0.) – Analytic expression defining the potential. Can only be specified when the solver is electrostatic.
cover_multiple_cuts (bool, default=None) – Whether to cover cells with multiple cuts. (If False, this will raise an error if some cells have multiple cuts)
arguments. (Parameters used in the analytic expressions should be given as additional keyword) –
Field solvers define the updates of electric and magnetic fields.
- class pywarpx.picmi.ElectromagneticSolver(grid, method=None, stencil_order=None, cfl=None, source_smoother=None, field_smoother=None, subcycling=None, galilean_velocity=None, divE_cleaning=None, divB_cleaning=None, pml_divE_cleaning=None, pml_divB_cleaning=None, **kw)[source]
Electromagnetic field solver
- Parameters:
grid (grid instance) – Grid object for the diagnostic
method ({'Yee', 'CKC', 'Lehe', 'PSTD', 'PSATD', 'GPSTD', 'DS', 'ECT'}) –
The advance method use to solve Maxwell’s equations. The default method is code dependent.
’Yee’: standard solver using the staggered Yee grid (https://doi.org/10.1109/TAP.1966.1138693)
’CKC’: solver with the extended Cole-Karkkainen-Cowan stencil with better dispersion properties (https://doi.org/10.1103/PhysRevSTAB.16.041303)
’Lehe’: CKC-style solver with modified dispersion (https://doi.org/10.1103/PhysRevSTAB.16.021301)
’PSTD’: Spectral solver with finite difference in time domain, e.g., Q. H. Liu, Letters 15 (3) (1997) 158–165
’PSATD’: Spectral solver with analytic in time domain (https://doi.org/10.1016/j.jcp.2013.03.010)
’DS’: Directional Splitting after Yasuhiko Sentoku (https://doi.org/10.1140/epjd/e2014-50162-y)
’ECT’: Enlarged Cell Technique solver, allowing internal conductors (https://doi.org/10.1109/APS.2005.1551259)
stencil_order (vector of integers) – Order of stencil for each axis (-1=infinite)
cfl (float, optional) – Fraction of the Courant-Friedrich-Lewy criteria [1]
source_smoother (smoother instance, optional) – Smoother object to apply to the sources
field_smoother (smoother instance, optional) – Smoother object to apply to the fields
subcycling (integer, optional) – Level of subcycling for the GPSTD solver
galilean_velocity (vector of floats, optional) – Velocity of Galilean reference frame [m/s]
divE_cleaning (bool, optional) – Solver uses div(E) cleaning if True
divB_cleaning (bool, optional) – Solver uses div(B) cleaning if True
pml_divE_cleaning (bool, optional) – Solver uses div(E) cleaning in the PML if True
pml_divB_cleaning (bool, optional) – Solver uses div(B) cleaning in the PML if True
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_pml_ncell (integer, optional) – The depth of the PML, in number of cells
warpx_periodic_single_box_fft (bool, default=False) – Whether to do the spectral solver FFTs assuming a single simulation block
warpx_current_correction (bool, default=True) – Whether to do the current correction for the spectral solver. See documentation for exceptions to the default value.
warpx_psatd_update_with_rho (bool, optional) – Whether to update with the actual rho for the spectral solver
warpx_psatd_do_time_averaging (bool, optional) – Whether to do the time averaging for the spectral solver
warpx_psatd_J_in_time ({'constant', 'linear'}, default='constant') – This determines whether the current density is assumed to be constant or linear in time, within the time step over which the electromagnetic fields are evolved.
warpx_psatd_rho_in_time ({'linear'}, default='linear') – This determines whether the charge density is assumed to be linear in time, within the time step over which the electromagnetic fields are evolved.
warpx_do_pml_in_domain (bool, default=False) – Whether to do the PML boundaries within the domain (versus in the guard cells)
warpx_pml_has_particles (bool, default=False) – Whether to allow particles in the PML region
warpx_do_pml_j_damping (bool, default=False) – Whether to do damping of J in the PML
- class pywarpx.picmi.ElectrostaticSolver(grid, method=None, required_precision=None, maximum_iterations=None, **kw)[source]
Electrostatic field solver
- Parameters:
grid (grid instance) – Grid object for the diagnostic
method (string) – One of ‘FFT’, or ‘Multigrid’
required_precision (float, optional) – Level of precision required for iterative solvers
maximum_iterations (integer, optional) – Maximum number of iterations for iterative solvers
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_relativistic (bool, default=False) – Whether to use the relativistic solver or lab frame solver
warpx_absolute_tolerance (float, default=0.) – Absolute tolerance on the lab frame solver
warpx_self_fields_verbosity (integer, default=2) – Level of verbosity for the lab frame solver
Constants
For convenience, the PICMI interface defines the following constants, which can be used directly inside any PICMI script. The values are in SI units.
picmi.constants.c
: The speed of light in vacuum.picmi.constants.ep0
: The vacuum permittivity \(\epsilon_0\)picmi.constants.mu0
: The vacuum permeability \(\mu_0\)picmi.constants.q_e
: The elementary charge (absolute value of the charge of an electron).picmi.constants.m_e
: The electron masspicmi.constants.m_p
: The proton mass
Applied fields
Instances of the classes below need to be passed to the method add_applied_field of the Simulation class.
- class pywarpx.picmi.AnalyticInitialField(Ex_expression=None, Ey_expression=None, Ez_expression=None, Bx_expression=None, By_expression=None, Bz_expression=None, lower_bound=[None, None, None], upper_bound=[None, None, None], **kw)[source]
Describes an analytic applied field
The expressions should be in terms of the position and time, written as ‘x’, ‘y’, ‘z’, ‘t’. Parameters can be used in the expression with the values given as additional keyword arguments. Expressions should be relative to the lab frame.
- Parameters:
Ex_expression (string, optional) – Analytic expression describing Ex field [V/m]
Ey_expression (string, optional) – Analytic expression describing Ey field [V/m]
Ez_expression (string, optional) – Analytic expression describing Ez field [V/m]
Bx_expression (string, optional) – Analytic expression describing Bx field [T]
By_expression (string, optional) – Analytic expression describing By field [T]
Bz_expression (string, optional) – Analytic expression describing Bz field [T]
lower_bound (vector, optional) – Lower bound of the region where the field is applied [m].
upper_bound (vector, optional) – Upper bound of the region where the field is applied [m]
- class pywarpx.picmi.ConstantAppliedField(Ex=None, Ey=None, Ez=None, Bx=None, By=None, Bz=None, lower_bound=[None, None, None], upper_bound=[None, None, None], **kw)[source]
Describes a constant applied field
- Parameters:
Ex (float, default=0.) – Constant Ex field [V/m]
Ey (float, default=0.) – Constant Ey field [V/m]
Ez (float, default=0.) – Constant Ez field [V/m]
Bx (float, default=0.) – Constant Bx field [T]
By (float, default=0.) – Constant By field [T]
Bz (float, default=0.) – Constant Bz field [T]
lower_bound (vector, optional) – Lower bound of the region where the field is applied [m].
upper_bound (vector, optional) – Upper bound of the region where the field is applied [m]
- class pywarpx.picmi.AnalyticAppliedField(Ex_expression=None, Ey_expression=None, Ez_expression=None, Bx_expression=None, By_expression=None, Bz_expression=None, lower_bound=[None, None, None], upper_bound=[None, None, None], **kw)[source]
Describes an analytic applied field
The expressions should be in terms of the position and time, written as ‘x’, ‘y’, ‘z’, ‘t’. Parameters can be used in the expression with the values given as additional keyword arguments. Expressions should be relative to the lab frame.
- Parameters:
Ex_expression (string, optional) – Analytic expression describing Ex field [V/m]
Ey_expression (string, optional) – Analytic expression describing Ey field [V/m]
Ez_expression (string, optional) – Analytic expression describing Ez field [V/m]
Bx_expression (string, optional) – Analytic expression describing Bx field [T]
By_expression (string, optional) – Analytic expression describing By field [T]
Bz_expression (string, optional) – Analytic expression describing Bz field [T]
lower_bound (vector, optional) – Lower bound of the region where the field is applied [m].
upper_bound (vector, optional) – Upper bound of the region where the field is applied [m]
- class pywarpx.picmi.PlasmaLens(period, starts, lengths, strengths_E=None, strengths_B=None, **kw)[source]
Custom class to setup a plasma lens lattice. The applied fields are dependent only on the transverse position.
- Parameters:
period (float) – Periodicity of the lattice (in lab frame, in meters)
starts (list of floats) – The start of each lens relative to the periodic repeat
lengths (list of floats) – The length of each lens
strengths_E=None (list of floats, default = 0.) – The electric field strength of each lens
strengths_B=None (list of floats, default = 0.) – The magnetic field strength of each lens
The field that is applied depends on the transverse position of the particle, (x,y)
Ex = x*strengths_E
Ey = y*strengths_E
Bx = +y*strengths_B
By = -x*strengths_B
- class pywarpx.picmi.Mirror(x_front_location=None, y_front_location=None, z_front_location=None, depth=None, number_of_cells=None, **kw)[source]
Describes a perfectly reflecting mirror, where the E and B fields are zeroed out in a plane of finite thickness.
- Parameters:
x_front_location (float, optional (see comment below)) – Location in x of the front of the nirror [m]
y_front_location (float, optional (see comment below)) – Location in y of the front of the nirror [m]
z_front_location (float, optional (see comment below)) – Location in z of the front of the nirror [m]
depth (float, optional (see comment below)) – Depth of the mirror [m]
number_of_cells (integer, optional (see comment below)) – Minimum numer of cells zeroed out
Only one of the [x,y,z]_front_location should be specified. The mirror will be set perpendicular to the respective direction and infinite in the others. The depth of the mirror will be the maximum of the specified depth and number_of_cells, or the code’s default value if neither are specified.
Diagnostics
- class pywarpx.picmi.ParticleDiagnostic(period, species=None, data_list=None, write_dir=None, step_min=None, step_max=None, parallelio=None, name=None, **kw)[source]
Defines the particle diagnostics in the simulation frame
- Parameters:
period (integer) – Period of time steps that the diagnostic is performed
species (species instance or list of species instances, optional) – Species to write out. If not specified, all species are written. Note that the name attribute must be defined for the species.
data_list (list of strings, optional) – The data to be written out. Possible values ‘position’, ‘momentum’, ‘weighting’. Defaults to the output list of the implementing code.
write_dir (string, optional) – Directory where data is to be written
step_min (integer, default=0) – Minimum step at which diagnostics could be written
step_max (integer, default=unbounded) – Maximum step at which diagnostics could be written
parallelio (bool, optional) – If set to True, particle diagnostics are dumped in parallel
name (string, optional) – Sets the base name for the diagnostic output files
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_format ({plotfile, checkpoint, openpmd, ascent, sensei}, optional) – Diagnostic file format
warpx_openpmd_backend ({bp, h5, json}, optional) – Openpmd backend file format
warpx_file_prefix (string, optional) – Prefix on the diagnostic file name
warpx_file_min_digits (integer, optional) – Minimum number of digits for the time step number in the file name
warpx_random_fraction (float, optional) – Random fraction of particles to include in the diagnostic
warpx_uniform_stride (integer, optional) – Stride to down select to the particles to include in the diagnostic
warpx_plot_filter_function (string, optional) – Analytic expression to down select the particles to in the diagnostic
- class pywarpx.picmi.FieldDiagnostic(grid, period, data_list=None, write_dir=None, step_min=None, step_max=None, number_of_cells=None, lower_bound=None, upper_bound=None, parallelio=None, name=None, **kw)[source]
Defines the electromagnetic field diagnostics in the simulation frame
- Parameters:
grid (grid instance) – Grid object for the diagnostic
period (integer) – Period of time steps that the diagnostic is performed
data_list (list of strings, optional) – List of quantities to write out. Possible values ‘rho’, ‘E’, ‘B’, ‘J’, ‘Ex’ etc. Defaults to the output list of the implementing code.
write_dir (string, optional) – Directory where data is to be written
step_min (integer, default=0) – Minimum step at which diagnostics could be written
step_max (integer, default=unbounded) – Maximum step at which diagnostics could be written
number_of_cells (vector of integers, optional) – Number of cells in each dimension. If not given, will be obtained from grid.
lower_bound (vector of floats, optional) – Lower corner of diagnostics box in each direction. If not given, will be obtained from grid.
upper_bound (vector of floats, optional) – Higher corner of diagnostics box in each direction. If not given, will be obtained from grid.
parallelio (bool, optional) – If set to True, field diagnostics are dumped in parallel
name (string, optional) – Sets the base name for the diagnostic output files
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_plot_raw_fields (bool, optional) – Flag whether to dump the raw fields
warpx_plot_raw_fields_guards (bool, optional) – Flag whether the raw fields should include the guard cells
warpx_format ({plotfile, checkpoint, openpmd, ascent, sensei}, optional) – Diagnostic file format
warpx_openpmd_backend ({bp, h5, json}, optional) – Openpmd backend file format
warpx_file_prefix (string, optional) – Prefix on the diagnostic file name
warpx_file_min_digits (integer, optional) – Minimum number of digits for the time step number in the file name
warpx_dump_rz_modes (bool, optional) – Flag whether to dump the data for all RZ modes
warpx_particle_fields_to_plot (list of ParticleFieldDiagnostics) – List of ParticleFieldDiagnostic classes to install in the simulation. Error checking is handled in the class itself.
warpx_particle_fields_species (list of strings, optional) – Species for which to calculate particle_fields_to_plot functions. Fields will be calculated separately for each specified species. If not passed, default is all of the available particle species.
- pywarpx.picmi.ElectrostaticFieldDiagnostic
alias of
FieldDiagnostic
- class pywarpx.picmi.Checkpoint(period=1, write_dir=None, name=None, **kw)[source]
Sets up checkpointing of the simulation, allowing for later restarts
See Input Parameters for more information.
- Parameters:
warpx_file_prefix (string) – The prefix to the checkpoint directory names
warpx_file_min_digits (integer) – Minimum number of digits for the time step number in the checkpoint directory name.
- class pywarpx.picmi.ReducedDiagnostic(diag_type, name=None, period=1, path=None, extension=None, separator=None, **kw)[source]
Sets up a reduced diagnostic in the simulation.
See Input Parameters for more information.
- Parameters:
diag_type (string) – The type of reduced diagnostic. See the link above for all the different types of reduced diagnostics available.
name (string) – The name of this diagnostic which will also be the name of the data file written to disk.
period (integer) – The simulation step interval at which to output this diagnostic.
path (string) – The file path in which the diagnostic file should be written.
extension (string) – The file extension used for the diagnostic output.
separator (string) – The separator between row values in the output file.
species (species instance) – The name of the species for which to calculate the diagnostic, required for diagnostic types ‘BeamRelevant’, ‘ParticleHistogram’, and ‘ParticleExtrema’
bin_number (integer) – For diagnostic type ‘ParticleHistogram’, the number of bins used for the histogram
bin_max (float) – For diagnostic type ‘ParticleHistogram’, the maximum value of the bins
bin_min (float) – For diagnostic type ‘ParticleHistogram’, the minimum value of the bins
normalization ({'unity_particle_weight', 'max_to_unity', 'area_to_unity'}, optional) – For diagnostic type ‘ParticleHistogram’, normalization method of the histogram.
histogram_function (string) – For diagnostic type ‘ParticleHistogram’, the function evaluated to produce the histogram data
filter_function (string, optional) – For diagnostic type ‘ParticleHistogram’, the function to filter whether particles are included in the histogram
reduced_function (string) – For diagnostic type ‘FieldReduction’, the function of the fields to evaluate
weighting_function (string, optional) – For diagnostic type ‘ChargeOnEB’, the function to weight contributions to the total charge
reduction_type ({'Maximum', 'Minimum', or 'Integral'}) – For diagnostic type ‘FieldReduction’, the type of reduction
probe_geometry ({'Point', 'Line', 'Plane'}, default='Point') – For diagnostic type ‘FieldProbe’, the geometry of the probe
integrate (bool, default=false) – For diagnostic type ‘FieldProbe’, whether the field is integrated
do_moving_window_FP (bool, default=False) – For diagnostic type ‘FieldProbe’, whether the moving window is followed
x_probe (floats) – For diagnostic type ‘FieldProbe’, a probe location. For ‘Point’, the location of the point. For ‘Line’, the start of the line. For ‘Plane’, the center of the square detector.
y_probe (floats) – For diagnostic type ‘FieldProbe’, a probe location. For ‘Point’, the location of the point. For ‘Line’, the start of the line. For ‘Plane’, the center of the square detector.
z_probe (floats) – For diagnostic type ‘FieldProbe’, a probe location. For ‘Point’, the location of the point. For ‘Line’, the start of the line. For ‘Plane’, the center of the square detector.
interp_order (integer) – For diagnostic type ‘FieldProbe’, the interpolation order for ‘Line’ and ‘Plane’
resolution (integer) – For diagnostic type ‘FieldProbe’, the number of points along the ‘Line’ or along each edge of the square ‘Plane’
x1_probe (floats) – For diagnostic type ‘FieldProbe’, the end point for ‘Line’
y1_probe (floats) – For diagnostic type ‘FieldProbe’, the end point for ‘Line’
z1_probe (floats) – For diagnostic type ‘FieldProbe’, the end point for ‘Line’
detector_radius (float) – For diagnostic type ‘FieldProbe’, the detector “radius” (half edge length) of the ‘Plane’
target_normal_x (floats) – For diagnostic type ‘FieldProbe’, the normal vector to the ‘Plane’. Only applicable in 3D
target_normal_y (floats) – For diagnostic type ‘FieldProbe’, the normal vector to the ‘Plane’. Only applicable in 3D
target_normal_z (floats) – For diagnostic type ‘FieldProbe’, the normal vector to the ‘Plane’. Only applicable in 3D
target_up_x (floats) – For diagnostic type ‘FieldProbe’, the vector specifying up in the ‘Plane’
target_up_y (floats) – For diagnostic type ‘FieldProbe’, the vector specifying up in the ‘Plane’
target_up_z (floats) – For diagnostic type ‘FieldProbe’, the vector specifying up in the ‘Plane’
Lab-frame diagnostics diagnostics are used when running boosted-frame simulations.
- class pywarpx.picmi.LabFrameParticleDiagnostic(grid, num_snapshots, dt_snapshots, data_list=None, time_start=0.0, species=None, write_dir=None, parallelio=None, name=None, **kw)[source]
Defines the particle diagnostics in the lab frame
- Parameters:
grid (grid instance) – Grid object for the diagnostic
num_snapshots (integer) – Number of lab frame snapshots to make
dt_snapshots (float) – Time between each snapshot in lab frame
species (species instance or list of species instances, optional) – Species to write out. If not specified, all species are written. Note that the name attribute must be defined for the species.
data_list (list of strings, optional) – The data to be written out. Possible values ‘position’, ‘momentum’, ‘weighting’. Defaults to the output list of the implementing code.
time_start (float, default=0) – Time for the first snapshot in lab frame
write_dir (string, optional) – Directory where data is to be written
parallelio (bool, optional) – If set to True, particle diagnostics are dumped in parallel
name (string, optional) – Sets the base name for the diagnostic output files
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_format (string, optional) – Passed to <diagnostic name>.format
warpx_openpmd_backend (string, optional) – Passed to <diagnostic name>.openpmd_backend
warpx_file_prefix (string, optional) – Passed to <diagnostic name>.file_prefix
warpx_intervals (integer or string) – Selects the snapshots to be made, instead of using “num_snapshots” which makes all snapshots. “num_snapshots” is ignored.
warpx_file_min_digits (integer, optional) – Passed to <diagnostic name>.file_min_digits
warpx_buffer_size (integer, optional) – Passed to <diagnostic name>.buffer_size
- class pywarpx.picmi.LabFrameFieldDiagnostic(grid, num_snapshots, dt_snapshots, data_list=None, z_subsampling=1, time_start=0.0, write_dir=None, parallelio=None, name=None, **kw)[source]
Defines the electromagnetic field diagnostics in the lab frame
- Parameters:
grid (grid instance) – Grid object for the diagnostic
num_snapshots (integer) – Number of lab frame snapshots to make
dt_snapshots (float) – Time between each snapshot in lab frame
data_list (list of strings, optional) – List of quantities to write out. Possible values ‘rho’, ‘E’, ‘B’, ‘J’, ‘Ex’ etc. Defaults to the output list of the implementing code.
z_subsampling (integer, default=1) – A factor which is applied on the resolution of the lab frame reconstruction
time_start (float, default=0) – Time for the first snapshot in lab frame
write_dir (string, optional) – Directory where data is to be written
parallelio (bool, optional) – If set to True, field diagnostics are dumped in parallel
name (string, optional) – Sets the base name for the diagnostic output files
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_format (string, optional) – Passed to <diagnostic name>.format
warpx_openpmd_backend (string, optional) – Passed to <diagnostic name>.openpmd_backend
warpx_file_prefix (string, optional) – Passed to <diagnostic name>.file_prefix
warpx_intervals (integer or string) – Selects the snapshots to be made, instead of using “num_snapshots” which makes all snapshots. “num_snapshots” is ignored.
warpx_file_min_digits (integer, optional) – Passed to <diagnostic name>.file_min_digits
warpx_buffer_size (integer, optional) – Passed to <diagnostic name>.buffer_size
warpx_lower_bound (vector of floats, optional) – Passed to <diagnostic name>.lower_bound
warpx_upper_bound (vector of floats, optional) – Passed to <diagnostic name>.upper_bound
Particles
Species objects are a collection of particles with similar properties. For instance, background plasma electrons, background plasma ions and an externally injected beam could each be their own particle species.
- class pywarpx.picmi.Species(particle_type=None, name=None, charge_state=None, charge=None, mass=None, initial_distribution=None, particle_shape=None, density_scale=None, method=None, **kw)[source]
Sets up the species to be simulated. The species charge and mass can be specified by setting the particle type or by setting them directly. If the particle type is specified, the charge or mass can be set to override the value from the type.
- Parameters:
particle_type (string, optional) – A string specifying an elementary particle, atom, or other, as defined in the openPMD 2 species type extension, openPMD-standard/EXT_SpeciesType.md
name (string, optional) – Name of the species
method ({'Boris', 'Vay', 'Higuera-Cary', 'Li' , 'free-streaming', 'LLRK4'}) –
The particle advance method to use. Code-specific method can be specified using ‘other:<method>’. The default is code dependent.
’Boris’: Standard “leap-frog” Boris advance
’Vay’:
’Higuera-Cary’:
’Li’ :
’free-streaming’: Advance with no fields
’LLRK4’: Landau-Lifschitz radiation reaction formula with RK-4)
charge_state (float, optional) – Charge state of the species (applies only to atoms) [1]
charge (float, optional) – Particle charge, required when type is not specified, otherwise determined from type [C]
mass (float, optional) – Particle mass, required when type is not specified, otherwise determined from type [kg]
initial_distribution (distribution instance) – The initial distribution loaded at t=0. Must be one of the standard distributions implemented.
density_scale (float, optional) – A scale factor on the density given by the initial_distribution
particle_shape ({'NGP', 'linear', 'quadratic', 'cubic'}) – Particle shape used for deposition and gather. If not specified, the value from the Simulation object will be used. Other values maybe specified that are code dependent.
Implementation specific documentation
See Input Parameters for more information.
- Parameters:
warpx_boost_adjust_transverse_positions (bool, default=False) – Whether to adjust transverse positions when apply the boost to the simulation frame
warpx_self_fields_required_precision (float, default=1.e-11) – Relative precision on the electrostatic solver (when using the relativistic solver)
warpx_self_fields_absolute_tolerance (float, default=0.) – Absolute precision on the electrostatic solver (when using the relativistic solver)
warpx_self_fields_max_iters (integer, default=200) – Maximum number of iterations for the electrostatic solver for the species
warpx_self_fields_verbosity (integer, default=2) – Level of verbosity for the electrostatic solver
warpx_save_previous_position (bool, default=False) – Whether to save the old particle positions
warpx_do_not_deposit (bool, default=False) – Whether or not to deposit the charge and current density for for this species
warpx_do_not_push (bool, default=False) – Whether or not to push this species
warpx_do_not_gather (bool, default=False) – Whether or not to gather the fields from grids for this species
warpx_random_theta (bool, default=True) – Whether or not to add random angle to the particles in theta when in RZ mode.
warpx_reflection_model_xlo (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the lower x boundary
warpx_reflection_model_xhi (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the upper x boundary
warpx_reflection_model_ylo (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the lower y boundary
warpx_reflection_model_yhi (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the upper y boundary
warpx_reflection_model_zlo (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the lower z boundary
warpx_reflection_model_zhi (string, default='0.') – Expression (in terms of the velocity “v”) specifying the probability that the particle will reflect on the upper z boundary
warpx_save_particles_at_xlo (bool, default=False) – Whether to save particles lost at the lower x boundary
warpx_save_particles_at_xhi (bool, default=False) – Whether to save particles lost at the upper x boundary
warpx_save_particles_at_ylo (bool, default=False) – Whether to save particles lost at the lower y boundary
warpx_save_particles_at_yhi (bool, default=False) – Whether to save particles lost at the upper y boundary
warpx_save_particles_at_zlo (bool, default=False) – Whether to save particles lost at the lower z boundary
warpx_save_particles_at_zhi (bool, default=False) – Whether to save particles lost at the upper z boundary
warpx_save_particles_at_eb (bool, default=False) – Whether to save particles lost at the embedded boundary
warpx_do_resampling (bool, default=False) – Whether particles will be resampled
warpx_resampling_min_ppc (int, default=1) – Cells with fewer particles than this number will be skipped during resampling.
warpx_resampling_trigger_intervals (bool, default=0) – Timesteps at which to resample
warpx_resampling_trigger_max_avg_ppc (int, default=infinity) – Resampling will be done when the average number of particles per cell exceeds this number
warpx_resampling_algorithm (str, default="leveling_thinning") – Resampling algorithm to use.
warpx_resampling_algorithm_velocity_grid_type (str, default="spherical") – Type of grid to use when clustering particles in velocity space. Only applicable with the velocity_coincidence_thinning algorithm.
warpx_resampling_algorithm_delta_ur (float) – Size of velocity window used for clustering particles during grid-based merging, with velocity_grid_type == “spherical”.
warpx_resampling_algorithm_n_theta (int) – Number of bins to use in theta when clustering particle velocities during grid-based merging, with velocity_grid_type == “spherical”.
warpx_resampling_algorithm_n_phi (int) – Number of bins to use in phi when clustering particle velocities during grid-based merging, with velocity_grid_type == “spherical”.
warpx_resampling_algorithm_delta_u (array of floats or float) – Size of velocity window used in ux, uy and uz for clustering particles during grid-based merging, with velocity_grid_type == “cartesian”. If a single number is given the same du value will be used in all three directions.
- class pywarpx.picmi.MultiSpecies(particle_types=None, names=None, charge_states=None, charges=None, masses=None, proportions=None, initial_distribution=None, particle_shape=None, **kw)[source]
INCOMPLETE: proportions argument is not implemented Multiple species that are initialized with the same distribution. Each parameter can be list, giving a value for each species, or a single value which is given to all species. The species charge and mass can be specified by setting the particle type or by setting them directly. If the particle type is specified, the charge or mass can be set to override the value from the type.
- Parameters:
particle_types (list of strings, optional) – A string specifying an elementary particle, atom, or other, as defined in the openPMD 2 species type extension, openPMD-standard/EXT_SpeciesType.md
names (list of strings, optional) – Names of the species
charge_states (list of floats, optional) – Charge states of the species (applies only to atoms)
charges (list of floats, optional) – Particle charges, required when type is not specified, otherwise determined from type [C]
masses (list of floats, optional) – Particle masses, required when type is not specified, otherwise determined from type [kg]
proportions (list of floats, optional) – Proportions of the initial distribution made up by each species
initial_distribution (distribution instance) – Initial particle distribution, applied to all species
particle_shape ({'NGP', 'linear', 'quadratic', 'cubic'}) – Particle shape used for deposition and gather. If not specified, the value from the Simulation object will be used. Other values maybe specified that are code dependent.
Particle distributions can be used for to initialize particles in a particle species.
- class pywarpx.picmi.GaussianBunchDistribution(n_physical_particles, rms_bunch_size, rms_velocity=[0.0, 0.0, 0.0], centroid_position=[0.0, 0.0, 0.0], centroid_velocity=[0.0, 0.0, 0.0], velocity_divergence=[0.0, 0.0, 0.0], **kw)[source]
Describes a Gaussian distribution of particles
- Parameters:
n_physical_particles (integer) – Number of physical particles in the bunch
rms_bunch_size (vector of length 3 of floats) – RMS bunch size at t=0 [m]
rms_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – RMS velocity spread at t=0 [m/s]
centroid_position (vector of length 3 of floats, default=[0.,0.,0.]) – Position of the bunch centroid at t=0 [m]
centroid_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – Velocity (gamma*V) of the bunch centroid at t=0 [m/s]
velocity_divergence (vector of length 3 of floats, default=[0.,0.,0.]) – Expansion rate of the bunch at t=0 [m/s/m]
- class pywarpx.picmi.UniformDistribution(density, lower_bound=[None, None, None], upper_bound=[None, None, None], rms_velocity=[0.0, 0.0, 0.0], directed_velocity=[0.0, 0.0, 0.0], fill_in=None, **kw)[source]
Describes a uniform density distribution of particles
- Parameters:
density (float) – Physical number density [m^-3]
lower_bound (vector of length 3 of floats, optional) – Lower bound of the distribution [m]
upper_bound (vector of length 3 of floats, optional) – Upper bound of the distribution [m]
rms_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – Thermal velocity spread [m/s]
directed_velocity (vector of length 3 of floats, default=[0.,0.,0.]) – Directed, average, velocity [m/s]
fill_in (bool, optional) – Flags whether to fill in the empty spaced opened up when the grid moves
- class pywarpx.picmi.AnalyticDistribution(density_expression, momentum_expressions=[None, None, None], lower_bound=[None, None, None], upper_bound=[None, None, None], rms_velocity=[0.0, 0.0, 0.0], directed_velocity=[0.0, 0.0, 0.0], fill_in=None, **kw)[source]
Describes a uniform density plasma
- Parameters:
density_expression (string) – Analytic expression describing physical number density (string) [m^-3]. Expression should be in terms of the position, written as ‘x’, ‘y’, and ‘z’. Parameters can be used in the expression with the values given as keyword arguments.
momentum_expressions (list of strings) – Analytic expressions describing the gamma*velocity for each axis [m/s]. Expressions should be in terms of the position, written as ‘x’, ‘y’, and ‘z’. Parameters can be used in the expression with the values given as keyword arguments. For any axis not supplied (set to None), directed_velocity will be used.
lower_bound (vector of length 3 of floats, optional) – Lower bound of the distribution [m]
upper_bound (vector of length 3 of floats, optional) – Upper bound of the distribution [m]
rms_velocity (vector of length 3 of floats, detault=[0.,0.,0.]) – Thermal velocity spread [m/s]
directed_velocity (vector of length 3 of floats, detault=[0.,0.,0.]) – Directed, average, velocity [m/s]
fill_in (bool, optional) – Flags whether to fill in the empty spaced opened up when the grid moves
This example will create a distribution where the density is n0 below rmax and zero elsewhere.:
.. code-block: python
- dist = AnalyticDistribution(density_expression=’((x**2+y**2)<rmax**2)*n0’,
rmax = 1., n0 = 1.e20, …)
Implementation specific documentation
- Parameters:
warpx_momentum_spread_expressions (list of string) – Analytic expressions describing the gamma*velocity spread for each axis [m/s]. Expressions should be in terms of the position, written as ‘x’, ‘y’, and ‘z’. Parameters can be used in the expression with the values given as keyword arguments. For any axis not supplied (set to None), zero will be used.
- class pywarpx.picmi.ParticleListDistribution(x=0.0, y=0.0, z=0.0, ux=0.0, uy=0.0, uz=0.0, weight=0.0, **kw)[source]
Load particles at the specified positions and velocities
- Parameters:
x (float, default=0.) – List of x positions of the particles [m]
y (float, default=0.) – List of y positions of the particles [m]
z (float, default=0.) – List of z positions of the particles [m]
ux (float, default=0.) – List of ux positions of the particles (ux = gamma*vx) [m/s]
uy (float, default=0.) – List of uy positions of the particles (uy = gamma*vy) [m/s]
uz (float, default=0.) – List of uz positions of the particles (uz = gamma*vz) [m/s]
weight (float) – Particle weight or list of weights, number of real particles per simulation particle
Particle layouts determine how to microscopically place macro particles in a grid cell.
- class pywarpx.picmi.GriddedLayout(n_macroparticle_per_cell, grid=None, **kw)[source]
Specifies a gridded layout of particles
- Parameters:
n_macroparticle_per_cell (vector of integers) – Number of particles per cell along each axis
grid (grid instance, optional) – Grid object specifying the grid to follow. If not specified, the underlying grid of the code is used.
- class pywarpx.picmi.PseudoRandomLayout(n_macroparticles=None, n_macroparticles_per_cell=None, seed=None, grid=None, **kw)[source]
Specifies a pseudo-random layout of the particles
- Parameters:
n_macroparticles (integer) – Total number of macroparticles to load. Either this argument or n_macroparticles_per_cell should be supplied.
n_macroparticles_per_cell (integer) – Number of macroparticles to load per cell. Either this argument or n_macroparticles should be supplied.
seed (integer, optional) – Pseudo-random number generator seed
grid (grid instance, optional) – Grid object specifying the grid to follow for n_macroparticles_per_cell. If not specified, the underlying grid of the code is used.
Other operations related to particles:
- class pywarpx.picmi.CoulombCollisions(name, species, CoulombLog=None, ndt=None, **kw)[source]
Custom class to handle setup of binary Coulomb collisions in WarpX. If collision initialization is added to picmistandard this can be changed to inherit that functionality.
- Parameters:
name (string) – Name of instance (used in the inputs file)
species (list of species instances) – The species involved in the collision. Must be of length 2.
CoulombLog (float, optional) – Value of the Coulomb log to use in the collision cross section. If not supplied, it is calculated from the local conditions.
ndt (integer, optional) – The collisions will be applied every “ndt” steps. Must be 1 or larger.
- class pywarpx.picmi.DSMCCollisions(name, species, scattering_processes, ndt=None, **kw)[source]
Custom class to handle setup of DSMC collisions in WarpX. If collision initialization is added to picmistandard this can be changed to inherit that functionality.
- Parameters:
name (string) – Name of instance (used in the inputs file)
species (species instance) – The species involved in the collision
scattering_processes (dictionary) – The scattering process to use and any needed information
ndt (integer, optional) – The collisions will be applied every “ndt” steps. Must be 1 or larger.
- class pywarpx.picmi.MCCCollisions(name, species, background_density, background_temperature, scattering_processes, background_mass=None, max_background_density=None, ndt=None, **kw)[source]
Custom class to handle setup of MCC collisions in WarpX. If collision initialization is added to picmistandard this can be changed to inherit that functionality.
- Parameters:
name (string) – Name of instance (used in the inputs file)
species (species instance) – The species involved in the collision
background_density (float or string) – The density of the background. An string expression as a function of (x, y, z, t) can be used.
background_temperature (float or string) – The temperature of the background. An string expression as a function of (x, y, z, t) can be used.
scattering_processes (dictionary) – The scattering process to use and any needed information
background_mass (float, optional) – The mass of the background particle. If not supplied, the default depends on the type of scattering process.
max_background_density (float) – The maximum background density. When the background_density is an expression, this must also be specified.
ndt (integer, optional) – The collisions will be applied every “ndt” steps. Must be 1 or larger.
- class pywarpx.picmi.FieldIonization(model, ionized_species, product_species, **kw)[source]
Field ionization on an ion species
- Parameters:
model (string) – Ionization model, e.g. “ADK”
ionized_species (species instance) – Species that is ionized
product_species (species instance) – Species in which ionized electrons are stored.
Implementation specific documentation
WarpX only has ADK ionization model implemented.
Laser Pulses
Laser profiles can be used to initialize laser pulses in the simulation.
- class pywarpx.picmi.GaussianLaser(wavelength, waist, duration, propagation_direction, polarization_direction, focal_position, centroid_position, a0=None, E0=None, phi0=None, zeta=None, beta=None, phi2=None, name=None, fill_in=True, **kw)[source]
Specifies a Gaussian laser distribution.
More precisely, the electric field near the focal plane is given by:
\[E(\boldsymbol{x},t) = a_0\times E_0\, \exp\left( -\frac{r^2}{w_0^2} - \frac{(z-z_0-ct)^2}{c^2\tau^2} \right) \cos[ k_0( z - z_0 - ct ) - \phi_{cep} ]\]where \(k_0 = 2\pi/\lambda_0\) is the wavevector and where \(E_0 = m_e c^2 k_0 / q_e\) is the field amplitude for \(a_0=1\).
Note
The additional terms that arise far from the focal plane (Gouy phase, wavefront curvature, …) are not included in the above formula for simplicity, but are of course taken into account by the code, when initializing the laser pulse away from the focal plane.
- Parameters:
wavelength (float) – Laser wavelength [m], defined as \(\lambda_0\) in the above formula
waist (float) – Waist of the Gaussian pulse at focus [m], defined as \(w_0\) in the above formula
duration (float) – Duration of the Gaussian pulse [s], defined as \(\tau\) in the above formula
propagation_direction (unit vector of length 3 of floats) – Direction of propagation [1]
polarization_direction (unit vector of length 3 of floats) – Direction of polarization [1]
focal_position (vector of length 3 of floats) – Position of the laser focus [m]
centroid_position (vector of length 3 of floats) – Position of the laser centroid at time 0 [m]
a0 (float) – Normalized vector potential at focus Specify either a0 or E0 (E0 takes precedence).
E0 (float) – Maximum amplitude of the laser field [V/m] Specify either a0 or E0 (E0 takes precedence).
phi0 (float) – Carrier envelope phase (CEP) [rad]
zeta (float) – Spatial chirp at focus (in the lab frame) [m.s]
beta (float) – Angular dispersion at focus (in the lab frame) [rad.s]
phi2 (float) – Temporal chirp at focus (in the lab frame) [s^2]
fill_in (bool, default=True) – Flags whether to fill in the empty spaced opened up when the grid moves
name (string, optional) – Optional name of the laser
- class pywarpx.picmi.AnalyticLaser(field_expression, wavelength, propagation_direction, polarization_direction, amax=None, Emax=None, name=None, fill_in=True, **kw)[source]
Specifies a laser with an analytically described distribution
- Parameters:
name=None (string, optional) – Optional name of the laser
field_expression (string) – Analytic expression describing the electric field of the laser [V/m] Expression should be in terms of the position, ‘X’, ‘Y’, in the plane orthogonal to the propagation direction, and ‘t’ the time. The expression should describe the full field, including the oscillitory component. Parameters can be used in the expression with the values given as keyword arguments.
wavelength (float) – Laser wavelength. This should be built into the expression, but some codes require a specified value for numerical purposes.
propagation_direction (unit vector of length 3 of floats) – Direction of propagation [1]
polarization_direction (unit vector of length 3 of floats) – Direction of polarization [1]
amax (float, optional) – Maximum normalized vector potential. Specify either amax or Emax (Emax takes precedence). This should be built into the expression, but some codes require a specified value for numerical purposes.
Emax (float, optional) – Maximum amplitude of the laser field [V/m]. Specify either amax or Emax (Emax takes precedence). This should be built into the expression, but some codes require a specified value for numerical purposes.
fill_in (bool, default=True) – Flags whether to fill in the empty spaced opened up when the grid moves
Laser injectors control where to initialize laser pulses on the simulation grid.
- class pywarpx.picmi.LaserAntenna(position, normal_vector=None, **kw)[source]
Specifies the laser antenna injection method
- Parameters:
position (vector of strings) – Position of antenna launching the laser [m]
normal_vector (vector of strings, optional) – Vector normal to antenna plane, defaults to the laser direction of propagation [1]
Parameters: Inputs File
This documents on how to use WarpX with an inputs file (e.g., warpx.3d input_3d
).
Complete example input files can be found in the examples section.
Note
WarpX input options are read via AMReX ParmParse.
Note
The AMReX parser (see Math parser and user-defined constants) is used for the right-hand-side of all input parameters that consist of one or more integers or floats, so expressions like <species_name>.density_max = "2.+1."
and/or using user-defined constants are accepted.
Overall simulation parameters
authors
(string: e.g."Jane Doe <jane@example.com>, Jimmy Joe <jimmy@example.com>"
)Authors of an input file / simulation setup. When provided, this information is added as metadata to (openPMD) output files.
max_step
(integer)The number of PIC cycles to perform.
stop_time
(float; in seconds)The maximum physical time of the simulation. Can be provided instead of
max_step
. If bothmax_step
andstop_time
are provided, both criteria are used and the simulation stops when the first criterion is hit.Note: in boosted-frame simulations,
stop_time
refers to the time in the boosted frame.
warpx.used_inputs_file
(string; default:warpx_used_inputs
)Name of a file that WarpX writes to archive the used inputs. The context of this file will contain an exact copy of all explicitly and implicitly used inputs parameters, including those extended and overwritten from the command line.
warpx.gamma_boost
(float)The Lorentz factor of the boosted frame in which the simulation is run. (The corresponding Lorentz transformation is assumed to be along
warpx.boost_direction
.)When using this parameter, the input parameters are interpreted as in the lab-frame and automatically converted to the boosted frame. (See the corresponding documentation of each input parameters for exceptions.)
warpx.boost_direction
(string:x
,y
orz
)The direction of the Lorentz-transform for boosted-frame simulations (The direction
y
cannot be used in 2D simulations.)
warpx.zmax_plasma_to_compute_max_step
(float) optionalCan be useful when running in a boosted frame. If specified, automatically calculates the number of iterations required in the boosted frame for the lower z end of the simulation domain to reach
warpx.zmax_plasma_to_compute_max_step
(typically the plasma end, given in the lab frame). The value ofmax_step
is overwritten, and printed to standard output. Currently only works if the Lorentz boost and the moving window are along the z direction.
warpx.compute_max_step_from_btd
(integer; 0 by default) optionalCan be useful when computing back-transformed diagnostics. If specified, automatically calculates the number of iterations required in the boosted frame for all back-transformed diagnostics to be completed. If
max_step
,stop_time
, orwarpx.zmax_plasma_to_compute_max_step
are not specified, or the current values ofmax_step
and/orstop_time
are too low to fill all BTD snapshots, the values ofmax_step
and/orstop_time
are overwritten with the new values and printed to standard output.
warpx.random_seed
(string or int > 0) optionalIf provided
warpx.random_seed = random
, the random seed will be determined using std::random_device and std::clock(), thus every simulation run produces different random numbers. If providedwarpx.random_seed = n
, and it is required that n > 0, the random seed for each MPI rank is (mpi_rank+1) * n, where mpi_rank starts from 0. n = 1 andwarpx.random_seed = default
produce the default random seed. Note that when GPU threading is used, one should not expect to obtain the same random numbers, even if a fixedwarpx.random_seed
is provided.
algo.evolve_scheme
(string, default: explicit)Specifies the evolve scheme used by WarpX.
explicit
: Use an explicit solver, such as the standard FDTD or PSATDimplicit_picard
: Use an implicit solver with exact energy conservation that uses a Picard iteration to solve the system. Note that this method is for demonstration only. It is inefficient and does not work well when \(\omega_{pe} \Delta t\) is close to or greater than one. The method is described in Angus et al., On numerical energy conservation for an implicit particle-in-cell method coupled with a binary Monte-Carlo algorithm for Coulomb collisions. The version implemented is an updated version that is relativistically correct, including the relativistic gamma factor for the particles. For exact energy conservation,algo.current_deposition = direct
must be used withinterpolation.galerkin_scheme = 0
, andalgo.current_deposition = Esirkepov
must be used withinterpolation.galerkin_scheme = 1
(which is the default, in which case charge will also be conserved).semi_implicit_picard
: Use an energy conserving semi-implicit solver that uses a Picard iteration to solve the system. Note that this method has the CFL limitation \(\Delta t < c/\sqrt( \sum_i 1/\Delta x_i^2 )\). It is inefficient and does not work well or at all when \(\omega_{pe} \Delta t\) is close to or greater than one. The method is described in Chen et al., A semi-implicit, energy- and charge-conserving particle-in-cell algorithm for the relativistic Vlasov-Maxwell equations. For energy conservation,algo.current_deposition = direct
must be used withinterpolation.galerkin_scheme = 0
, andalgo.current_deposition = Esirkepov
must be used withinterpolation.galerkin_scheme = 1
(which is the default, in which case charge will also be conserved).
algo.max_picard_iterations
(integer, default: 10)When algo.evolve_scheme is either implicit_picard or semi_implicit_picard, this sets the maximum number of Picard itearations that are done each time step.
algo.picard_iteration_tolerance
(float, default: 1.e-7)When algo.evolve_scheme is either implicit_picard or semi_implicit_picard, this sets the convergence tolerance of the iterations, the maximum of the relative change of the L2 norm of the field from one iteration to the next. If this is set to zero, the maximum number of iterations will always be done with the change only calculated on the last iteration (for a slight optimization).
algo.require_picard_convergence
(bool, default: 1)When algo.evolve_scheme is either implicit_picard or semi_implicit_picard, this sets whether the iteration each step is required to converge. If it is required, an abort is raised if it does not converge and the code then exits. If not, then a warning is issued and the calculation continues.
warpx.do_electrostatic
(string) optional (default none)Specifies the electrostatic mode. When turned on, instead of updating the fields at each iteration with the full Maxwell equations, the fields are recomputed at each iteration from the Poisson equation. There is no limitation on the timestep in this case, but electromagnetic effects (e.g. propagation of radiation, lasers, etc.) are not captured. There are several options:
labframe
: Poisson’s equation is solved in the lab frame with the charge density of all species combined. More specifically, the code solves:\[\boldsymbol{\nabla}^2 \phi = - \rho/\epsilon_0 \qquad \boldsymbol{E} = - \boldsymbol{\nabla}\phi\]labframe-electromagnetostatic
: Poisson’s equation is solved in the lab frame with the charge density of all species combined. Additionally the 3-component vector potential is solved in the Coulomb Gauge with the current density of all species combined to include self magnetic fields. More specifically, the code solves:\[\begin{split}\boldsymbol{\nabla}^2 \phi = - \rho/\epsilon_0 \qquad \boldsymbol{E} = - \boldsymbol{\nabla}\phi \\ \boldsymbol{\nabla}^2 \boldsymbol{A} = - \mu_0 \boldsymbol{j} \qquad \boldsymbol{B} = \boldsymbol{\nabla}\times\boldsymbol{A}\end{split}\]relativistic
: Poisson’s equation is solved for each species in their respective rest frame. The corresponding field is mapped back to the simulation frame and will produce both E and B fields. More specifically, in the simulation frame, this is equivalent to solving for each species\[\boldsymbol{\nabla}^2 - (\boldsymbol{\beta}\cdot\boldsymbol{\nabla})^2\phi = - \rho/\epsilon_0 \qquad \boldsymbol{E} = -\boldsymbol{\nabla}\phi + \boldsymbol{\beta}(\boldsymbol{\beta} \cdot \boldsymbol{\nabla}\phi) \qquad \boldsymbol{B} = -\frac{1}{c}\boldsymbol{\beta}\times\boldsymbol{\nabla}\phi\]where \(\boldsymbol{\beta}\) is the average (normalized) velocity of the considered species (which can be relativistic). See, e.g., Vay [1] for more information.
warpx.poisson_solver
(string) optional (default multigrid)multigrid
: Poisson’s equation is solved using an iterative multigrid (MLMG) solver.See the AMReX documentation for details of the MLMG solver (the default solver used with electrostatic simulations). The default behavior of the code is to check whether there is non-zero charge density in the system and if so force the MLMG solver to use the solution max norm when checking convergence. If there is no charge density, the MLMG solver will switch to using the initial guess max norm error when evaluating convergence and an absolute error tolerance of \(10^{-6}\) \(\mathrm{V/m}^2\) will be used (unless a different non-zero value is specified by the user via
warpx.self_fields_absolute_tolerance
).
fft
: Poisson’s equation is solved using an Integrated Green Function method (which requires FFT calculations).See these references for more details , . It only works in 3D and it requires the compilation flag
-DWarpX_PSATD=ON
. If mesh refinement is enabled, this solver only works on the coarsest level. On the refined patches, the Poisson equation is solved with the multigrid solver. In electrostatic mode, this solver requires open field boundary conditions (boundary.field_lo,hi = open
). In electromagnetic mode, this solver can be used to initialize the species’ self fields (<species_name>.initialize_self_fields=1
) provided that the field BCs are PML (boundary.field_lo,hi = PML
).
warpx.self_fields_required_precision
(float, default: 1.e-11)The relative precision with which the electrostatic space-charge fields should be calculated. More specifically, the space-charge fields are computed with an iterative Multi-Level Multi-Grid (MLMG) solver. This solver can fail to reach the default precision within a reasonable time. This only applies when warpx.do_electrostatic = labframe.
warpx.self_fields_absolute_tolerance
(float, default: 0.0)The absolute tolerance with which the space-charge fields should be calculated in units of \(\mathrm{V/m}^2\). More specifically, the acceptable residual with which the solution can be considered converged. In general this should be left as the default, but in cases where the simulation state changes very little between steps it can occur that the initial guess for the MLMG solver is so close to the converged value that it fails to improve that solution sufficiently to reach the
self_fields_required_precision
value.
warpx.self_fields_max_iters
(integer, default: 200)Maximum number of iterations used for MLMG solver for space-charge fields calculation. In case if MLMG converges but fails to reach the desired
self_fields_required_precision
, this parameter may be increased. This only applies when warpx.do_electrostatic = labframe.
warpx.self_fields_verbosity
(integer, default: 2)The verbosity used for MLMG solver for space-charge fields calculation. Currently MLMG solver looks for verbosity levels from 0-5. A higher number results in more verbose output.
amrex.abort_on_out_of_gpu_memory
(0
or1
; default is1
for true)When running on GPUs, memory that does not fit on the device will be automatically swapped to host memory when this option is set to
0
. This will cause severe performance drops. Note that even with this set to1
WarpX will not catch all out-of-memory events yet when operating close to maximum device memory. Please also see the documentation in AMReX.
amrex.the_arena_is_managed
(0
or1
; default is0
for false)When running on GPUs, device memory that is accessed from the host will automatically be transferred with managed memory. This is useful for convenience during development, but has sometimes severe performance and memory footprint implications if relied on (and sometimes vendor bugs). For all regular WarpX operations, we therefore do explicit memory transfers without the need for managed memory and thus changed the AMReX default to false. Please also see the documentation in AMReX.
amrex.omp_threads
(system
,nosmt
or positive integer; default isnosmt
)An integer number can be set in lieu of the
OMP_NUM_THREADS
environment variable to control the number of OpenMP threads to use for theOMP
compute backend on CPUs. By default, we use thenosmt
option, which overwrites the OpenMP default of spawning one thread per logical CPU core, and instead only spawns a number of threads equal to the number of physical CPU cores on the machine. If set, the environment variableOMP_NUM_THREADS
takes precedence oversystem
andnosmt
, but not over integer numbers set in this option.
Signal Handling
WarpX can handle Unix (Linux/macOS) process signals. This can be useful to configure jobs on HPC and cloud systems to shut down cleanly when they are close to reaching their allocated walltime or to steer the simulation behavior interactively.
Allowed signal names are documented in the C++ standard and POSIX.
We follow the same naming, but remove the SIG
prefix, e.g., the WarpX signal configuration name for SIGINT
is INT
.
warpx.break_signals
(array of string, separated by spaces) optionalA list of signal names or numbers that the simulation should handle by cleanly terminating at the next timestep
warpx.checkpoint_signals
(array of string, separated by spaces) optionalA list of signal names or numbers that the simulation should handle by outputting a checkpoint at the next timestep. A diagnostic of type checkpoint must be configured.
Note
Certain signals are only available on specific platforms, please see the links above for details.
Typically supported on Linux and macOS are HUP
, INT
, QUIT
, ABRT
, USR1
, USR2
, TERM
, TSTP
, URG
, and IO
among others.
Signals to think about twice before overwriting in interactive simulations:
Note that INT
(interupt) is the signal that Ctrl+C
sends on the terminal, which most people use to abort a process; once overwritten you need to abort interactive jobs with, e.g., Ctrl+\
(QUIT
) or sending the KILL
signal.
The TSTP
(terminal stop) command is sent interactively from Ctrl+Z
to temporarily send a process to sleep (until send in the background with commands such as bg
or continued with fg
), overwriting it would thus disable that functionality.
The signals KILL
and STOP
cannot be used.
The FPE
and ILL
signals should not be overwritten in WarpX, as they are controlled by AMReX for debug workflows that catch invalid floating-point operations.
Tip
For example, the following logic can be added to Slurm batch scripts (signal name to number mapping here) to gracefully shut down 6 min prior to walltime.
If you have a checkpoint diagnostics in your inputs file, this automatically will write a checkpoint due to the default <diag_name>.dump_last_timestep = 1
option in WarpX.
#SBATCH --signal=1@360
srun ... \
warpx.break_signals=HUP \
> output.txt
For LSF batch systems, the equivalent job script lines are:
#BSUB -wa 'HUP' -wt '6'
jsrun ... \
warpx.break_signals=HUP \
> output.txt
Setting up the field mesh
amr.n_cell
(2 integers in 2D, 3 integers in 3D)The number of grid points along each direction (on the coarsest level)
amr.max_level
(integer, default:0
)When using mesh refinement, the number of refinement levels that will be used.
Use 0 in order to disable mesh refinement. Note: currently,
0
and1
are supported.
amr.ref_ratio
(integer per refined level, default:2
)When using mesh refinement, this is the refinement ratio per level. With this option, all directions are fined by the same ratio.
amr.ref_ratio_vect
(3 integers for x,y,z per refined level)When using mesh refinement, this can be used to set the refinement ratio per direction and level, relative to the previous level.
Example: for three levels, a value of
2 2 4 8 8 16
refines the first level by 2-fold in x and y and 4-fold in z compared to the coarsest level (level 0/mother grid); compared to the first level, the second level is refined 8-fold in x and y and 16-fold in z.
geometry.dims
(string)The dimensions of the simulation geometry. Supported values are
1
,2
,3
,RZ
. For3
, a cartesian geometry ofx
,y
,z
is modeled. For2
, the axes arex
andz
and all physics iny
is assumed to be translation symmetric. For1
, the only axis isz
and the dimensionsx
andy
are translation symmetric. ForRZ
, we apply an azimuthal mode decomposition, withwarpx.n_rz_azimuthal_modes
providing further control.Note that this value has to match the WarpX_DIMS compile-time option. If you installed WarpX from a package manager, then pick the right executable by name.
warpx.n_rz_azimuthal_modes
(integer; 1 by default)When using the RZ version, this is the number of azimuthal modes. The default is
1
, which corresponds to a perfectly axisymmetric simulation.
geometry.prob_lo
andgeometry.prob_hi
(2 floats in 2D, 3 floats in 3D; in meters)The extent of the full simulation box. This box is rectangular, and thus its extent is given here by the coordinates of the lower corner (
geometry.prob_lo
) and upper corner (geometry.prob_hi
). The first axis of the coordinates is x (or r with cylindrical) and the last is z.
warpx.do_moving_window
(integer; 0 by default)Whether to use a moving window for the simulation
warpx.moving_window_dir
(eitherx
,y
orz
)The direction of the moving window.
warpx.moving_window_v
(float)The speed of moving window, in units of the speed of light (i.e. use
1.0
for a moving window that moves exactly at the speed of light)
warpx.start_moving_window_step
(integer; 0 by default)The timestep at which the moving window starts.
warpx.end_moving_window_step
(integer; default is-1
for false)The timestep at which the moving window ends.
warpx.fine_tag_lo
andwarpx.fine_tag_hi
(2 floats in 2D, 3 floats in 3D; in meters) optionalWhen using static mesh refinement with 1 level, the extent of the refined patch. This patch is rectangular, and thus its extent is given here by the coordinates of the lower corner (
warpx.fine_tag_lo
) and upper corner (warpx.fine_tag_hi
).
warpx.ref_patch_function(x,y,z)
(string) optionalA function of x, y, z that defines the extent of the refined patch when using static mesh refinement with
amr.max_level
>0. Note that the function can be used to define distinct regions for refinement, however, the refined regions should be such that the pml layer surrounding the patches should not overlap. For this reason, when defining distinct patches, please ensure that they are sufficiently separated.
warpx.refine_plasma
(integer) optional (default 0)Increase the number of macro-particles that are injected “ahead” of a mesh refinement patch in a moving window simulation.
Note: in development; only works with static mesh-refinement, specific to moving window plasma injection, and requires a single refined level.
warpx.n_current_deposition_buffer
(integer)When using mesh refinement: the particles that are located inside a refinement patch, but within
n_current_deposition_buffer
cells of the edge of this patch, will deposit their charge and current to the lower refinement level, instead of depositing to the refinement patch itself. See the mesh-refinement section for more details. If this variable is not explicitly set in the input script,n_current_deposition_buffer
is automatically set so as to be large enough to hold the particle shape, on the fine grid
warpx.n_field_gather_buffer
(integer, optional)Default:
warpx.n_field_gather_buffer = n_current_deposition_buffer + 1
(one cell larger thann_current_deposition_buffer
on the fine grid).When using mesh refinement, particles that are located inside a refinement patch, but within
n_field_gather_buffer
cells of the edge of the patch, gather the fields from the lower refinement level, instead of gathering the fields from the refinement patch itself. This avoids some of the spurious effects that can occur inside the refinement patch, close to its edge. See the section Mesh refinement for more details.
warpx.do_single_precision_comms
(integer; 0 by default)Perform MPI communications for field guard regions in single precision. Only meaningful for
WarpX_PRECISION=DOUBLE
.
particles.deposit_on_main_grid
(list of strings)When using mesh refinement: the particle species whose name are included in the list will deposit their charge/current directly on the main grid (i.e. the coarsest level), even if they are inside a refinement patch.
particles.gather_from_main_grid
(list of strings)When using mesh refinement: the particle species whose name are included in the list will gather their fields from the main grid (i.e. the coarsest level), even if they are inside a refinement patch.
Domain Boundary Conditions
boundary.field_lo
andboundary.field_hi
(2 strings for 2D, 3 strings for 3D, pml by default)Boundary conditions applied to fields at the lower and upper domain boundaries. Options are:
Periodic
: This option can be used to set periodic domain boundaries. Note that if the fields for lo in a certain dimension are set to periodic, then the corresponding upper boundary must also be set to periodic. If particle boundaries are not specified in the input file, then particles boundaries by default will be set to periodic. If particles boundaries are specified, then they must be set to periodic corresponding to the periodic field boundaries.pml
(default): This option can be used to add Perfectly Matched Layers (PML) around the simulation domain. See the PML theory section for more details. Additional pml algorithms can be explored using the parameterswarpx.do_pml_in_domain
,warpx.pml_has_particles
, andwarpx.do_pml_j_damping
.absorbing_silver_mueller
: This option can be used to set the Silver-Mueller absorbing boundary conditions. These boundary conditions are simpler and less computationally expensive than the pml, but are also less effective at absorbing the field. They only work with the Yee Maxwell solver.damped
: This is the recommended option in the moving direction when using the spectral solver with moving window (currently only supported along z). This boundary condition applies a damping factor to the electric and magnetic fields in the outer half of the guard cells, using a sine squared profile. As the spectral solver is by nature periodic, the damping prevents fields from wrapping around to the other end of the domain when the periodicity is not desired. This boundary condition is only valid when using the spectral solver.pec
: This option can be used to set a Perfect Electric Conductor at the simulation boundary. Please see the PEC theory section for more details. Note that PEC boundary is invalid at r=0 for the RZ solver. Please usenone
option. This boundary condition does not work with the spectral solver.none
: No boundary condition is applied to the fields with the electromagnetic solver. This option must be used for the RZ-solver at r=0.neumann
: For the electrostatic multigrid solver, a Neumann boundary condition (with gradient of the potential equal to 0) will be applied on the specified boundary.open
: For the electrostatic Poisson solver based on a Integrated Green Function method.
boundary.potential_lo_x/y/z
andboundary.potential_hi_x/y/z
(default 0)Gives the value of the electric potential at the boundaries, for
pec
boundaries. With electrostatic solvers (i.e., withwarpx.do_electrostatic = ...
), this is used in order to compute the potential in the simulation volume at each timestep. When using other solvers (e.g. Maxwell solver), setting these variables will trigger an electrostatic solve att=0
, to compute the initial electric field produced by the boundaries.
boundary.particle_lo
andboundary.particle_hi
(2 strings for 2D, 3 strings for 3D, absorbing by default)Options are:
Absorbing
: Particles leaving the boundary will be deleted.Periodic
: Particles leaving the boundary will re-enter from the opposite boundary. The field boundary condition must be consistently set to periodic and both lower and upper boundaries must be periodic.Reflecting
: Particles leaving the boundary are reflected from the boundary back into the domain. Whenboundary.reflect_all_velocities
is false, the sign of only the normal velocity is changed, otherwise the sign of all velocities are changed.Thermal
: Particles leaving the boundary are reflected from the boundary back into the domain and their velocities are thermalized. The tangential velocity components are sampled fromgaussian
distribution and the component normal to the boundary is sampled fromgaussian flux
distribution. The standard deviation for these distributions should be provided for each species usingboundary.<species>.u_th
. The same standard deviation is used to sample all components.
boundary.reflect_all_velocities
(bool) optional (default false)For a reflecting boundary condition, this flags whether the sign of only the normal velocity is changed or all velocities.
boundary.verboncoeur_axis_correction
(bool) optional (default true)Whether to apply the Verboncoeur correction on the charge and current density on axis when using RZ. For nodal values (rho and Jz), the cell volume for values on axis is calculated as \(\pi*\Delta r^2/4\). In Verboncoeur [2], it is shown that using \(\pi*\Delta r^2/3\) instead will give a uniform density if the particle density is uniform.
Additional PML parameters
warpx.pml_ncell
(int; default: 10)The depth of the PML, in number of cells.
do_similar_dm_pml
(int; default: 1)Whether or not to use an amrex::DistributionMapping for the PML grids that is similar to the mother grids, meaning that the mapping will be computed to minimize the communication costs between the PML and the mother grids.
warpx.pml_delta
(int; default: 10)The characteristic depth, in number of cells, over which the absorption coefficients of the PML increases.
warpx.do_pml_in_domain
(int; default: 0)Whether to create the PML inside the simulation area or outside. If inside, it allows the user to propagate particles in PML and to use extended PML
warpx.pml_has_particles
(int; default: 0)Whether to propagate particles in PML or not. Can only be done if PML are in simulation domain, i.e. if warpx.do_pml_in_domain = 1.
warpx.do_pml_j_damping
(int; default: 0)Whether to damp current in PML. Can only be used if particles are propagated in PML, i.e. if warpx.pml_has_particles = 1.
warpx.v_particle_pml
(float; default: 1)When
warpx.do_pml_j_damping = 1
, the assumed velocity of the particles to be absorbed in the PML, in units of the speed of light c.
warpx.do_pml_dive_cleaning
(bool)Whether to use divergence cleaning for E in the PML region. The value must match
warpx.do_pml_divb_cleaning
(either both false or both true). This option seems to be necessary in order to avoid strong Nyquist instabilities in 3D simulations with the PSATD solver, open boundary conditions and PML in all directions. 2D simulations and 3D simulations with open boundary conditions and PML only in one direction might run well even without divergence cleaning. This option is implemented only for the Cartesian PSATD solver; it is turned on by default in this case.
warpx.do_pml_divb_cleaning
(bool)Whether to use divergence cleaning for B in the PML region. The value must match
warpx.do_pml_dive_cleaning
(either both false or both true). This option seems to be necessary in order to avoid strong Nyquist instabilities in 3D simulations with the PSATD solver, open boundary conditions and PML in all directions. 2D simulations and 3D simulations with open boundary conditions and PML only in one direction might run well even without divergence cleaning. This option is implemented only for the Cartesian PSATD solver; it is turned on by default in this case.
Embedded Boundary Conditions
warpx.eb_implicit_function
(string)A function of x, y, z that defines the surface of the embedded boundary. That surface lies where the function value is 0 ; the physics simulation area is where the function value is negative ; the interior of the embeddded boundary is where the function value is positive.
warpx.eb_potential(x,y,z,t)
(string)Gives the value of the electric potential at the surface of the embedded boundary, as a function of x, y, z and t. With electrostatic solvers (i.e., with
warpx.do_electrostatic = ...
), this is used in order to compute the potential in the simulation volume at each timestep. When using other solvers (e.g. Maxwell solver), setting this variable will trigger an electrostatic solve att=0
, to compute the initial electric field produced by the boundaries. Note that this function is also evaluated inside the embedded boundary. For this reason, it is important to define this function in such a way that it is constant inside the embedded boundary.
Distribution across MPI ranks and parallelization
warpx.numprocs
(2 ints for 2D, 3 ints for 3D) optional (default none)This optional parameter can be used to control the domain decomposition on the coarsest level. The domain will be chopped into the exact number of pieces in each dimension as specified by this parameter. If it’s not specified, the domain decomposition will be determined by the parameters that will be discussed below. If specified, the product of the numbers must be equal to the number of MPI processes.
amr.max_grid_size
(integer) optional (default 128)Maximum allowable size of each subdomain (expressed in number of grid points, in each direction). Each subdomain has its own ghost cells, and can be handled by a different MPI rank ; several OpenMP threads can work simultaneously on the same subdomain.
If
max_grid_size
is such that the total number of subdomains is larger that the number of MPI ranks used, than some MPI ranks will handle several subdomains, thereby providing additional flexibility for load balancing.When using mesh refinement, this number applies to the subdomains of the coarsest level, but also to any of the finer level.
algo.load_balance_intervals
(string) optional (default 0)Using the Intervals parser syntax, this string defines the timesteps at which WarpX should try to redistribute the work across MPI ranks, in order to have better load balancing. Use 0 to disable load_balancing.
When performing load balancing, WarpX measures the wall time for computational parts of the PIC cycle. It then uses this data to decide how to redistribute the subdomains across MPI ranks. (Each subdomain is unchanged, but its owner is changed in order to have better performance.) This relies on each MPI rank handling several (in fact many) subdomains (see
max_grid_size
).
algo.load_balance_efficiency_ratio_threshold
(float) optional (default 1.1)Controls whether to adopt a proposed distribution mapping computed during a load balance. If the the ratio of the proposed to current distribution mapping efficiency (i.e., average cost per MPI process; efficiency is a number in the range [0, 1]) is greater than the threshold value, the proposed distribution mapping is adopted. The suggested range of values is
algo.load_balance_efficiency_ratio_threshold >= 1
, which ensures that the new distribution mapping is adopted only if doing so would improve the load balance efficiency. The higher the threshold value, the more conservative is the criterion for adoption of a proposed distribution; for example, withalgo.load_balance_efficiency_ratio_threshold = 1
, the proposed distribution is adopted any time the proposed distribution improves load balancing; if insteadalgo.load_balance_efficiency_ratio_threshold = 2
, the proposed distribution is adopted only if doing so would yield a 100% to the load balance efficiency (with this threshold value, if the current efficiency is0.45
, the new distribution would only be adopted if the proposed efficiency were greater than0.9
).
algo.load_balance_with_sfc
(0 or 1) optional (default 0)If this is 1: use a Space-Filling Curve (SFC) algorithm in order to perform load-balancing of the simulation. If this is 0: the Knapsack algorithm is used instead.
algo.load_balance_knapsack_factor
(float) optional (default 1.24)Controls the maximum number of boxes that can be assigned to a rank during load balance when using the ‘knapsack’ policy for update of the distribution mapping; the maximum is load_balance_knapsack_factor*(average number of boxes per rank). For example, if there are 4 boxes per rank and load_balance_knapsack_factor=2, no more than 8 boxes can be assigned to any rank.
algo.load_balance_costs_update
(heuristic
ortimers
) optional (defaulttimers
)If this is heuristic: load balance costs are updated according to a measure of particles and cells assigned to each box of the domain. The cost \(c\) is computed as
\[c = n_{\text{particle}} \cdot w_{\text{particle}} + n_{\text{cell}} \cdot w_{\text{cell}},\]where \(n_{\text{particle}}\) is the number of particles on the box, \(w_{\text{particle}}\) is the particle cost weight factor (controlled by
algo.costs_heuristic_particles_wt
), \(n_{\text{cell}}\) is the number of cells on the box, and \(w_{\text{cell}}\) is the cell cost weight factor (controlled byalgo.costs_heuristic_cells_wt
).If this is timers: costs are updated according to in-code timers.
algo.costs_heuristic_particles_wt
(float) optionalParticle weight factor used in Heuristic strategy for costs update; if running on GPU, the particle weight is set to a value determined from single-GPU tests on Summit, depending on the choice of solver (FDTD or PSATD) and order of the particle shape. If running on CPU, the default value is 0.9. If running on GPU, the default value is
Particle shape factor
1
2
3
FDTD/CKC
0.599
0.732
0.855
PSATD
0.425
0.595
0.75
algo.costs_heuristic_cells_wt
(float) optionalCell weight factor used in Heuristic strategy for costs update; if running on GPU, the cell weight is set to a value determined from single-GPU tests on Summit, depending on the choice of solver (FDTD or PSATD) and order of the particle shape. If running on CPU, the default value is 0.1. If running on GPU, the default value is
Particle shape factor
1
2
3
FDTD/CKC
0.401
0.268
0.145
PSATD
0.575
0.405
0.25
warpx.do_dynamic_scheduling
(0 or 1) optional (default 1)Whether to activate OpenMP dynamic scheduling.
Math parser and user-defined constants
WarpX uses AMReX’s math parser that reads expressions in the input file. It can be used in all input parameters that consist of one or more integers or floats. Integer input expecting boolean, 0 or 1, are not parsed. Note that when multiple values are expected, the expressions are space delimited. For integer input values, the expressions are evaluated as real numbers and the final result rounded to the nearest integer. See this section of the AMReX documentation for a complete list of functions supported by the math parser.
WarpX constants
WarpX provides a few pre-defined constants, that can be used for any parameter that consists of one or more floats.
q_e |
elementary charge |
m_e |
electron mass |
m_p |
proton mass |
m_u |
unified atomic mass unit (Dalton) |
epsilon0 |
vacuum permittivity |
mu0 |
vacuum permeability |
clight |
speed of light |
kb |
Boltzmann’s constant (J/K) |
pi |
math constant pi |
See Source/Utils/WarpXConst.H
for the values.
User-defined constants
Users can define their own constants in the input file.
These constants can be used for any parameter that consists of one or more integers or floats.
User-defined constant names can contain only letters, numbers and the character _
.
The name of each constant has to begin with a letter. The following names are used
by WarpX, and cannot be used as user-defined constants: x
, y
, z
, X
, Y
, t
.
The values of the constants can include the predefined WarpX constants listed above as well as other user-defined constants.
For example:
my_constants.a0 = 3.0
my_constants.z_plateau = 150.e-6
my_constants.n0 = 1.e22
my_constants.wp = sqrt(n0*q_e**2/(epsilon0*m_e))
Coordinates
Besides, for profiles that depend on spatial coordinates (the plasma momentum distribution or the laser field, see below Particle initialization and Laser initialization), the parser will interpret some variables as spatial coordinates. These are specified in the input parameter, i.e., density_function(x,y,z)
and field_function(X,Y,t)
.
The parser reads python-style expressions between double quotes, for instance
"a0*x**2 * (1-y*1.e2) * (x>0)"
is a valid expression where a0
is a
user-defined constant (see above) and x
and y
are spatial coordinates. The names are case sensitive. The factor
(x>0)
is 1
where x>0
and 0
where x<=0
. It allows the user to
define functions by intervals.
Alternatively the expression above can be written as if(x>0, a0*x**2 * (1-y*1.e2), 0)
.
Particle initialization
particles.species_names
(strings, separated by spaces)The name of each species. This is then used in the rest of the input deck ; in this documentation we use <species_name> as a placeholder.
particles.photon_species
(strings, separated by spaces)List of species that are photon species, if any. This is required when compiling with QED=TRUE.
particles.use_fdtd_nci_corr
(0 or 1) optional (default 0)Whether to activate the FDTD Numerical Cherenkov Instability corrector. Not currently available in the RZ configuration.
particles.rigid_injected_species
(strings, separated by spaces)List of species injected using the rigid injection method. The rigid injection method is useful when injecting a relativistic particle beam in boosted-frame simulations; see the input-output section for more details. For species injected using this method, particles are translated along the +z axis with constant velocity as long as their
z
coordinate verifiesz<zinject_plane
. Whenz>zinject_plane
, particles are pushed in a standard way, using the specified pusher. (see the parameter<species_name>.zinject_plane
below)
particles.do_tiling
(bool) optional (default false if WarpX is compiled for GPUs, true otherwise)Controls whether tiling (‘cache blocking’) transformation is used for particles. Tiling should be on when using OpenMP and off when using GPUs.
<species_name>.species_type
(string) optional (default unspecified)Type of physical species. Currently, the accepted species are
"electron"
,"positron"
,"muon"
,"antimuon"
,"photon"
,"neutron"
,"proton"
,"alpha"
,"hydrogen1"
(a.k.a."protium"
),"hydrogen2"
(a.k.a."deuterium"
),"hydrogen3"
(a.k.a."tritium"
),"helium"
,"helium3"
,"helium4"
,"lithium"
,"lithium6"
,"lithium7"
,"beryllium"
,"beryllium9"
,"boron"
,"boron10"
,"boron11"
,"carbon"
,"carbon12"
,"carbon13"
,"carbon14"
,"nitrogen"
,"nitrogen14"
,"nitrogen15"
,"oxygen"
,"oxygen16"
,"oxygen17"
,"oxygen18"
,"fluorine"
,"fluorine19"
,"neon"
,"neon20"
,"neon21"
,"neon22"
,"aluminium"
,"argon"
,"copper"
,"xenon"
and"gold"
. The difference between"proton"
and"hydrogen1"
is that the mass of the latter includes also the mass of the bound electron (same for"alpha"
and"helium4"
). When only the name of an element is specified, the mass is a weighted average of the masses of the stable isotopes. For all the elements withZ < 11
we provide also the stable isotopes as an option forspecies_type
(e.g.,"helium3"
and"helium4"
). Eitherspecies_type
or bothmass
andcharge
have to be specified.
<species_name>.charge
(float) optional (default NaN)The charge of one physical particle of this species. If
species_type
is specified, the charge will be set to the physical value andcharge
is optional. When<species>.do_field_ionization = 1
, the physical particle charge is equal toionization_initial_level * charge
, so latter parameter should be equal to q_e (which is defined in WarpX as the elementary charge in coulombs).
<species_name>.mass
(float) optional (default NaN)The mass of one physical particle of this species. If
species_type
is specified, the mass will be set to the physical value andmass
is optional.
<species_name>.xmin,ymin,zmin
and<species_name>.xmax,ymax,zmax
(float) optional (default unlimited)When
<species_name>.xmin
and<species_name>.xmax
are set, they delimit the region within which particles are injected. If periodic boundary conditions are used in directioni
, then the default (i.e. if the range is not specified) range will be the simulation box,[geometry.prob_hi[i], geometry.prob_lo[i]]
.
<species_name>.injection_sources
(list of strings
) optionalNames of additional injection sources. By default, WarpX assumes one injection source per species, hence all of the input parameters below describing the injection are parameters directly of the species. However, this option allows additional sources, the names of which are specified here. For each source, the name of the source is added to the input parameters below. For instance, with
<species_name>.injection_sources = source1 source2
there can be the two input parameters<species_name>.source1.injection_style
and<species_name>.source2.injection_style
. For the parameters of each source, the parameter with the name of the source will be used. If it is not given, the value of the parameter without the source name will be used. This allows parameters used for all sources to be specified once. For example, if thesource1
andsource2
have the same value ofuz_m
, then it can be set using<species_name>.uz_m
instead of setting it for each source. Note that since by default<species_name>.injection_style = none
, all injection sources can be input this way. Note that if a moving window is used, the bulk velocity of all of the sources must be the same since it is used when updating the window.
<species_name>.injection_style
(string; default:none
)Determines how the (macro-)particles will be injected in the simulation. The number of particles per cell is always given with respect to the coarsest level (level 0/mother grid), even if particles are immediately assigned to a refined patch.
The options are:
NUniformPerCell
: injection with a fixed number of evenly-spaced particles per cell. This requires the additional parameter<species_name>.num_particles_per_cell_each_dim
.NRandomPerCell
: injection with a fixed number of randomly-distributed particles per cell. This requires the additional parameter<species_name>.num_particles_per_cell
.SingleParticle
: Inject a single macroparticle. This requires the additional parameters:<species_name>.single_particle_pos
(3 doubles, particle 3D position [meter])<species_name>.single_particle_u
(3 doubles, particle 3D normalized momentum, i.e. \(\gamma \beta\))<species_name>.single_particle_weight
( double, macroparticle weight, i.e. number of physical particles it represents)
MultipleParticles
: Inject multiple macroparticles. This requires the additional parameters:<species_name>.multiple_particles_pos_x
(list of doubles, X positions of the particles [meter])<species_name>.multiple_particles_pos_y
(list of doubles, Y positions of the particles [meter])<species_name>.multiple_particles_pos_z
(list of doubles, Z positions of the particles [meter])<species_name>.multiple_particles_ux
(list of doubles, X normalized momenta of the particles, i.e. \(\gamma \beta_x\))<species_name>.multiple_particles_uy
(list of doubles, Y normalized momenta of the particles, i.e. \(\gamma \beta_y\))<species_name>.multiple_particles_uz
(list of doubles, Z normalized momenta of the particles, i.e. \(\gamma \beta_z\))<species_name>.multiple_particles_weight
(list of doubles, macroparticle weights, i.e. number of physical particles each represents)
gaussian_beam
: Inject particle beam with gaussian distribution in space in all directions. This requires additional parameters:<species_name>.q_tot
(beam charge),<species_name>.npart
(number of macroparticles in the beam),<species_name>.x/y/z_m
(average position in x/y/z),<species_name>.x/y/z_rms
(standard deviation in x/y/z),
There are additional optional parameters:
<species_name>.x/y/z_cut
(optional, particles withabs(x-x_m) > x_cut*x_rms
are not injected, same for y and z.<species_name>.q_tot
is the charge of the un-cut beam, so that cutting the distribution is likely to result in a lower total charge),<species_name>.do_symmetrize
(optional, whether to symmetrize the beam)<species_name>.symmetrization_order
(order of symmetrization, default is 4, can be 4 or 8).
If
<species_name>.do_symmetrize
is 0, no symmetrization occurs. If<species_name>.do_symmetrize
is 1, then the beam is symmetrized according to the value of<species_name>.symmetrization_order
. If set to 4, symmetrization is in the x and y direction, (x,y) (-x,y) (x,-y) (-x,-y). If set to 8, symmetrization is also done with x and y exchanged, (y,x), (-y,x), (y,-x), (-y,-x)).<species_name>.focal_distance
(optional, distance between the beam centroid and the position of the focal plane of the beam, along the direction of the beam mean velocity; space charge is ignored in the initialization of the particles)
If
<species_name>.focal_distance
is specified,x_rms
,y_rms
andz_rms
are the sizes of the beam in the focal plane. Since the beam is not necessarily initialized close to its focal plane, the initial size of the beam will differ fromx_rms
,y_rms
,z_rms
.Usually, in accelerator physics the operative quantities are the normalized emittances \(\epsilon_{x,y}\) and beta functions \(\beta_{x,y}\). We assume that the beam travels along \(z\) and we mark the quantities evaluated at the focal plane with a \(*\). Therefore, the normalized transverse emittances and beta functions are related to the focal distance \(f = z - z^*\), the beam sizes \(\sigma_{x,y}\) (which in the code are
x_rms
,y_rms
), the beam relativistic Lorentz factor \(\gamma\), and the normalized momentum spread \(\Delta u_{x,y}\) according to the equations below (Wiedemann [3]).\[ \begin{align}\begin{aligned}\Delta u_{x,y} &= \frac{\epsilon^*_{x,y}}{\sigma^*_{x,y}},\\\sigma*_{x, y} &= \sqrt{ \frac{ \epsilon^*_{x,y} \beta^*_{x,y} }{\gamma}},\\\sigma_{x,y}(z) &= \sigma^*_{x,y} \sqrt{1 + \left( \frac{z - z^*}{\beta^*_{x,y}} \right)^2}\end{aligned}\end{align} \]external_file
: Inject macroparticles with properties (mass, charge, position, and momentum - \(\gamma \beta m c\)) read from an external openPMD file. With it users can specify the additional arguments:<species_name>.injection_file
(string) openPMD file name and<species_name>.charge
(double) optional (default is read from openPMD file) when set this will be the charge of the physical particle represented by the injected macroparticles.<species_name>.mass
(double) optional (default is read from openPMD file) when set this will be the charge of the physical particle represented by the injected macroparticles.<species_name>.z_shift
(double) optional (default is no shift) when set this value will be added to the longitudinal,z
, position of the particles.<species_name>.impose_t_lab_from_file
(bool) optional (default is false) only read if warpx.gamma_boost > 1., it allows to set t_lab for the Lorentz Transform as being the time stored in the openPMD file.
Warning:
q_tot!=0
is not supported with theexternal_file
injection style. If a value is provided, it is ignored and no re-scaling is done. The external file must include the speciesopenPMD::Record
labeledposition
andmomentum
(double arrays), with dimensionality and units set viaopenPMD::setUnitDimension
andsetUnitSI
. If the external file also containsopenPMD::Records
formass
andcharge
(constant double scalars) then the species will use these, unless overwritten in the input file (see<species_name>.mass
,<species_name>.charge
or<species_name>.species_type
). Theexternal_file
option is currently implemented for 2D, 3D and RZ geometries, with record components in the cartesian coordinates(x,y,z)
for 3D and RZ, and(x,z)
for 2D. For more information on the openPMD format and how to build WarpX with it, please visit the install section.NFluxPerCell
: Continuously inject a flux of macroparticles from a planar surface. This requires the additional parameters:<species_name>.flux_profile
(see the description of this parameter further below)<species_name>.surface_flux_pos
(double, location of the injection plane [meter])<species_name>.flux_normal_axis
(x, y, or z for 3D, x or z for 2D, or r, t, or z for RZ. When flux_normal_axis is r or t, the x and y components of the user-specified momentum distribution are interpreted as the r and t components respectively)<species_name>.flux_direction
(-1 or +1, direction of flux relative to the plane)<species_name>.num_particles_per_cell
(double)<species_name>.flux_tmin
(double, Optional time at which the flux will be turned on. Ignored when negative.)<species_name>.flux_tmax
(double, Optional time at which the flux will be turned off. Ignored when negative.)
none
: Do not inject macro-particles (for example, in a simulation that starts with neutral, ionizable atoms, one may want to create the electrons species – where ionized electrons can be stored later on – without injecting electron macro-particles).
<species_name>.num_particles_per_cell_each_dim
(3 integers in 3D and RZ, 2 integers in 2D)With the NUniformPerCell injection style, this specifies the number of particles along each axis within a cell. Note that for RZ, the three axis are radius, theta, and z and that the recommended number of particles per theta is at least two times the number of azimuthal modes requested. (It is recommended to do a convergence scan of the number of particles per theta)
<species_name>.random_theta
(bool) optional (default 1)When using RZ geometry, whether to randomize the azimuthal position of particles. This is used when
<species_name>.injection_style = NUniformPerCell
.
<species_name>.do_splitting
(bool) optional (default 0)Split particles of the species when crossing the boundary from a lower resolution domain to a higher resolution domain.
Currently implemented on CPU only.
<species_name>.do_continuous_injection
(0 or 1)Whether to inject particles during the simulation, and not only at initialization. This can be required with a moving window and/or when running in a boosted frame.
<species_name>.initialize_self_fields
(0 or 1)Whether to calculate the space-charge fields associated with this species at the beginning of the simulation. The fields are calculated for the mean gamma of the species.
<species_name>.self_fields_required_precision
(float, default: 1.e-11)The relative precision with which the initial space-charge fields should be calculated. More specifically, the initial space-charge fields are computed with an iterative Multi-Level Multi-Grid (MLMG) solver. For highly-relativistic beams, this solver can fail to reach the default precision within a reasonable time ; in that case, users can set a relaxed precision requirement through
self_fields_required_precision
.
<species_name>.self_fields_absolute_tolerance
(float, default: 0.0)The absolute tolerance with which the space-charge fields should be calculated in units of \(\mathrm{V/m}^2\). More specifically, the acceptable residual with which the solution can be considered converged. In general this should be left as the default, but in cases where the simulation state changes very little between steps it can occur that the initial guess for the MLMG solver is so close to the converged value that it fails to improve that solution sufficiently to reach the
self_fields_required_precision
value.
<species_name>.self_fields_max_iters
(integer, default: 200)Maximum number of iterations used for MLMG solver for initial space-charge fields calculation. In case if MLMG converges but fails to reach the desired
self_fields_required_precision
, this parameter may be increased.
<species_name>.profile
(string)Density profile for this species. The options are:
constant
: Constant density profile within the box, or between<species_name>.xmin
and<species_name>.xmax
(and same in all directions). This requires additional parameter<species_name>.density
. i.e., the plasma density in \(m^{-3}\).predefined
: Predefined density profile. This requires additional parameters<species_name>.predefined_profile_name
and<species_name>.predefined_profile_params
. Currently, only a parabolic channel density profile is implemented.parse_density_function
: the density is given by a function in the input file. It requires additional argument<species_name>.density_function(x,y,z)
, which is a mathematical expression for the density of the species, e.g.electrons.density_function(x,y,z) = "n0+n0*x**2*1.e12"
wheren0
is a user-defined constant, see above. WARNING: wheredensity_function(x,y,z)
is close to zero, particles will still be injected betweenxmin
andxmax
etc., with a null weight. This is undesirable because it results in useless computing. To avoid this, see optiondensity_min
below.
<species_name>.flux_profile
(string)Defines the expression of the flux, when using
<species_name>.injection_style=NFluxPerCell
constant
: Constant flux. This requires the additional parameter<species_name>.flux
. i.e., the injection flux in \(m^{-2}.s^{-1}\).parse_flux_function
: the flux is given by a function in the input file. It requires the additional argument<species_name>.flux_function(x,y,z,t)
, which is a mathematical expression for the flux of the species.
<species_name>.density_min
(float) optional (default 0.)Minimum plasma density. No particle is injected where the density is below this value.
<species_name>.density_max
(float) optional (default infinity)Maximum plasma density. The density at each point is the minimum between the value given in the profile, and density_max.
<species_name>.radially_weighted
(bool) optional (default true)Whether particle’s weight is varied with their radius. This only applies to cylindrical geometry. The only valid value is true.
<species_name>.momentum_distribution_type
(string)Distribution of the normalized momentum (u=p/mc) for this species. The options are:
at_rest
: Particles are initialized with zero momentum.constant
: constant momentum profile. This can be controlled with the additional parameters<species_name>.ux
,<species_name>.uy
and<species_name>.uz
, the normalized momenta in the x, y and z direction respectively, which are all0.
by default.uniform
: uniform probability distribution between a minimum and a maximum value. The x, y and z directions are sampled independently and the final momentum space is a cuboid. The parameters that control the minimum and maximum domain of the distribution are<species_name>.u<x,y,z>_min
and<species_name>.u<x,y,z>_max
in each direction respectively (e.g.,<species_name>.uz_min = 0.2
and<species_name>.uz_max = 0.4
to control the generation along thez
direction). All the parameters default to0
.gaussian
: gaussian momentum distribution in all 3 directions. This can be controlled with the additional arguments for the average momenta along each direction<species_name>.ux_m
,<species_name>.uy_m
and<species_name>.uz_m
as well as standard deviations along each direction<species_name>.ux_th
,<species_name>.uy_th
and<species_name>.uz_th
. These 6 parameters are all0.
by default.gaussianflux
: Gaussian momentum flux distribution, which is Gaussian in the plane and v*Gaussian normal to the plane. It can only be used wheninjection_style = NFluxPerCell
. This can be controlled with the additional arguments to specify the plane’s orientation,<species_name>.flux_normal_axis
and<species_name>.flux_direction
, for the average momenta along each direction<species_name>.ux_m
,<species_name>.uy_m
and<species_name>.uz_m
, as well as standard deviations along each direction<species_name>.ux_th
,<species_name>.uy_th
and<species_name>.uz_th
.ux_m
,uy_m
,uz_m
,ux_th
,uy_th
anduz_th
are all0.
by default.maxwell_boltzmann
: Maxwell-Boltzmann distribution that takes a dimensionless temperature parameter \(\theta\) as an input, where \(\theta = \frac{k_\mathrm{B} \cdot T}{m \cdot c^2}\), \(T\) is the temperature in Kelvin, \(k_\mathrm{B}\) is the Boltzmann constant, \(c\) is the speed of light, and \(m\) is the mass of the species. Theta is specified by a combination of<species_name>.theta_distribution_type
,<species_name>.theta
, and<species_name>.theta_function(x,y,z)
(see below). For values of \(\theta > 0.01\), errors due to ignored relativistic terms exceed 1%. Temperatures less than zero are not allowed. The plasma can be initialized to move at a bulk velocity \(\beta = v/c\). The speed is specified by the parameters<species_name>.beta_distribution_type
,<species_name>.beta
, and<species_name>.beta_function(x,y,z)
(see below). \(\beta\) can be positive or negative and is limited to the range \(-1 < \beta < 1\). The direction of the velocity field is given by<species_name>.bulk_vel_dir = (+/-) 'x', 'y', 'z'
, and must be the same across the domain. Please leave no whitespace between the sign and the character on input. A direction without a sign will be treated as positive. The MB distribution is initialized in the drifting frame by sampling three Gaussian distributions in each dimension using, the Box Mueller method, and then the distribution is transformed to the simulation frame using the flipping method. The flipping method can be found in Zenitani 2015 section III. B. (Phys. Plasmas 22, 042116). By default,beta
is equal to0.
andbulk_vel_dir
is+x
.Note that though the particles may move at relativistic speeds in the simulation frame, they are not relativistic in the drift frame. This is as opposed to the Maxwell Juttner setting, which initializes particles with relativistic momentums in their drifting frame.
maxwell_juttner
: Maxwell-Juttner distribution for high temperature plasma that takes a dimensionless temperature parameter \(\theta\) as an input, where \(\theta = \frac{k_\mathrm{B} \cdot T}{m \cdot c^2}\), \(T\) is the temperature in Kelvin, \(k_\mathrm{B}\) is the Boltzmann constant, and \(m\) is the mass of the species. Theta is specified by a combination of<species_name>.theta_distribution_type
,<species_name>.theta
, and<species_name>.theta_function(x,y,z)
(see below). The Sobol method used to generate the distribution will not terminate for \(\theta \lesssim 0.1\), and the code will abort if it encounters a temperature below that threshold. The Maxwell-Boltzmann distribution is recommended for temperatures in the range \(0.01 < \theta < 0.1\). Errors due to relativistic effects can be expected to approximately between 1% and 10%. The plasma can be initialized to move at a bulk velocity \(\beta = v/c\). The speed is specified by the parameters<species_name>.beta_distribution_type
,<species_name>.beta
, and<species_name>.beta_function(x,y,z)
(see below). \(\beta\) can be positive or negative and is limited to the range \(-1 < \beta < 1\). The direction of the velocity field is given by<species_name>.bulk_vel_dir = (+/-) 'x', 'y', 'z'
, and must be the same across the domain. Please leave no whitespace between the sign and the character on input. A direction without a sign will be treated as positive. The MJ distribution will be initialized in the moving frame using the Sobol method, and then the distribution will be transformed to the simulation frame using the flipping method. Both the Sobol and the flipping method can be found in Zenitani 2015 (Phys. Plasmas 22, 042116). By default,beta
is equal to0.
andbulk_vel_dir
is+x
.Please take notice that particles initialized with this setting can be relativistic in two ways. In the simulation frame, they can drift with a relativistic speed beta. Then, in the drifting frame they are still moving with relativistic speeds due to high temperature. This is as opposed to the Maxwell Boltzmann setting, which initializes non-relativistic plasma in their relativistic drifting frame.
radial_expansion
: momentum depends on the radial coordinate linearly. This can be controlled with additional parameteru_over_r
which is the slope (0.
by default).parse_momentum_function
: the momentum \(u = (u_{x},u_{y},u_{z})=(\gamma v_{x}/c,\gamma v_{y}/c,\gamma v_{z}/c)\) is given by a function in the input file. It requires additional arguments<species_name>.momentum_function_ux(x,y,z)
,<species_name>.momentum_function_uy(x,y,z)
and<species_name>.momentum_function_uz(x,y,z)
, which gives the distribution of each component of the momentum as a function of space.gaussian_parse_momentum_function
: Gaussian momentum distribution where the mean and the standard deviation are given by functions of position in the input file. Both are assumed to be non-relativistic. The mean is the normalized momentum, \(u_m = \gamma v_m/c\). The standard deviation is normalized, \(u_th = v_th/c\). For example, this might be u_th = sqrt(T*q_e/mass)/clight given the temperature (in eV) and mass. It requires the following arguments:<species_name>.momentum_function_ux_m(x,y,z)
: mean \(u_{x}\)<species_name>.momentum_function_uy_m(x,y,z)
: mean \(u_{y}\)<species_name>.momentum_function_uz_m(x,y,z)
: mean \(u_{z}\)<species_name>.momentum_function_ux_th(x,y,z)
: standard deviation of \(u_{x}\)<species_name>.momentum_function_uy_th(x,y,z)
: standard deviation of \(u_{y}\)<species_name>.momentum_function_uz_th(x,y,z)
: standard deviation of \(u_{z}\)
<species_name>.theta_distribution_type
(string) optional (defaultconstant
)Only read if
<species_name>.momentum_distribution_type
ismaxwell_boltzmann
ormaxwell_juttner
. See documentation for these distributions (above) for constraints on values of theta. Temperatures less than zero are not allowed.If
constant
, use a constant temperature, given by the required float parameter<species_name>.theta
.If
parser
, use a spatially-dependent analytic parser function, given by the required parameter<species_name>.theta_function(x,y,z)
.
<species_name>.beta_distribution_type
(string) optional (defaultconstant
)Only read if
<species_name>.momentum_distribution_type
ismaxwell_boltzmann
ormaxwell_juttner
. See documentation for these distributions (above) for constraints on values of beta.If
constant
, use a constant speed, given by the required float parameter<species_name>.beta
.If
parser
, use a spatially-dependent analytic parser function, given by the required parameter<species_name>.beta_function(x,y,z)
.
<species_name>.zinject_plane
(float)Only read if
<species_name>
is inparticles.rigid_injected_species
. Injection plane when using the rigid injection method. Seeparticles.rigid_injected_species
above.
<species_name>.rigid_advance
(bool)Only read if
<species_name>
is inparticles.rigid_injected_species
.If
false
, each particle is advanced with its own velocityvz
until it reacheszinject_plane
.If
true
, each particle is advanced with the average speed of the speciesvzbar
until it reacheszinject_plane
.
species_name.predefined_profile_name
(string)Only read if
<species_name>.profile
ispredefined
.If
parabolic_channel
, the plasma profile is a parabolic profile with cosine-like ramps at the beginning and the end of the profile. The density is given by\[n = n_0 n(x,y) n(z-z_0)\]with
\[n(x,y) = 1 + 4\frac{x^2+y^2}{k_p^2 R_c^4}\]where \(k_p\) is the plasma wavenumber associated with density \(n_0\). Here, with \(z_0\) as the start of the plasma, \(n(z-z_0)\) is a cosine-like up-ramp from \(0\) to \(L_{ramp,up}\), constant to \(1\) from \(L_{ramp,up}\) to \(L_{ramp,up} + L_{plateau}\) and a cosine-like down-ramp from \(L_{ramp,up} + L_{plateau}\) to \(L_{ramp,up} + L_{plateau}+L_{ramp,down}\). All parameters are given in
predefined_profile_params
.
<species_name>.predefined_profile_params
(list of float)Parameters for the predefined profiles.
If
species_name.predefined_profile_name
isparabolic_channel
,predefined_profile_params
contains a space-separated list of the following parameters, in this order: \(z_0\) \(L_{ramp,up}\) \(L_{plateau}\) \(L_{ramp,down}\) \(R_c\) \(n_0\)
<species_name>.do_backward_propagation
(bool)Inject a backward-propagating beam to reduce the effect of charge-separation fields when running in the boosted frame. See examples.
<species_name>.split_type
(int) optional (default 0)Splitting technique. When 0, particles are split along the simulation axes (4 particles in 2D, 6 particles in 3D). When 1, particles are split along the diagonals (4 particles in 2D, 8 particles in 3D).
<species_name>.do_not_deposit
(0 or 1 optional; default 0)If 1 is given, both charge deposition and current deposition will not be done, thus that species does not contribute to the fields.
<species_name>.do_not_gather
(0 or 1 optional; default 0)If 1 is given, field gather from grids will not be done, thus that species will not be affected by the field on grids.
<species_name>.do_not_push
(0 or 1 optional; default 0)If 1 is given, this species will not be pushed by any pusher during the simulation.
<species_name>.addIntegerAttributes
(list of string)User-defined integer particle attribute for species,
species_name
. These integer attributes will be initialized with user-defined functions when the particles are generated. If the user-defined integer attribute is<int_attrib_name>
then the following required parameter must be specified to initialize the attribute. *<species_name>.attribute.<int_attrib_name>(x,y,z,ux,uy,uz,t)
(string)t
represents the physical time in seconds during the simulation.x
,y
,z
represent particle positions in the unit of meter.ux
,uy
,uz
represent the particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g. Ifelectrons.addIntegerAttributes = upstream
andelectrons.upstream(x,y,z,ux,uy,uz,t) = (x>0.0)*1
is provided then, an integer attributeupstream
is added to all electron particles and when these particles are generated, the particles with position less than0
are assigned a value of1
.
<species_name>.addRealAttributes
(list of string)User-defined real particle attribute for species,
species_name
. These real attributes will be initialized with user-defined functions when the particles are generated. If the user-defined real attribute is<real_attrib_name>
then the following required parameter must be specified to initialize the attribute.<species_name>.attribute.<real_attrib_name>(x,y,z,ux,uy,uz,t)
(string)t
represents the physical time in seconds during the simulation.x
,y
,z
represent particle positions in the unit of meter.ux
,uy
,uz
represent the particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light.
<species>.save_particles_at_xlo/ylo/zlo
,<species>.save_particles_at_xhi/yhi/zhi
and<species>.save_particles_at_eb
(0 or 1 optional, default 0)If 1 particles of this species will be copied to the scraped particle buffer for the specified boundary if they leave the simulation domain in the specified direction. If USE_EB=TRUE the
save_particles_at_eb
flag can be set to 1 to also save particle data for the particles of this species that impact the embedded boundary. The scraped particle buffer can be used to track particle fluxes out of the simulation. The particle data can be written out by setting up aBoundaryScrapingDiagnostic
. It is also accessible via the Python interface. The functionget_particle_boundary_buffer
, found in thepicmi.Simulation
class assim.extension.get_particle_boundary_buffer()
, can be used to access the scraped particle buffer. An entry is included for every particle in the buffer of the timestep at which the particle was scraped. This can be accessed by passing the argumentcomp_name="step_scraped"
to the above mentioned function.Note
When accessing the data via Python, the scraped particle buffer relies on the user to clear the buffer after processing the data. The buffer will grow unbounded as particles are scraped and therefore could lead to memory issues if not periodically cleared. To clear the buffer call
clear_buffer()
.
<species>.do_field_ionization
(0 or 1) optional (default 0)Do field ionization for this species (using the ADK theory).
<species>.do_adk_correction
(0 or 1) optional (default 0)Whether to apply the correction to the ADK theory proposed by Zhang, Lan and Lu in Q. Zhang et al. (Phys. Rev. A 90, 043410, 2014). If so, the probability of ionization is modified using an empirical model that should be more accurate in the regime of high electric fields. Currently, this is only implemented for Hydrogen, although Argon is also available in the same reference.
<species>.physical_element
(string)Only read if do_field_ionization = 1. Symbol of chemical element for this species. Example: for Helium, use
physical_element = He
. All the elements up to atomic number Z=100 (Fermium) are supported.
<species>.ionization_product_species
(string)Only read if do_field_ionization = 1. Name of species in which ionized electrons are stored. This species must be created as a regular species in the input file (in particular, it must be in particles.species_names).
<species>.ionization_initial_level
(int) optional (default 0)Only read if do_field_ionization = 1. Initial ionization level of the species (must be smaller than the atomic number of chemical element given in physical_element).
<species>.do_classical_radiation_reaction
(int) optional (default 0)Enables Radiation Reaction (or Radiation Friction) for the species. Species must be either electrons or positrons. Boris pusher must be used for the simulation. If both
<species>.do_classical_radiation_reaction
and<species>.do_qed_quantum_sync
are enabled, then the classical module will be used when the particle’s chi parameter is belowqed_qs.chi_min
, the discrete quantum module otherwise.
<species>.do_qed_quantum_sync
(int) optional (default 0)Enables Quantum synchrotron emission for this species. Quantum synchrotron lookup table should be either generated or loaded from disk to enable this process (see “Lookup tables for QED modules” section below). <species> must be either an electron or a positron species. This feature requires to compile with QED=TRUE
<species>.do_qed_breit_wheeler
(int) optional (default 0)Enables non-linear Breit-Wheeler process for this species. Breit-Wheeler lookup table should be either generated or loaded from disk to enable this process (see “Lookup tables for QED modules” section below). <species> must be a photon species. This feature requires to compile with QED=TRUE
<species>.qed_quantum_sync_phot_product_species
(string)If an electron or a positron species has the Quantum synchrotron process, a photon product species must be specified (the name of an existing photon species must be provided) This feature requires to compile with QED=TRUE
<species>.qed_breit_wheeler_ele_product_species
(string)If a photon species has the Breit-Wheeler process, an electron product species must be specified (the name of an existing electron species must be provided) This feature requires to compile with QED=TRUE
<species>.qed_breit_wheeler_pos_product_species
(string)If a photon species has the Breit-Wheeler process, a positron product species must be specified (the name of an existing positron species must be provided). This feature requires to compile with QED=TRUE
<species>.do_resampling
(0 or 1) optional (default 0)If 1 resampling is performed for this species. This means that the number of macroparticles will be reduced at specific timesteps while preserving the distribution function as much as possible (details depend on the chosen resampling algorithm). This can be useful in situations with continuous creation of particles (e.g. with ionization or with QED effects). At least one resampling trigger (see below) must be specified to actually perform resampling.
<species>.resampling_algorithm
(string) optional (default leveling_thinning)The algorithm used for resampling:
leveling_thinning
This algorithm is defined in Muraviev et al. [4]. It has one parameter:<species>.resampling_algorithm_target_ratio
(float) optional (default 1.5)This roughly corresponds to the ratio between the number of particles before and after resampling.
velocity_coincidence_thinning`
The particles are sorted into phase space cells and merged, similar to the approach described in Vranic et al. [5]. It has three parameters:<species>.resampling_algorithm_delta_ur
(float)The width of momentum cells used in clustering particles, in m/s.
<species>.resampling_algorithm_n_theta
(int)The number of cell divisions to use in the \(\theta\) direction when clustering the particle velocities.
<species>.resampling_algorithm_n_phi
(int)The number of cell divisions to use in the \(\phi\) direction when clustering the particle velocities.
<species>.resampling_min_ppc
(int) optional (default 1)Resampling is not performed in cells with a number of macroparticles strictly smaller than this parameter.
<species>.resampling_trigger_intervals
(string) optional (default 0)Using the Intervals parser syntax, this string defines timesteps at which resampling is performed.
<species>.resampling_trigger_max_avg_ppc
(float) optional (default infinity)Resampling is performed everytime the number of macroparticles per cell of the species averaged over the whole simulation domain exceeds this parameter.
Cold Relativistic Fluid initialization
fluids.species_names
(strings, separated by spaces)Defines the names of each fluid species. It is a required input to create and evolve fluid species using the cold relativistic fluid equations. Most of the parameters described in the section “Particle initialization” can also be used to initialize fluid properties (e.g. initial density distribution). For fluid-specific inputs we use <fluid_species_name> as a placeholder. Also see external fields for how to specify these for fluids as the function names differ.
Laser initialization
lasers.names
(list of string)Name of each laser. This is then used in the rest of the input deck ; in this documentation we use <laser_name> as a placeholder. The parameters below must be provided for each laser pulse.
<laser_name>.position
(3 floats in 3D and 2D ; in meters)The coordinates of one of the point of the antenna that will emit the laser. The plane of the antenna is entirely defined by
<laser_name>.position
and<laser_name>.direction
.<laser_name>.position
also corresponds to the origin of the coordinates system for the laser tranverse profile. For instance, for a Gaussian laser profile, the peak of intensity will be at the position given by<laser_name>.position
. This variable can thus be used to shift the position of the laser pulse transversally.Note
In 2D,
<laser_name>.position
is still given by 3 numbers, but the second number is ignored.When running a boosted-frame simulation, provide the value of
<laser_name>.position
in the laboratory frame, and usewarpx.gamma_boost
to automatically perform the conversion to the boosted frame. Note that, in this case, the laser antenna will be moving, in the boosted frame.
<laser_name>.polarization
(3 floats in 3D and 2D)The coordinates of a vector that points in the direction of polarization of the laser. The norm of this vector is unimportant, only its direction matters.
Note
Even in 2D, all the 3 components of this vectors are important (i.e. the polarization can be orthogonal to the plane of the simulation).
<laser_name>.direction
(3 floats in 3D)The coordinates of a vector that points in the propagation direction of the laser. The norm of this vector is unimportant, only its direction matters.
The plane of the antenna that will emit the laser is orthogonal to this vector.
Warning
When running boosted-frame simulations,
<laser_name>.direction
should be parallel towarpx.boost_direction
, for now.
<laser_name>.e_max
(float ; in V/m)Peak amplitude of the laser field, in the focal plane.
For a laser with a wavelength \(\lambda = 0.8\,\mu m\), the peak amplitude is related to \(a_0\) by:
\[E_{max} = a_0 \frac{2 \pi m_e c^2}{e\lambda} = a_0 \times (4.0 \cdot 10^{12} \;V.m^{-1})\]When running a boosted-frame simulation, provide the value of
<laser_name>.e_max
in the laboratory frame, and usewarpx.gamma_boost
to automatically perform the conversion to the boosted frame.
<laser_name>.a0
(float ; dimensionless)Peak normalized amplitude of the laser field, in the focal plane (given in the lab frame, just as
e_max
above). See the description of<laser_name>.e_max
for the conversion betweena0
ande_max
. Eithera0
ore_max
must be specified.
<laser_name>.wavelength
(float; in meters)The wavelength of the laser in vacuum.
When running a boosted-frame simulation, provide the value of
<laser_name>.wavelength
in the laboratory frame, and usewarpx.gamma_boost
to automatically perform the conversion to the boosted frame.
<laser_name>.profile
(string)The spatio-temporal shape of the laser. The options that are currently implemented are:
"Gaussian"
: The transverse and longitudinal profiles are Gaussian."parse_field_function"
: the laser electric field is given by a function in the input file. It requires additional argument<laser_name>.field_function(X,Y,t)
, which is a mathematical expression , e.g.<laser_name>.field_function(X,Y,t) = "a0*X**2 * (X>0) * cos(omega0*t)"
wherea0
andomega0
are a user-defined constant, see above. The profile passed here is the full profile, not only the laser envelope.t
is time andX
andY
are coordinates orthogonal to<laser_name>.direction
(not necessarily the x and y coordinates of the simulation). All parameters above are required, but none of the parameters below are used when<laser_name>.parse_field_function=1
. Even though<laser_name>.wavelength
and<laser_name>.e_max
should be included in the laser function, they still have to be specified as they are used for numerical purposes."from_file"
: the electric field of the laser is read from an external file. Currently both the lasy format as well as a custom binary format are supported. It requires to provide the name of the file to load setting the additional parameter<laser_name>.binary_file_name
or<laser_name>.lasy_file_name
(string). It accepts an optional parameter<laser_name>.time_chunk_size
(int), supported for both lasy and binary files; this allows to read only time_chunk_size timesteps from the file. New timesteps are read as soon as they are needed.The default value is automatically set to the number of timesteps contained in the file (i.e. only one read is performed at the beginning of the simulation). It also accepts the optional parameter
<laser_name>.delay
(float; in seconds), which allows delaying (delay > 0
) or anticipating (delay < 0
) the laser by the specified amount of time.Details about the usage of the lasy format: lasy can produce either 3D Cartesian files or RZ files. WarpX can read both types of files independently of the geometry in which it was compiled (e.g. WarpX compiled with
WarpX_DIMS=RZ
can read 3D Cartesian lasy files). In the case where WarpX is compiled in 2D (or 1D) Cartesian, the laser antenna will emit the field values that correspond to the slicey=0
in the lasy file (andx=0
in the 1D case). One can generate a lasy file from Python, see an example atExamples/Tests/laser_injection_from_file
.Details about the usage of the binary format: The external binary file should provide E(x,y,t) on a rectangular (necessarily uniform) grid. The code performs a bi-linear (in 2D) or tri-linear (in 3D) interpolation to set the field values. x,y,t are meant to be in S.I. units, while the field value is meant to be multiplied by
<laser_name>.e_max
(i.e. in most cases the maximum of abs(E(x,y,t)) should be 1, so that the maximum field intensity can be set straightforwardly with<laser_name>.e_max
). The binary file has to respect the following format:flag
to indicate the grid is uniform (1 byte, 0 means non-uniform, !=0 means uniform) - only uniform is supportednt
, number of timesteps (uint32_t
, must be >=2)nx
, number of points along x (uint32_t
, must be >=2)ny
, number of points along y (uint32_t
, must be 1 for 2D simulations and >=2 for 3D simulations)timesteps
(double[2]=[t_min,t_max]
)x_coords
(double[2]=[x_min,x_max]
)y_coords
(double[1]
in 2D,double[2]=[y_min,y_max]
in 3D)field_data
(double[nt x nx * ny]
, withnt
being the slowest coordinate).
A binary file can be generated from Python, see an example at
Examples/Tests/laser_injection_from_file
<laser_name>.profile_t_peak
(float; in seconds)The time at which the laser reaches its peak intensity, at the position given by
<laser_name>.position
(only used for the"gaussian"
profile)When running a boosted-frame simulation, provide the value of
<laser_name>.profile_t_peak
in the laboratory frame, and usewarpx.gamma_boost
to automatically perform the conversion to the boosted frame.
<laser_name>.profile_duration
(float ; in seconds)The duration of the laser pulse for the
"gaussian"
profile, defined as \(\tau\) below:\[E(\boldsymbol{x},t) \propto \exp\left( -\frac{(t-t_{peak})^2}{\tau^2} \right)\]Note that \(\tau\) relates to the full width at half maximum (FWHM) of intensity, which is closer to pulse length measurements in experiments, as \(\tau = \mathrm{FWHM}_I / \sqrt{2\ln(2)}\) \(\approx \mathrm{FWHM}_I / 1.1774\).
For a chirped laser pulse (i.e. with a non-zero
<laser_name>.phi2
),profile_duration
is the Fourier-limited duration of the pulse, not the actual duration of the pulse. See the documentation for<laser_name>.phi2
for more detail.When running a boosted-frame simulation, provide the value of
<laser_name>.profile_duration
in the laboratory frame, and usewarpx.gamma_boost
to automatically perform the conversion to the boosted frame.
<laser_name>.profile_waist
(float ; in meters)The waist of the transverse Gaussian \(w_0\), i.e. defined such that the electric field of the laser pulse in the focal plane is of the form:
\[E(\boldsymbol{x},t) \propto \exp\left( -\frac{\boldsymbol{x}_\perp^2}{w_0^2} \right)\]
<laser_name>.profile_focal_distance
(float; in meters)The distance from
laser_position
to the focal plane. (where the distance is defined along the direction given by<laser_name>.direction
.)Use a negative number for a defocussing laser instead of a focussing laser.
When running a boosted-frame simulation, provide the value of
<laser_name>.profile_focal_distance
in the laboratory frame, and usewarpx.gamma_boost
to automatically perform the conversion to the boosted frame.
<laser_name>.phi0
(float; in radians) optional (default 0.)The Carrier Envelope Phase, i.e. the phase of the laser oscillation, at the position where the laser envelope is maximum (only used for the
"gaussian"
profile)
<laser_name>.stc_direction
(3 floats) optional (default 1. 0. 0.)Direction of laser spatio-temporal couplings. See definition in Akturk et al. [6].
<laser_name>.zeta
(float; in meters.seconds) optional (default 0.)Spatial chirp at focus in direction
<laser_name>.stc_direction
. See definition in Akturk et al. [6].
<laser_name>.beta
(float; in seconds) optional (default 0.)Angular dispersion (or angular chirp) at focus in direction
<laser_name>.stc_direction
. See definition in Akturk et al. [6].
<laser_name>.phi2
(float; in seconds**2) optional (default 0.)The amount of temporal chirp \(\phi^{(2)}\) at focus (in the lab frame). Namely, a wave packet centered on the frequency \((\omega_0 + \delta \omega)\) will reach its peak intensity at \(z(\delta \omega) = z_0 - c \phi^{(2)} \, \delta \omega\). Thus, a positive \(\phi^{(2)}\) corresponds to positive chirp, i.e. red part of the spectrum in the front of the pulse and blue part of the spectrum in the back. More specifically, the electric field in the focal plane is of the form:
\[E(\boldsymbol{x},t) \propto Re\left[ \exp\left( -\frac{(t-t_{peak})^2}{\tau^2 + 2i\phi^{(2)}} + i\omega_0 (t-t_{peak}) + i\phi_0 \right) \right]\]where \(\tau\) is given by
<laser_name>.profile_duration
and represents the Fourier-limited duration of the laser pulse. Thus, the actual duration of the chirped laser pulse is:\[\tau' = \sqrt{ \tau^2 + 4 (\phi^{(2)})^2/\tau^2 }\]See also the definition in Akturk et al. [6].
<laser_name>.do_continuous_injection
(0 or 1) optional (default 0).Whether or not to use continuous injection. If the antenna starts outside of the simulation domain but enters it at some point (due to moving window or moving antenna in the boosted frame), use this so that the laser antenna is injected when it reaches the box boundary. If running in a boosted frame, this requires the boost direction, moving window direction and laser propagation direction to be along z. If not running in a boosted frame, this requires the moving window and laser propagation directions to be the same (x, y or z)
<laser_name>.min_particles_per_mode
(int) optional (default 4)When using the RZ version, this specifies the minimum number of particles per angular mode. The laser particles are loaded into radial spokes, with the number of spokes given by min_particles_per_mode*(warpx.n_rz_azimuthal_modes-1).
lasers.deposit_on_main_grid
(int) optional (default 0)When using mesh refinement, whether the antenna that emits the laser deposits charge/current only on the main grid (i.e. level 0), or also on the higher mesh-refinement levels.
warpx.num_mirrors
(int) optional (default 0)Users can input perfect mirror condition inside the simulation domain. The number of mirrors is given by
warpx.num_mirrors
. The mirrors are orthogonal to the z direction. The following parameters are required whenwarpx.num_mirrors
is >0.
warpx.mirror_z
(list of float) required ifwarpx.num_mirrors>0
z
location of the front of the mirrors.
warpx.mirror_z_width
(list of float) required ifwarpx.num_mirrors>0
z
width of the mirrors.
warpx.mirror_z_npoints
(list of int) required ifwarpx.num_mirrors>0
In the boosted frame, depending on gamma_boost,
warpx.mirror_z_width
can be smaller than the cell size, so that the mirror would not work. This parameter is the minimum number of points for the mirror. Ifmirror_z_width < dz/cell_size
, the upper bound of the mirror is increased so that it contains at leastmirror_z_npoints
.
External fields
Applied to the grid
The external fields defined with input parameters that start with warpx.B_ext_grid_init_
or warpx.E_ext_grid_init_
are applied to the grid directly. In particular, these fields can be seen in the diagnostics that output the fields on the grid.
When using an electromagnetic field solver, these fields are applied to the grid at the beginning of the simulation, and serve as initial condition for the Maxwell solver.
When using an electrostatic or magnetostatic field solver, these fields are added to the fields computed by the Poisson solver, at each timestep.
warpx.B_ext_grid_init_style
(string) optionalThis parameter determines the type of initialization for the external magnetic field. By default, the external magnetic field (Bx,By,Bz) is initialized to (0.0, 0.0, 0.0). The string can be set to “constant” if a constant magnetic field is required to be set at initialization. If set to “constant”, then an additional parameter, namely,
warpx.B_external_grid
must be specified. If set toparse_B_ext_grid_function
, then a mathematical expression can be used to initialize the external magnetic field on the grid. It requires additional parameters in the input file, namely,warpx.Bx_external_grid_function(x,y,z)
,warpx.By_external_grid_function(x,y,z)
,warpx.Bz_external_grid_function(x,y,z)
to initialize the external magnetic field for each of the three components on the grid. Constants required in the expression can be set usingmy_constants
. For example, ifwarpx.Bx_external_grid_function(x,y,z)=Bo*x + delta*(y + z)
then the constants Bo and delta required in the above equation can be set usingmy_constants.Bo=
andmy_constants.delta=
in the input file. For a two-dimensional simulation, it is assumed that the first dimension is x and the second dimension is z, and the value of y is set to zero. Note that the current implementation of the parser for external B-field does not work with RZ and the code will abort with an error message.If
B_ext_grid_init_style
is set to beread_from_file
, an additional parameter, indicating the path of an openPMD data file,warpx.read_fields_from_path
must be specified, from which external B field data can be loaded into WarpX. One can refer to input files inExamples/Tests/LoadExternalField
for more information. Regarding how to prepare the openPMD data file, one can refer to the openPMD-example-datasets.
warpx.E_ext_grid_init_style
(string) optionalThis parameter determines the type of initialization for the external electric field. By default, the external electric field (Ex,Ey,Ez) to (0.0, 0.0, 0.0). The string can be set to “constant” if a constant electric field is required to be set at initialization. If set to “constant”, then an additional parameter, namely,
warpx.E_external_grid
must be specified in the input file. If set toparse_E_ext_grid_function
, then a mathematical expression can be used to initialize the external electric field on the grid. It required additional parameters in the input file, namely,warpx.Ex_external_grid_function(x,y,z)
,warpx.Ey_external_grid_function(x,y,z)
,warpx.Ez_external_grid_function(x,y,z)
to initialize the external electric field for each of the three components on the grid. Constants required in the expression can be set usingmy_constants
. For example, ifwarpx.Ex_external_grid_function(x,y,z)=Eo*x + delta*(y + z)
then the constants Bo and delta required in the above equation can be set usingmy_constants.Eo=
andmy_constants.delta=
in the input file. For a two-dimensional simulation, it is assumed that the first dimension is x and the second dimension is z, and the value of y is set to zero. Note that the current implementation of the parser for external E-field does not work with RZ and the code will abort with an error message.If
E_ext_grid_init_style
is set to beread_from_file
, an additional parameter, indicating the path of an openPMD data file,warpx.read_fields_from_path
must be specified, from which external E field data can be loaded into WarpX. One can refer to input files inExamples/Tests/LoadExternalField
for more information. Regarding how to prepare the openPMD data file, one can refer to the openPMD-example-datasets. Note that if both B_ext_grid_init_style and E_ext_grid_init_style are set to read_from_file, the openPMD file specified by warpx.read_fields_from_path should contain both B and E external fields data.
warpx.E_external_grid
&warpx.B_external_grid
(list of 3 floats)required when
warpx.E_ext_grid_init_style="constant"
and whenwarpx.B_ext_grid_init_style="constant"
, respectively. External uniform and constant electrostatic and magnetostatic field added to the grid at initialization. Use with caution as these fields are used for the field solver. In particular, do not use any other boundary condition than periodic.
warpx.maxlevel_extEMfield_init
(default is maximum number of levels in the simulation)With this parameter, the externally applied electric and magnetic fields will not be applied for levels greater than
warpx.maxlevel_extEMfield_init
. For some mesh-refinement simulations, the external fields are only applied to the parent grid and not the refined patches. In such cases,warpx.maxlevel_extEMfield_init
can be set to 0. In that case, the other levels have external field values of 0.
Applied to Particles
The external fields defined with input parameters that start with warpx.B_ext_particle_init_
or warpx.E_ext_particle_init_
are applied to the particles directly, at each timestep. As a results, these fields cannot be seen in the diagnostics that output the fields on the grid.
particles.E_ext_particle_init_style
&particles.B_ext_particle_init_style
(string) optional (default “none”)These parameters determine the type of the external electric and magnetic fields respectively that are applied directly to the particles at every timestep. The field values are specified in the lab frame. With the default
none
style, no field is applied. Possible values areconstant
,parse_E_ext_particle_function
orparse_B_ext_particle_function
, orrepeated_plasma_lens
.constant
: a constant field is applied, given by the input parametersparticles.E_external_particle
orparticles.B_external_particle
, which are lists of the field components.parse_E_ext_particle_function
orparse_B_ext_particle_function
: the field is specified as an analytic expression that is a function of space (x,y,z) and time (t), relative to the lab frame. The E-field is specified by the input parameters:particles.Ex_external_particle_function(x,y,z,t)
particles.Ey_external_particle_function(x,y,z,t)
particles.Ez_external_particle_function(x,y,z,t)
The B-field is specified by the input parameters:
particles.Bx_external_particle_function(x,y,z,t)
particles.By_external_particle_function(x,y,z,t)
particles.Bz_external_particle_function(x,y,z,t)
Note that the position is defined in Cartesian coordinates, as a function of (x,y,z), even for RZ.
repeated_plasma_lens
: apply a series of plasma lenses. The properties of the lenses are defined in the lab frame by the input parameters:repeated_plasma_lens_period
, the period length of the repeat, a single float number,repeated_plasma_lens_starts
, the start of each lens relative to the period, an array of floats,repeated_plasma_lens_lengths
, the length of each lens, an array of floats,repeated_plasma_lens_strengths_E
, the electric focusing strength of each lens, an array of floats, whenparticles.E_ext_particle_init_style
is set torepeated_plasma_lens
.repeated_plasma_lens_strengths_B
, the magnetic focusing strength of each lens, an array of floats, whenparticles.B_ext_particle_init_style
is set torepeated_plasma_lens
.
The repeated lenses are only defined for \(z > 0\). Once the number of lenses specified in the input are exceeded, the repeated lens stops.
The applied field is uniform longitudinally (along z) with a hard edge, where residence corrections are used for more accurate field calculation. On the time step when a particle enters or leaves each lens, the field applied is scaled by the fraction of the time step spent within the lens. The fields are of the form \(E_x = \mathrm{strength} \cdot x\), \(E_y = \mathrm{strength} \cdot y\), and \(E_z = 0\), and \(B_x = \mathrm{strength} \cdot y\), \(B_y = -\mathrm{strength} \cdot x\), and \(B_z = 0\).
Applied to Cold Relativistic Fluids
The external fields defined with input parameters that start with warpx.B_ext_init_
or warpx.E_ext_init_
are applied to the fluids directly, at each timestep. As a results, these fields cannot be seen in the diagnostics that output the fields on the grid.
<fluid_species_name>.E_ext_init_style
&<fluid_species_name>.B_ext_init_style
(string) optional (default “none”)These parameters determine the type of the external electric and magnetic fields respectively that are applied directly to the cold relativistic fluids at every timestep. The field values are specified in the lab frame. With the default
none
style, no field is applied. Possible values areparse_E_ext_function
orparse_B_ext_function
.parse_E_ext_function
orparse_B_ext_function
: the field is specified as an analytic expression that is a function of space (x,y,z) and time (t), relative to the lab frame. The E-field is specified by the input parameters:<fluid_species_name>.Ex_external_function(x,y,z,t)
<fluid_species_name>.Ey_external_function(x,y,z,t)
<fluid_species_name>.Ez_external_function(x,y,z,t)
The B-field is specified by the input parameters:
<fluid_species_name>.Bx_external_function(x,y,z,t)
<fluid_species_name>.By_external_function(x,y,z,t)
<fluid_species_name>.Bz_external_function(x,y,z,t)
Note that the position is defined in Cartesian coordinates, as a function of (x,y,z), even for RZ.
Accelerator Lattice
Several accelerator lattice elements can be defined as described below. The elements are defined relative to the z axis and in the lab frame, starting at z = 0. They are described using a simplified MAD like syntax. Note that elements of the same type cannot overlap each other.
lattice.elements
(list of strings
) optional (default: no elements)A list of names (one name per lattice element), in the order that they appear in the lattice.
lattice.reverse
(boolean
) optional (default:false
)Reverse the list of elements in the lattice.
<element_name>.type
(string
)Indicates the element type for this lattice element. This should be one of:
drift
for free drift. This requires this additional parameter:<element_name>.ds
(float
, in meters) the segment length
quad
for a hard edged quadrupole. This applies a quadrupole field that is uniform within the z extent of the element with a sharp cut off at the ends. This uses residence corrections, with the field scaled by the amount of time within the element for particles entering or leaving it, to increase the accuracy. This requires these additional parameters:<element_name>.ds
(float
, in meters) the segment length<element_name>.dEdx
(float
, in volts/meter^2) optional (default: 0.) the electric quadrupole field gradient The field applied to the particles will be Ex = dEdx*x and Ey = -dEdx*y.<element_name>.dBdx
(float
, in Tesla/meter) optional (default: 0.) the magnetic quadrupole field gradient The field applied to the particles will be Bx = dBdx*y and By = dBdx*x.
plasmalens
for a field modeling a plasma lens This applies a radially directed plasma lens field that is uniform within the z extent of the element with a sharp cut off at the ends. This uses residence corrections, with the field scaled by the amount of time within the element for particles entering or leaving it, to increase the accuracy. This requires these additional parameters:<element_name>.ds
(float
, in meters) the segment length<element_name>.dEdx
(float
, in volts/meter^2) optional (default: 0.) the electric field gradient The field applied to the particles will be Ex = dEdx*x and Ey = dEdx*y.<element_name>.dBdx
(float
, in Tesla/meter) optional (default: 0.) the magnetic field gradient The field applied to the particles will be Bx = dBdx*y and By = -dBdx*x.
line
a sub-lattice (line) of elements to append to the lattice.<element_name>.elements
(list of strings
) optional (default: no elements) A list of names (one name per lattice element), in the order that they appear in the lattice.<element_name>.reverse
(boolean
) optional (default:false
) Reverse the list of elements in the line before appending to the lattice.
Collision models
WarpX provides several particle collision models, using varying degrees of approximation. Details about the collision models can be found in the theory section.
collisions.collision_names
(strings, separated by spaces)The name of each collision type. This is then used in the rest of the input deck; in this documentation we use
<collision_name>
as a placeholder.
<collision_name>.type
(string) optionalThe type of collision. The types implemented are:
pairwisecoulomb
for pair-wise Coulomb collisions, the default if unspecified. This provides a pair-wise relativistic elastic Monte Carlo binary Coulomb collision model, following the algorithm given by Pérez et al. [7]. When the RZ mode is used, warpx.n_rz_azimuthal_modes must be set to 1 at the moment, since the current implementation of the collision module assumes axisymmetry.nuclearfusion
for fusion reactions. This implements the pair-wise fusion model by Higginson et al. [8]. Currently, WarpX supports deuterium-deuterium, deuterium-tritium, deuterium-helium and proton-boron fusion. When initializing the reactant and product species, you need to usespecies_type
(see the documentation for this parameter), so that WarpX can identify the type of reaction to use. (e.g.<species_name>.species_type = 'deuterium'
)dsmc
for pair-wise, non-Coulomb collisions between kinetic species. This is a “direct simulation Monte Carlo” treatment of collisions between kinetic species. See DSMC section.background_mcc
for collisions between particles and a neutral background. This is a relativistic Monte Carlo treatment for particles colliding with a neutral background gas. See MCC section.background_stopping
for slowing of ions due to collisions with electrons or ions. This implements the approximate formulae as derived in Introduction to Plasma Physics, from Goldston and Rutherford, section 14.2.
<collision_name>.species
(strings)If using
dsmc
,pairwisecoulomb
ornuclearfusion
, this should be the name(s) of the species, between which the collision will be considered. (Provide only one name for intra-species collisions.) If usingbackground_mcc
orbackground_stopping
type this should be the name of the species for which collisions with a background will be included. In this case, only one species name should be given.
<collision_name>.product_species
(strings)Only for
nuclearfusion
. The name(s) of the species in which to add the new macroparticles created by the reaction.
<collision_name>.ndt
(int) optionalExecute collision every # time steps. The default value is 1.
<collision_name>.CoulombLog
(float) optionalOnly for
pairwisecoulomb
. A provided fixed Coulomb logarithm of the collision type<collision_name>
. For example, a typical Coulomb logarithm has a form of \(\ln(\lambda_D/R)\), where \(\lambda_D\) is the Debye length, \(R\approx1.4A^{1/3}\) is the effective Coulombic radius of the nucleus, \(A\) is the mass number. If this is not provided, or if a non-positive value is provided, a Coulomb logarithm will be computed automatically according to the algorithm in Pérez et al. [7].
<collision_name>.fusion_multiplier
(float) optional.Only for
nuclearfusion
. Increasingfusion_multiplier
creates more macroparticles of fusion products, but with lower weight (in such a way that the corresponding total number of physical particle remains the same). This can improve the statistics of the simulation, in the case where fusion reactions are very rare. More specifically, in a fusion reaction between two macroparticles with weightw_1
andw_2
, the weight of the product macroparticles will bemin(w_1,w_2)/fusion_multiplier
. (And the weights of the reactant macroparticles are reduced correspondingly after the reaction.) See Higginson et al. [8] for more details. The default value offusion_multiplier
is 1.
<collision_name>.fusion_probability_threshold
(float) optional.Only for
nuclearfusion
. If the fusion multiplier is too high and results in a fusion probability that approaches 1 (for a given collision between two macroparticles), then there is a risk of underestimating the total fusion yield. In these cases, WarpX reduces the fusion multiplier used in that given collision.m_probability_threshold
is the fusion probability threshold above which WarpX reduces the fusion multiplier.
<collision_name>.fusion_probability_target_value
(float) optional.Only for
nuclearfusion
. When the probability of fusion for a given collision exceedsfusion_probability_threshold
, WarpX reduces the fusion multiplier for that collisions such that the fusion probability approchesfusion_probability_target_value
.
<collision_name>.background_density
(float)Only for
background_mcc
andbackground_stopping
. The density of the background in \(m^{-3}\). Can also provide<collision_name>.background_density(x,y,z,t)
using the parser initialization style for spatially and temporally varying density. Withbackground_mcc
, if a function is used for the background density, the input parameter<collision_name>.max_background_density
must also be provided to calculate the maximum collision probability.
<collision_name>.background_temperature
(float)Only for
background_mcc
andbackground_stopping
. The temperature of the background in Kelvin. Can also provide<collision_name>.background_temperature(x,y,z,t)
using the parser initialization style for spatially and temporally varying temperature.
<collision_name>.background_mass
(float) optionalOnly for
background_mcc
andbackground_stopping
. The mass of the background gas in kg. Withbackground_mcc
, if not given the mass of the colliding species will be used unless ionization is included in which case the mass of the product species will be used. Withbackground_stopping
, andbackground_type
set toelectrons
, if not given defaults to the electron mass. Withbackground_type
set toions
, the mass must be given.
<collision_name>.background_charge_state
(float)Only for
background_stopping
, where it is required whenbackground_type
is set toions
. This specifies the charge state of the background ions.
<collision_name>.background_type
(string)Only for
background_stopping
, where it is required, the type of the background. The possible values areelectrons
andions
. Whenelectrons
, equation 14.12 from Goldston and Rutherford is used. This formula is based on Coulomb collisions with the approximations that \(M_b >> m_e\) and \(V << v_{thermal\_e}\), and the assumption that the electrons have a Maxwellian distribution with temperature \(T_e\).\[\frac{dV}{dt} = - \frac{2^{1/2}n_eZ_b^2e^4m_e^{1/2}\log\Lambda}{12\pi^{3/2}\epsilon_0M_bT_e^{3/2}}V\]where \(V\) is each velocity component, \(n_e\) is the background density, \(Z_b\) is the ion charge state, \(e\) is the electron charge, \(m_e\) is the background mass, \(\log\Lambda=\log((12\pi/Z_b)(n_e\lambda_{de}^3))\), \(\lambda_{de}\) is the DeBye length, and \(M_b\) is the ion mass. The equation is integrated over a time step, giving \(V(t+dt) = V(t)*\exp(-\alpha*{dt})\) where \(\alpha\) is the factor multiplying \(V\).
When
ions
, equation 14.20 is used. This formula is based on Coulomb collisions with the approximations that \(M_b >> M\) and \(V >> v_{thermal\_i}\). The background ion temperature only appears in the \(\log\Lambda\) term.\[\frac{dW_b}{dt} = - \frac{2^{1/2}n_iZ^2Z_b^2e^4M_b^{1/2}\log\Lambda}{8\pi\epsilon_0MW_b^{1/2}}\]where \(W_b\) is the ion energy, \(n_i\) is the background density, \(Z\) is the charge state of the background ions, \(Z_b\) is the ion charge state, \(e\) is the electron charge, \(M_b\) is the ion mass, \(\log\Lambda=\log((12\pi/Z_b)(n_i\lambda_{di}^3))\), \(\lambda_{di}\) is the DeBye length, and \(M\) is the background ion mass. The equation is integrated over a time step, giving \(W_b(t+dt) = ((W_b(t)^{3/2}) - 3/2\beta{dt})^{2/3}\) where \(\beta\) is the term on the r.h.s except \(W_b\).
<collision_name>.scattering_processes
(strings separated by spaces)Only for
dsmc
andbackground_mcc
. The scattering processes that should be included. Available options areelastic
,back
&charge_exchange
for ions andelastic
,excitationX
&ionization
for electrons. Multiple excitation events can be included for electrons corresponding to excitation to different levels, theX
above can be changed to a unique identifier for each excitation process. For each scattering process specified a path to a cross-section data file must also be given. We use<scattering_process>
as a placeholder going forward.
<collision_name>.<scattering_process>_cross_section
(string)Only for
dsmc
andbackground_mcc
. Path to the file containing cross-section data for the given scattering processes. The cross-section file must have exactly 2 columns of data, the first containing equally spaced energies in eV and the second the corresponding cross-section in \(m^2\). The energy column should represent the kinetic energy of the colliding particles in the center-of-mass frame.
<collision_name>.<scattering_process>_energy
(float)Only for
background_mcc
. If the scattering process is eitherexcitationX
orionization
the energy cost of that process must be given in eV.
<collision_name>.ionization_species
(float)Only for
background_mcc
. If the scattering process isionization
the produced species must also be given. For example if argon properties is used for the background gas, a species of argon ions should be specified here.
Numerics and algorithms
This section describes the input parameters used to select numerical methods and algorithms for your simulation setup.
Time step
warpx.cfl
(float) optional (default 0.999)The ratio between the actual timestep that is used in the simulation and the Courant-Friedrichs-Lewy (CFL) limit. (e.g. for warpx.cfl=1, the timestep will be exactly equal to the CFL limit.) This parameter will only be used with the electromagnetic solver.
warpx.const_dt
(float)Allows direct specification of the time step size, in units of seconds. When the electrostatic solver is being used, this must be supplied. This can be used with the electromagnetic solver, overriding
warpx.cfl
, but it is up to the user to ensure that the CFL condition is met.
Filtering
warpx.use_filter
(0 or 1; default: 1, except for RZ FDTD)Whether to smooth the charge and currents on the mesh, after depositing them from the macro-particles. This uses a bilinear filter (see the filtering section). The default is 1 in all cases, except for simulations in RZ geometry using the FDTD solver. With the RZ PSATD solver, the filtering is done in \(k\)-space.
Warning
Known bug: filter currently not working with FDTD solver in RZ geometry (see https://github.com/ECP-WarpX/WarpX/issues/1943).
warpx.filter_npass_each_dir
(3 int) optional (default 1 1 1)Number of passes along each direction for the bilinear filter. In 2D simulations, only the first two values are read.
warpx.use_filter_compensation
(0 or 1; default: 0)Whether to add compensation when applying filtering. This is only supported with the RZ spectral solver.
Particle push, charge and current deposition, field gathering
algo.current_deposition
(string, optional)This parameter selects the algorithm for the deposition of the current density. Available options are:
direct
,esirkepov
, andvay
. The default choice isesirkepov
for FDTD maxwell solvers butdirect
for standard or Galilean PSATD solver (i.e. withalgo.maxwell_solver = psatd
) and for the hybrid-PIC solver (i.e. withalgo.maxwell_solver = hybrid
) and for diagnostics output with the electrostatic solvers (i.e., withwarpx.do_electrostatic = ...
). Note thatvay
is only available foralgo.maxwell_solver = psatd
.direct
The current density is deposited as described in the section Current deposition. This deposition scheme does not conserve charge.
esirkepov
The current density is deposited as described in Esirkepov [9]. This deposition scheme guarantees charge conservation for shape factors of arbitrary order.
vay
The current density is deposited as described in Vay et al. [10] (see section Current deposition for more details). This option guarantees charge conservation only when used in combination with
psatd.periodic_single_box_fft=1
, that is, only for periodic single-box simulations with global FFTs without guard cells. The implementation for domain decomposition with local FFTs over guard cells is planned but not yet completed.
algo.charge_deposition
(string, optional)The algorithm for the charge density deposition. Available options are:
standard
: standard charge deposition algorithm, described in the particle-in-cell theory section.
algo.field_gathering
(string, optional)The algorithm for field gathering. Available options are:
energy-conserving
: gathers directly from the grid points (either staggered or nodal grid points depending onwarpx.grid_type
).momentum-conserving
: first average the fields from the grid points to the nodes, and then gather from the nodes.
Default:
algo.field_gathering = energy-conserving
with collocated or staggered grids (note thatenergy-conserving
andmomentum-conserving
are equivalent with collocated grids),algo.field_gathering = momentum-conserving
with hybrid grids.
algo.particle_pusher
(string, optional)The algorithm for the particle pusher. Available options are:
algo.particle_shape
(integer; 1, 2, 3, or 4)The order of the shape factors (splines) for the macro-particles along all spatial directions: 1 for linear, 2 for quadratic, 3 for cubic, 4 for quartic. Low-order shape factors result in faster simulations, but may lead to more noisy results. High-order shape factors are computationally more expensive, but may increase the overall accuracy of the results. For production runs it is generally safer to use high-order shape factors, such as cubic order.
Note that this input parameter is not optional and must always be set in all input files provided that there is at least one particle species (set in input as
particles.species_names
) or one laser species (set in input aslasers.names
) in the simulation. No default value is provided automatically.
Maxwell solver
Two families of Maxwell solvers are implemented in WarpX, based on the Finite-Difference Time-Domain method (FDTD) or the Pseudo-Spectral Analytical Time-Domain method (PSATD), respectively.
algo.maxwell_solver
(string, optional)The algorithm for the Maxwell field solver. Available options are:
yee
: Yee FDTD solver.ckc
: (not available inRZ
geometry) Cole-Karkkainen solver with Cowan coefficients (see Cowan et al. [12]).psatd
: Pseudo-spectral solver (see theory).ect
: Enlarged cell technique (conformal finite difference solver. See Xiao and Liu [13]).hybrid
: The E-field will be solved using Ohm’s law and a kinetic-fluid hybrid model (see theory).none
: No field solve will be performed.
If
algo.maxwell_solver
is not specified,yee
is the default.
algo.em_solver_medium
(string, optional)The medium for evaluating the Maxwell solver. Available options are :
vacuum
: vacuum properties are used in the Maxwell solver.macroscopic
: macroscopic Maxwell equation is evaluated. If this option is selected, then the corresponding properties of the medium must be provided usingmacroscopic.sigma
,macroscopic.epsilon
, andmacroscopic.mu
for each case where the initialization style isconstant
. Otherwise if the initialization style uses the parser,macroscopic.sigma_function(x,y,z)
,macroscopic.epsilon_function(x,y,z)
and/ormacroscopic.mu_function(x,y,z)
must be provided using the parser initialization style for spatially varying macroscopic properties.
If
algo.em_solver_medium
is not specified,vacuum
is the default.
Maxwell solver: PSATD method
psatd.nox
,psatd.noy
,pstad.noz
(integer) optional (default 16 for all)The order of accuracy of the spatial derivatives, when using the code compiled with a PSATD solver. If
psatd.periodic_single_box_fft
is used, these can be set toinf
for infinite-order PSATD.
psatd.nx_guard
,psatd.ny_guard
,psatd.nz_guard
(integer) optionalThe number of guard cells to use with PSATD solver. If not set by users, these values are calculated automatically and determined empirically and equal the order of the solver for collocated grids and half the order of the solver for staggered grids.
psatd.periodic_single_box_fft
(0 or 1; default: 0)If true, this will not incorporate the guard cells into the box over which FFTs are performed. This is only valid when WarpX is run with periodic boundaries and a single box. In this case, using psatd.periodic_single_box_fft is equivalent to using a global FFT over the whole domain. Therefore, all the approximations that are usually made when using local FFTs with guard cells (for problems with multiple boxes) become exact in the case of the periodic, single-box FFT without guard cells.
psatd.current_correction
(0 or 1; default: 1, with the exceptions mentioned below)If true, a current correction scheme in Fourier space is applied in order to guarantee charge conservation. The default value is
psatd.current_correction=1
, unless a charge-conserving current deposition scheme is used (by settingalgo.current_deposition=esirkepov
oralgo.current_deposition=vay
) or unless thediv(E)
cleaning scheme is used (by settingwarpx.do_dive_cleaning=1
).If
psatd.v_galilean
is zero, the spectral solver used is the standard PSATD scheme described in Vay et al. [10] and the current correction reads\[\widehat{\boldsymbol{J}}^{\,n+1/2}_{\mathrm{correct}} = \widehat{\boldsymbol{J}}^{\,n+1/2} - \bigg(\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2} - i \frac{\widehat{\rho}^{n+1} - \widehat{\rho}^{n}}{\Delta{t}}\bigg) \frac{\boldsymbol{k}}{k^2}\]If
psatd.v_galilean
is non-zero, the spectral solver used is the Galilean PSATD scheme described in Lehe et al. [14] and the current correction reads\[\widehat{\boldsymbol{J}}^{\,n+1/2}_{\mathrm{correct}} = \widehat{\boldsymbol{J}}^{\,n+1/2} - \bigg(\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2} - (\boldsymbol{k}\cdot\boldsymbol{v}_G) \,\frac{\widehat\rho^{n+1} - \widehat\rho^{n}\theta^2}{1 - \theta^2}\bigg) \frac{\boldsymbol{k}}{k^2}\]where \(\theta=\exp(i\,\boldsymbol{k}\cdot\boldsymbol{v}_G\,\Delta{t}/2)\).
This option is currently implemented only for the standard PSATD, Galilean PSATD, and averaged Galilean PSATD schemes, while it is not yet available for the multi-J algorithm.
psatd.update_with_rho
(0 or 1)If true, the update equation for the electric field is expressed in terms of both the current density and the charge density, namely \(\widehat{\boldsymbol{J}}^{\,n+1/2}\), \(\widehat\rho^{n}\), and \(\widehat\rho^{n+1}\). If false, instead, the update equation for the electric field is expressed in terms of the current density \(\widehat{\boldsymbol{J}}^{\,n+1/2}\) only. If charge is expected to be conserved (by setting, for example,
psatd.current_correction=1
), then the two formulations are expected to be equivalent.If
psatd.v_galilean
is zero, the spectral solver used is the standard PSATD scheme described in Vay et al. [10]:if
psatd.update_with_rho=0
, the update equation for the electric field reads
\[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1}= & \: C \widehat{\boldsymbol{E}}^{\,n} + i \, \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} - \frac{S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & +\frac{1-C}{k^2} (\boldsymbol{k}\cdot\widehat{\boldsymbol{E}}^{\,n}) \boldsymbol{k} + \frac{1}{\epsilon_0 k^2} \left(\frac{S}{c \, k}-\Delta{t}\right) (\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2}) \boldsymbol{k} \end{split}\end{split}\]if
psatd.update_with_rho=1
, the update equation for the electric field reads
\[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1}= & \: C\widehat{\boldsymbol{E}}^{\,n} + i \, \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} - \frac{S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & + \frac{i}{\epsilon_0 k^2} \left(C-\frac{S}{c\,k}\frac{1}{\Delta{t}}\right) \widehat{\rho}^{n} \boldsymbol{k} - \frac{i}{\epsilon_0 k^2} \left(1-\frac{S}{c \, k} \frac{1}{\Delta{t}}\right)\widehat{\rho}^{n+1} \boldsymbol{k} \end{split}\end{split}\]The coefficients \(C\) and \(S\) are defined in Vay et al. [10].
If
psatd.v_galilean
is non-zero, the spectral solver used is the Galilean PSATD scheme described in Lehe et al. [14]:if
psatd.update_with_rho=0
, the update equation for the electric field reads
\[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1} = & \: \theta^{2} C \widehat{\boldsymbol{E}}^{\,n} + i \, \theta^{2} \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} + \frac{i \, \nu \, \theta \, \chi_1 - \theta^{2} S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & + \theta^{2} \frac{\chi_2-\chi_3}{k^{2}} (\boldsymbol{k}\cdot\widehat{\boldsymbol{E}}^{\,n}) \boldsymbol{k} + i \, \frac{\chi_2\left(\theta^{2}-1\right)}{\epsilon_0 c \, k^{3} \nu} (\boldsymbol{k}\cdot\widehat{\boldsymbol{J}}^{\,n+1/2}) \boldsymbol{k} \end{split}\end{split}\]if
psatd.update_with_rho=1
, the update equation for the electric field reads
\[\begin{split}\begin{split} \widehat{\boldsymbol{E}}^{\,n+1} = & \: \theta^{2} C \widehat{\boldsymbol{E}}^{\,n} + i \, \theta^{2} \frac{S c}{k} \boldsymbol{k}\times\widehat{\boldsymbol{B}}^{\,n} + \frac{i \, \nu \, \theta \, \chi_1 - \theta^{2} S}{\epsilon_0 c \, k} \widehat{\boldsymbol{J}}^{\,n+1/2} \\[0.2cm] & + i \, \frac{\theta^{2} \chi_3}{\epsilon_0 k^{2}} \widehat{\rho}^{\,n} \boldsymbol{k} - i \, \frac{\chi_2}{\epsilon_0 k^{2}} \widehat{\rho}^{\,n+1} \boldsymbol{k} \end{split}\end{split}\]The coefficients \(C\), \(S\), \(\theta\), \(\nu\), \(\chi_1\), \(\chi_2\), and \(\chi_3\) are defined in Lehe et al. [14].
The default value for
psatd.update_with_rho
is1
ifpsatd.v_galilean
is non-zero and0
otherwise. The optionpsatd.update_with_rho=0
is not implemented with the following algorithms: comoving PSATD (psatd.v_comoving
), time averaging (psatd.do_time_averaging=1
), div(E) cleaning (warpx.do_dive_cleaning=1
), and multi-J (warpx.do_multi_J=1
).Note that the update with and without rho is also supported in RZ geometry.
psatd.J_in_time
(constant
orlinear
; defaultconstant
)This determines whether the current density is assumed to be constant or linear in time, within the time step over which the electromagnetic fields are evolved.
psatd.rho_in_time
(linear
; defaultlinear
)This determines whether the charge density is assumed to be linear in time, within the time step over which the electromagnetic fields are evolved.
psatd.v_galilean
(3 floats, in units of the speed of light; default0. 0. 0.
)Defines the Galilean velocity. A non-zero velocity activates the Galilean algorithm, which suppresses numerical Cherenkov instabilities (NCI) in boosted-frame simulations (see the section Numerical Stability and alternate formulation in a Galilean frame for more information). This requires the code to be compiled with the spectral solver. It also requires the use of the direct current deposition algorithm (by setting
algo.current_deposition = direct
).
psatd.use_default_v_galilean
(0 or 1; default: 0)This can be used in boosted-frame simulations only and sets the Galilean velocity along the \(z\) direction automatically as \(v_{G} = -\sqrt{1-1/\gamma^2}\), where \(\gamma\) is the Lorentz factor of the boosted frame (set by
warpx.gamma_boost
). See the section Numerical Stability and alternate formulation in a Galilean frame for more information on the Galilean algorithm for boosted-frame simulations.
psatd.v_comoving
(3 floating-point values, in units of the speed of light; default0. 0. 0.
)Defines the comoving velocity in the comoving PSATD scheme. A non-zero comoving velocity selects the comoving PSATD algorithm, which suppresses the numerical Cherenkov instability (NCI) in boosted-frame simulations, under certain assumptions. This option requires that WarpX is compiled with
USE_PSATD = TRUE
. It also requires the use of direct current deposition (algo.current_deposition = direct
) and has not been neither implemented nor tested with other current deposition schemes.
psatd.do_time_averaging
(0 or 1; default: 0)Whether to use an averaged Galilean PSATD algorithm or standard Galilean PSATD.
warpx.do_multi_J
(0 or 1; default: 0)Whether to use the multi-J algorithm, where current deposition and field update are performed multiple times within each time step. The number of sub-steps is determined by the input parameter
warpx.do_multi_J_n_depositions
. Unlike sub-cycling, field gathering is performed only once per time step, as in regular PIC cycles. Whenwarpx.do_multi_J = 1
, we perform linear interpolation of two distinct currents deposited at the beginning and the end of the time step, instead of using one single current deposited at half time. For simulations with strong numerical Cherenkov instability (NCI), it is recommended to use the multi-J algorithm in combination withpsatd.do_time_averaging = 1
.
warpx.do_multi_J_n_depositions
(integer)Number of sub-steps to use with the multi-J algorithm, when
warpx.do_multi_J = 1
. Note that this input parameter is not optional and must always be set in all input files wherewarpx.do_multi_J = 1
. No default value is provided automatically.
Maxwell solver: macroscopic media
algo.macroscopic_sigma_method
(string, optional)The algorithm for updating electric field when
algo.em_solver_medium
is macroscopic. Available options are:backwardeuler
is a fully-implicit, first-order in time scheme for E-update (default).laxwendroff
is the semi-implicit, second order in time scheme for E-update.
Comparing the two methods, Lax-Wendroff is more prone to developing oscillations and requires a smaller timestep for stability. On the other hand, Backward Euler is more robust but it is first-order accurate in time compared to the second-order Lax-Wendroff method.
macroscopic.sigma_function(x,y,z)
,macroscopic.epsilon_function(x,y,z)
,macroscopic.mu_function(x,y,z)
(string)To initialize spatially varying conductivity, permittivity, and permeability, respectively, using a mathematical function in the input. Constants required in the mathematical expression can be set using
my_constants
. These parameters are parsed ifalgo.em_solver_medium=macroscopic
.
macroscopic.sigma
,macroscopic.epsilon
,macroscopic.mu
(double)To initialize a constant conductivity, permittivity, and permeability of the computational medium, respectively. The default values are the corresponding values in vacuum.
Maxwell solver: kinetic-fluid hybrid
hybrid_pic_model.elec_temp
(float)If
algo.maxwell_solver
is set tohybrid
, this sets the electron temperature, in eV, used to calculate the electron pressure (see here).
hybrid_pic_model.n0_ref
(float)If
algo.maxwell_solver
is set tohybrid
, this sets the reference density, in \(m^{-3}\), used to calculate the electron pressure (see here).
hybrid_pic_model.gamma
(float) optional (default5/3
)If
algo.maxwell_solver
is set tohybrid
, this sets the exponent used to calculate the electron pressure (see here).
hybrid_pic_model.plasma_resistivity(rho,J)
(float or str) optional (default0
)If
algo.maxwell_solver
is set tohybrid
, this sets the plasma resistivity in \(\Omega m\).
hybrid_pic_model.plasma_hyper_resistivity
(float or str) optional (default0
)If
algo.maxwell_solver
is set tohybrid
, this sets the plasma hyper-resistivity in \(\Omega m^3\).
hybrid_pic_model.J[x/y/z]_external_grid_function(x, y, z, t)
(float or str) optional (default0
)If
algo.maxwell_solver
is set tohybrid
, this sets the external current (on the grid) in \(A/m^2\).
hybrid_pic_model.n_floor
(float) optional (default1
)If
algo.maxwell_solver
is set tohybrid
, this sets the plasma density floor, in \(m^{-3}\), which is useful since the generalized Ohm’s law used to calculate the E-field includes a \(1/n\) term.
hybrid_pic_model.substeps
(int) optional (default10
)If
algo.maxwell_solver
is set tohybrid
, this sets the number of sub-steps to take during the B-field update.
Note
Based on results from Stanier et al. [15] it is recommended to use linear particles when using the hybrid-PIC model.
Grid types (collocated, staggered, hybrid)
warpx.grid_type
(string,collocated
,staggered
orhybrid
)Whether to use a collocated grid (all fields defined at the cell nodes), a staggered grid (fields defined on a Yee grid), or a hybrid grid (fields and currents are interpolated back and forth between a staggered grid and a nodal grid, must be used with momentum-conserving field gathering algorithm,
algo.field_gathering = momentum-conserving
). The optionhybrid
is currently not supported in RZ geometry.Default:
warpx.grid_type = staggered
.
interpolation.galerkin_scheme
(0 or 1)Whether to use a Galerkin scheme when gathering fields to particles. When set to
1
, the interpolation orders used for field-gathering are reduced for certain field components along certain directions. For example, \(E_z\) is gathered usingalgo.particle_shape
along \((x,y)\) andalgo.particle_shape - 1
along \(z\). See equations (21)-(23) of Godfrey and Vay [16] and associated references for details.Default:
interpolation.galerkin_scheme = 0
with collocated grids and/or momentum-conserving field gathering,interpolation.galerkin_scheme = 1
otherwise.Warning
The default behavior should not normally be changed. At present, this parameter is intended mainly for testing and development purposes.
warpx.field_centering_nox
,warpx.field_centering_noy
,warpx.field_centering_noz
(integer, optional)The order of interpolation used with staggered or hybrid grids (
warpx.grid_type = staggered
orwarpx.grid_type = hybrid
) and momentum-conserving field gathering (algo.field_gathering = momentum-conserving
) to interpolate the electric and magnetic fields from the cell centers to the cell nodes, before gathering the fields from the cell nodes to the particle positions.Default:
warpx.field_centering_no<x,y,z> = 2
with staggered grids,warpx.field_centering_no<x,y,z> = 8
with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).
warpx.current_centering_nox
,warpx.current_centering_noy
,warpx.current_centering_noz
(integer, optional)The order of interpolation used with hybrid grids (
warpx.grid_type = hybrid
) to interpolate the currents from the cell nodes to the cell centers whenwarpx.do_current_centering = 1
, before pushing the Maxwell fields on staggered grids.Default:
warpx.current_centering_no<x,y,z> = 8
with hybrid grids (typically necessary to ensure stability in boosted-frame simulations of relativistic plasmas and beams).
warpx.do_current_centering
(bool, 0 or 1)If true, the current is deposited on a nodal grid and then centered to a staggered grid (Yee grid), using finite-order interpolation.
Default:
warpx.do_current_centering = 0
with collocated or staggered grids,warpx.do_current_centering = 1
with hybrid grids.
Additional parameters
warpx.do_dive_cleaning
(0 or 1 ; default: 0)Whether to use modified Maxwell equations that progressively eliminate the error in \(div(E)-\rho\). This can be useful when using a current deposition algorithm which is not strictly charge-conserving, or when using mesh refinement. These modified Maxwell equation will cause the error to propagate (at the speed of light) to the boundaries of the simulation domain, where it can be absorbed.
warpx.do_subcycling
(0 or 1; default: 0)Whether or not to use sub-cycling. Different refinement levels have a different cell size, which results in different Courant–Friedrichs–Lewy (CFL) limits for the time step. By default, when using mesh refinement, the same time step is used for all levels. This time step is taken as the CFL limit of the finest level. Hence, for coarser levels, the timestep is only a fraction of the CFL limit for this level, which may lead to numerical artifacts. With sub-cycling, each level evolves with its own time step, set to its own CFL limit. In practice, it means that when level 0 performs one iteration, level 1 performs two iterations. Currently, this option is only supported when
amr.max_level = 1
. More information can be found at https://ieeexplore.ieee.org/document/8659392.
warpx.override_sync_intervals
(string) optional (default 1)Using the Intervals parser syntax, this string defines the timesteps at which synchronization of sources (rho and J) and fields (E and B) on grid nodes at box boundaries is performed. Since the grid nodes at the interface between two neighbor boxes are duplicated in both boxes, an instability can occur if they have too different values. This option makes sure that they are synchronized periodically. Note that if Perfectly Matched Layers (PML) are used, synchronization of the E and B fields is performed at every timestep regardless of this parameter.
warpx.use_hybrid_QED
(bool; default: 0)Will use the Hybrid QED Maxwell solver when pushing fields: a QED correction is added to the field solver to solve non-linear Maxwell’s equations, according to Grismayer et al. [17]. Note that this option can only be used with the PSATD build. Furthermore, one must set
warpx.grid_type = collocated
(which otherwise would bestaggered
by default).
warpx.quantum_xi
(float; default: 1.3050122.e-52)Overwrites the actual quantum parameter used in Maxwell’s QED equations. Assigning a value here will make the simulation unphysical, but will allow QED effects to become more apparent. Note that this option will only have an effect if the
warpx.use_Hybrid_QED
flag is also triggered.
warpx.do_device_synchronize
(bool) optional (default 1)When running in an accelerated platform, whether to call a
amrex::Gpu::synchronize()
around profiling regions. This allows the profiler to give meaningful timers, but (hardly) slows down the simulation.
warpx.sort_intervals
(string) optional (defaults:-1
on CPU;4
on GPU)Using the Intervals parser syntax, this string defines the timesteps at which particles are sorted. If
<=0
, do not sort particles. It is turned on on GPUs for performance reasons (to improve memory locality).
warpx.sort_particles_for_deposition
(bool) optional (default:true
for the CUDA backend, otherwisefalse
)This option controls the type of sorting used if particle sorting is turned on, i.e. if
sort_intervals
is not<=0
. Iftrue
, particles will be sorted by cell to optimize deposition with many particles per cell, in the order x -> y -> z -> ppc. Iffalse
, particles will be sorted by bin, using thesort_bin_size
parameter below, in the order ppc -> x -> y -> z.true
is recommend for best performance on NVIDIA GPUs, especially if there are many particles per cell.
warpx.sort_idx_type
(list of int) optional (default:0 0 0
)This controls the type of grid used to sort the particles when
sort_particles_for_deposition
istrue
. Possible values are:idx_type = {0, 0, 0}
: Sort particles to a cell centered grididx_type = {1, 1, 1}
: Sort particles to a node centered grididx_type = {2, 2, 2}
: Compromise between a cell and node centered grid. In 2D (XZ and RZ), only the first two elements are read. In 1D, only the first element is read.
warpx.sort_bin_size
(list of int) optional (default1 1 1
)If
sort_intervals
is activated andsort_particles_for_deposition
isfalse
, particles are sorted in bins ofsort_bin_size
cells. In 2D, only the first two elements are read.
warpx.do_shared_mem_charge_deposition
(bool) optional (default false)If activated, charge deposition will allocate and use small temporary buffers on which to accumulate deposited charge values from particles. On GPUs these buffers will reside in
__shared__
memory, which is faster than the usual__global__
memory. Performance impact will depend on the relative overhead of assigning the particles to bins small enough to fit in the space available for the temporary buffers.
warpx.do_shared_mem_current_deposition
(bool) optional (default false)If activated, current deposition will allocate and use small temporary buffers on which to accumulate deposited current values from particles. On GPUs these buffers will reside in
__shared__
memory, which is faster than the usual__global__
memory. Performance impact will depend on the relative overhead of assigning the particles to bins small enough to fit in the space available for the temporary buffers. Performance is mostly improved when there is lots of contention between particles writing to the same cell (e.g. for high particles per cell). This feature is only available for CUDA and HIP, and is only recommended for 3D or 2D.
warpx.shared_tilesize
(list of int) optional (default 6 6 8 in 3D; 14 14 in 2D; 1s otherwise)Used to tune performance when
do_shared_mem_current_deposition
ordo_shared_mem_charge_deposition
is enabled.shared_tilesize
is the size of the temporary buffer allocated in shared memory for a threadblock. A larger tilesize requires more shared memory, but gives more work to each threadblock, which can lead to higher occupancy, and allows for more buffered writes to__shared__
instead of__global__
. The defaults in 2D and 3D are chosen from experimentation, but can be improved upon for specific problems. The other defaults are not optimized and should always be fine tuned for the problem.
warpx.shared_mem_current_tpb
(int) optional (default 128)Used to tune performance when
do_shared_mem_current_deposition
is enabled.shared_mem_current_tpb
controls the number of threads per block (tpb), i.e. the number of threads operating on a shared buffer.
Diagnostics and output
In-situ visualization
WarpX has four types of diagnostics:
FullDiagnostics
consist in dumps of fields and particles at given iterations,
BackTransformedDiagnostics
are used when running a simulation in a boosted frame, to reconstruct output data to the lab frame,
BoundaryScrapingDiagnostics
are used to collect the particles that are absorbed at the boundary, throughout the simulation, and
ReducedDiags
allow the user to compute some reduced quantity (particle temperature, max of a field) and write a small amount of data to text files.
Similar to what is done for physical species, WarpX has a class Diagnostics that allows users to initialize different diagnostics, each of them with different fields, resolution and period.
This currently applies to standard diagnostics, but should be extended to back-transformed diagnostics and reduced diagnostics (and others) in a near future.
Full Diagnostics
FullDiagnostics
consist in dumps of fields and particles at given iterations.
Similar to what is done for physical species, WarpX has a class Diagnostics that allows users to initialize different diagnostics, each of them with different fields, resolution and period.
The user specifies the number of diagnostics and the name of each of them, and then specifies options for each of them separately.
Note that some parameter (those that do not start with a <diag_name>.
prefix) apply to all diagnostics.
This should be changed in the future.
In-situ capabilities can be used by turning on Sensei or Ascent (provided they are installed) through the output format, see below.
diagnostics.enable
(0 or 1, optional, default 1)Whether to enable or disable diagnostics. This flag overwrites all other diagnostics input parameters.
diagnostics.diags_names
(list of string optional, default empty)Name of each diagnostics. example:
diagnostics.diags_names = diag1 my_second_diag
.
<diag_name>.intervals
(string)Using the Intervals parser syntax, this string defines the timesteps at which data is dumped. Use a negative number or 0 to disable data dumping. example:
diag1.intervals = 10,20:25:1
. Note that by default the last timestep is dumped regardless of this parameter. This can be changed using the parameter<diag_name>.dump_last_timestep
described below.
<diag_name>.dump_last_timestep
(bool optional, default 1)If this is 1, the last timestep is dumped regardless of
<diag_name>.intervals
.
<diag_name>.diag_type
(string)Type of diagnostics.
Full
,BackTransformed
, andBoundaryScraping
example:diag1.diag_type = Full
ordiag1.diag_type = BackTransformed
<diag_name>.format
(string optional, defaultplotfile
)Flush format. Possible values are:
plotfile
for native AMReX format.checkpoint
for a checkpoint file, only works with<diag_name>.diag_type = Full
.openpmd
for OpenPMD format openPMD. Requires to build WarpX withUSE_OPENPMD=TRUE
(see instructions).ascent
for in-situ visualization using Ascent.sensei
for in-situ visualization using Sensei.
example:
diag1.format = openpmd
.
<diag_name>.sensei_config
(string)Only read if
<diag_name>.format = sensei
. Points to the SENSEI XML file which selects and configures the desired back end.
<diag_name>.sensei_pin_mesh
(integer; 0 by default)Only read if
<diag_name>.format = sensei
. When 1 lower left corner of the mesh is pinned to 0.,0.,0.
<diag_name>.openpmd_backend
(bp
,h5
orjson
) optional, only used if<diag_name>.format = openpmd
I/O backend for openPMD data dumps.
bp
is the ADIOS I/O library,h5
is the HDF5 format, andjson
is a simple text format.json
only works with serial/single-rank jobs. When WarpX is compiled with openPMD support, the first available backend in the order given above is taken.
<diag_name>.openpmd_encoding
(optional,v
(variable based),f
(file based) org
(group based) ) only read if<diag_name>.format = openpmd
.openPMD file output encoding. File based: one file per timestep (slower), group/variable based: one file for all steps (faster)).
variable based
is an experimental feature with ADIOS2 and not supported for back-transformed diagnostics. Default:f
(full diagnostics)
<diag_name>.adios2_operator.type
(zfp
,blosc
) optional,ADIOS2 I/O operator type for openPMD data dumps.
<diag_name>.adios2_operator.parameters.*
optional,ADIOS2 I/O operator parameters for openPMD data dumps.
A typical example for ADIOS2 output using lossless compression with
blosc
using thezstd
compressor and 6 CPU treads per MPI Rank (e.g. for a GPU run with spare CPU resources):<diag_name>.adios2_operator.type = blosc <diag_name>.adios2_operator.parameters.compressor = zstd <diag_name>.adios2_operator.parameters.clevel = 1 <diag_name>.adios2_operator.parameters.doshuffle = BLOSC_BITSHUFFLE <diag_name>.adios2_operator.parameters.threshold = 2048 <diag_name>.adios2_operator.parameters.nthreads = 6 # per MPI rank (and thus per GPU)
or for the lossy ZFP compressor using very strong compression per scalar:
<diag_name>.adios2_operator.type = zfp <diag_name>.adios2_operator.parameters.precision = 3
<diag_name>.adios2_engine.type
(bp4
,sst
,ssc
,dataman
) optional,ADIOS2 Engine type for openPMD data dumps. See full list of engines at ADIOS2 readthedocs
<diag_name>.adios2_engine.parameters.*
optional,ADIOS2 Engine parameters for openPMD data dumps.
An example for parameters for the BP engine are setting the number of writers (
NumAggregators
), transparently redirecting data to burst buffers etc. A detailed list of engine-specific parameters are available at the official ADIOS2 documentation<diag_name>.adios2_engine.parameters.NumAggregators = 2048 <diag_name>.adios2_engine.parameters.BurstBufferPath="/mnt/bb/username"
<diag_name>.fields_to_plot
(list of strings, optional)Fields written to output. Possible scalar fields:
part_per_cell
rho
phi
F
part_per_grid
divE
divB
andrho_<species_name>
, where<species_name>
must match the name of one of the available particle species. Note thatphi
will only be written out when do_electrostatic==labframe. Also, note that for<diag_name>.diag_type = BackTransformed
, the only scalar field currently supported isrho
. Possible vector field components in Cartesian geometry:Ex
Ey
Ez
Bx
By
Bz
jx
jy
jz
. Possible vector field components in RZ geometry:Er
Et
Ez
Br
Bt
Bz
jr
jt
jz
. The default is<diag_name>.fields_to_plot = Ex Ey Ez Bx By Bz jx jy jz
in Cartesian geometry and<diag_name>.fields_to_plot = Er Et Ez Br Bt Bz jr jt jz
in RZ geometry. When the special valuenone
is specified, no fields are written out. Note that the fields are averaged on the cell centers before they are written to file. Otherwise, we reconstruct a 2D Cartesian slice of the fields for output at \(\theta=0\).
<diag_name>.dump_rz_modes
(0 or 1) optional (default 0)Whether to save all modes when in RZ. When
openpmd_backend = openpmd
, this parameter is ignored and all modes are saved.
<diag_name>.particle_fields_to_plot
(list of strings, optional)Names of per-cell diagnostics of particle properties to calculate and output as additional fields. Note that the deposition onto the grid does not respect the particle shape factor, but instead uses nearest-grid point interpolation. Default is none. Parser functions for these field names are specified by
<diag_name>.particle_fields.<field_name>(x,y,z,ux,uy,uz)
. Also, note that this option is only available for<diag_name>.diag_type = Full
<diag_name>.particle_fields_species
(list of strings, optional)Species for which to calculate
particle_fields_to_plot
. Fields will be calculated separately for each specified species. The default is a list of all of the available particle species.
<diag_name>.particle_fields.<field_name>.do_average
(0 or 1) optional (default 1)Whether the diagnostic is an average or a sum. With an average, the sum over the specified function is divided by the sum of the particle weights in each cell.
<diag_name>.particle_fields.<field_name>(x,y,z,ux,uy,uz)
(parser string)Parser function to be calculated for each particle per cell. The averaged field written is
\[\texttt{<field_name>_<species>} = \frac{\sum_{i=1}^N w_i \, f(x_i,y_i,z_i,u_{x,i},u_{y,i},u_{z,i})}{\sum_{i=1}^N w_i}\]where \(w_i\) is the particle weight, \(f()\) is the parser function, and \((x_i,y_i,z_i)\) are particle positions in units of a meter. The sums are over all particles of type
<species>
in a cell (ignoring the particle shape factor) that satisfy<diag_name>.particle_fields.<field_name>.filter(x,y,z,ux,uy,uz)
. When<diag_name>.particle_fields.<field_name>.do_average
is 0, the division by the sum over particle weights is not done. In 1D or 2D, the particle coordinates will follow the WarpX convention. \((u_{x,i},u_{y,i},u_{z,i})\) are components of the particle four-momentum. \(u = \gamma v/c\), \(\gamma\) is the Lorentz factor, \(v\) is the particle velocity and \(c\) is the speed of light. For photons, we use the standardized momentum \(u = p/(m_{e}c)\), where \(p\) is the momentum of the photon and \(m_{e}\) the mass of an electron.
<diag_name>.particle_fields.<field_name>.filter(x,y,z,ux,uy,uz)
(parser string, optional)Parser function returning a boolean for whether to include a particle in the diagnostic. If not specified, all particles will be included (see above). The function arguments are the same as above.
<diag_name>.plot_raw_fields
(0 or 1) optional (default 0)By default, the fields written in the plot files are averaged on the cell centers. When
<diag_name>.plot_raw_fields = 1
, then the raw (i.e. non-averaged) fields are also saved in the output files. Only works with<diag_name>.format = plotfile
. See this section in the yt documentation for more details on how to view raw fields.
<diag_name>.plot_raw_fields_guards
(0 or 1) optional (default 0)Only used when
<diag_name>.plot_raw_fields = 1
. Whether to include the guard cells in the output of the raw fields. Only works with<diag_name>.format = plotfile
.
<diag_name>.coarsening_ratio
(list of int) optional (default 1 1 1)Reduce size of the selected diagnostic fields output by this ratio in each dimension. (For a ratio of N, this is done by averaging the fields over N or (N+1) points depending on the staggering). If
blocking_factor
andmax_grid_size
are used for the domain decomposition, as detailed in the domain decomposition section,coarsening_ratio
should be an integer divisor ofblocking_factor
. Ifwarpx.numprocs
is used instead, the total number of cells in a given dimension must be a multiple of thecoarsening_ratio
multiplied bynumprocs
in that dimension.
<diag_name>.file_prefix
(string) optional (default diags/<diag_name>)Root for output file names. Supports sub-directories.
<diag_name>.file_min_digits
(int) optional (default 6)The minimum number of digits used for the iteration number appended to the diagnostic file names.
<diag_name>.diag_lo
(list float, 1 per dimension) optional (default -infinity -infinity -infinity)Lower corner of the output fields (if smaller than
warpx.dom_lo
, then set towarpx.dom_lo
). Currently, when thediag_lo
is different fromwarpx.dom_lo
, particle output is disabled.
<diag_name>.diag_hi
(list float, 1 per dimension) optional (default +infinity +infinity +infinity)Higher corner of the output fields (if larger than
warpx.dom_hi
, then set towarpx.dom_hi
). Currently, when thediag_hi
is different fromwarpx.dom_hi
, particle output is disabled.
<diag_name>.write_species
(0 or 1) optional (default 1)Whether to write species output or not. For checkpoint format, always set this parameter to 1.
<diag_name>.species
(list of string, default all physical species in the simulation)Which species dumped in this diagnostics.
<diag_name>.<species_name>.variables
(list of strings separated by spaces, optional)List of particle quantities to write to output. Choices are
w
for the particle weight andux
uy
uz
for the particle momenta. When using the lab-frame electrostatic solver,phi
(electrostatic potential, on the macroparticles) is also available. By default, all particle quantities (exceptphi
) are written. If<diag_name>.<species_name>.variables = none
, no particle data are written, except for particle positions, which are always included.
<diag_name>.<species_name>.random_fraction
(float) optionalIf provided
<diag_name>.<species_name>.random_fraction = a
, only a fraction of the particle data of this species will be dumped randomly in diag<diag_name>
, i.e. if rand() < a, this particle will be dumped, where rand() denotes a random number generator. The value a provided should be between 0 and 1.
<diag_name>.<species_name>.uniform_stride
(int) optionalIf provided
<diag_name>.<species_name>.uniform_stride = n
, every n particle of this species will be dumped, selected uniformly. The value provided should be an integer greater than or equal to 0.
<diag_name>.<species_name>.plot_filter_function(t,x,y,z,ux,uy,uz)
(string) optionalUsers can provide an expression returning a boolean for whether a particle is dumped. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g. If provided (x>0.0)*(uz<10.0) only those particles located at positions x greater than 0, and those having momentum uz less than 10, will be dumped.
amrex.async_out
(0 or 1) optional (default 0)Whether to use asynchronous IO when writing plotfiles. This only has an effect when using the AMReX plotfile format. Please see the data analysis section for more information.
amrex.async_out_nfiles
(int) optional (default 64)The maximum number of files to write to when using asynchronous IO. To use asynchronous IO with more than
amrex.async_out_nfiles
MPI ranks, WarpX must be configured with-DWarpX_MPI_THREAD_MULTIPLE=ON
. Please see the data analysis section for more information.
warpx.field_io_nfiles
andwarpx.particle_io_nfiles
(int) optional (default 1024)The maximum number of files to use when writing field and particle data to plotfile directories.
warpx.mffile_nstreams
(int) optional (default 4)Limit the number of concurrent readers per file.
BackTransformed Diagnostics
BackTransformed
diag type are used when running a simulation in a boosted frame, to reconstruct output data to the lab frame. This option can be set using<diag_name>.diag_type = BackTransformed
. We support the following list of options from Full Diagnostics<diag_name>.format
,<diag_name>.openpmd_backend
,<diag_name>.dump_rz_modes
,<diag_name>.file_prefix
,<diag_name>.diag_lo
,<diag_name>.diag_hi
,<diag_name>.write_species
,<diag_name>.species
.Additional options for this diagnostic include:
<diag_name>.num_snapshots_lab
(integer)Only used when
<diag_name>.diag_type
isBackTransformed
. The number of lab-frame snapshots that will be written. Only this option orintervals
should be specified; a run-time error occurs if the user attempts to set bothnum_snapshots_lab
andintervals
.
<diag_name>.intervals
(string)Only used when
<diag_name>.diag_type
isBackTransformed
. Using the Intervals parser syntax, this string defines the lab frame times at which data is dumped, given as multiples of the step sizedt_snapshots_lab
ordz_snapshots_lab
described below. Example:btdiag1.intervals = 10:11,20:24:2
andbtdiag1.dt_snapshots_lab = 1.e-12
indicate to dump at lab times1e-11
,1.1e-11
,2e-11
,2.2e-11
, and2.4e-11
seconds. Note that the stop interval, the second number in the slice, must always be specified. Only this option ornum_snapshots_lab
should be specified; a run-time error occurs if the user attempts to set bothnum_snapshots_lab
andintervals
.
<diag_name>.dt_snapshots_lab
(float, in seconds)Only used when
<diag_name>.diag_type
isBackTransformed
. The time interval in between the lab-frame snapshots (where this time interval is expressed in the laboratory frame).
<diag_name>.dz_snapshots_lab
(float, in meters)Only used when
<diag_name>.diag_type
isBackTransformed
. Distance between the lab-frame snapshots (expressed in the laboratory frame).dt_snapshots_lab
is then computed bydt_snapshots_lab = dz_snapshots_lab/c
. Either dt_snapshots_lab or dz_snapshot_lab is required.
<diag_name>.buffer_size
(integer)Only used when
<diag_name>.diag_type
isBackTransformed
. The default size of the back transformed diagnostic buffers used to generate lab-frame data is 256. That is, when the multifab with lab-frame data has 256 z-slices, the data will be flushed out. However, if many lab-frame snapshots are required for diagnostics and visualization, the GPU may run out of memory with many large boxes with a size of 256 in the z-direction. This input parameter can then be used to set a smaller buffer-size, preferably multiples of 8, such that, a large number of lab-frame snapshot data can be generated without running out of gpu memory. The downside to using a small buffer size, is that the I/O time may increase due to frequent flushes of the lab-frame data. The other option is to keep the default value for buffer size and use slices to reduce the memory footprint and maintain optimum I/O performance.
<diag_name>.do_back_transformed_fields
(0 or 1) optional (default 1)Only used when
<diag_name>.diag_type
isBackTransformed
Whether to back transform the fields or not. Note that forBackTransformed
diagnostics, at least one of the options<diag_name>.do_back_transformed_fields
or<diag_name>.do_back_transformed_particles
must be 1.
<diag_name>.do_back_transformed_particles
(0 or 1) optional (default 1)Only used when
<diag_name>.diag_type
isBackTransformed
Whether to back transform the particle data or not. Note that forBackTransformed
diagnostics, at least one of the options<diag_name>.do_back_transformed_fields
or<diag_name>.do_back_transformed_particles
must be 1. Ifdiag_name.write_species = 0
, then<diag_name>.do_back_transformed_particles
will be set to 0 in the simulation and particles will not be backtransformed.
Boundary Scraping Diagnostics
BoundaryScrapingDiagnostics
are used to collect the particles that are absorbed at the boundaries, throughout the simulation.
This diagnostic type is specified by setting <diag_name>.diag_type
= BoundaryScraping
.
Currently, the only supported output format is openPMD, so the user also needs to set <diag>.format=openpmd
and WarpX must be compiled with openPMD turned on.
The data that is to be collected and recorded is controlled per species and per boundary by setting one or more of the flags to 1
,
<species>.save_particles_at_xlo/ylo/zlo
, <species>.save_particles_at_xhi/yhi/zhi
, and <species>.save_particles_at_eb
.
(Note that this diagnostics does not save any field ; it only saves particles.)
The data collected at each boundary is written out to a subdirectory of the diagnostics directory with the name of the boundary, for example, particles_at_xlo
, particles_at_zhi
, or particles_at_eb
.
By default, all of the collected particle data is written out at the end of the simulation. Optionally, the <diag_name>.intervals
parameter can be given to specify writing out the data more often.
This can be important if a large number of particles are lost, avoiding filling up memory with the accumulated lost particle data.
- In addition to their usual attributes, the saved particles have
an integer attribute
stepScraped
, which indicates the PIC iteration at which each particle was absorbed at the boundary, a real attributedeltaTimeScraped
, which indicates the time between the time associated to stepScraped and the exact time when each particle hits the boundary. 3 real attributesnx
,ny
,nz
, which represents the three components of the normal to the boundary on the point of contact of the particles (not saved if they reach non-EB boundaries)
BoundaryScrapingDiagnostics
can be used with <diag_name>.<species>.random_fraction
, <diag_name>.<species>.uniform_stride
, and <diag_name>.<species>.plot_filter_function
, which have the same behavior as for FullDiagnostics
. For BoundaryScrapingDiagnostics
, these filters are applied at the time the data is written to file. An implication of this is that more particles may initially be accumulated in memory than are ultimately written. t
in plot_filter_function
refers to the time the diagnostic is written rather than the time the particle crossed the boundary.
Reduced Diagnostics
ReducedDiags
allow the user to compute some reduced quantity (particle temperature, max of a field) and write a small amount of data to text files.
warpx.reduced_diags_names
(strings, separated by spaces)The names given by the user of simple reduced diagnostics. Also the names of the output .txt files. This reduced diagnostics aims to produce simple outputs of the time history of some physical quantities. If
warpx.reduced_diags_names
is not provided in the input file, no reduced diagnostics will be done. This is then used in the rest of the input deck; in this documentation we use<reduced_diags_name>
as a placeholder.
<reduced_diags_name>.type
(string)The type of reduced diagnostics associated with this
<reduced_diags_name>
. For example,ParticleEnergy
,FieldEnergy
, etc. All available types are described below in detail. For all reduced diagnostics, the first and the second columns in the output file are the time step and the corresponding physical time in seconds, respectively.ParticleEnergy
This type computes the total and mean relativistic particle kinetic energy among all species:
\[E_p = \sum_{i=1}^N w_i \, \left( \sqrt{|\boldsymbol{p}_i|^2 c^2 + m_0^2 c^4} - m_0 c^2 \right)\]where \(\boldsymbol{p}_i\) is the relativistic momentum of the \(i\)-th particle, \(c\) is the speed of light, \(m_0\) is the rest mass, \(N\) is the number of particles, and \(w_i\) is the weight of the \(i\)-th particle.
The output columns are the total energy of all species, the total energy per species, the total mean energy \(E_p / \sum_i w_i\) of all species, and the total mean energy per species.
ParticleMomentum
This type computes the total and mean relativistic particle momentum among all species:
\[\boldsymbol{P}_p = \sum_{i=1}^N w_i \, \boldsymbol{p}_i\]where \(\boldsymbol{p}_i\) is the relativistic momentum of the \(i\)-th particle, \(N\) is the number of particles, and \(w_i\) is the weight of the \(i\)-th particle.
The output columns are the components of the total momentum of all species, the total momentum per species, the total mean momentum \(\boldsymbol{P}_p / \sum_i w_i\) of all species, and the total mean momentum per species.
FieldEnergy
This type computes the electromagnetic field energy
\[E_f = \frac{1}{2} \sum_{\text{cells}} \left( \varepsilon_0 |\boldsymbol{E}|^2 + \frac{|\boldsymbol{B}|^2}{\mu_0} \right) \Delta V\]where \(\boldsymbol{E}\) is the electric field, \(\boldsymbol{B}\) is the magnetic field, \(\varepsilon_0\) is the vacuum permittivity, \(\mu_0\) is the vacuum permeability, \(\Delta V\) is the cell volume (or cell area in 2D), and the sum is over all cells.
The output columns are the total field energy \(E_f\), the \(\boldsymbol{E}\) field energy, and the \(\boldsymbol{B}\) field energy, at each mesh refinement level.
FieldMomentum
This type computes the electromagnetic field momentum
\[\boldsymbol{P}_f = \varepsilon_0 \sum_{\text{cells}} \left( \boldsymbol{E} \times \boldsymbol{B} \right) \Delta V\]where \(\boldsymbol{E}\) is the electric field, \(\boldsymbol{B}\) is the magnetic field, \(\varepsilon_0\) is the vacuum permittivity, \(\Delta V\) is the cell volume (or cell area in 2D), and the sum is over all cells.
The output columns are the components of the total field momentum \(\boldsymbol{P}_f\) at each mesh refinement level.
Note that the fields are not averaged on the cell centers before their energy is computed.
FieldMaximum
This type computes the maximum value of each component of the electric and magnetic fields and of the norm of the electric and magnetic field vectors. Measuring maximum fields in a plasma might be very noisy in PIC, use this instead for analysis of scenarios such as an electromagnetic wave propagating in vacuum.
The output columns are the maximum value of the \(E_x\) field, the maximum value of the \(E_y\) field, the maximum value of the \(E_z\) field, the maximum value of the norm \(|E|\) of the electric field, the maximum value of the \(B_x\) field, the maximum value of the \(B_y\) field, the maximum value of the \(B_z\) field and the maximum value of the norm \(|B|\) of the magnetic field, at mesh refinement levels from 0 to \(n\).
Note that the fields are averaged on the cell centers before their maximum values are computed.
FieldProbe
This type computes the value of each component of the electric and magnetic fields and of the Poynting vector (a measure of electromagnetic flux) at points in the domain.
Multiple geometries for point probes can be specified via
<reduced_diags_name>.probe_geometry = ...
:Point
(default): a single pointLine
: a line of points with equal spacingPlane
: a plane of points with equal spacing
Point: The point where the fields are measured is specified through the input parameters
<reduced_diags_name>.x_probe
,<reduced_diags_name>.y_probe
and<reduced_diags_name>.z_probe
.Line: probe a 1 dimensional line of points to create a line detector. Initial input parameters
x_probe
,y_probe
, andz_probe
designate one end of the line detector, while the far end is specified via<reduced_diags_name>.x1_probe
,<reduced_diags_name>.y1_probe
,<reduced_diags_name>.z1_probe
. Additionally,<reduced_diags_name>.resolution
must be defined to give the number of detector points along the line (equally spaced) to probe.Plane: probe a 2 dimensional plane of points to create a square plane detector. Initial input parameters
x_probe
,y_probe
, andz_probe
designate the center of the detector. The detector plane is normal to a vector specified by<reduced_diags_name>.target_normal_x
,<reduced_diags_name>.target_normal_y
, and<reduced_diags_name>.target_normal_z
. Note that it is not necessary to specify thetarget_normal
vector in a 2D simulation (the only supported normal is iny
). The top of the plane is perpendicular to an “up” vector denoted by<reduced_diags_name>.target_up_x
,<reduced_diags_name>.target_up_y
, and<reduced_diags_name>.target_up_z
. The detector has a square radius to be determined by<reduced_diags_name>.detector_radius
. Similarly to the line detector, the plane detector requires a resolution<reduced_diags_name>.resolution
, which denotes the number of detector particles along each side of the square detector.The output columns are the value of the \(E_x\) field, the value of the \(E_y\) field, the value of the \(E_z\) field, the value of the \(B_x\) field, the value of the \(B_y\) field, the value of the \(B_z\) field and the value of the Poynting Vector \(|S|\) of the electromagnetic fields, at mesh refinement levels from 0 to \(n\), at point (\(x\), \(y\), \(z\)).
The fields are always interpolated to the measurement point. The interpolation order can be set by specifying
<reduced_diags_name>.interp_order
, defaulting to1
. In RZ geometry, this only saves the 0’th azimuthal mode component of the fields. Time integrated electric and magnetic field components can instead be obtained by specifying<reduced_diags_name>.integrate = true
. The integration is done every time step even when the data is written out less often. In a moving window simulation, the FieldProbe can be set to follow the moving frame by specifying<reduced_diags_name>.do_moving_window_FP = 1
(default 0).Warning
The FieldProbe reduced diagnostic does not yet add a Lorentz back transformation for boosted frame simulations. Thus, it records field data in the boosted frame, not (yet) in the lab frame.
RhoMaximum
This type computes the maximum and minimum values of the total charge density as well as the maximum absolute value of the charge density of each charged species. Please be aware that measuring maximum charge densities might be very noisy in PIC simulations.
The output columns are the maximum value of the \(rho\) field, the minimum value of the \(rho\) field, the maximum value of the absolute \(|rho|\) field of each charged species.
Note that the charge densities are averaged on the cell centers before their maximum values are computed.
FieldReduction
This type computes an arbitrary reduction of the positions, the current density, and the electromagnetic fields.
<reduced_diags_name>.reduced_function(x,y,z,Ex,Ey,Ez,Bx,By,Bz,jx,jy,jz)
(string)An analytic function to be reduced must be provided, using the math parser.
<reduced_diags_name>.reduction_type
(string)The type of reduction to be performed. It must be either
Maximum
,Minimum
orIntegral
.Integral
computes the spatial integral of the function defined in the parser by summing its value on all grid points and multiplying the result by the volume of a cell. Please be also aware that measuring maximum quantities might be very noisy in PIC simulations.
The only output column is the reduced value.
Note that the fields are averaged on the cell centers before the reduction is performed.
ParticleNumber
This type computes the total number of macroparticles and of physical particles (i.e. the sum of their weights) in the whole simulation domain (for each species and summed over all species). It can be useful in particular for simulations with creation (ionization, QED processes) or removal (resampling) of particles.
The output columns are total number of macroparticles summed over all species, total number of macroparticles of each species, sum of the particles’ weight summed over all species, sum of the particles’ weight of each species.
BeamRelevant
This type computes properties of a particle beam relevant for particle accelerators, like position, momentum, emittance, etc.
<reduced_diags_name>.species
must be provided, such that the diagnostics are done for this (beam-like) species only.The output columns (for 3D-XYZ) are the following, where the average is done over the whole species (typical usage: the particle beam is in a separate species):
[0]: simulation step (iteration).
[1]: time (s).
[2], [3], [4]: The mean values of beam positions (m) \(\langle x \rangle\), \(\langle y \rangle\), \(\langle z \rangle\).
[5], [6], [7]: The mean values of beam relativistic momenta (kg m/s) \(\langle p_x \rangle\), \(\langle p_y \rangle\), \(\langle p_z \rangle\).
[8]: The mean Lorentz factor \(\langle \gamma \rangle\).
[9], [10], [11]: The RMS values of beam positions (m) \(\delta_x = \sqrt{ \langle (x - \langle x \rangle)^2 \rangle }\), \(\delta_y = \sqrt{ \langle (y - \langle y \rangle)^2 \rangle }\), \(\delta_z = \sqrt{ \langle (z - \langle z \rangle)^2 \rangle }\).
[12], [13], [14]: The RMS values of beam relativistic momenta (kg m/s) \(\delta_{px} = \sqrt{ \langle (p_x - \langle p_x \rangle)^2 \rangle }\), \(\delta_{py} = \sqrt{ \langle (p_y - \langle p_y \rangle)^2 \rangle }\), \(\delta_{pz} = \sqrt{ \langle (p_z - \langle p_z \rangle)^2 \rangle }\).
[15]: The RMS value of the Lorentz factor \(\sqrt{ \langle (\gamma - \langle \gamma \rangle)^2 \rangle }\).
[16], [17], [18]: beam projected transverse RMS normalized emittance (m) \(\epsilon_x = \dfrac{1}{mc} \sqrt{\delta_x^2 \delta_{px}^2 - \Big\langle (x-\langle x \rangle) (p_x-\langle p_x \rangle) \Big\rangle^2}\), \(\epsilon_y = \dfrac{1}{mc} \sqrt{\delta_y^2 \delta_{py}^2 - \Big\langle (y-\langle y \rangle) (p_y-\langle p_y \rangle) \Big\rangle^2}\), \(\epsilon_z = \dfrac{1}{mc} \sqrt{\delta_z^2 \delta_{pz}^2 - \Big\langle (z-\langle z \rangle) (p_z-\langle p_z \rangle) \Big\rangle^2}\).
[19], [20]: Twiss alpha for the transverse directions \(\alpha_x = - \Big\langle (x-\langle x \rangle) (p_x-\langle p_x \rangle) \Big\rangle \Big/ \epsilon_x\), \(\alpha_y = - \Big\langle (y-\langle y \rangle) (p_y-\langle p_y \rangle) \Big\rangle \Big/ \epsilon_y\).
[21], [22]: beta function for the transverse directions (m) \(\beta_x = \dfrac{{\delta_x}^2}{\epsilon_x}\), \(\beta_y = \dfrac{{\delta_y}^2}{\epsilon_y}\).
[23]: The charge of the beam (C).
For 2D-XZ, \(\langle y \rangle\), \(\delta_y\), and \(\epsilon_y\) will not be outputted.
LoadBalanceCosts
This type computes the cost, used in load balancing, for each box on the domain. The cost \(c\) is computed as
\[c = n_{\text{particle}} \cdot w_{\text{particle}} + n_{\text{cell}} \cdot w_{\text{cell}},\]where \(n_{\text{particle}}\) is the number of particles on the box, \(w_{\text{particle}}\) is the particle cost weight factor (controlled by
algo.costs_heuristic_particles_wt
), \(n_{\text{cell}}\) is the number of cells on the box, and \(w_{\text{cell}}\) is the cell cost weight factor (controlled byalgo.costs_heuristic_cells_wt
).
LoadBalanceEfficiency
This type computes the load balance efficiency, given the present costs and distribution mapping. Load balance efficiency is computed as the mean cost over all ranks, divided by the maximum cost over all ranks. Until costs are recorded, load balance efficiency is output as -1; at earliest, the load balance efficiency can be output starting at step 2, since costs are not recorded until step 1.
ParticleHistogram
This type computes a user defined particle histogram.
<reduced_diags_name>.species
(string)A species name must be provided, such that the diagnostics are done for this species.
<reduced_diags_name>.histogram_function(t,x,y,z,ux,uy,uz)
(string)A histogram function must be provided. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent the particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g.
x
produces the position (density) distribution in x.ux
produces the momentum distribution in x,sqrt(ux*ux+uy*uy+uz*uz)
produces the speed distribution. The default value of the histogram without normalization is \(f = \sum\limits_{i=1}^N w_i\), where \(\sum\limits_{i=1}^N\) is the sum over \(N\) particles in that bin, \(w_i\) denotes the weight of the ith particle.
<reduced_diags_name>.bin_number
(int > 0)This is the number of bins used for the histogram.
<reduced_diags_name>.bin_max
(float)This is the maximum value of the bins.
<reduced_diags_name>.bin_min
(float)This is the minimum value of the bins.
<reduced_diags_name>.normalization
(optional)This provides options to normalize the histogram:
unity_particle_weight
uses unity particle weight to compute the histogram, such that the values of the histogram are the number of counted macroparticles in that bin, i.e. \(f = \sum\limits_{i=1}^N 1\), \(N\) is the number of particles in that bin.max_to_unity
will normalize the histogram such that its maximum value is one.area_to_unity
will normalize the histogram such that the area under the histogram is one, so the histogram is also the probability density function.If nothing is provided, the macroparticle weight will be used to compute the histogram, and no normalization will be done.
<reduced_diags_name>.filter_function(t,x,y,z,ux,uy,uz)
(string) optionalUsers can provide an expression returning a boolean for whether a particle is taken into account when calculating the histogram. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle momenta in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. E.g. If provided (x>0.0)*(uz<10.0) only those particles located at positions x greater than 0, and those having momentum uz less than 10, will be taken into account when calculating the histogram.
The output columns are values of the 1st bin, the 2nd bin, …, the nth bin. An example input file and a loading python script of using the histogram reduced diagnostics are given in
Examples/Tests/initial_distribution/
.
ParticleHistogram2D
This type computes a user defined, 2D particle histogram.
<reduced_diags_name>.species
(string)A species name must be provided, such that the diagnostics are done for this species.
<reduced_diags_name>.file_min_digits
(int) optional (default 6)The minimum number of digits used for the iteration number appended to the diagnostic file names.
<reduced_diags_name>.histogram_function_abs(t,x,y,z,ux,uy,uz,w)
(string)A histogram function must be provided for the abscissa axis. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent the particle velocities in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. w represents the weight.
<reduced_diags_name>.histogram_function_ord(t,x,y,z,ux,uy,uz,w)
(string)A histogram function must be provided for the ordinate axis.
<reduced_diags_name>.bin_number_abs
(int > 0) and<reduced_diags_name>.bin_number_ord
(int > 0)These are the number of bins used for the histogram for the abscissa and ordinate axis respectively.
<reduced_diags_name>.bin_max_abs
(float) and<reduced_diags_name>.bin_max_ord
(float)These are the maximum value of the bins for the abscissa and ordinate axis respectively. Particles with values outside of these ranges are discarded.
<reduced_diags_name>.bin_min_abs
(float) and<reduced_diags_name>.bin_min_ord
(float)These are the minimum value of the bins for the abscissa and ordinate axis respectively. Particles with values outside of these ranges are discarded.
<reduced_diags_name>.filter_function(t,x,y,z,ux,uy,uz,w)
(string) optionalUsers can provide an expression returning a boolean for whether a particle is taken into account when calculating the histogram. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle velocities in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. w represents the weight.
<reduced_diags_name>.value_function(t,x,y,z,ux,uy,uz,w)
(string) optionalUsers can provide an expression for the weight used to calculate the number of particles per cell associated with the selected abscissa and ordinate functions and/or the filter function. t represents the physical time in seconds during the simulation. x, y, z represent particle positions in the unit of meter. ux, uy, uz represent particle velocities in the unit of \(\gamma v/c\), where \(\gamma\) is the Lorentz factor, \(v/c\) is the particle velocity normalized by the speed of light. w represents the weight.
The output is a
<reduced_diags_name>
folder containing a set of openPMD files. An example input file and a loading python script of using the histogram2D reduced diagnostics are given inExamples/Tests/histogram2D/
.
ParticleExtrema
This type computes the minimum and maximum values of particle position, momentum, gamma, weight, and the \(\chi\) parameter for QED species.
<reduced_diags_name>.species
must be provided, such that the diagnostics are done for this species only.The output columns are minimum and maximum position \(x\), \(y\), \(z\); minimum and maximum momentum \(p_x\), \(p_y\), \(p_z\); minimum and maximum gamma \(\gamma\); minimum and maximum weight \(w\); minimum and maximum \(\chi\).
Note that when the QED parameter \(\chi\) is computed, field gather is carried out at every output, so the time of the diagnostic may be long depending on the simulation size.
ChargeOnEB
This type computes the total surface charge on the embedded boundary (in Coulombs), by using the formula
\[Q_{tot} = \epsilon_0 \iint dS \cdot E\]where the integral is performed over the surface of the embedded boundary.
When providing
<reduced_diags_name>.weighting_function(x,y,z)
, the computed integral is weighted:\[Q = \epsilon_0 \iint dS \cdot E \times weighting(x, y, z)\]In particular, by choosing a weighting function which returns either 1 or 0, it is possible to compute the charge on only some part of the embedded boundary.
ColliderRelevant
This diagnostics computes properties of two colliding beams that are relevant for particle colliders. Two species must be specified. Photon species are not supported yet. It is assumed that the two species propagate and collide along the
z
direction. The output columns (for 3D-XYZ) are the following, where the minimum, average and maximum are done over the whole species:[0]: simulation step (iteration).
[1]: time (s).
[2]: time derivative of the luminosity (\(m^{-2}s^{-1}\)) defined as:
\[\frac{dL}{dt} = 2 c \iiint n_1(x,y,z) n_2(x,y,z) dx dy dz\]where \(n_1\), \(n_2\) are the number densities of the two colliding species.
[3], [4], [5]: If, QED is enabled, the minimum, average and maximum values of the quantum parameter \(\chi\) of species 1: \(\chi_{min}\), \(\langle \chi \rangle\), \(\chi_{max}\). If QED is not enabled, these numbers are not computed.
[6], [7]: The average and standard deviation of the values of the transverse coordinate \(x\) (m) of species 1: \(\langle x \rangle\), \(\sqrt{\langle x- \langle x \rangle \rangle^2}\).
[8], [9]: The average and standard deviation of the values of the transverse coordinate \(y\) (m) of species 1: \(\langle y \rangle\), \(\sqrt{\langle y- \langle y \rangle \rangle^2}\).
[10], [11], [12], [13]: The minimum, average, maximum and standard deviation of the angle \(\theta_x = \angle (u_x, u_z)\) (rad) of species 1: \({\theta_x}_{min}\), \(\langle \theta_x \rangle\), \({\theta_x}_{max}\), \(\sqrt{\langle \theta_x- \langle \theta_x \rangle \rangle^2}\).
[14], [15], [16], [17]: The minimum, average, maximum and standard deviation of the angle \(\theta_y = \angle (u_y, u_z)\) (rad) of species 1: \({\theta_y}_{min}\), \(\langle \theta_y \rangle\), \({\theta_y}_{max}\), \(\sqrt{\langle \theta_y- \langle \theta_y \rangle \rangle^2}\).
[18], …, [32]: Analogous quantities for species 2.
For 2D-XZ, \(y\)-related quantities are not outputted. For 1D-Z, \(x\)-related and \(y\)-related quantities are not outputted. RZ geometry is not supported yet.
<reduced_diags_name>.intervals
(string)Using the Intervals Parser syntax, this string defines the timesteps at which reduced diagnostics are written to file.
<reduced_diags_name>.path
(string) optional (default ./diags/reducedfiles/)The path that the output file will be stored.
<reduced_diags_name>.extension
(string) optional (default txt)The extension of the output file.
<reduced_diags_name>.separator
(string) optional (default a whitespace)The separator between row values in the output file. The default separator is a whitespace.
<reduced_diags_name>.precision
(integer) optional (default 14)The precision used when writing out the data to the text files.
Lookup tables and other settings for QED modules
Lookup tables store pre-computed values for functions used by the QED modules. This feature requires to compile with QED=TRUE (and also with QED_TABLE_GEN=TRUE for table generation)
qed_bw.lookup_table_mode
(string)There are three options to prepare the lookup table required by the Breit-Wheeler module:
builtin
: a built-in table is used (Warning: the table gives reasonable results but its resolution is quite low).generate
: a new table is generated. This option requires Boost math library (version >= 1.66) and to compile withQED_TABLE_GEN=TRUE
. All the following parameters must be specified (table 1 is used to evolve the optical depth of the photons, while table 2 is used for pair generation):qed_bw.tab_dndt_chi_min
(float): minimum chi parameter for lookup table 1 ( used for the evolution of the optical depth of the photons)qed_bw.tab_dndt_chi_max
(float): maximum chi parameter for lookup table 1qed_bw.tab_dndt_how_many
(int): number of points to be used for lookup table 1qed_bw.tab_pair_chi_min
(float): minimum chi parameter for lookup table 2 ( used for pair generation)qed_bw.tab_pair_chi_max
(float): maximum chi parameter for lookup table 2qed_bw.tab_pair_chi_how_many
(int): number of points to be used for chi axis in lookup table 2qed_bw.tab_pair_frac_how_many
(int): number of points to be used for the second axis in lookup table 2 (the second axis is the ratio between the quantum parameter of the less energetic particle of the pair and the quantum parameter of the photon).qed_bw.save_table_in
(string): where to save the lookup table
Alternatively, the lookup table can be generated using a standalone tool (see qed tools section).
load
: a lookup table is loaded from a pre-generated binary file. The following parameter must be specified:qed_bw.load_table_from
(string): name of the lookup table file to read from.
qed_qs.lookup_table_mode
(string)There are three options to prepare the lookup table required by the Quantum Synchrotron module:
builtin
: a built-in table is used (Warning: the table gives reasonable results but its resolution is quite low).generate
: a new table is generated. This option requires Boost math library (version >= 1.66) and to compile withQED_TABLE_GEN=TRUE
. All the following parameters must be specified (table 1 is used to evolve the optical depth of the particles, while table 2 is used for photon emission):qed_qs.tab_dndt_chi_min
(float): minimum chi parameter for lookup table 1 ( used for the evolution of the optical depth of electrons and positrons)qed_qs.tab_dndt_chi_max
(float): maximum chi parameter for lookup table 1qed_qs.tab_dndt_how_many
(int): number of points to be used for lookup table 1qed_qs.tab_em_chi_min
(float): minimum chi parameter for lookup table 2 ( used for photon emission)qed_qs.tab_em_chi_max
(float): maximum chi parameter for lookup table 2qed_qs.tab_em_chi_how_many
(int): number of points to be used for chi axis in lookup table 2qed_qs.tab_em_frac_how_many
(int): number of points to be used for the second axis in lookup table 2 (the second axis is the ratio between the quantum parameter of the photon and the quantum parameter of the charged particle).qed_qs.tab_em_frac_min
(float): minimum value to be considered for the second axis of lookup table 2qed_qs.save_table_in
(string): where to save the lookup table
Alternatively, the lookup table can be generated using a standalone tool (see qed tools section).
load
: a lookup table is loaded from a pre-generated binary file. The following parameter must be specified:qed_qs.load_table_from
(string): name of the lookup table file to read from.
qed_bw.chi_min
(float): minimum chi parameter to be considered by the Breit-Wheeler engine(suggested value : 0.01)
qed_qs.chi_min
(float): minimum chi parameter to be considered by the Quantum Synchrotron engine(suggested value : 0.001)
qed_qs.photon_creation_energy_threshold
(float) optional (default 2)Energy threshold for photon particle creation in *me*c^2 units.
warpx.do_qed_schwinger
(bool) optional (default 0)If this is 1, Schwinger electron-positron pairs can be generated in vacuum in the cells where the EM field is high enough. Activating the Schwinger process requires the code to be compiled with
QED=TRUE
andPICSAR
. Ifwarpx.do_qed_schwinger = 1
, Schwinger product species must be specified withqed_schwinger.ele_product_species
andqed_schwinger.pos_product_species
. Schwinger process requires eitherwarpx.grid_type = collocated
oralgo.field_gathering=momentum-conserving
(so that different field components are computed at the same location in the grid) and does not currently support mesh refinement, cylindrical coordinates or single precision.
qed_schwinger.ele_product_species
(string)If Schwinger process is activated, an electron product species must be specified (the name of an existing electron species must be provided).
qed_schwinger.pos_product_species
(string)If Schwinger process is activated, a positron product species must be specified (the name of an existing positron species must be provided).
qed_schwinger.y_size
(float; in meters)If Schwinger process is activated with
DIM=2D
, a transverse size must be specified. It is used to convert the pair production rate per unit volume into an actual number of created particles. This value should correspond to the typical transverse extent for which the EM field has a very high value (e.g. the beam waist for a focused laser beam).
qed_schwinger.xmin,ymin,zmin
andqed_schwinger.xmax,ymax,zmax
(float) optional (default unlimited)When
qed_schwinger.xmin
andqed_schwinger.xmax
are set, they delimit the region within which Schwinger pairs can be created. The same is applicable in the other directions.
qed_schwinger.threshold_poisson_gaussian
(integer) optional (default 25)If the expected number of physical pairs created in a cell at a given timestep is smaller than this threshold, a Poisson distribution is used to draw the actual number of physical pairs created. Otherwise a Gaussian distribution is used. Note that, regardless of this parameter, the number of macroparticles created is at most one per cell per timestep per species (with a weight corresponding to the number of physical pairs created).
Checkpoints and restart
WarpX supports checkpoints/restart via AMReX.
The checkpoint capability can be turned with regular diagnostics: <diag_name>.format = checkpoint
.
amr.restart
(string)Name of the checkpoint file to restart from. Returns an error if the folder does not exist or if it is not properly formatted.
warpx.write_diagnostics_on_restart
(bool) optional (default false)When true, write the diagnostics after restart at the time of the restart.
Intervals parser
WarpX can parse time step interval expressions of the form start:stop:period
, e.g.
1:2:3, 4::, 5:6, :, ::10
.
A comma is used as a separator between groups of intervals, which we call slices.
The resulting time steps are the union set of all given slices.
White spaces are ignored.
A single slice can have 0, 1 or 2 colons :
, just as numpy slices, but with inclusive upper bound for stop
.
For 0 colon the given value is the period
For 1 colon the given string is of the type
start:stop
For 2 colons the given string is of the type
start:stop:period
Any value that is not given is set to default.
Default is 0
for the start, std::numeric_limits<int>::max()
for the stop and 1
for the
period.
For the 1 and 2 colon syntax, actually having values in the string is optional
(this means that ::5
, 100 ::10
and 100 :
are all valid syntaxes).
All values can be expressions that will be parsed in the same way as other integer input parameters.
Examples
something_intervals = 50
-> do something at timesteps 0, 50, 100, 150, etc. (equivalent tosomething_intervals = ::50
)something_intervals = 300:600:100
-> do something at timesteps 300, 400, 500 and 600.something_intervals = 300::50
-> do something at timesteps 300, 350, 400, 450, etc.something_intervals = 105:108,205:208
-> do something at timesteps 105, 106, 107, 108, 205, 206, 207 and 208. (equivalent tosomething_intervals = 105 : 108 : , 205 : 208 :
)something_intervals = :
orsomething_intervals = ::
-> do something at every timestep.something_intervals = 167:167,253:253,275:425:50
do something at timesteps 167, 253, 275, 325, 375 and 425.
This is essentially the python slicing syntax except that the stop is inclusive
(0:100
contains 100) and that no colon means that the given value is the period.
Note that if a given period is zero or negative, the corresponding slice is disregarded.
For example, something_intervals = -1
deactivates something
and
something_intervals = ::-1,100:1000:25
is equivalent to something_intervals = 100:1000:25
.
Testing and Debugging
When developing, testing and debugging WarpX, the following options can be considered.
warpx.verbose
(0
or1
; default is1
for true)Controls how much information is printed to the terminal, when running WarpX.
warpx.always_warn_immediately
(0
or1
; default is0
for false)If set to
1
, WarpX immediately prints every warning message as soon as it is generated. It is mainly intended for debug purposes, in case a simulation crashes before a global warning report can be printed.
warpx.abort_on_warning_threshold
(string:low
,medium
orhigh
) optionalOptional threshold to abort as soon as a warning is raised. If the threshold is set, warning messages with priority greater than or equal to the threshold trigger an immediate abort. It is mainly intended for debug purposes, and is best used with
warpx.always_warn_immediately=1
.
amrex.abort_on_unused_inputs
(0
or1
; default is0
for false)When set to
1
, this option causes simulation to fail after its completion if there were unused parameters. It is mainly intended for continuous integration and automated testing to check that all tests and inputs are adapted to API changes.
amrex.use_profiler_syncs
(0
or1
; default is0
for false)Adds a synchronization at the start of communication, so any load balance will be caught there (the timer is called
SyncBeforeComms
), then the comm operation will run. This will slow down the run.
warpx.serialize_initial_conditions
(0 or 1) optional (default 0)Serialize the initial conditions for reproducible testing, e.g, in our continuous integration tests. Mainly whether or not to use OpenMP threading for particle initialization.
warpx.safe_guard_cells
(0 or 1) optional (default 0)Run in safe mode, exchanging more guard cells, and more often in the PIC loop (for debugging).
ablastr.fillboundary_always_sync
(0 or 1) optional (default 0)Run all
FillBoundary
operations onMultiFab
to force-synchronize shared nodal points. This slightly increases communication cost and can help to spot missingnodal_sync
flags in these operations.
J.-L. Vay. Simulation Of Beams Or Plasmas Crossing At Relativistic Velocity. Physics of Plasmas, 15(5):56701, May 2008. doi:10.1063/1.2837054.
J.P. Verboncoeur. Symmetric Spline Weighting for Charge and Current Density in Particle Simulation. Journal of Computational Physics, 174(1):421–427, 2001. doi:10.1006/jcph.2001.6923.
H. Wiedemann. Particle Accelerator Physics. Springer Cham, 2015. ISBN 978-3-319-18317-6. doi:10.1007/978-3-319-18317-6.
A. Muraviev, A. Bashinov, E. Efimenko, V. Volokitin, I. Meyerov, and A. Gonoskov. Strategies for particle resampling in PIC simulations. Computer Physics Communications, 262:107826, 2021. doi:10.1016/j.cpc.2021.107826.
M. Vranic, T. Grismayer, J.L. Martins, R.A. Fonseca, and L.O. Silva. Particle merging algorithm for pic codes. Computer Physics Communications, 191:65–73, 2015. doi:https://doi.org/10.1016/j.cpc.2015.01.020.
S. Akturk, X. Gu, E. Zeek, and R. Trebino. Pulse-front tilt caused by spatial and temporal chirp. Opt. Express, 12(19):4399–4410, Sep 2004. doi:10.1364/OPEX.12.004399.
F. Pérez, L. Gremillet, A. Decoster, M. Drouin, and E. Lefebvre. Improved modeling of relativistic collisions and collisional ionization in particle-in-cell codes. Physics of Plasmas, 19(8):083104, Aug 2012. doi:10.1063/1.4742167.
D. P. Higginson, A. Link, and A. Schmidt. A pairwise nuclear fusion algorithm for weighted particle-in-cell plasma simulations. Journal of Computational Physics, 388:439–453, Jul 2019. doi:10.1016/j.jcp.2019.03.020.
T. Z. Esirkepov. Exact Charge Conservation Scheme For Particle-In-Cell Simulation With An Arbitrary Form-Factor. Computer Physics Communications, 135(2):144–153, Apr 2001.
J.-L. Vay, I. Haber, and B. B. Godfrey. A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas. Journal of Computational Physics, 243:260–268, Jun 2013. doi:10.1016/j.jcp.2013.03.010.
A. V. Higuera and J. R. Cary. Structure-preserving second-order integration of relativistic charged particle trajectories in electromagnetic fields. Physics of Plasmas, 24(5):052104, 04 2017. URL: https://doi.org/10.1063/1.4979989, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.4979989/15988441/052104\_1\_online.pdf, doi:10.1063/1.4979989.
B. M. Cowan, D. L. Bruhwiler, J. R. Cary, E. Cormier-Michel, and C. G. R. Geddes. Generalized algorithm for control of numerical dispersion in explicit time-domain electromagnetic simulations. Physical Review Special Topics-Accelerators And Beams, Apr 2013. doi:10.1103/PhysRevSTAB.16.041303.
T. Xiao and Q. H. Liu. An enlarged cell technique for the conformal FDTD method to model perfectly conducting objects. In 2005 IEEE Antennas and Propagation Society International Symposium, volume 1A, 122–125 Vol. 1A. 2005. doi:10.1109/APS.2005.1551259.
R. Lehe, M. Kirchen, B. B. Godfrey, A. R. Maier, and J.-L. Vay. Elimination of numerical Cherenkov instability in flowing-plasma particle-in-cell simulations by using Galilean coordinates. Phys. Rev. E, 94:053305, Nov 2016. URL: https://link.aps.org/doi/10.1103/PhysRevE.94.053305, doi:10.1103/PhysRevE.94.053305.
A. Stanier, L. Chacón, and A. Le. A cancellation problem in hybrid particle-in-cell schemes due to finite particle size. Journal of Computational Physics, 420:109705, 2020. URL: https://www.sciencedirect.com/science/article/pii/S0021999120304794, doi:https://doi.org/10.1016/j.jcp.2020.109705.
B. B. Godfrey and J.-L. Vay. Numerical stability of relativistic beam multidimensional PIC simulations employing the Esirkepov algorithm. Journal of Computational Physics, 248:33–46, 2013.
T. Grismayer, R. Torres, P. Carneiro, F. Cruz, R. A. Fonseca, and L. O. Silva. Quantum Electrodynamics vacuum polarization solver. New Journal of Physics, 23(9):095005, Sep 2021. doi:10.1088/1367-2630/ac2004.
Workflows
This section collects typical user workflows and best practices for WarpX.
Extend a Simulation with Python
When running WarpX directly from Python it is possible to interact with the simulation.
For instance, with the step()
method of the simulation class, one could run sim.step(nsteps=1)
in a loop:
# Preparation: set up the simulation
# sim = picmi.Simulation(...)
# ...
steps = 1000
for _ in range(steps):
sim.step(nsteps=1)
# do something custom with the sim object
As a more flexible alternative, one can install callback functions, which will execute a given Python function at a specific location in the WarpX simulation loop.
Callback Locations
These are the functions which allow installing user created functions so that they are called at various places along the time step.
The following three functions allow the user to install, uninstall and verify the different call back types.
installcallback()
: Installs a function to be called at that specified timeuninstallcallback()
: Uninstalls the function (so it won’t be called anymore)isinstalled()
: Checks if the function is installed
These functions all take a callback location name (string) and function or instance method as an argument. Note that if an instance method is used, an extra reference to the method’s object is saved.
Functions can be called at the following times:
beforeInitEsolve
: before the initial solve for the E fields (i.e. before the PIC loop starts)afterinit
: immediately after the init is completebeforeEsolve
: before the solve for E fieldspoissonsolver
: In place of the computePhi call but only in an electrostatic simulationafterEsolve
: after the solve for E fieldsafterBpush
: after the B field advance for electromagnetic solversafterEpush
: after the E field advance for electromagnetic solversbeforedeposition
: before the particle deposition (for charge and/or current)afterdeposition
: after particle deposition (for charge and/or current)beforestep
: before the time stepafterstep
: after the time stepafterdiagnostics
: after diagnostic outputoncheckpointsignal
: on a checkpoint signalonbreaksignal
: on a break signal. These callbacks will be the last ones executed before the simulation ends.particlescraper
: just after the particle boundary conditions are applied but before lost particles are processedparticleloader
: at the time that the standard particle loader is calledparticleinjection
: called when particle injection happens, after the position advance and before deposition is called, allowing a user defined particle distribution to be injected each time step
Example that calls the Python function myplots
after each step:
from pywarpx.callbacks import installcallback
def myplots():
# do something here
installcallback('afterstep', myplots)
# run simulation
sim.step(nsteps=100)
The install can also be done using a Python decorator, which has the prefix callfrom
.
To use a decorator, the syntax is as follows. This will install the function myplots
to be called after each step.
The above example is quivalent to the following:
from pywarpx.callbacks import callfromafterstep
@callfromafterstep
def myplots():
# do something here
# run simulation
sim.step(nsteps=100)
- pywarpx.callbacks.installcallback(name, f)[source]
Installs a function to be called at that specified time.
Adds a function to the list of functions called by this callback.
- pywarpx.callbacks.isinstalled(name, f)[source]
Checks if a function is installed for this callback.
- pywarpx.callbacks.uninstallcallback(name, f)[source]
Uninstalls the function (so it won’t be called anymore).
Removes the function from the list of functions called by this callback.
pyAMReX
Many of the following classes are provided through pyAMReX. After the simulation is initialized, the pyAMReX module can be accessed via
from pywarpx import picmi, libwarpx
# ... simulation definition ...
# equivalent to
# import amrex.space3d as amr
# for a 3D simulation
amr = libwarpx.amr # picks the right 1d, 2d or 3d variant
Full details for pyAMReX APIs are documented here. Important APIs include:
amr.ParallelDescriptor: MPI-parallel rank information
amr.MultiFab: MPI-parallel field data
amr.ParticleContainer_*: MPI-parallel particle data for a particle species
Data Access
While the simulation is running, callbacks can have read and write access the WarpX simulation data in situ.
An important object in the pywarpx.picmi
module for data access is Simulation.extension.warpx
, which is available only during the simulation run.
This object is the Python equivalent to the C++ WarpX
simulation class.
- class pywarpx.callbacks.WarpX
- getistep(lev: int)
Get the current step on mesh-refinement level
lev
.
- gett_new(lev: int)
Get the current physical time on mesh-refinement level
lev
.
- getdt(lev: int)
Get the current physical time step size on mesh-refinement level
lev
.
- multifab(multifab_name: str)
Return MultiFabs by name, e.g.,
"Efield_aux[x][level=0]"
,"Efield_cp[x][level=0]"
, …The physical fields in WarpX have the following naming:
_fp
are the “fine” patches, the regular resolution of a current mesh-refinement level_aux
are temporary (auxiliar) patches at the same resolution as_fp
. They usually include contributions from other levels and can be interpolated for gather routines of particles._cp
are “coarse” patches, at the same resolution (but not necessary values) as the_fp
oflevel - 1
(only for level 1 and higher).
- multi_particle_container()
- get_particle_boundary_buffer()
- set_potential_on_domain_boundary(potential_[lo/hi]_[x/y/z]: str)
The potential on the domain boundaries can be modified when using the electrostatic solver. This function updates the strings and function parsers which set the domain boundary potentials during the Poisson solve.
- set_potential_on_eb(potential: str)
The embedded boundary (EB) conditions can be modified when using the electrostatic solver. This set the EB potential string and updates the function parser.
- evolve(numsteps=-1)
Evolve the simulation the specified number of steps.
- finalize(finalize_mpi=1)
Call finalize for WarpX and AMReX. Registered to run at program exit.
The WarpX
also provides read and write access to field MultiFab
and ParticleContainer
data, shown in the following examples.
Fields
This example accesses the \(E_x(x,y,z)\) field at level 0 after every time step and calculate the largest value in it.
from pywarpx import picmi
from pywarpx.callbacks import callfromafterstep
# Preparation: set up the simulation
# sim = picmi.Simulation(...)
# ...
@callfromafterstep
def set_E_x():
warpx = sim.extension.warpx
# data access
E_x_mf = warpx.multifab(f"Efield_fp[x][level=0]")
# compute
# iterate over mesh-refinement levels
for lev in range(warpx.finest_level + 1):
# grow (aka guard/ghost/halo) regions
ngv = E_x_mf.n_grow_vect
# get every local block of the field
for mfi in E_x_mf:
# global index space box, including guards
bx = mfi.tilebox().grow(ngv)
print(bx) # note: global index space of this block
# numpy representation: non-copying view, including the
# guard/ghost region; .to_cupy() for GPU!
E_x_np = E_x_mf.array(mfi).to_numpy()
# notes on indexing in E_x_np:
# - numpy uses locally zero-based indexing
# - layout is F_CONTIGUOUS by default, just like AMReX
# notes:
# Only the next lines are the "HOT LOOP" of the computation.
# For efficiency, use numpy array operation for speed on CPUs.
# For GPUs use .to_cupy() above and compute with cupy or numba.
E_x_np[()] = 42.0
sim.step(nsteps=100)
For further details on how to access GPU data or compute on E_x
, please see the pyAMReX documentation.
High-Level Field Wrapper
Note
TODO
Note
TODO: What are the benefits of using the high-level wrapper? TODO: What are the limitations (e.g., in memory usage or compute scalability) of using the high-level wrapper?
Particles
from pywarpx import picmi
from pywarpx.callbacks import callfromafterstep
# Preparation: set up the simulation
# sim = picmi.Simulation(...)
# ...
@callfromafterstep
def my_after_step_callback():
warpx = sim.extension.warpx
Config = sim.extension.Config
# data access
multi_pc = warpx.multi_particle_container()
pc = multi_pc.get_particle_container_from_name("electrons")
# compute
# iterate over mesh-refinement levels
for lvl in range(pc.finest_level + 1):
# get every local chunk of particles
for pti in pc.iterator(pc, level=lvl):
# compile-time and runtime attributes in SoA format
soa = pti.soa().to_cupy() if Config.have_gpu else \
pti.soa().to_numpy()
# notes:
# Only the next lines are the "HOT LOOP" of the computation.
# For speed, use array operation.
# write to all particles in the chunk
# note: careful, if you change particle positions, you might need to
# redistribute particles before continuing the simulation step
soa.real[0][()] = 0.30 # x
soa.real[1][()] = 0.35 # y
soa.real[2][()] = 0.40 # z
# all other attributes: weight, momentum x, y, z, ...
for soa_real in soa.real[3:]:
soa_real[()] = 42.0
# by default empty unless ionization or QED physics is used
# or other runtime attributes were added manually
for soa_int in soa.int:
soa_int[()] = 12
sim.step(nsteps=100)
For further details on how to access GPU data or compute on electrons
, please see the pyAMReX documentation.
High-Level Particle Wrapper
Note
TODO: What are the benefits of using the high-level wrapper? TODO: What are the limitations (e.g., in memory usage or compute scalability) of using the high-level wrapper?
Particles can be added to the simulation at specific positions and with specific attribute values:
from pywarpx import particle_containers, picmi
# ...
electron_wrapper = particle_containers.ParticleContainerWrapper("electrons")
- class pywarpx.particle_containers.ParticleContainerWrapper(species_name)[source]
Wrapper around particle containers. This provides a convenient way to query and set data in the particle containers.
- Parameters:
species_name (string) – The name of the species to be accessed.
- add_particles(x=None, y=None, z=None, ux=None, uy=None, uz=None, w=None, unique_particles=True, **kwargs)[source]
A function for adding particles to the WarpX simulation.
- Parameters:
species_name (str) – The type of species for which particles will be added
x (arrays or scalars) – The particle positions (m) (default = 0.)
y (arrays or scalars) – The particle positions (m) (default = 0.)
z (arrays or scalars) – The particle positions (m) (default = 0.)
ux (arrays or scalars) – The particle proper velocities (m/s) (default = 0.)
uy (arrays or scalars) – The particle proper velocities (m/s) (default = 0.)
uz (arrays or scalars) – The particle proper velocities (m/s) (default = 0.)
w (array or scalars) – Particle weights (default = 0.)
unique_particles (bool) – True means the added particles are duplicated by each process; False means the number of added particles is independent of the number of processes (default = True)
kwargs (dict) – Containing an entry for all the extra particle attribute arrays. If an attribute is not given it will be set to 0.
- add_real_comp(pid_name, comm=True)[source]
Add a real component to the particle data array.
- Parameters:
pid_name (str) – Name that can be used to identify the new component
comm (bool) – Should the component be communicated
- deposit_charge_density(level, clear_rho=True, sync_rho=True)[source]
Deposit this species’ charge density in rho_fp in order to access that data via pywarpx.fields.RhoFPWrapper().
- Parameters:
species_name (str) – The species name that will be deposited.
level (int) – Which AMR level to retrieve scraped particle data from.
clear_rho (bool) – If True, zero out rho_fp before deposition.
sync_rho (bool) – If True, perform MPI exchange and properly set boundary cells for rho_fp.
- get_particle_count(local=False)[source]
Get the number of particles of this species in the simulation.
- Parameters:
local (bool) – If True the particle count on this processor will be returned. Default False.
- Returns:
An integer count of the number of particles
- Return type:
int
- get_particle_cpu(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle ‘cpu’ numbers on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle cpus
- Return type:
List of arrays
- get_particle_id(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle ‘id’ numbers on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle ids
- Return type:
List of arrays
- get_particle_idcpu(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle ‘idcpu’ numbers on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle idcpu
- Return type:
List of arrays
- get_particle_idcpu_arrays(level, copy_to_host=False)[source]
This returns a list of numpy or cupy arrays containing the particle idcpu data on each tile for this process.
Unless copy_to_host is specified, the data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle array data
- Return type:
List of arrays
- get_particle_int_arrays(comp_name, level, copy_to_host=False)[source]
This returns a list of numpy or cupy arrays containing the particle int array data on each tile for this process.
Unless copy_to_host is specified, the data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.
- Parameters:
comp_name (str) – The component of the array data that will be returned
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle array data
- Return type:
List of arrays
- get_particle_r(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle ‘r’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle r position
- Return type:
List of arrays
- get_particle_real_arrays(comp_name, level, copy_to_host=False)[source]
This returns a list of numpy or cupy arrays containing the particle real array data on each tile for this process.
Unless copy_to_host is specified, the data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.
- Parameters:
comp_name (str) – The component of the array data that will be returned
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle array data
- Return type:
List of arrays
- get_particle_theta(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle theta on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle theta position
- Return type:
List of arrays
- get_particle_ux(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle x momentum on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle x momentum
- Return type:
List of arrays
- get_particle_uy(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle y momentum on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle y momentum
- Return type:
List of arrays
- get_particle_uz(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle z momentum on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle z momentum
- Return type:
List of arrays
- get_particle_weight(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle weight on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle weight
- Return type:
List of arrays
- get_particle_x(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle ‘x’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle x position
- Return type:
List of arrays
- get_particle_y(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle ‘y’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle y position
- Return type:
List of arrays
- get_particle_z(level=0, copy_to_host=False)[source]
Return a list of numpy or cupy arrays containing the particle ‘z’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle z position
- Return type:
List of arrays
- get_species_charge_sum(local=False)[source]
Returns the total charge in the simulation due to the given species.
- Parameters:
local (bool) – If True return total charge per processor
- property idcpu
Return a list of numpy or cupy arrays containing the particle ‘idcpu’ numbers on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle idcpu
- Return type:
List of arrays
- property nps
Get the number of particles of this species in the simulation.
- Parameters:
local (bool) – If True the particle count on this processor will be returned. Default False.
- Returns:
An integer count of the number of particles
- Return type:
int
- property rp
Return a list of numpy or cupy arrays containing the particle ‘r’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle r position
- Return type:
List of arrays
- property thetap
Return a list of numpy or cupy arrays containing the particle theta on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle theta position
- Return type:
List of arrays
- property uxp
Return a list of numpy or cupy arrays containing the particle x momentum on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle x momentum
- Return type:
List of arrays
- property uyp
Return a list of numpy or cupy arrays containing the particle y momentum on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle y momentum
- Return type:
List of arrays
- property uzp
Return a list of numpy or cupy arrays containing the particle z momentum on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle z momentum
- Return type:
List of arrays
- property wp
Return a list of numpy or cupy arrays containing the particle weight on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle weight
- Return type:
List of arrays
- property xp
Return a list of numpy or cupy arrays containing the particle ‘x’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle x position
- Return type:
List of arrays
- property yp
Return a list of numpy or cupy arrays containing the particle ‘y’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle y position
- Return type:
List of arrays
- property zp
Return a list of numpy or cupy arrays containing the particle ‘z’ positions on each tile.
- Parameters:
level (int) – The refinement level to reference (default=0)
copy_to_host (bool) – For GPU-enabled runs, one can either return the GPU arrays (the default) or force a device-to-host copy.
- Returns:
The requested particle z position
- Return type:
List of arrays
The get_particle_real_arrays()
, get_particle_int_arrays()
and
get_particle_idcpu_arrays()
functions are called
by several utility functions of the form get_particle_{comp_name}
where
comp_name
is one of x
, y
, z
, r
, theta
, id
, cpu
,
weight
, ux
, uy
or uz
.
Diagnostics
Various diagnostics are also accessible from Python.
This includes getting the deposited or total charge density from a given species as well as accessing the scraped particle buffer.
See the example in Examples/Tests/ParticleBoundaryScrape
for a reference on how to interact with scraped particle data.
- class pywarpx.particle_containers.ParticleBoundaryBufferWrapper[source]
Wrapper around particle boundary buffer containers. This provides a convenient way to query data in the particle boundary buffer containers.
- get_particle_boundary_buffer(species_name, boundary, comp_name, level)[source]
This returns a list of numpy or cupy arrays containing the particle array data for a species that has been scraped by a specific simulation boundary.
The data for the arrays are not copied, but share the underlying memory buffer with WarpX. The arrays are fully writeable.
You can find here https://github.com/ECP-WarpX/WarpX/blob/319e55b10ad4f7c71b84a4fb21afbafe1f5b65c2/Examples/Tests/particle_boundary_interaction/PICMI_inputs_rz.py an example of a simple case of particle-boundary interaction (reflection).
- Parameters:
species_name (str) – The species name that the data will be returned for.
boundary (str) – The boundary from which to get the scraped particle data in the form x/y/z_hi/lo or eb.
comp_name (str) – The component of the array data that will be returned. “x”, “y”, “z”, “ux”, “uy”, “uz”, “w” “stepScraped”,”deltaTimeScraped”, if boundary=’eb’: “nx”, “ny”, “nz”
level (int) – Which AMR level to retrieve scraped particle data from.
- get_particle_boundary_buffer_size(species_name, boundary, local=False)[source]
This returns the number of particles that have been scraped so far in the simulation from the specified boundary and of the specified species.
- Parameters:
species_name (str) – Return the number of scraped particles of this species
boundary (str) – The boundary from which to get the scraped particle data in the form x/y/z_hi/lo
local (bool) – Whether to only return the number of particles in the current processor’s buffer
Modify Solvers
From Python, one can also replace numerical solvers in the PIC loop or add new physical processes into the time step loop. Examples:
Capacitive Discharge: replaces the Poisson solver of an electrostatic simulation (default: MLMG) with a python function that uses superLU to directly solve the Poisson equation.
Domain Decomposition
WarpX relies on a spatial domain decomposition for MPI parallelization. It provides two different ways for users to specify this decomposition, a simple way recommended for most users, and a flexible way recommended if more control is desired. The flexible method is required for dynamic load balancing to be useful.
1. Simple Method
The first and simplest method is to provide the warpx.numprocs = nx ny nz
parameter, either at the command line or somewhere in your inputs deck. In this case, WarpX will split up the overall problem domain into exactly the specified number of subdomains, or Boxes in the AMReX terminology, with the data defined on each Box having its own guard cells. The product of nx, ny, and nz
must be exactly the desired number of MPI ranks. Note that, because there is exactly one Box per MPI rank when run this way, dynamic load balancing will not be possible, as there is no way of shifting Boxes around to achieve a more even load. This is the approach recommended for new users as it is the easiest to use.
Note
If warpx.numprocs
is not specified, WarpX will fall back on using the amr.max_grid_size
and amr.blocking_factor
parameters, described below.
2. More General Method
The second way of specifying the domain decomposition provides greater flexibility and enables dynamic load balancing, but is not as easy to use. In this method, the user specifies inputs parameters amr.max_grid_size
and amr.blocking_factor
, which can be thought of as the maximum and minimum allowed Box sizes. Now, the overall problem domain (specified by the amr.ncell
input parameter) will be broken up into some number of Boxes with the specified characteristics. By default, WarpX will make the Boxes as big as possible given the constraints.
For example, if amr.ncell = 768 768 768
, amr.max_grid_size = 128
, and amr.blocking_factor = 32
, then AMReX will make 6 Boxes in each direction, for a total of 216 (the amr.blocking_factor
does not factor in yet; however, see the section on mesh refinement below). If this problem is then run on 54 MPI ranks, there will be 4 boxes per rank initially. This problem could be run on as many as 216 ranks without performing any splitting.
Note
Both amr.ncell
and amr.max_grid_size
must be divisible by amr.blocking_factor
, in each direction.
When WarpX is run using this approach to domain decomposition, the number of MPI ranks does not need to be exactly equal to the number of Boxes
. Note also that if you run WarpX with more MPI ranks than there are boxes on the base level, WarpX will attempt to split the available Boxes
until there is at least one for each rank to work on; this may cause it violate the constraints of amr.max_grid_size
and amr.blocking_factor
.
Note
The AMReX documentation on Grid Creation may also be helpful.
You can also specify a separate max_grid_size and blocking_factor for each direction, using the parameters amr.max_grid_size_x
, amr.max_grid_size_y
, etc… . This allows you to request, for example, a “pencil” type domain decomposition that is long in one direction. Note that, in RZ geometry, the parameters corresponding to the longitudinal direction are amr.max_grid_size_y
and amr.blocking_factor_y
.
3. Performance Considerations
In terms of performance, in general there is a trade off. Having many small boxes provides flexibility in terms of load balancing; however, the cost is increased time spent in communication due to surface-to-volume effects and increased kernel launch overhead when running on the GPUs. The ideal number of boxes per rank depends on how important dynamic load balancing is on your problem. If your problem is intrinsically well-balanced, like in a uniform plasma, then having a few, large boxes is best. But, if the problem is non-uniform and achieving a good load balance is critical for performance, having more, smaller Boxes can be worth it. In general, we find that running with something in the range of 4-8 Boxes per process is a good compromise for most problems.
Note
For specific information on the dynamic load balancer used in WarpX, visit the Load Balancing page on the AMReX documentation.
The best values for these parameters can also depend strongly on a number of numerical parameters:
Algorithms used (Maxwell/spectral field solver, filters, order of the particle shape factor)
Number of guard cells (that depends on the particle shape factor and the type and order of the Maxwell solver, the filters used, etc.)
Number of particles per cell, and the number of species
and the details of the on-node parallelization and computer architecture used for the run:
GPU or CPU
Number of OpenMP threads
Amount of high-bandwidth memory.
Because these parameters put additional constraints on the domain size for a
simulation, it can be cumbersome to calculate the number of cells and the
physical size of the computational domain for a given resolution. This
Python script
does it
automatically.
When using the RZ spectral solver, the values of amr.max_grid_size
and amr.blocking_factor
are constrained since the solver
requires that the full radial extent be within a each block.
For the radial values, any input is ignored and the max grid size and blocking factor are both set equal to the number of radial cells.
For the longitudinal values, the blocking factor has a minimum size of 8, allowing the computational domain of each block to be large enough relative to the guard cells for reasonable performance, but the max grid size and blocking factor must also be small enough so that there will be at least one block per processor.
If max grid size and/or blocking factor are too large, they will be silently reduced as needed.
If there are too many processors so that there is not enough blocks for the number processors, WarpX will abort.
4. Mesh Refinement
With mesh refinement, the above picture is more complicated, as in general the number of boxes can not be predicted at the start of the simulation. The decomposition of the base level will proceed as outlined above. The refined region, however, will be covered by some number of Boxes whose sizes are consistent with amr.max_grid_size
and amr.blocking_factor
. With mesh refinement, the blocking factor is important, as WarpX may decide to use Boxes smaller than amr.max_grid_size
so as not to over-refine outside of the requested area. Note that you can specify a vector of values to make these parameters vary by level. For example, amr.max_grid_size = 128 64
will make the max grid size be 128 on level 0 and 64 on level 1.
In general, the above performance considerations apply - varying these values such that there are 4-8 Boxes per rank on each level is a good guideline.
Visualizing a distribution mapping
WarpX provides via reduced diagnostics an output LoadBalanceCosts, which allows for visualization of a simulation’s distribution mapping and computational costs. Here we demonstrate the workflow for generating this data and using it to plot distribution mappings and load balance costs.
Generating the data
To generate ‘Load Balance Costs’ reduced diagnostics output, WarpX should be run with the following lines added to the input file (the name of the reduced diagnostics file, LBC, and interval in steps to output reduced diagnostics data, 100, may be changed as needed):
warpx.reduced_diags_names = LBC
LBC.type = LoadBalanceCosts
LBC.intervals = 100
The line warpx.reduced_diags_names = LBC sets the name of the reduced diagnostics output file to LBC. The next line LBC.type = LoadBalanceCosts tells WarpX that the reduced diagnostics is a LoadBalanceCosts diagnostic, and instructs WarpX to record costs and rank layouts. The final line, LBC.intervals = 100, controls the interval for output of this reduced diagnostic’s data.
Loading and plotting the data
After generating data (called LBC_knapsack.txt and LBC_sfc.txt in the example
below), the following Python code, along with a helper class in
plot_distribution_mapping.py
can be used to read the data:
# Math
import numpy as np
import random
# Plotting
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib.colors import ListedColormap
from mpl_toolkits.axes_grid1 import make_axes_locatable
# Data handling
import plot_distribution_mapping as pdm
sim_knapsack = pdm.SimData('LBC_knapsack.txt', # Data directory
[2800], # Files to process
is_3D=False # if this is a 2D sim
)
sim_sfc = pdm.SimData('LBC_sfc.txt', [2800])
# Set reduced diagnostics data for step 2800
for sim in [sim_knapsack, sim_sfc]: sim(2800)
For 2D data, the following function can be used for visualization of distribution mappings:
# Plotting -- we know beforehand the data is 2D
def plot(sim):
"""
Plot MPI rank layout for a set of `LoadBalanceCosts` reduced diagnostics
(2D) data.
Arguments:
sim -- SimData class with data (2D) loaded for desired iteration
"""
# Make first cmap
cmap = plt.cm.nipy_spectral
cmaplist = [cmap(i) for i in range(cmap.N)][::-1]
unique_ranks = np.unique(sim.rank_arr)
sz = len(unique_ranks)
cmap = mpl.colors.LinearSegmentedColormap.from_list(
'my_cmap', cmaplist, sz) # create the new map
# Make cmap from 1 --> 96 then randomize
cmaplist= [cmap(i) for i in range(sz)]
random.Random(6).shuffle(cmaplist)
cmap = mpl.colors.LinearSegmentedColormap.from_list(
'my_cmap', cmaplist, sz) # create the new map
# Define the bins and normalize
bounds = np.linspace(0, sz, sz + 1)
norm = mpl.colors.BoundaryNorm(bounds, sz)
my, mx = sim.rank_arr.shape
xcoord, ycoord = np.linspace(0,mx,mx+1), np.linspace(0,my,my+1)
im = plt.pcolormesh(xcoord, ycoord, sim.rank_arr,
cmap=cmap, norm=norm)
# Grid lines
plt.ylabel('$j$')
plt.xlabel('$i$')
plt.minorticks_on()
plt.hlines(ycoord, xcoord[0], xcoord[-1],
alpha=0.7, linewidth=0.3, color='lightgrey')
plt.vlines(xcoord, ycoord[0], ycoord[-1],
alpha=0.7, linewidth=0.3, color='lightgrey')
plt.gca().set_aspect('equal')
# Center rank label
for j in range(my):
for i in range(mx):
text = plt.gca().text(i+0.5, j+0.5, int(sim.rank_arr[j][i]),
ha="center", va="center",
color="w", fontsize=8)
# Colorbar
divider = make_axes_locatable(plt.gca())
cax = divider.new_horizontal(size="5%", pad=0.05)
plt.gcf().add_axes(cax)
cb=plt.gcf().colorbar(im, label='rank', cax=cax, orientation="vertical")
minorticks = np.linspace(0, 1, len(unique_ranks) + 1)
cb.ax.yaxis.set_ticks(minorticks, minor=True)
The function can be used as follows:
fig, axs = plt.subplots(1, 2, figsize=(12, 6))
plt.sca(axs[0])
plt.title('Knapsack')
plot(sim_knapsack)
plt.sca(axs[1])
plt.title('SFC')
plot(sim_sfc)
plt.tight_layout()
This generates plots like in [fig:knapsack_sfc_distribution_mapping_2D]:

Sample distribution mappings from simulations with knapsack (left) and space-filling curve (right) policies for update of the distribution mapping when load balancing.
Similarly, the computational costs per box can be plotted with the following code:
fig, axs = plt.subplots(1, 2, figsize=(12, 6))
plt.sca(axs[0])
plt.title('Knapsack')
plt.pcolormesh(sim_knapsack.cost_arr)
plt.sca(axs[1])
plt.title('SFC')
plt.pcolormesh(sim_sfc.cost_arr)
for ax in axs:
plt.sca(ax)
plt.ylabel('$j$')
plt.xlabel('$i$')
ax.set_aspect('equal')
plt.tight_layout()
This generates plots like in [fig:knapsack_sfc_costs_2D]:

Sample computational cost per box from simulations with knapsack (left) and space-filling curve (right) policies for update of the distribution mapping when load balancing.
Loading 3D data works the same as loading 2D data, but this time the cost and rank arrays will be 3 dimensional. Here we load and plot some example 3D data (LBC_3D.txt) from a simulation run on 4 MPI ranks. Particles fill the box from \(k=0\) to \(k=1\).
sim_3D = pdm.SimData('LBC_3D.txt', [1,2,3])
sim_3D(1)
# Plotting -- we know beforehand the data is 3D
def plot_3D(sim, j0):
"""
Plot MPI rank layout for a set of `LoadBalanceCosts` reduced diagnostics
(3D) data.
Arguments:
sim -- SimData class with data (3D) loaded for desired iteration
j0 -- slice along j direction to plot ik slice
"""
# Make first cmap
cmap = plt.cm.viridis
cmaplist = [cmap(i) for i in range(cmap.N)][::-1]
unique_ranks = np.unique(sim.rank_arr)
sz = len(unique_ranks)
cmap = mpl.colors.LinearSegmentedColormap.from_list(
'my_cmap', cmaplist, sz) # create the new map
# Make cmap from 1 --> 96 then randomize
cmaplist= [cmap(i) for i in range(sz)]
random.Random(6).shuffle(cmaplist)
cmap = mpl.colors.LinearSegmentedColormap.from_list(
'my_cmap', cmaplist, sz) # create the new map
# Define the bins and normalize
bounds = np.linspace(0, sz, sz + 1)
norm = mpl.colors.BoundaryNorm(bounds, sz)
mz, my, mx = sim.rank_arr.shape
xcoord, ycoord, zcoord = np.linspace(0,mx,mx+1), np.linspace(0,my,my+1),
np.linspace(0,mz,mz+1)
im = plt.pcolormesh(xcoord, zcoord, sim.rank_arr[:,j0,:],
cmap=cmap, norm=norm)
# Grid lines
plt.ylabel('$k$')
plt.xlabel('$i$')
plt.minorticks_on()
plt.hlines(zcoord, xcoord[0], xcoord[-1],
alpha=0.7, linewidth=0.3, color='lightgrey')
plt.vlines(xcoord, zcoord[0], zcoord[-1],
alpha=0.7, linewidth=0.3, color='lightgrey')
plt.gca().set_aspect('equal')
# Center rank label
for k in range(mz):
for i in range(mx):
text = plt.gca().text(i+0.5, k+0.5, int(sim.rank_arr[k][j0][i]),
ha="center", va="center",
color="red", fontsize=8)
# Colorbar
divider = make_axes_locatable(plt.gca())
cax = divider.new_horizontal(size="5%", pad=0.05)
plt.gcf().add_axes(cax)
cb=plt.gcf().colorbar(im, label='rank', cax=cax, orientation="vertical")
ticks = np.linspace(0, 1, len(unique_ranks)+1)
cb.ax.yaxis.set_ticks(ticks)
cb.ax.yaxis.set_ticklabels([0, 1, 2, 3, " "])
fig, axs = plt.subplots(2, 2, figsize=(8, 8))
for j,ax in enumerate(axs.flatten()):
plt.sca(ax)
plt.title('j={}'.format(j))
plot_3D(sim_3D, j)
plt.tight_layout()
This generates plots like in [fig:distribution_mapping_3D]:

Sample distribution mappings from 3D simulations, visualized as slices in the \(ik\) plane along \(j\).
Debugging the code
Sometimes, the code does not give you the result that you are expecting. This can be due to a variety of reasons, from misunderstandings or changes in the input parameters, system specific quirks, or bugs. You might also want to debug your code as you implement new features in WarpX during development.
This section gives a step-by-step guidance on how to systematically check what might be going wrong.
Debugging Workflow
Try the following steps to debug a simulation:
Check the output text file, usually called
output.txt
: are there warnings or errors present?On an HPC system, look for the job output and error files, usually called
WarpX.e...
andWarpX.o...
. Read long messages from the top and follow potential guidance.If your simulation already created output data files: Check if they look reasonable before the problem occurred; are the initial conditions of the simulation as you expected? Do you spot numerical artifacts or instabilities that could point to missing resolution or unexpected/incompatible numerical parameters?
Did the job output files indicate a crash? Check the
Backtrace.<mpirank>
files for the location of the code that triggered the crash. Backtraces are read from bottom (high-level) to top (most specific line that crashed).Was this a segmentation fault in C++, but the run was controlled from Python (PICMI)? To get the last called Python line for the backtrace, run again and add the Python
faulthandler
, e.g., withpython3 -X faulthandler PICMI_your_script_here.py
.
Try to make the reproducible scenario as small as possible by modifying the inputs file. Reduce number of cells, particles and MPI processes to something as small and as quick to execute as possible. The next steps in debugging will increase runtime, so you will benefit from a fast reproducer.
Consider adding runtime debug options that can narrow down typical causes in numerical implementations.
In case of a crash, Backtraces can be more detailed if you re-compile with debug flags: for example, try compiling with
-DCMAKE_BUILD_TYPE=RelWithDebInfo
(some slowdown) or even-DCMAKE_BUILD_TYPE=Debug
(this will make the simulation way slower) and rerun.If debug builds are too costly, try instead compiling with
-DAMReX_ASSERTIONS=ON
to activate more checks and rerun.If the problem looks like a memory violation, this could be from an invalid field or particle index access. Try compiling with
-DAMReX_BOUND_CHECK=ON
(this will make the simulation very slow), and rerun.If the problem looks like a random memory might be used, try initializing memory with signaling Not-a-Number (NaN) values through the runtime option
fab.init_snan = 1
. Further useful runtime options areamrex.fpe_trap_invalid
,amrex.fpe_trap_zero
andamrex.fpe_trap_overflow
(see details in the AMReX link below).On Nvidia GPUs, if you suspect the problem might be a race condition due to a missing host / device synchronization, set the environment variable
export CUDA_LAUNCH_BLOCKING=1
and rerun.Consider simplifying your input options and re-adding more options after having found a working baseline.
Fore more information, see also the AMReX Debugging Manual.
Last but not least: the community of WarpX developers and users can help if you get stuck. Collect your above findings, describe where and what you are running and how you installed the code, describe the issue you are seeing with details and input files used and what you already tried. Can you reproduce the problem with a smaller setup (less parallelism and/or less resolution)? Report these details in a WarpX GitHub issue.
Debuggers
See the AMReX debugger section on additional runtime parameters to
disable backtraces
rethrow exceptions
avoid AMReX-level signal handling
You will need to set those runtime options to work directly with debuggers.
Typical Error Messages
By default, the code is run in Release mode (see compilation options). That means, code errors will likely show up as symptoms of earlier errors in the code instead of directly showing the underlying line that caused the error.
For instance, we have these checks in release mode
Particles shape does not fit within tile (CPU) or guard cells (GPU) used for charge deposition
Particles shape does not fit within tile (CPU) or guard cells (GPU) used for current deposition
which prevent that particles with positions that violate the local definitions of guard cells cause confusing errors in charge/current deposition.
In such a case, as described above, rebuild and rerun in Debug mode before searching further for the bug.
Usually, the bug is from NaN
or infinite
numbers assigned to particles or fields earlier in the code or from ill-defined guard sizes.
Building in debug mode will likely move the first thrown error to an earlier location in the code, which is then closer to the underlying cause.
Then, continue following the workflow above, adding more compilation guards and runtime flags that can trap array bound violations and invalid floating point values.
Generate QED lookup tables using the standalone tool
We provide tools to generate and convert into a human-readable format the QED lookup tables.
Such tools can be compiled with cmake
by setting the flag WarpX_QED_TOOLS=ON
(this
requires both PICSAR
and Boost
libraries). The tools are compiled alongside the WarpX executable
in the folder bin
. We report here the help message displayed by the tools:
$ ./qed_table_reader -h
### QED Table Reader ###
Command line options:
-h [NO ARG] Prints all command line arguments
-i [STRING] Name of the file to open
--table [STRING] Either BW (Breit-Wheeler) or QS (Quantum Synchrotron)
--mode [STRING] Precision of the calculations: either DP (double) or SP (single)
-o [STRING] filename to save the lookup table in human-readable format
$ ./qed_table_generator -h
### QED Table Generator ###
Command line options:
-h [NO ARG] Prints all command line arguments
--table [STRING] Either BW (Breit-Wheeler) or QS (Quantum Synchrotron)
--mode [STRING] Precision of the calculations: either DP (double) or SP (single)
--dndt_chi_min [DOUBLE] minimum chi parameter for the dNdt table
--dndt_chi_max [DOUBLE] maximum chi parameter for the dNdt table
--dndt_how_many [INTEGR] number of points in the dNdt table
--pair_chi_min [DOUBLE] minimum chi for the pair production table (BW only)
--pair_chi_max [DOUBLE] maximum chi for the pair production table (BW only)
--pair_chi_how_many [INTEGR] number of chi points in the pair production table (BW only)
--pair_frac_how_many [INTEGR] number of frac points in the pair production table (BW only)
--em_chi_min [DOUBLE] minimum chi for the photon emission table (QS only)
--em_chi_max [DOUBLE] maximum chi for the photon emission production table (QS only)
--em_frac_min [DOUBLE] minimum frac for the photon emission production table (QS only)
--em_chi_how_many [INTEGR] number of chi points in the photon emission table (QS only)
--em_frac_how_many [INTEGR] number of frac points in the photon emission table (QS only)
-o [STRING] filename to save the lookup table
These tools are meant to be compatible with WarpX: qed_table_generator
should generate
tables that can be loaded into WarpX and qed_table_reader
should be able to read tables generated with WarpX.
It is not safe to use these tools to generate a table on a machine using a different endianness with respect to
the machine where the table is used.
Plot timestep duration
We provide a simple python script to generate plots of the timestep duration
from the stdandard output of WarpX (provided that warpx.verbose
is set to 1):
plot_timestep_duration.py
.
If the standard output of a simulation has been redirected to a file named log_file
,
the script can be used as follows:
python plot_timestep_duration.py log_file
The script generates two pictures: log_file_ts_duration.png
, which shows the duration
of each timestep in seconds as a function of the timestep number, and log_file_ts_cumulative_duration.png
,
which shows the total duration of the simulation as a function of the timestep number.
Predicting the Number of Guard Cells for PSATD Simulations
When the computational domain is decomposed in parallel subdomains and the pseudo-spectral analytical time-domain (PSATD) method is used to solve Maxwell’s equations (by setting algo.maxwell_solver = psatd
in the input file), the number of guard cells used to exchange fields between neighboring subdomains can be chosen based on the extent of the stencil of the leading term in Maxwell’s equations, in Fourier space. A measure of such stencil can be obtained by computing the inverse Fourier transform of the given term along a chosen axis and by averaging the result over the remaining axes in Fourier space. The idea is to look at how quickly such stencils fall off to machine precision, with respect to their extension in units of grid cells, and identify consequently the number of cells after which the stencils will be truncated, with the aim of balancing numerical accuracy and locality. See (Zoni et al., 2021) for reference.
A user can run the Python script Stencil.py, located in ./Tools/DevUtils, in order to compute such stencils and estimate the number of guard cells needed for a given PSATD simulation with domain decomposition. In particular, the script computes the minimum number of guard cells for a given error threshold, that is, the minimum number of guard cells such that the stencil measure is not larger than the error threshold. The user can modify the input parameters set in the main function in order to reproduce the simulation setup. These parameters include: cell size, time step, spectral order, Lorentz boost, whether the PSATD algorithm is based on the Galilean scheme, and error threshold (this is not an input parameter of a WarpX simulation, but rather an empirical error threshold chosen to balance numerical accuracy and locality, as mentioned above).
Archiving
Archiving simulation inputs, scripts and output data is a common need for computational physicists. Here are some popular tools and workflows to make archiving easy.
HPC Systems: HPSS
A very common tape filesystem is HPSS, e.g., on NERSC or OLCF.
What’s in my archive file system?
hsi ls
Already something in my archive location?
hsi ls 2019/cool_campaign/
as usualLet’s create a neat directory structure:
new directory on the archive:
hsi mkdir 2021
create sub-dirs per campaign as usual:
hsi mkdir 2021/reproduce_paper
Create an archive of a simulation:
htar -cvf 2021/reproduce_paper/sim_042.tar /global/cfs/cdirs/m1234/ahuebl/reproduce_paper/sim_042
This copies all files over to the tape filesystem and stores them as a single
.tar
archiveThe first argument here will be the new archive
.tar
file on the archive file system, all following arguments (can be multiple, separated by a space) are locations to directories and files on the parallel file system.Don’t be confused, these tools also create an index
.tar.idx
file along it; just leave that file be and don’t interact with it
Change permissions of your archive, so your team can read your files:
Check the unix permissions via
hsi ls -al 2021/
andhsi ls -al 2021/reproduce_paper/
Files must be group (g) readable (r):
hsi chmod g+r 2021/reproduce_paper/sim_042.tar
Directories must be group (g) readable (r) and group accessible (x):
hsi chmod -R g+rx 2021
Restore things:
mkdir here_we_restore
cd here_we_restore
htar -xvf 2021/reproduce_paper/sim_42.tar
this copies the
.tar
file back from tape to our parallel filesystem and extracts its content in the current directory
Argument meaning: -c
create; -x
extract; -v
verbose; -f
tar filename.
That’s it, folks!
Note
Sometimes, for large dirs, htar
takes a while.
You could then consider running it as part of a (single-node/single-cpu) job script.
Desktops/Laptops: Cloud Drives
Even for small simulation runs, it is worth to create data archives. A good location for such an archive might be the cloud storage provided by one’s institution.
Tools like rclone can help with this, e.g., to quickly sync a large amount of directories to a Google Drive.
Asynchronous File Copies: Globus
The scientific data service Globus allows to perform large-scale data copies, between HPC centers as well as local computers, with ease and a graphical user interface. Copies can be kicked off asynchronously, often use dedicated internet backbones and are checked when transfers are complete.
Many HPC centers also add their archives as a storage endpoint and one can download a client program to add also one’s desktop/laptop.
Scientific Data for Publications
It is good practice to make computational results accessible, scrutinizable and ideally even reusable.
For data artifacts up to approximately 50 GB, consider using free services like Zenodo and Figshare to store supplementary materials of your publications.
For more information, see the open science movement, open data and open access.
Note
More information, guidance and templates will be posted here in the future.
Training a Surrogate Model from WarpX Data
Suppose we have a WarpX simulation that we wish to replace with a neural network surrogate model. For example, a simulation determined by the following input script
In this section we walk through a workflow for data processing and model training, using data from this input script as an example.
The simulation output is stored in an online Zenodo archive, in the lab_particle_diags
directory.
In the example scripts provided here, the data is downloaded from the Zenodo archive, properly formatted, and used to train a neural network.
This workflow was developed and first presented in Sandberg et al. [1], Sandberg et al. [2].
It assumes you have an up-to-date environment with PyTorch and openPMD.
Data Cleaning
It is important to inspect the data for artifacts, to check that input/output data make sense. If we plot the final phase space of the particle beam, shown in Fig. 18. we see outlying particles. Looking closer at the z-pz space, we see that some particles were not trapped in the accelerating region of the wake and have much less energy than the rest of the beam.

The final phase space projections of a particle beam through a laser-plasma acceleration element where some beam particles were not accelerated.
To assist our neural network in learning dynamics of interest, we filter out these particles.
It is sufficient for our purposes to select particles that are not too far back, setting
particle_selection={'z':[0.280025, None]}
.
After filtering, we can see in Fig. 19 that the beam phase space projections are much cleaner – this is the beam we want to train on.

The final phase space projections of a particle beam through a laser-plasma acceleration element after filtering out outlying particles.
A particle tracker is set up to make sure we consistently filter out these particles from both the initial and final data.
iteration = ts.iterations[survivor_select_index]
pt = ParticleTracker( ts,
species=species,
iteration=iteration,
select=particle_selection)
This data cleaning ensures that the particle data is distributed in a single blob, as is optimal for training neural networks.
Create Normalized Dataset
Having chosen training data we are content with, we now need to format the data, normalize it, and store the normalized data as well as the normalizations. The script below will take the openPMD data we have selected and format, normalize, and store it.
Load openPMD Data
First the openPMD data is loaded, using the particle selector as chosen above. The neural network will make predictions from the initial phase space coordinates, using the final phase space coordinates to measure how well it is making predictions. Hence we load two sets of particle data, the source and target particle arrays.
iteration = ts.iterations[source_index]
source_data = ts.get_particle(species=species,
iteration=iteration,
var_list=['x','y','z','ux','uy','uz'],
select=pt)
iteration = ts.iterations[target_index]
target_data = ts.get_particle(species=species,
iteration=iteration,
var_list=['x','y','z','ux','uy','uz'],
select=pt)
Normalize Data
Neural networks learn better on appropriately normalized data. Here we subtract out the mean in each coordinate direction and divide by the standard deviation in each coordinate direction, for normalized data that is centered on the origin with unit variance.
target_means = np.zeros(6)
target_stds = np.zeros(6)
source_means = np.zeros(6)
source_stds = np.zeros(6)
for jj in range(6):
source_means[jj] = source_data[jj].mean()
source_stds[jj] = source_data[jj].std()
source_data[jj] -= source_means[jj]
source_data[jj] /= source_stds[jj]
for jj in range(6):
target_means[jj] = target_data[jj].mean()
target_stds[jj] = target_data[jj].std()
target_data[jj] -= target_means[jj]
target_data[jj] /= target_stds[jj]
openPMD to PyTorch Data
With the data normalized, it must be stored in a form PyTorch recognizes. The openPMD data are 6 lists of arrays, for each of the 6 phase space coordinates \(x, y, z, p_x, p_y,\) and \(p_z\). This data are converted to an \(N\times 6\) numpy array and then to a PyTorch \(N\times 6\) tensor.
source_data = torch.tensor(np.column_stack(source_data))
target_data = torch.tensor(np.column_stack(target_data))
Save Normalizations and Normalized Data
The data is split into training and testing subsets. We take most of the data (70%) for training, meaning that data is used to update the neural network parameters. The testing data is reserved to determine how well the neural network generalizes; that is, how well the neural network performs on data that wasn’t used to update the neural network parameters. With the data split and properly normalized, it and the normalizations are saved to file for use in training and inference.
full_dataset = torch.utils.data.TensorDataset(source_data.float(), target_data.float())
n_samples = full_dataset.tensors[0].size(0)
n_train = int(training_frac*n_samples)
n_test = n_samples - n_train
train_data, test_data = torch.utils.data.random_split(full_dataset, [n_train, n_test])
torch.save({'dataset':full_dataset,
'train_indices':train_data.indices,
'test_indices':test_data.indices,
'source_means':source_means,
'source_stds':source_stds,
'target_means':target_means,
'target_stds':target_stds,
'times':times,
},
dataset_fullpath_filename
)
Neural Network Structure
It was found in Sandberg et al. [2] that a reasonable surrogate model is obtained with shallow feedforward neural networks consisting of about 5 hidden layers and 700-900 nodes per layer. The example shown here uses 3 hidden layers and 20 nodes per layer and is trained for 10 epochs.
Some utility functions for creating neural networks are provided in the script below. These are mostly convenience wrappers and utilities for working with PyTorch neural network objects. This script is imported in the training scripts shown later.
Train and Save Neural Network
The script below trains the neural network on the dataset just created. In subsequent sections we discuss the various parts of the training process.
Training Function
In the training function, the model weights are updated.
Iterating through batches, the loss function is evaluated on each batch.
PyTorch provides automatic differentiation, so the direction of steepest descent
is determined when the loss function is evaluated and the loss.backward()
function
is invoked.
The optimizer uses this information to update the weights in the optimizer.step()
call.
The training loop then resets the optimizer and updates the summed error for the whole dataset
with the error on the batch and continues iterating through batches.
Note that this function returns the sum of all errors across the entire dataset,
which is later divided by the size of the dataset in the training loop.
def train(model, optimizer, train_loader, loss_fun):
model.train()
total_loss = 0.
for batch_idx, (data, target) in enumerate(train_loader):
#evaluate network with data
output = model(data)
#compute loss
# sum the differences squared, take mean afterward
loss = loss_fun(output, target,reduction='sum')
#backpropagation: step optimizer and reset gradients
loss.backward()
optimizer.step()
optimizer.zero_grad()
total_loss += loss.item()
return total_loss
Testing Function
The testing function just evaluates the neural network on the testing data that has not been used to update the model parameters. This testing function requires that the testing dataset is small enough to be loaded all at once. The PyTorch dataloader can load data in batches if this size assumption is not satisfied. The error, measured by the loss function, is returned by the testing function to be aggregated and stored. Note that this function returns the sum of all errors across the entire dataset, which is later divided by the size of the dataset in the training loop.
def test_dataset(model, test_source, test_target, loss_fun):
model.eval()
with torch.no_grad():
output = model(test_source)
return loss_fun(output, test_target, reduction='sum').item()
Training Loop
The full training loop performs n_epochs
number of iterations.
At each iteration the training and testing functions are called,
the respective errors are divided by the size of the dataset and recorded,
and a status update is printed to the console.
for epoch in range(n_epochs):
if do_print:
t1 = time.time()
ave_train_loss = train(model, optimizer, train_loader_device, loss_fun) / data_dim / training_set_size
ave_test_loss = test_dataset(model, test_source_device, test_target_device, loss_fun) / data_dim / training_set_size
train_loss_list.append(ave_train_loss)
test_loss_list.append(ave_test_loss)
if do_print:
t2 = time.time()
print('Train Epoch: {:04d} \tTrain Loss: {:.6f} \tTest Loss: {:.6f}, this epoch: {:.3f} s'.format(
epoch + 1, ave_train_loss, ave_test_loss, t2-t1))
Save Neural Network Parameters
The model weights are saved after training to record the updates to the model parameters. Additionally, we save some model metainformation with the model for convenience, including the model hyperparameters, the training and testing losses, and how long the training took.
model.to(device='cpu')
torch.save({
'n_hidden_layers':n_hidden_layers,
'n_hidden_nodes':n_hidden_nodes,
'activation':activation_type,
'model_state_dict': model.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'train_loss_list': train_loss_list,
'test_loss_list': test_loss_list,
'training_time': training_time,
}, f'models/{species}_model.pt')
Evaluate
In this section we show two ways to diagnose how well the neural network is learning the data. First we consider the train-test loss curves, shown in Fig. 20. This figure shows the model error on the training data (in blue) and testing data (in green) as a function of the number of epochs seen. The training data is used to update the model parameters, so training error should be lower than testing error. A key feature to look for in the train-test loss curve is the inflection point in the test loss trend. The testing data is set aside as a sample of data the neural network hasn’t seen before. The testing error serves as a metric of model generalizability, indicating how well the model performs on data it hasn’t seen yet. When the test-loss starts to trend flat or even upward, the neural network is no longer improving its ability to generalize to new data.

Training (in blue) and testing (in green) loss curves versus number of training epochs.

A comparison of model prediction (yellow-red dots, colored by mean-squared error) with simulation output (black dots).
A visual inspection of the model prediction can be seen in Fig. 21. This plot compares the model prediction, with dots colored by mean-square error, on the testing data with the actual simulation output in black. The model obtained with the hyperparameters chosen here trains quickly but is not very accurate. A more accurate model is obtained with 5 hidden layers and 900 nodes per layer, as discussed in Sandberg et al. [2].
These figures can be generated with the following Python script.
Surrogate Usage in Accelerator Physics
A neural network such as the one we trained here can be incorporated in other BLAST codes. Consider this example using neural network surrogates of WarpX simulations in ImpactX.
R. Sandberg, R. Lehe, C. E. Mitchell, M. Garten, J. Qiang, J.-L. Vay, and A. Huebl. Hybrid beamline element ML-training for surrogates in the ImpactX beam-dynamics code. In Proc. 14th International Particle Accelerator Conference, number 14 in IPAC'23 - 14th International Particle Accelerator Conference, 2885–2888. Venice, Italy, May 2023. JACoW Publishing, Geneva, Switzerland. URL: https://indico.jacow.org/event/41/contributions/2276, doi:10.18429/JACoW-IPAC2023-WEPA101.
R. Sandberg, R. Lehe, C. E. Mitchell, M. Garten, A. Myers, J. Qiang, J.-L. Vay, and A. Huebl. Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines. 2024. accepted. URL: https://arxiv.org/abs/2402.17248, doi:10.48550/arXiv.2402.17248.
Optimizing with Optimas
optimas is an open-source Python library that enables highly scalable parallel optimization, from a typical laptop to exascale HPC systems. While a WarpX simulation can provide insight in some physics, it remains a single point evaluation in the space of parameters. If you have a simulation ready for use, but would like to (i) scan over some input parameters uniformly for, e.g., a tolerance study, or (ii) have a random evaluation of the space of input parameters within a given span or (iii) tune some input parameters to optimize an output parameter, e.g., beam emittance, energy spread, etc., optimas provides these capabilities and will take care of tasks monitoring with fault tolerance on multiple platforms (optimas targets modern HPC platforms like Perlmutter and Frontier).
A more detailed description of optimas is provided in the optimas documentation. In particular, the online optimas documentation provides an example optimization with optimas that runs WarpX simulations.
FAQ
This section lists frequently asked usage questions.
What is “MPI initialized with thread support level …”?
When we start up WarpX, we report a couple of information on used MPI processes across parallel compute processes, CPU threads or GPUs and further capabilities. For instance, a parallel, multi-process, multi-threaded CPU run could output:
MPI initialized with 4 MPI processes
MPI initialized with thread support level 3
OMP initialized with 8 OMP threads
AMReX (22.10-20-g3082028e4287) initialized
...
The 1st line is the number of parallel MPI processes (also called MPI ranks).
The 2nd line reports on the support level of MPI functions to be called from threads. We currently only use this for optional, async IO with AMReX plotfiles. In the past, requesting MPI threading support had performance penalties, but we have not seen such anymore on recent systems. Thus, we request it by default but you can overwrite it with a compile time option if it ever becomes needed.
The 3rd line is the number of CPU OpenMP (OMP) threads per MPI process. After that, information on software versions follow.
How do I suppress tiny profiler output if I do not care to see it?
Via AMReX_TINY_PROFILE=OFF
(see: build options and then AMReX build options).
We change the default in cmake/dependencies/AMReX.cmake
.
Note that the tiny profiler adds literally no overhead to the simulation runtime, thus we enable it by default.
What design principles should I keep in mind when creating an input file?
Leave a cushion between lasers, particles, and the edge of computational domain.
The laser antenna and plasma species zmin
can be less than or greater than the geometry.prob_hi
,
but not exactly equal.
What do I need to know about using the boosted frame?
The input deck can be designed in the lab frame and little modification to the physical set-up is needed – most of the work is done internally. Here are a few practical items to assist in designing boosted frame simulations:
Ions must be explicitly included
Best practice is to separate counter-propagating objects; things moving to the right should start with \(z <= 0\) and things stationary or moving to the left (moving to the left in the boosted frame) should start with \(z > 0\)
Don’t forget the general design principles listed above
The boosted frame simulation begins at boosted time \(t'=0\)
Numerics and algorithms need to be adjusted, as there are numerical instabilities that arise in the boosted frame. For example, setting
particles.use_fdtd_nci_corr=1
for an FDTD simulation or settingpsatd.use_default_v_galilean=1
for a PSATD simulation. Be careful as this is overly simplistic and these options will not work in all cases. Please see the input parameters documentation and the examples for more information
An in-depth discussion of the boosted frame is provided in the moving window and optimal Lorentz boosted frame section.
What about Back-transformed diagnostics (BTD)?
![[fig:BTD_features] Minkowski diagram indicating several features of the back-transformed diagnostic (BTD). The diagram explains why the first BTD begins to fill at boosted time :math:`t'=0` but this doesn't necessarily correspond to lab time :math:`t=0`, how the BTD grid-spacing is determined by the boosted time step :math:`\Delta t'`, hence why the snapshot length don't correspond to the grid spacing and length in the input script, and how the BTD snapshots complete when the effective snapshot length is covered in the boosted frame.](https://user-images.githubusercontent.com/10621396/198702232-9dd595ad-479e-4170-bd25-51e2b72cd50a.png)
[fig:BTD_features] Minkowski diagram indicating several features of the back-transformed diagnostic (BTD). The diagram explains why the first BTD begins to fill at boosted time \(t'=0\) but this doesn’t necessarily correspond to lab time \(t=0\), how the BTD grid-spacing is determined by the boosted time step \(\Delta t'\), hence why the snapshot length don’t correspond to the grid spacing and length in the input script, and how the BTD snapshots complete when the effective snapshot length is covered in the boosted frame.
Several BTD quantities differ slightly from the lab frame domain described in the input deck. In the following discussion, we will use a subscript input (e.g. \(\Delta z_{\rm input}\)) to denote properties of the lab frame domain.
The first back-transformed diagnostic (BTD) snapshot may not occur at \(t=0\). Rather, it occurs at \(t_0=\frac{z_{max}}c \beta(1+\beta)\gamma^2\). This is the first time when the boosted frame can complete the snapshot.
The grid spacing of the BTD snapshot is different from the grid spacing indicated in the input script. It is given by \(\Delta z_{\rm grid,snapshot}=\frac{c\Delta t_{\rm boost}}{\gamma\beta}\). For a CFL-limited time step, \(\Delta z_{\rm grid,snapshot}\approx \frac{1+\beta}{\beta} \Delta z_{\rm input}\approx 2 \Delta z_{\rm input}\). Hence in many common use cases at large boost, it is expected that the BTD snapshot has a grid spacing twice what is expressed in the input script.
The effective length of the BTD snapshot may be longer than anticipated from the input script because the grid spacing is different. Additionally, the number of grid points in the BTD snapshot is a multiple of
<BTD>.buffer_size
whereas the number of grid cells specified in the input deck may not be.The code may require longer than anticipated to complete a BTD snapshot. The code starts filling the \(i^{th}\) snapshot around step \(j_{\rm BTD start}={\rm ceil}\left( i\gamma(1-\beta)\frac{\Delta t_{\rm snapshot}}{\Delta t_{\rm boost}}\right)\). The code then saves information for one BTD cell every time step in the boosted frame simulation. The \(i^{th}\) snapshot is completed and saved \(n_{z,{\rm snapshot}}=n_{\rm buffers}\cdot ({\rm buffer\ size})\) time steps after it begins, which is when the effective snapshot length is covered by the simulation.
What kinds of RZ output do you support?
In RZ, supported detail of RZ output depends on the output format that is configured in the inputs file.
openPMD supports output of the detailed RZ modes and reconstructs representations on-the-fly in post-processing, e.g, in openPMD-viewer
or other tools.
For some tools, this is in-development.
AMReX plotfiles and other in situ methods output a 2D reconstructed Cartesian slice at \(\theta=0\) by default (and can opt-in to dump raw modes).
Data Analysis
Output formats
WarpX can write diagnostics data either in
plotfile format or in
Plotfiles are AMReX’ native data format, while openPMD is implemented in popular community formats such as ADIOS and HDF5.
This section describes some of the tools available to visualize the data.
Asynchronous IO
When using the AMReX plotfile format, users can set the amrex.async_out=1
option to perform the IO in a non-blocking fashion, meaning that the simulation
will continue to run while an IO thread controls writing the data to disk.
This can significantly reduce the overall time spent in IO. This is primarily intended for
large runs on supercomputers (e.g. at OLCF or NERSC); depending on the MPI
implementation you are using, you may not see a benefit on your workstation.
When writing plotfiles, each rank will write to a separate file, up to some maximum number
(by default, 64). This maximum can be adjusted using the amrex.async_out_nfiles
inputs
parameter. To use asynchronous IO with than amrex.async_out_nfiles
MPI ranks, WarpX
WarpX must be configured with -DWarpX_MPI_THREAD_MULTIPLE=ON
.
Please see the building instructions for details.
In Situ Capabilities
WarpX includes so-called reduced diagnostics. Reduced diagnostics create observables on-the-fly, such as energy histograms or particle beam statistics and are easily visualized in post-processing.
In addition, WarpX also has vn-situ visualization capabilities (i.e. visualizing the data directly from the simulation, without dumping data files to disk).
In situ Visualization with SENSEI
SENSEI is a light weight framework for in situ data analysis. SENSEI’s data model and API provide uniform access to and run time selection of a diverse set of visualization and analysis back ends including VisIt Libsim, ParaView Catalyst, VTK-m, Ascent, ADIOS, Yt, and Python.
SENSEI uses an XML file to select and configure one or more back ends at run time. Run time selection of the back end via XML means one user can access Catalyst, another Libsim, yet another Python with no changes to the code.
System Architecture
SENSEI’s in situ architecture enables use of a diverse of back ends which can be selected at run time via an XML configuration file
The three major architectural components in SENSEI are data adaptors which present simulation data in SENSEI’s data model, analysis adaptors which present the back end data consumers to the simulation, and bridge code from which the simulation manages adaptors and periodically pushes data through the system. SENSEI comes equipped with a number of analysis adaptors enabling use of popular analysis and visualization libraries such as VisIt Libsim, ParaView Catalyst, Python, and ADIOS to name a few. AMReX contains SENSEI data adaptors and bridge code making it easy to use in AMReX based simulation codes.
SENSEI provides a configurable analysis adaptor which uses an XML file to select and configure one or more back ends at run time. Run time selection of the back end via XML means one user can access Catalyst, another Libsim, yet another Python with no changes to the code. This is depicted in Fig. 23. On the left side of the figure AMReX produces data, the bridge code pushes the data through the configurable analysis adaptor to the back end that was selected at run time.
Compiling with GNU Make
For codes making use of AMReX’s build system add the following variable to the
code’s main GNUmakefile
.
USE_SENSEI_INSITU = TRUE
When set, AMReX’s make files will query environment variables for the lists of
compiler and linker flags, include directories, and link libraries. These lists
can be quite elaborate when using more sophisticated back ends, and are best
set automatically using the sensei_config
command line tool that should
be installed with SENSEI. Prior to invoking make use the following command to
set these variables:
source sensei_config
Typically, the sensei_config
tool is in the users PATH after loading
the desired SENSEI module. After configuring the build environment with
sensei_config
, proceed as usual.
make -j4 -f GNUmakefile
ParmParse Configuration
Once an AMReX code has been compiled with SENSEI features enabled, it will need to be enabled and configured at runtime. This is done using ParmParse input file. The supported parameters are described in the following table.
parameter |
description |
default |
---|---|---|
|
turns in situ processing on or off and controls how often data is processed. |
0 |
|
controls when in situ processing starts. |
0 |
|
points to the SENSEI XML file which selects and configures the desired back end. |
|
|
when 1 lower left corner of the mesh is pinned to 0.,0.,0. |
0 |
A typical use case is to enabled SENSEI by setting insitu.int
to be
greater than 1, and insitu.config
to point SENSEI to an XML file that
selects and configures the desired back end.
insitu.int = 2
insitu.config = render_iso_catalyst.xml
Back-end Selection and Configuration
The back end is selected and configured at run time using the SENSEI XML file. The XML sets parameters specific to SENSEI and to the chosen back end. Many of the back ends have sophisticated configuration mechanisms which SENSEI makes use of. For example the following XML configuration was used on NERSC’s Cori with WarpX to render 10 iso surfaces, shown in Fig. 24, using VisIt Libsim.
<sensei>
<analysis type="libsim" frequency="1" mode="batch"
session="beam_j_pin.session"
image-filename="beam_j_pin_%ts" image-width="1200" image-height="900"
image-format="png" enabled="1"/>
</sensei>
The session attribute names a session file that contains VisIt specific runtime configuration. The session file is generated using VisIt GUI on a representative dataset. Usually this data set is generated in a low resolution run of the desired simulation.
Rendering of 10 3D iso-surfaces of j using VisIt libsim. The upper left quadrant has been clipped away to reveal innner structure.
The same run and visualization was repeated using ParaView Catalyst, shown in Fig. 25, by providing the following XML configuration.
<sensei>
<analysis type="catalyst" pipeline="pythonscript"
filename="beam_j.py" enabled="1" />
</sensei>
Here the filename attribute is used to pass Catalyst a Catalyst specific configuration that was generated using the ParaView GUI on a representative dataset.
Rendering of 10 3D iso-surfaces of j using ParaView Catalyst. The upper left quadrant has been clipped away to reveal innner structure.
The renderings in these runs were configured using a representative dataset
which was obtained by running the simulation for a few time steps at a lower
spatial resolution. When using VisIt Libsim the following XML configures the
VTK writer to write the simulation data in VTK format. At the end of the run a
.visit
file that VisIt can open will be generated.
<sensei>
<analysis type="PosthocIO" mode="visit" writer="xml"
ghost_array_name="avtGhostZones" output_dir="./"
enabled="1">
</analysis>
</sensei>
When using ParaView Catalyst the following XML configures the VTK writer to
write the simulation data in VTK format. At the end of the run a .pvd
file that ParaView can open will be generated.
<sensei>
<analysis type="PosthocIO" mode="paraview" writer="xml"
ghost_array_name="vtkGhostType" output_dir="./"
enabled="1">
</analysis>
</sensei>
Obtaining SENSEI
SENSEI is hosted on Kitware’s Gitlab site at https://gitlab.kitware.com/sensei/sensei
It’s best to checkout the latest release rather than working on the develop
branch.
To ease the burden of wrangling back end installs SENSEI provides two platforms with all dependencies pre-installed, a VirtualBox VM, and a NERSC Cori deployment. New users are encouraged to experiment with one of these.
SENSEI VM
The SENSEI VM comes with all of SENSEI’s dependencies and the major back ends such as VisIt and ParaView installed. The VM is the easiest way to test things out. It also can be used to see how installs were done and the environment configured.
The SENSEI VM can be downloaded here.
The SENSEI VM uses modules to manage the build and run environment. Load the SENSEI modulefile for the back-end you wish to use. The following table describes the available installs and which back-ends are supported in each.
modulefile |
back-end(s) |
---|---|
sensei/2.1.1-catalyst-shared |
ParaView Catalyst, ADIOS, Python |
sensei/2.1.1-libsim-shared |
VisIt Libsim, ADIOS, Python |
sensei/2.1.1-vtk-shared |
VTK-m, ADIOS, Python |
NERSC Cori
SENSEI is deployed at NERSC on Cori. The NERSC deployment includes the major back ends such as ADIOS, ParaView Catalyst, VisIt Libsim, and Python.
The SENSEI installs uses modules to manage the build and run environment. Load the SENSEI modulefile for the back-end you wish to use. The following table describes the available installs and which back-ends are supported in each.
modulefile |
back-end(s) |
---|---|
sensei/2.1.0-catalyst-shared |
ParaView Catalyst, ADIOS, Python |
sensei/2.1.0-libsim-shared |
VisIt Libsim, ADIOS, Python |
sensei/2.1.0-vtk-shared |
VTK-m, ADIOS, Python |
To access the SENSEI modulefiles on cori first add the SENSEI install to the search path:
module use /usr/common/software/sensei/modulefiles
3D LPA Example
This section shows an example of using SENSEI and three different back ends on a 3D LPA simulation. The instructions are specifically for NERSC cori, but also work with the SENSEI VM. The primary difference between working through the examples on cori or the VM are that different versions of software are installed.
Rendering with VisIt Libsim
First, log into cori and clone the git repo’s.
cd $SCRATCH
mkdir warpx
cd warpx/
git clone https://github.com/ECP-WarpX/WarpX.git WarpX-libsim
git clone https://github.com/AMReX-Codes/amrex
git clone https://github.com/ECP-WarpX/picsar.git
cd WarpX-libsim
vim GNUmakefile
Next, edit the makefile to turn the SENSEI features on.
USE_SENSEI_INSITU=TRUE
Then, load the SENSEI VisIt module, bring SENSEI’s build requirements into the environment, and compile WarpX.
module use /usr/common/software/sensei/modulefiles/
module load sensei/2.1.0-libsim-shared
source sensei_config
make -j8
Download the WarpX input deck, SENSEI XML configuration and and VisIt session files. The inputs file configures WarpX, the xml file configures SENSEI, and the session file configures VisIt. The inputs and xml files are written by hand, while the session file is generated in VisIt gui on a representative data set.
wget https://data.kitware.com/api/v1/item/5c05d48e8d777f2179d22f20/download -O inputs.3d
wget https://data.kitware.com/api/v1/item/5c05d4588d777f2179d22f16/download -O beam_j_pin.xml
wget https://data.kitware.com/api/v1/item/5c05d4588d777f2179d22f0e/download -O beam_j_pin.session
To run the demo, submit an interactive job to the batch queue, and launch WarpX.
salloc -C haswell -N 1 -t 00:30:00 -q debug
./Bin/main3d.gnu.TPROF.MPI.OMP.ex inputs.3d
Rendering with ParaView Catalyst
First, log into cori and clone the git repo’s.
cd $SCRATCH
mkdir warpx
cd warpx/
git clone https://github.com/ECP-WarpX/WarpX.git WarpX-catalyst
git clone --branch development https://github.com/AMReX-Codes/amrex
git clone https://github.com/ECP-WarpX/picsar.git
cd WarpX-catalyst
vim GNUmakefile
Next, edit the makefile to turn the SENSEI features on.
USE_SENSEI_INSITU=TRUE
Then, load the SENSEI ParaView module, bring SENSEI’s build requirements into the environment, and compile WarpX.
module use /usr/common/software/sensei/modulefiles/
module load sensei/2.1.0-catalyst-shared
source sensei_config
make -j8
Download the WarpX input deck, SENSEI XML configuration and and ParaView session files. The inputs file configures WarpX, the xml file configures SENSEI, and the session file configures ParaView. The inputs and xml files are written by hand, while the session file is generated in ParaView gui on a representative data set.
wget https://data.kitware.com/api/v1/item/5c05b3fd8d777f2179d2067d/download -O inputs.3d
wget https://data.kitware.com/api/v1/item/5c05b3fd8d777f2179d20675/download -O beam_j.xml
wget https://data.kitware.com/api/v1/item/5c05b3fc8d777f2179d2066d/download -O beam_j.py
To run the demo, submit an interactive job to the batch queue, and launch WarpX.
salloc -C haswell -N 1 -t 00:30:00 -q debug
./Bin/main3d.gnu.TPROF.MPI.OMP.ex inputs.3d
In situ Calculation with Python
SENSEI’s Python back-end loads a user provided script file containing callbacks
for Initialize
, Execute
, and Finalize
phases of the run.
During the execute phase the simulation pushes data through SENSEI. SENSEI forwards
this data to the user provided Python function. SENSEI’s MPI communicator is made
available to the user’s function via a global variable comm
.
Here is a template for the user provided Python code.
# YOUR IMPORTS HERE
# SET DEFAULTS OF GLOBAL VARIABLES THAT INFLUENCE RUNTIME BEHAVIOR HERE
def Initialize():
""" Initialization code """
# YOUR CODE HERE
return
def Execute(dataAdaptor):
""" Use sensei::DataAdaptor instance passed in
dataAdaptor to access and process simulation data """
# YOUR CODE HERE
return
def Finalize():
""" Finalization code """
# YOUR CODE HERE
return
Initialize
and Finalize
are optional and will be called if
they are provided. Execute
is required. SENSEI’s DataAdaptor API
is used to obtain data and metadata from the simulation. Data is through
VTK Object’s. In WarpX the vtkOverlappingAMR VTK dataset is used.
The following script shows a simple integration of a scalar quantity over the valid cells of the mesh. The result is saved in a CSV format.
import numpy as np, matplotlib.pyplot as plt
from vtk.util.numpy_support import *
from vtk import vtkDataObject
import sys
# default values of control parameters
array = ''
out_file = ''
def Initialize():
# rank zero writes the result
if comm.Get_rank() == 0:
fn = out_file if out_file else 'integrate_%s.csv'%(array)
f = open(fn, 'w')
f.write('# time, %s\n'%(array))
f.close()
return
def Execute(adaptor):
# get the mesh and arrays we need
dobj = adaptor.GetMesh('mesh', False)
adaptor.AddArray(dobj, 'mesh', vtkDataObject.CELL, array)
adaptor.AddGhostCellsArray(dobj, 'mesh')
time = adaptor.GetDataTime()
# integrate over the local blocks
varint = 0.
it = dobj.NewIterator()
while not it.IsDoneWithTraversal():
# get the local data block and its props
blk = it.GetCurrentDataObject()
# get the array container
atts = blk.GetCellData()
# get the data array
var = vtk_to_numpy(atts.GetArray(array))
# get ghost cell mask
ghost = vtk_to_numpy(atts.GetArray('vtkGhostType'))
ii = np.where(ghost == 0)[0]
# integrate over valid cells
varint = np.sum(var[ii])*np.prod(blk.GetSpacing())
it.GoToNextItem()
# reduce integral to rank 0
varint = comm.reduce(varint, root=0, op=MPI.SUM)
# rank zero writes the result
if comm.Get_rank() == 0:
fn = out_file if out_file else 'integrate_%s.csv'%(array)
f = open(fn, 'a+')
f.write('%s, %s\n'%(time, varint))
f.close()
return
The following XML configures SENSEI’s Python back-end.
<sensei>
<analysis type="python" script_file="./integrate.py" enabled="1">
<initialize_source>
array='rho'
out_file='rho.csv'
</initialize_source>
</analysis>
</sensei>
The script_file
attribute sets the file path to load the user’s Python
code from, and the initialize_source
element contains Python code that
controls runtime behavior specific to each user provided script.
In situ Visualization with Ascent
Ascent is a system designed to meet the in-situ visualization and analysis needs of simulation code teams running multi-physics calculations on many-core HPC architectures. It provides rendering runtimes that can leverage many-core CPUs and GPUs to render images of simulation meshes.
Compiling with GNU Make
After building and installing Ascent according to the instructions at Building Ascent, you can enable support for it in WarpX by changing the line
USE_ASCENT_INSITU=FALSE
in GNUmakefile
to
USE_ASCENT_INSITU=TRUE
Furthermore, you must ensure that either the ASCENT_DIR
shell environment variable contains the directory where Ascent is installed or you must specify this location when invoking make, i.e.,
make -j 8 USE_ASCENT_INSITU=TRUE ASCENT_DIR=/path/to/ascent/install
Inputs File Configuration
Once WarpX has been compiled with Ascent support, it will need to be enabled and configured at runtime.
This is done using our usual inputs file (read with amrex::ParmParse
).
The supported parameters are part of the FullDiagnostics with <diag_name>.format
parameter set to ascent
.
Visualization/Analysis Pipeline Configuration
Ascent uses the file ascent_actions.yaml
to configure analysis and visualization pipelines.
Ascent looks for the ascent_actions.yaml
file in the current working directory.
For example, the following ascent_actions.yaml
file extracts an isosurface of the field Ex
for 15
levels and saves the resulting images to levels_<nnnn>.png
.
Ascent Actions provides an overview over all available analysis and visualization actions.
-
action: "add_pipelines"
pipelines:
p1:
f1:
type: "contour"
params:
field: "Ex"
levels: 15
-
action: "add_scenes"
scenes:
scene1:
image_prefix: "levels_%04d"
plots:
plot1:
type: "pseudocolor"
pipeline: "p1"
field: "Ex"
Here is another ascent_actions.yaml
example that renders isosurfaces and particles:
-
action: "add_pipelines"
pipelines:
p1:
f1:
type: "contour"
params:
field: "Bx"
levels: 3
-
action: "add_scenes"
scenes:
scene1:
plots:
plot1:
type: "pseudocolor"
pipeline: "p1"
field: "Bx"
plot2:
type: "pseudocolor"
field: "particle_electrons_Bx"
points:
radius: 0.0000005
renders:
r1:
camera:
azimuth: 100
elevation: 10
image_prefix: "out_render_3d_%06d"
Finally, here is a more complex ascent_actions.yaml
example that creates the same images as the prior example, but adds a trigger that creates a Cinema Database at cycle 300
:
-
action: "add_triggers"
triggers:
t1:
params:
condition: "cycle() == 300"
actions_file: "trigger.yaml"
-
action: "add_pipelines"
pipelines:
p1:
f1:
type: "contour"
params:
field: "jy"
iso_values: [ 1000000000000.0, -1000000000000.0]
-
action: "add_scenes"
scenes:
scene1:
plots:
plot1:
type: "pseudocolor"
pipeline: "p1"
field: "jy"
plot2:
type: "pseudocolor"
field: "particle_electrons_w"
points:
radius: 0.0000002
renders:
r1:
camera:
azimuth: 100
elevation: 10
image_prefix: "out_render_jy_part_w_3d_%06d"
When the trigger condition is meet, cycle() == 300
, the actions in trigger.yaml
are also executed:
-
action: "add_pipelines"
pipelines:
p1:
f1:
type: "contour"
params:
field: "jy"
iso_values: [ 1000000000000.0, -1000000000000.0]
-
action: "add_scenes"
scenes:
scene1:
plots:
plot1:
type: "pseudocolor"
pipeline: "p1"
field: "jy"
plot2:
type: "pseudocolor"
field: "particle_electrons_w"
points:
radius: 0.0000001
renders:
r1:
type: "cinema"
phi: 10
theta: 10
db_name: "cinema_out"
You can view the Cinema Database result by opening cinema_databases/cinema_out/index.html
.
Replay
With Ascent/Conduit, one can store the intermediate data files before the rendering step is applied to custom files. These so-called Conduit Blueprint HDF5 files can be “replayed”, i.e. rendered without running the simulation again. VisIt 3.0+ also supports those files.
Replay is a utility that allows the user to replay a simulation from aforementioned files and rendering them with Ascent. Replay enables the user or developer to pick specific time steps and load them for Ascent visualization, without running the simulation again.
We will guide you through the replay procedure.
Get Blueprint Files
To use replay, you first need Conduit Blueprint HDF5 files. The following block can be used in an ascent action to extract Conduit Blueprint HDF5 files from a simulation run.
-
action: "add_extracts"
extracts:
e1:
type: "relay"
params:
path: "conduit_blueprint"
protocol: "blueprint/mesh/hdf5"
The output in the WarpX run directory will look as in the following listing.
The .root
file is a metadata file and the corresponding directory contains the conduit blueprint data in an internal format that is based on HDF5.
conduit_blueprint.cycle_000000/
conduit_blueprint.cycle_000000.root
conduit_blueprint.cycle_000050/
conduit_blueprint.cycle_000050.root
conduit_blueprint.cycle_000100/
conduit_blueprint.cycle_000100.root
In order to select a few time steps after the fact, a so-called cycles file can be created. A cycles file is a simple text file that lists one root file per line, e.g.:
conduit_blueprint.cycle_000100.root
conduit_blueprint.cycle_000050.root
Run Replay
For Ascent Replay, two command line tools are provided in the utilities/replay directory of the Ascent installation.
There are two version of replay: the MPI-parallel version replay_mpi
and a serial version, replay_ser
.
Use an MPI-parallel replay with data sets created with MPI-parallel builds of WarpX.
Here we use replay_mpi
as an example.
The options for replay are:
--root
: specifies Blueprint root file to load--cycles
: specifies a text file containing a list of Blueprint root files to load--actions
: specifies the name of the actions file to use (default:ascent_actions.yaml
)
Instead of starting a simulation that generates data for Ascent, we now execute replay_ser
/replay_mpi
.
Replay will loop over the files listed in cycles
in the order in which they appear in the cycles file.
For example, for a small data example that fits on a single computer:
./replay_ser --root=conduit_blueprint.cycle_000400.root --actions=ascent_actions.yaml
Will replay the data of WarpX step 400 (“cycle” 400). A whole set of steps can be replayed with the above mentioned cycles file:
./replay_ser --cycles=warpx_list.txt --actions=ascent_actions.yaml
For larger examples, e.g. on a cluster with Slurm batch system, a parallel launch could look like this:
# one step
srun -n 8 ./replay_mpi --root=conduit_blueprint.cycle_000400.root --actions=ascent_actions.yaml
# multiple steps
srun -n 8 ./replay_mpi --cycles=warpx_list.txt --actions=ascent_actions.yaml
Example Actions
A visualization of the electric field component \(E_x\) (variable: Ex
) with a contour plot and with added particles can be obtained with the following Ascent Action.
This action can be used both in replay as well as in situ runs.
-
action: "add_pipelines"
pipelines:
clipped_volume:
f0:
type: "contour"
params:
field: "Ex"
levels: 16
f1:
type: "clip"
params:
topology: topo # name of the amr mesh
multi_plane:
point1:
x: 0.0
y: 0.0
z: 0.0
normal1:
x: 0.0
y: -1.0
z: 0.0
point2:
x: 0.0
y: 0.0
z: 0.0
normal2:
x: -0.7
y: -0.7
z: 0.0
sampled_particles:
f1:
type: histsampling
params:
field: particle_electrons_uz
bins: 64
sample_rate: 0.90
f2:
type: "clip"
params:
topology: particle_electrons # particle data
multi_plane:
point1:
x: 0.0
y: 0.0
z: 0.0
normal1:
x: 0.0
y: -1.0
z: 0.0
point2:
x: 0.0
y: 0.0
z: 0.0
normal2:
x: -0.7
y: -0.7
z: 0.0
# Uncomment this block if you want to create "Conduit Blueprint files" that can
# be used with Ascent "replay" after the simulation run.
# Replay is a workflow to visualize individual steps without running the simulation again.
#-
# action: "add_extracts"
# extracts:
# e1:
# type: "relay"
# params:
# path: "./conduit_blueprint"
# protocol: "blueprint/mesh/hdf5"
-
action: "add_scenes"
scenes:
scene1:
plots:
p0:
type: "pseudocolor"
field: "particle_electrons_uz"
pipeline: "sampled_particles"
p1:
type: "pseudocolor"
field: "Ex"
pipeline: "clipped_volume"
renders:
image1:
bg_color: [1.0, 1.0, 1.0]
fg_color: [0.0, 0.0, 0.0]
image_prefix: "lwfa_Ex_e-uz_%06d"
camera:
azimuth: 20
elevation: 30
zoom: 2.5
There are more Ascent Actions examples available for you to play.
Workflow
Note
This section is in-progress. TODOs: finalize acceptance testing; update 3D LWFA example
In the preparation of simulations, it is generally useful to run small, under-resolved versions of the planned simulation layout first.
Ascent replay is helpful in the setup of an in situ visualization pipeline during this process.
In the following, a Jupyter-based workflow is shown that can be used to quickly iterate on the design of a ascent_actions.yaml
file, repeatedly rendering the same (small) data.
First, run a small simulation, e.g. on a local computer, and create conduit blueprint files (see above).
Second, copy the Jupyter Notebook file ascent_replay_warpx.ipynb
into the simulation output directory.
Third, download and start a Docker container with a prepared Jupyter installation and Ascent Python bindings from the simulation output directory:
docker pull alpinedav/ascent-jupyter:latest
docker run -v$PWD:/home/user/ascent/install-debug/examples/ascent/tutorial/ascent_intro/notebooks/replay -p 8000:8000 -p 8888:8888 -p 9000:9000 -p 10000:10000 -t -i alpinedav/ascent-jupyter:latest
Now, access Jupyter Lab via: http://localhost:8888/lab (password: learn
).
Inside the Jupyter Lab is a replay/
directory, which mounts the outer working directory.
You can now open ascent_replay_warpx.ipynb
and execute all cells.
The last two cells are the replay action that can be quickly iterated: change replay_actions.yaml
cell and execute both.
Note
Keep an eye on the terminal, if a replay action is erroneous it will show up on the terminal that started the docker container. (TODO: We might want to catch that inside python and print it in Jupyter instead.)
If you remove a “key” from the replay action, you might see an error in the
AscentViewer
. Restart and execute all cells in that case.
If you like the 3D rendering of laser wakefield acceleration
on the WarpX documentation front page (which is
also the avatar of the ECP-WarpX organization), you can find the serial
analysis script video_yt.py
as well
as a parallel analysis script
video_yt.py
used to make a similar
rendering for a beam-driven wakefield simulation, running parallel.
Staggering in Data Output
Warning: currently, quantities in the output file for iteration n
are not all defined at the same physical time due to the staggering in time in WarpX.
The table below provides the physical time at which each quantity in the output file is written, in units of time step, for time step n
.
quantity |
staggering |
|
---|---|---|
E |
n |
n |
B |
n |
n |
j |
n-1/2 |
n-1/2 |
rho |
n |
n |
position |
n |
n |
momentum |
n-1/2 |
n-1/2 |
yt-project
yt is a Python package that can help in analyzing and visualizing WarpX data (among other data formats). It is convenient to use yt within a Jupyter notebook.
Data Support
yt primarily supports WarpX through plotfiles. There is also support for openPMD HDF5 files in yt (w/o mesh refinement).
Installation
From the terminal, install the latest version of yt:
python3 -m pip install cython
python3 -m pip install --upgrade yt
Alternatively, yt can be installed via their installation script, see yt installation web page.
Visualizing the data
Once data (“plotfiles”) has been created by the simulation, open a Jupyter notebook from the terminal:
jupyter notebook
Then use the following commands in the first cell of the notebook to import yt and load the first plot file:
import yt
ds = yt.load('./diags/plotfiles/plt00000/')
The list of field data and particle data stored can be seen with:
ds.field_list
For a quick start-up, the most useful commands for post-processing can be found
in our Jupyter notebook
Visualization.ipynb
Field data
Field data can be visualized using yt.SlicePlot
(see the docstring of
this function here)
For instance, in order to plot the field Ex
in a slice orthogonal to y
(1
):
yt.SlicePlot( ds, 1, 'Ex', origin='native' )
Note
yt.SlicePlot creates a 2D plot with the same aspect ratio as the physical size of the simulation box. Sometimes this can lead to very elongated plots that are difficult to read. You can modify the aspect ratio with the aspect argument ; for instance:
yt.SlicePlot( ds, 1, 'Ex', aspect=1./10 )
Alternatively, the data can be obtained as a numpy array.
For instance, in order to obtain the field jz (on level 0) as a numpy array:
ad0 = ds.covering_grid(level=0, left_edge=ds.domain_left_edge, dims=ds.domain_dimensions)
jz_array = ad0['jz'].to_ndarray()
Particle data
Particle data can be visualized using yt.ParticlePhasePlot
(see the docstring
here).
For instance, in order to plot the particles’ x
and y
positions:
yt.ParticlePhasePlot( ds.all_data(), 'particle_position_x', 'particle_position_y', 'particle_weight')
Alternatively, the data can be obtained as a numpy array.
For instance, in order to obtain the array of position x as a numpy array:
ad = ds.all_data()
x = ad['particle_position_x'].to_ndarray()
Further information
A lot more information can be obtained from the yt documentation, and the corresponding notebook tutorials here.
Out-of-the-box plotting script
A ready-to-use python script for plotting simulation results is available at
plot_parallel.py
. Feel free to
use it out-of-the-box or to modify it to suit your needs.
Dependencies
Most of its dependencies are standard Python packages, that come with a default
Anaconda installation or can be installed with pip
or conda
:
os, matplotlib, sys, argparse, matplotlib, scipy.
Additional dependencies are yt >= 4.0.1
and mpi4py
.
Run serial
Executing the script with
python plot_parallel.py
will loop through plotfiles named plt?????
(e.g., plt00000
, plt00100
etc.)
and save one image per plotfile. For a 2D simulation, a 2D colormap of the Ez
field is plotted by default, with 1/20 of particles of each species (with different colors).
For a 3D simulation, a 2D colormap of the central slices in y is plotted, and particles
are handled the same way.
The script reads command-line options (which field and particle species, rendering with yt or matplotlib, etc.). For the full list of options, run
python plot_parallel.py --help
In particular, option --plot_Ey_max_evolution
shows you how to plot the evolution of
a scalar quantity over time (by default, the max of the Ey field). Feel free to modify it
to plot the evolution of other quantities.
Run parallel
To execute the script in parallel, you can run for instance
mpirun -np 4 python plot_parallel.py --parallel
In this case, MPI ranks will share the plotfiles to process as evenly as possible.
Note that each plotfile is still processed in serial. When option
--plot_Ey_max_evolution
is on, the scalar quantity is gathered to rank 0, and
rank 0 plots the image.
If all dependencies are satisfied, the script can be used on Summit or Cori. For instance, the following batch script illustrates how to submit a post-processing batch job on Cori haswell with some options:
#!/bin/bash
# Copyright 2019 Maxence Thevenet
#
# This file is part of WarpX.
#
# License: BSD-3-Clause-LBNL
#SBATCH --job-name=postproc
#SBATCH --time=00:20:00
#SBATCH -C haswell
#SBATCH -N 8
#SBATCH -q regular
#SBATCH -e postproce.txt
#SBATCH -o postproco.txt
#SBATCH --mail-type=end
#SBATCH --account=m2852
export OMP_NUM_THREADS=1
# Requires python3 and yt > 3.5
srun -n 32 -c 16 python plot_parallel.py --path <path/to/plotfiles> --plotlib=yt --parallel
Advanced Visualization of Plotfiles With yt (for developers)
This sections contains yt commands for advanced users. The Particle-In-Cell methods uses a
staggered grid (see particle-in-cell theory), so that the x, y, and z components of the
electric and magnetic fields are all defined at different locations in space. Regular output
(see the yt-project page, or the notebook at WarpX/Tools/PostProcessing/Visualization.ipynb
for an example)
returns cell-centered data for convenience, which involves an additional operation. It is sometimes
useful to access the raw data directly. Furthermore,
the WarpX implementation for mesh refinement contains a number of grids for each level (coarse,
fine and auxiliary, see the theory for more details), and it is sometimes useful to access each of
them (regular output return the auxiliary grid only). This page provides information to read
raw data of all grids.
Write Raw Data
For a given diagnostic the user has the option to write the raw data by setting <diag_name>.plot_raw_fields = 1
.
Moreover, the user has the option to write also the values of the fields in the guard cells by setting <diag_name>.plot_raw_fields_guards = 1
.
Please refer to Input Parameters for more information.
Read Raw Data
Meta-data relevant to this topic (for example, number and locations of grids in the simulation) are accessed with
import yt
# get yt dataset
ds = yt.load( './plotfiles/plt00004' )
# Index of data in the plotfile
ds_index = ds.index
# Print the number of grids in the simulation
ds_index.grids.shape
# Left and right physical boundary of each grid
ds_index.grid_left_edge
ds_index.grid_right_edge
# List available fields
ds.field_list
When <diag_name>.plot_raw_fields = 1
, here are some useful commands to access properties of a grid and the Ex field on the fine patch:
# store grid number 2 into my_grid
my_grid = ds.index.grids[2]
# Get left and right edges of my_grid
my_grid.LeftEdge
my_grid.RightEdge
# Get Level of my_grid
my_grid.Level
# left edge of the grid, in number of points
my_grid.start_index
Return the Ex
field on the fine patch of grid my_grid
:
my_field = my_grid['raw', 'Ex_fp'].squeeze().v
For a 2D plotfile, my_field
has shape (nx,nz,2)
. The last component stands for the
two values on the edges of each cell for the electric field, due to field staggering. Numpy
function squeeze
removes empty components. While yt
arrays are unit-aware, it is
sometimes useful to extract the data into unitless numpy arrays. This is achieved with .v
.
In the case of Ex_fp
, the staggering is on direction x
, so that
my_field[:,:-1,1] == my_field[:,1:,0]
.
All combinations of the fields (E
or B
), the component (x
, y
or z
) and the
grid (_fp
for fine, _cp
for coarse and _aux
for auxiliary) can be accessed in this
way, i.e., my_grid['raw', 'Ey_aux']
or my_grid['raw', 'Bz_cp']
are valid queries.
Read Raw Data With Guard Cells
When the output includes the data in the guard cells, the user can read such data using the post-processing tool read_raw_data.py
, available in Tools/PostProcessing/
, as illustrated in the following example:
from read_raw_data import read_data
# Load all data saved in a given path
path = './diags/diag00200/'
data = read_data(path)
# Load Ex_fp on mesh refinement level 0
level = 0
field = 'Ex_fp'
# data[level] is a dictionary, data[level][field] is a numpy array
my_field = data[level][field]
Note that a list of all available raw fields written to output, that is, a list of all valid strings that the variable field
in the example above can be assigned to, can be obtained by calling data[level].keys()
.
In order to plot a 2D slice of the data with methods like matplotlib.axes.Axes.imshow
, one might want to pass the correct extent
(the bounding box in data coordinates that the image will fill), including the guard cells. One way to set the correct extent
is illustrated in the following example (case of a 2D slice in the (x,z)
plane):
import yt
import numpy as np
from read_raw_data import read_data
# Load all data saved in a given path
path = './diags/diag00200/'
data = read_data(path)
# Load Ex_fp on mesh refinement level 0
level = 0
field = 'Ex_fp'
# data[level] is a dictionary, data[level][field] is a numpy array
my_field = data[level][field]
# Set the number of cells in the valid domain
# by loading the standard output data with yt
ncells = yt.load(path).domain_dimensions
# Set the number of dimensions automatically (2D or 3D)
dim = 2 if (ncells[2] == 1) else 3
xdir = 0
zdir = 1 if (dim == 2) else 2
# Set the extent (bounding box in data coordinates, including guard cells)
# to be passed to matplotlib.axes.Axes.imshow
left_edge_x = 0 - (my_field.shape[xdir] - ncells[xdir]) // 2
right_edge_x = ncells[xdir] + (my_field.shape[xdir] - ncells[xdir]) // 2
left_edge_z = 0 - (my_field.shape[zdir] - ncells[zdir]) // 2
right_edge_z = ncells[zdir] + (my_field.shape[zdir] - ncells[zdir]) // 2
extent = np.array([left_edge_z, right_edge_z, left_edge_x, right_edge_x])
openPMD-viewer
openPMD-viewer is an open-source Python package to access openPMD data.
It allows to:
Quickly browse through the data, with a GUI-type interface in the Jupyter notebook
Have access to the data numpy array, for more detailed analysis
Installation
openPMD-viewer can be installed via conda
or pip
:
conda install -c conda-forge openpmd-viewer openpmd-api
python3 -m pip install openPMD-viewer openPMD-api
Usage
openPMD-viewer can be used either in simple Python scripts or in Jupyter. For interactive plots in Jupyter notebook or Jupyter Lab, add this “cell magic” to the first line of your notebook:
%matplotlib widget
If none of those work, e.g. because ipympl is not properly installed, you can as a last resort always try %matplotlib inline
for non-interactive plots.
In both interactive and scripted usage, you can import openPMD-viewer, and load the data with the following commands:
from openpmd_viewer import OpenPMDTimeSeries
ts = OpenPMDTimeSeries('./diags/diag1/')
Note
If you are using the Jupyter notebook, then you can start a pre-filled notebook, which already contains the above lines, by typing in a terminal:
openPMD_notebook
When using the Jupyter notebook, you can quickly browse through the data by using the command:
ts.slider()
You can also access the particle and field data as numpy arrays with the methods ts.get_field
and ts.get_particle
.
See the openPMD-viewer tutorials here for more info.
openPMD-api
openPMD-api is an open-source C++ and Python API for openPMD data.
- Please see the openPMD-api manual for a quick introduction:
3D Visualization: ParaView
WarpX results can be visualized by ParaView, an open source visualization and analysis software. ParaView can be downloaded and installed from httpshttps://www.paraview.org. Use the latest version for best results.
Tutorials
ParaView is a powerful, general parallel rendering program. If this is your first time using ParaView, consider starting with a tutorial.
openPMD
WarpX’ openPMD files can be visualized with ParaView 5.9+. ParaView supports ADIOS1, ADIOS2 and HDF5 files, as it implements (like WarpX) against openPMD-api.
For openPMD output, WarpX automatically creates an .pmd
file per diagnostics, which can be opened with ParaView.
Tip
When you first open ParaView, adjust its global Settings
(Linux: under menu item Edit
).
General
-> Advanced
-> Search for data
-> Data Processing Options
.
Check the box Auto Convert Properties
.
This will simplify application of filters, e.g., contouring of components of vector fields, without first adding a calculator that extracts a single component or magnitude.
Warning
WarpX issue 21162:
We currently load WarpX field data with a rotation.
Please apply rotation of 0 -90 0
to mesh data.
Warning
ParaView issue 21837:
In order to visualize particle traces with the Temporal Particles To Pathlines
, you need to apply the Merge Blocks
filter first.
If you have multiple species, you may have to extract the species you want with Extract Block
before applying Merge Blocks
.
Plotfiles (AMReX)
ParaView also supports visualizing AMReX plotfiles. Please see the AMReX documentation for more details.
3D Visualization: VisIt
WarpX results can be visualized by VisIt, an open source visualization and analysis software. VisIt can be downloaded and installed from https://wci.llnl.gov/simulation/computer-codes/visit.
openPMD (HDF5)
WarpX’ openPMD files can be visualized with VisIt 3.1.0+.
VisIt supports openPMD HDF5 files and requires to rename the files from .h5
to .opmd
to be automatically detected.
Plotfiles (AMReX)
Assuming that you ran a 2D simulation, here are instructions for making a simple plot from a given plotfile:
Open the header file: Run VisIt, then select “File” -> “Open file …”, then select the Header file associated with the plotfile of interest (e.g.,
plt10000/Header
).View the data: Select “Add” -> “Pseudocolor” -> “Ez” and select “Draw”. You can select other variable to draw, such as
jx
,jy
,jz
,Ex
, …View the grid structure: Select “Subset” -> “levels”. Then double click the text “Subset-levels”, enable the “Wireframe” option, select “Apply”, select “Dismiss”, and then select “Draw”.
Save the image: Select “File” -> “Set save options”, then customize the image format to your liking, then click “Save”.
Your image should look similar to the one below

In 3D, you must apply the “Operators” -> “Slicing” -> “ThreeSlice”, You can left-click and drag over the image to rotate the image to generate image you like.
To make a movie, you must first create a text file named movie.visit
with a
list of the Header files for the individual frames.
The next step is to run VisIt, select “File” -> “Open file
…”, then select movie.visit
. Create an image to your liking and press the
“play” button on the VCR-like control panel to preview all the frames. To save
the movie, choose “File” -> “Save movie …”, and follow the instructions on the screen.
VisualPIC
VisualPIC is an open-source Python GUI for visual data analysis, especially for advanced accelerator simulations. It supports WarpX’ data through openPMD files.
Installation
mamba install -c conda-forge python vtk pyvista pyqt
python3 -m pip install git+https://github.com/AngelFP/VisualPIC.git@dev
Usage
VisualPIC provides a Python data reader API and plotting capabilties. It is designed for small to medium-size data sets that fit in the RAM of a single computer.
Plotting can be performed via a command line tools or scripted with Python. Command line tools are:
vpic [options] <path/to/diagnostics/>
: 2D matplotlib plotter, e.g., for particle phase spacevpic3d [options] <path/to/diagnostics/>
: 3D VTK renderer
Example: vpic3d -s beam -rho -Ez diags/diag1/
could be used to visualize the witness beam, plasma density, and accelerating field of an LWFA.
Example: vpic3d -Ex diags/diag1/
could be used to visualize the transverse focusing field \(E_x\) in a plasma wake behind a laser pulse (linearly polarized in \(E_y\)), see below:

The Python script controlled rendering allows more flexible options, such as selecting and cutting views, rendering directly into an image file, looping for animations, etc. As with matplotlib scripts, Python script scenes can also be used to open a GUI and then browse time series interactively. The VisualPIC examples provide showcases for scripting.
Repository
- The source code can be found under:
PICViewer

PICViewer is a visualization GUI implemented on PyQt. The toolkit provides various easy-to-use functions for data analysis of Warp/WarpX simulations.
It works for both plotfiles and openPMD files.
Main features
2D/3D openPMD or WarpX data visualization,
Multi-plot panels (up to 6 rows x 5 columns) which can be controlled independently or synchronously
Interactive mouse functions (panel selection, image zoom-in, local data selection, etc)
Animation from a single or multiple panel(s)
Saving your job configuration and loading it later
Interface to use VisIt, yt, or mayavi for 3D volume rendering (currently updating)
Required software
python 2.7 or higher: http://docs.continuum.io/anaconda/install.
PyQt5
conda install pyqt
h5py
matplotlib
numpy
yt
python3 -m pip install git+https://github.com/yt-project/yt.git --user
numba
Installation
python3 -m pip install picviewer
You need to install yt and PySide separately.
You can install from the source for the latest update,
python3 -m pip install git+https://bitbucket.org/ecp_warpx/picviewer/
To install manually
Clone this repository
git clone https://bitbucket.org/ecp_warpx/picviewer/
Switch to the cloned directory with cd picviewer and type python setup.py install
To run
You can start PICViewer from any directory. Type picviewer in the command line. Select a folder where your data files are located.
You can directly open your data. Move on to a folder where your data files ae located (cd [your data folder]) and type picviewer in the command line.
Note
We currently seek a new maintainer for PICViewer. Please contact us if you are interested.
Reduced diagnostics
WarpX has optional reduced diagnostics, that typically return one value (e.g., particle energy) per timestep.
A simple and quick way to read the data using python is
data = numpy.genfromtxt("filename.txt")
where data
is a two dimensional array, data[i][j]
gives the data in the ith row and the jth column.
A Python function to read the data is available from module read_raw_data
in WarpX/Tools/PostProcessing/
:
from read_raw_data import read_reduced_diags
filename = 'EF.txt'
metadata, data = read_reduced_diags( filename )
# list available diagnostics
data.keys()
# Print total field energy on level 0
data['total_lev0']
# Print units for the total field energy on level 0
metadata['units']['total_lev0']
In addition, for reduced diagnostic type ParticleHistogram
,
another Python function is available:
from read_raw_data import read_reduced_diags_histogram
filename = 'velocity_distribution.txt'
metadata_dict, data_dict, bin_value, bin_data = read_reduced_diags_histogram( filename )
# 1-D array of the ith bin value
bin_value[i]
# 2-D array of the jth bin data at the ith time
bin_data[i][j]
Another available reduced diagnostic is ParticleHistogram2D
.
It computes a 2D histogram of particle data with user-specified axes and value functions.
The output data is stored in openPMD files gathered in a hist2D/
folder.
Workflows
This section collects typical user workflows and best practices for data analysis with WarpX.
Port Tunneling
SSH port tunneling (port forwarding) is a secure way to access a computational service of a remote computer. A typical workflow where you might need port tunneling is for Jupyter data analysis, e.g., when analyzing data on your desktop computer but working from your laptop.
Before getting started here, please note that many HPC centers offer a pre-installed Jupyter service, where tunnel is not needed. For example, see the NERSC Jupyter and OLCF Jupyter services.
Introduction
When running a service such as Jupyter from your command line, it will start a local (web) port.
The IPv4 address of your local computer is always 127.0.0.1
or the alias localhost
.
As a secure default, you cannot connect from outside your local computer to this port. This prevents misconfigurations where one could, in the worst case, connect to your open port without authentication and execute commands with your user privileges.
One way to access your remote Jupyter desktop service from your laptop is to forward the port started remotely via an encrypted SSH connection to a local port on your current laptop. The following section will explain the detailed workflow.
Workflow
you connect via SSH to your desktop at work, in a terminal (A) as usual
e.g., ssh
username@your-computers-hostname.dhcp.lbl.gov
start Jupyter locally in headless mode, e.g.,
jupyter lab --no-browser
this will show you a
127.0.0.1
(akalocalhost
) URL, by default on port TCP8888
you cannot reach that URL, because you are not sitting on that computer, with your browser
You now start a second terminal (B) locally, which forwards the remote port 8888 to your local laptop
this step must be done after Jupyter was started on the desktop
ssh -L <laptop-port>:<Ip-as-seen-on-desktop>:<desktop-port> <desktop-ip> -N
so concrete:
ssh -L 8888:localhost:8888 your-computers-hostname.dhcp.lbl.gov -N
note: Jupyter on the desktop will increase the port if already in use.
note: take another port on your laptop if you have local Jupyter instances still running
Now open the browser on your local laptop, open the URL from Jupyter with
.../127.0.0.1:8888/...
in it
To close the connection down, do this:
stop Jupyter in terminal A:
Ctrl+C
and confirm withy
,Enter
Ctrl+C
the SSH tunnel in terminal B

Theory
Introduction

Plasma laser-driven (top) and charged-particles-driven (bottom) acceleration (rendering from 3-D Particle-In-Cell simulations). A laser beam (red and blue disks in top picture) or a charged particle beam (red dots in bottom picture) propagating (from left to right) through an under-dense plasma (not represented) displaces electrons, creating a plasma wakefield that supports very high electric fields (pale blue and yellow). These electric fields, which can be orders of magnitude larger than with conventional techniques, can be used to accelerate a short charged particle beam (white) to high-energy over a very short distance.
Computer simulations have had a profound impact on the design and understanding of past and present plasma acceleration experiments [1, 2, 3, 4, 5]. Accurate modeling of wake formation, electron self-trapping and acceleration require fully kinetic methods (usually Particle-In-Cell) using large computational resources due to the wide range of space and time scales involved. Numerical modeling complements and guides the design and analysis of advanced accelerators, and can reduce development costs significantly. Despite the major recent experimental successes [6, 7, 8, 9], the various advanced acceleration concepts need significant progress to fulfill their potential. To this end, large-scale simulations will continue to be a key component toward reaching a detailed understanding of the complex interrelated physics phenomena at play.
For such simulations, the most popular algorithm is the Particle-In-Cell (or PIC) technique, which represents electromagnetic fields on a grid and particles by a sample of macroparticles. However, these simulations are extremely computationally intensive, due to the need to resolve the evolution of a driver (laser or particle beam) and an accelerated beam into a structure that is orders of magnitude longer and wider than the accelerated beam. Various techniques or reduced models have been developed to allow multidimensional simulations at manageable computational costs: quasistatic approximation [10, 11, 12, 13, 14], ponderomotive guiding center (PGC) models [11, 12, 14, 15, 16], simulation in an optimal Lorentz boosted frame [17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29], expanding the fields into a truncated series of azimuthal modes [30, 31, 32, 33, 34], fluid approximation [12, 15, 35] and scaled parameters [4, 36].
F. S. Tsung, W. Lu, M. Tzoufras, W. B. Mori, C. Joshi, J. M. Vieira, L. O. Silva, and R. A. Fonseca. Simulation Of Monoenergetic Electron Generation Via Laser Wakefield Accelerators For 5-25 TW Lasers. Physics of Plasmas, 13(5):56708, May 2006. doi:10.1063/1.2198535.
C. G. R. Geddes, D. L. Bruhwiler, J. R. Cary, W. B. Mori, J.-L. Vay, S. F. Martins, T. Katsouleas, E. Cormier-Michel, W. M. Fawley, C. Huang, X. Wang, B. Cowan, V. K. Decyk, E. Esarey, R. A. Fonseca, W. Lu, P. Messmer, P. Mullowney, K. Nakamura, K. Paul, G. R. Plateau, C. B. Schroeder, L. O. Silva, C. Toth, F. S. Tsung, M. Tzoufras, T. Antonsen, J. Vieira, and W. P. Leemans. Computational Studies And Optimization Of Wakefield Accelerators. In Journal of Physics: Conference Series, volume 125, 012002 (11 Pp.). 2008.
C. G.R. Geddes, E. Cormier-Michel, E. H. Esarey, C. B. Schroeder, J.-L. Vay, W. P. Leemans, D. L. Bruhwiler, J. R. Cary, B. Cowan, M. Durant, P. Hamill, P. Messmer, P. Mullowney, C. Nieter, K. Paul, S. Shasharina, S. Veitzer, G. Weber, O. Rubel, D. Ushizima, W. Bethel, and J. Wu. Laser Plasma Particle Accelerators: Large Fields For Smaller Facility Sources. In Scidac Review 13, number 13, 13–21. 2009. URL: https://www.osti.gov/biblio/971264.
C. G. R. Geddes, E. Cormier-Michel, E. Esarey, C. B. Schroeder, and W. P. Leemans. Scaled Simulation Design Of High Quality Laser Wakefield Accelerator Stages. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.
C. Huang, W. An, V. K. Decyk, W. Lu, W. B. Mori, F. S. Tsung, M. Tzoufras, S. Morshed, T. Antonsen, B. Feng, T. Katsouleas, R. A. Fonseca, S. F. Martins, J. Vieira, L. O. Silva, E. Esarey, C. G. R. Geddes, W. P. Leemans, E. Cormier-Michel, J.-L. Vay, D. L. Bruhwiler, B. Cowan, J. R. Cary, and K. Paul. Recent Results And Future Challenges For Large Scale Particle-In-Cell Simulations Of Plasma-Based Accelerator Concepts. Journal of Physics: Conference Series, 180(1):012005 (11 Pp.), 2009.
W. P. Leemans, A. J. Gonsalves, H.-S. Mao, K. Nakamura, C. Benedetti, C. B. Schroeder, Cs. Tóth, J. Daniels, D. E. Mittelberger, S. S. Bulanov, J.-L. Vay, C. G. R. Geddes, and E. Esarey. Multi-GeV Electron Beams from Capillary-Discharge-Guided Subpetawatt Laser Pulses in the Self-Trapping Regime. Phys. Rev. Lett., 113(24):245002, Dec 2014. URL: http://link.aps.org/doi/10.1103/PhysRevLett.113.245002, doi:10.1103/PhysRevLett.113.245002.
I. Blumenfeld, C. E. Clayton, F.-J. Decker, M. J. Hogan, C. Huang, R. Ischebeck, R. Iverson, C. Joshi, T. Katsouleas, N. Kirby, W. Lu, K. A. Marsh, W. B. Mori, P. Muggli, E. Oz, R. H. Siemann, D. Walz, and M. Zhou. Energy doubling of 42[thinsp]GeV electrons in a metre-scale plasma wakefield accelerator. Nature, 445(7129):741–744, Feb 2007. URL: http://dx.doi.org/10.1038/nature05538.
S. V. Bulanov, J. J. Wilkens, T. Z. Esirkepov, G. Korn, G. Kraft, S. D. Kraft, M. Molls, and V. S. Khoroshkov. Laser ion acceleration for hadron therapy. Physics-Uspekhi, 57(12):1149, 2014. URL: http://stacks.iop.org/1063-7869/57/i=12/a=1149.
S. Steinke, J. van Tilborg, C. Benedetti, C. G. R. Geddes, C. B. Schroeder, J. Daniels, K. K. Swanson, A. J. Gonsalves, K. Nakamura, N. H. Matlis, B. H. Shaw, E. Esarey, and W. P. Leemans. Multistage coupling of independent laser-plasma accelerators. Nature, 530(7589):190–193, Feb 2016. URL: http://dx.doi.org/10.1038/nature16525 http://10.1038/nature16525.
P. Sprangle, E. Esarey, and A. Ting. Nonlinear theory of intense laser-plasma interactions. Physical Review Letters, 64(17):2011–2014, Apr 1990.
T. M. Antonsen and P. Mora. Self-Focusing And Raman-Scattering Of Laser-Pulses In Tenuous Plasmas. Physical Review Letters, 69(15):2204–2207, Oct 1992. doi:10.1103/Physrevlett.69.2204.
J. Krall, A. Ting, E. Esarey, and P. Sprangle. Enhanced Acceleration In A Self-Modulated-Laser Wake-Field Accelerator. Physical Review E, 48(3):2157–2161, Sep 1993. doi:10.1103/Physreve.48.2157.
P. Mora and T. M. Antonsen. Kinetic Modeling Of Intense, Short Laser Pulses Propagating In Tenuous Plasmas. Phys. Plasmas, 4(1):217–229, Jan 1997. doi:10.1063/1.872134.
C. Huang, V. K. Decyk, C. Ren, M. Zhou, W. Lu, W. B. Mori, J. H. Cooley, T. M. Antonsen, Jr, and T. Katsouleas. Quickpic: A Highly Efficient Particle-In-Cell Code For Modeling Wakefield Acceleration In Plasmas. Journal of Computational Physics, 217(2):658–679, Sep 2006. doi:10.1016/J.Jcp.2006.01.039.
C. Benedetti, C. B. Schroeder, E. Esarey, C. G. R. Geddes, and W. P. Leemans. Efficient Modeling Of Laser-Plasma Accelerators With Inf&Rno. Aip Conference Proceedings, 1299:250–255, 2010. doi:10.1063/1.3520323.
B. M. Cowan, D. L. Bruhwiler, E. Cormier-Michel, E. Esarey, C. G. R. Geddes, P. Messmer, and K. M. Paul. Characteristics Of An Envelope Model For Laser-Plasma Accelerator Simulation. Journal of Computational Physics, 230(1):61–86, 2011. doi:Doi: 10.1016/J.Jcp.2010.09.009.
J.-L. Vay. Noninvariance Of Space- And Time-Scale Ranges Under A Lorentz Transformation And The Implications For The Study Of Relativistic Interactions. Physical Review Letters, 98(13):130405/1–4, 2007.
D. L. Bruhwiler, J. R. Cary, B. M. Cowan, K. Paul, C. G. R. Geddes, P. J. Mullowney, P. Messmer, E. Esarey, E. Cormier-Michel, W. Leemans, and J.-L. Vay. New Developments In The Simulation Of Advanced Accelerator Concepts. In Aip Conference Proceedings, volume 1086, 29–37. 2009.
J.-L. Vay, D. L. Bruhwiler, C. G. R. Geddes, W. M. Fawley, S. F. Martins, J. R. Cary, E. Cormier-Michel, B. Cowan, R. A. Fonseca, M. A. Furman, W. Lu, W. B. Mori, and L. O. Silva. Simulating Relativistic Beam And Plasma Systems Using An Optimal Boosted Frame. Journal of Physics: Conference Series, 180(1):012006 (5 Pp.), 2009.
J.-L. Vay, W. M. Fawley, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Application of the reduction of scale range in a Lorentz boosted frame to the numerical simulation of particle acceleration devices. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.
S. F. Martins, R. A. Fonseca, L. O. Silva, and W. B. Mori. Boosted Frame PIC Simulations of LWFA: Towards the Energy Frontier. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.
J.‐L. Vay, C. G. R. Geddes, C. Benedetti, D. L. Bruhwiler, E. Cormier‐Michel, B. M. Cowan, J. R. Cary, and D. P. Grote. Modeling Laser Wakefield Accelerators In A Lorentz Boosted Frame. AIP Conference Proceedings, 1299(1):244–249, Nov 2010. URL: https://doi.org/10.1063/1.3520322, arXiv:https://pubs.aip.org/aip/acp/article-pdf/1299/1/244/11928106/244\_1\_online.pdf, doi:10.1063/1.3520322.
S. F. Martins, R. A. Fonseca, W. Lu, W. B. Mori, and L. O. Silva. Exploring Laser-Wakefield-Accelerator Regimes For Near-Term Lasers Using Particle-In-Cell Simulation In Lorentz-Boosted Frames. Nature Physics, 6(4):311–316, Apr 2010. doi:10.1038/Nphys1538.
S. F. Martins, R. A. Fonseca, J. Vieira, L. O. Silva, W. Lu, and W. B. Mori. Modeling Laser Wakefield Accelerator Experiments With Ultrafast Particle-In-Cell Simulations In Boosted Frames. Physics of Plasmas, 17(5):56705, May 2010. doi:10.1063/1.3358139.
S. F. Martins, R. A. Fonseca, L. O. Silva, W. Lu, and W. B. Mori. Numerical Simulations Of Laser Wakefield Accelerators In Optimal Lorentz Frames. Computer Physics Communications, 181(5):869–875, May 2010. doi:10.1016/J.Cpc.2009.12.023.
J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Numerical Methods For Instability Mitigation In The Modeling Of Laser Wakefield Accelerators In A Lorentz-Boosted Frame. Journal of Computational Physics, 230(15):5908–5929, Jul 2011. doi:10.1016/J.Jcp.2011.04.003.
J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Effects of hyperbolic rotation in Minkowski space on the modeling of plasma accelerators in a Lorentz boosted frame. Physics of Plasmas, 18(3):030701, Mar 2011. URL: https://doi.org/10.1063/1.3559483, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.3559483/16019930/030701\_1\_online.pdf, doi:10.1063/1.3559483.
J.-L. Vay, C. G. R. Geddes, E. Esarey, C. B. Schroeder, W. P. Leemans, E. Cormier-Michel, and D. P. Grote. Modeling Of 10 GeV-1 TeV Laser-Plasma Accelerators Using Lorentz Boosted Simulations. Physics of Plasmas, Dec 2011. doi:10.1063/1.3663841.
P. Yu, X. Xu, A. Davidson, A. Tableman, T. Dalichaouch, F. Li, M. D. Meyers, W. An, F. S. Tsung, V. K. Decyk, F. Fiuza, J. Vieira, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Enabling Lorentz boosted frame particle-in-cell simulations of laser wakefield acceleration in quasi-3D geometry. Journal of Computational Physics, 2016. doi:10.1016/j.jcp.2016.04.014.
B. B. Godfrey. The IPROP Three-Dimensional Beam Propagation Code. Defense Technical Information Center, 1985.
A. F. Lifschitz, X. Davoine, E. Lefebvre, J. Faure, C. Rechatin, and V. Malka. Particle-in-Cell modelling of laser-plasma interaction using Fourier decomposition. Journal of Computational Physics, 228(5):1803–1814, 2009. URL: http://www.sciencedirect.com/science/article/pii/S0021999108005950, doi:http://dx.doi.org/10.1016/j.jcp.2008.11.017.
A. Davidson, A. Tableman, W. An, F. S. Tsung, W. Lu, J. Vieira, R. A. Fonseca, L. O. Silva, and W. B. Mori. Implementation of a hybrid particle code with a PIC description in r–z and a gridless description in \Phi into OSIRIS. Journal of Computational Physics, 281:1063–1077, 2015. doi:10.1016/j.jcp.2014.10.064.
R. Lehe, M. Kirchen, I. A. Andriyash, B. B. Godfrey, and J.-L. Vay. A spectral, quasi-cylindrical and dispersion-free Particle-In-Cell algorithm. Computer Physics Communications, 203:66–82, 2016. doi:10.1016/j.cpc.2016.02.007.
I. A. Andriyash, R. Lehe, and A. Lifschitz. Laser-plasma interactions with a Fourier-Bessel particle-in-cell method. Physics of Plasmas, 23(3):, 2016. doi:10.1063/1.4943281.
B. A. Shadwick, C. B. Schroeder, and E. Esarey. Nonlinear Laser Energy Depletion In Laser-Plasma Accelerators. Physics of Plasmas, 16(5):56704, May 2009. doi:10.1063/1.3124185.
E. Cormier-Michel, C. G. R. Geddes, E. Esarey, C. B. Schroeder, D. L. Bruhwiler, K. Paul, B. Cowan, and W. P. Leemans. Scaled Simulations Of A 10 GeV Accelerator. In Aip Conference Proceedings, volume 1086, 297–302. 2009.
Particle-in-Cell Method
![[fig:PIC] The Particle-In-Cell (PIC) method follows the evolution of a collection of charged macro-particles (positively charged in blue on the left plot, negatively charged in red) that evolve self-consistently with their electromagnetic (or electrostatic) fields. The core PIC algorithm involves four operations at each time step: 1) evolve the velocity and position of the particles using the Newton-Lorentz equations, 2) deposit the charge and/or current densities through interpolation from the particles distributions onto the grid, 3) evolve Maxwell’s wave equations (for electromagnetic) or solve Poisson’s equation (for electrostatic) on the grid, 4) interpolate the fields from the grid onto the particles for the next particle push. Additional “add-ons” operations are inserted between these core operations to account for additional physics (e.g. absorption/emission of particles, addition of external forces to account for accelerator focusing or accelerating component) or numerical effects (e.g. smoothing/filtering of the charge/current densities and/or fields on the grid).](_images/PIC.png)
The Particle-In-Cell (PIC) method follows the evolution of a collection of charged macro-particles (positively charged in blue on the left plot, negatively charged in red) that evolve self-consistently with their electromagnetic (or electrostatic) fields. The core PIC algorithm involves four operations at each time step: 1) evolve the velocity and position of the particles using the Newton-Lorentz equations, 2) deposit the charge and/or current densities through interpolation from the particles distributions onto the grid, 3) evolve Maxwell’s wave equations (for electromagnetic) or solve Poisson’s equation (for electrostatic) on the grid, 4) interpolate the fields from the grid onto the particles for the next particle push. Additional “add-ons” operations are inserted between these core operations to account for additional physics (e.g. absorption/emission of particles, addition of external forces to account for accelerator focusing or accelerating component) or numerical effects (e.g. smoothing/filtering of the charge/current densities and/or fields on the grid).
In the electromagnetic particle-in-cell method [1, 2], the electromagnetic fields are solved on a grid, usually using Maxwell’s equations
given here in natural units (\(\epsilon_0=\mu_0=c=1\)), where \(t\) is time, \(\mathbf{E}\) and \(\mathbf{B}\) are the electric and magnetic field components, and \(\rho\) and \(\mathbf{J}\) are the charge and current densities. The charged particles are advanced in time using the Newton-Lorentz equations of motion
where \(m\), \(q\), \(\mathbf{x}\), \(\mathbf{v}\) and \(\gamma=1/\sqrt{1-v^{2}}\) are respectively the mass, charge, position, velocity and relativistic factor of the particle given in natural units (\(c=1\)). The charge and current densities are interpolated on the grid from the particles’ positions and velocities, while the electric and magnetic field components are interpolated from the grid to the particles’ positions for the velocity update.
Particle push
A centered finite-difference discretization of the Newton-Lorentz equations of motion is given by
In order to close the system, \(\bar{\mathbf{v}}^{i}\) must be expressed as a function of the other quantities. The two implementations that have become the most popular are presented below.
Boris relativistic velocity rotation
The solution proposed by Boris [3] is given by
where \(\bar{\gamma}^{i}\) is defined by \(\bar{\gamma}^{i} \equiv (\gamma^{i+1/2}+\gamma^{i-1/2} )/2\).
The system (8, 9) is solved very efficiently following Boris’ method, where the electric field push is decoupled from the magnetic push. Setting \(\mathbf{u}=\gamma\mathbf{v}\), the velocity is updated using the following sequence:
where \(\mathbf{t}=\left(q\Delta t/2m\right)\mathbf{B}^{i}/\bar{\gamma}^{i}\) and where \(\bar{\gamma}^{i}\) can be calculated as \(\bar{\gamma}^{i}=\sqrt{1+(\mathbf{u}^-/c)^2}\).
The Boris implementation is second-order accurate, time-reversible and fast. Its implementation is very widespread and used in the vast majority of PIC codes.
Vay Lorentz-invariant formulation
It was shown in Vay [4] that the Boris formulation is not Lorentz invariant and can lead to significant errors in the treatment of relativistic dynamics. A Lorentz invariant formulation is obtained by considering the following velocity average
This gives a system that is solvable analytically (see Vay [4] for a detailed derivation), giving the following velocity update:
where
This Lorentz invariant formulation is particularly well suited for the modeling of ultra-relativistic charged particle beams, where the accurate account of the cancellation of the self-generated electric and magnetic fields is essential, as shown in Vay [4].
Field solve
Various methods are available for solving Maxwell’s equations on a grid, based on finite-differences, finite-volume, finite-element, spectral, or other discretization techniques that apply most commonly on single structured or unstructured meshes and less commonly on multiblock multiresolution grid structures. In this chapter, we summarize the widespread second order finite-difference time-domain (FDTD) algorithm, its extension to non-standard finite-differences as well as the pseudo-spectral analytical time-domain (PSATD) and pseudo-spectral time-domain (PSTD) algorithms. Extension to multiresolution (or mesh refinement) PIC is described in, e.g., Vay et al. [5], Vay et al. [6].
 Layout of field components on the staggered “Yee” grid. Current densities and electric fields are defined on the edges of the cells and magnetic fields on the faces. (right) Time integration using a second-order finite-difference "leapfrog" integrator.](_images/Yee_grid.png)
(left) Layout of field components on the staggered “Yee” grid. Current densities and electric fields are defined on the edges of the cells and magnetic fields on the faces. (right) Time integration using a second-order finite-difference “leapfrog” integrator.
Finite-Difference Time-Domain (FDTD)
The most popular algorithm for electromagnetic PIC codes is the Finite-Difference Time-Domain (or FDTD) solver
The differential operator is defined as \(\nabla=D_{x}\mathbf{\hat{x}}+D_{y}\mathbf{\hat{y}}+D_{z}\mathbf{\hat{z}}\) and the finite-difference operators in time and space are defined respectively as
where \(\Delta t\) and \(\Delta x\) are respectively the time step and the grid cell size along \(x\), \(n\) is the time index and \(i\), \(j\) and \(k\) are the spatial indices along \(x\), \(y\) and \(z\) respectively. The difference operators along \(y\) and \(z\) are obtained by circular permutation. The equations in brackets are given for completeness, as they are often not actually solved, thanks to the usage of a so-called charge conserving algorithm, as explained below. As shown in Fig. 28, the quantities are given on a staggered (or “Yee”) grid [7], where the electric field components are located between nodes and the magnetic field components are located in the center of the cell faces. Knowing the current densities at half-integer steps, the electric field components are updated alternately with the magnetic field components at integer and half-integer steps respectively.
Non-Standard Finite-Difference Time-Domain (NSFDTD)
An implementation of the source-free Maxwell’s wave equations for narrow-band applications based on non-standard finite-differences (NSFD) was introduced in Cole [8], Cole [9], and was adapted for wideband applications in Karkkainen et al. [10]. At the Courant limit for the time step and for a given set of parameters, the stencil proposed in Karkkainen et al. [10] has no numerical dispersion along the principal axes, provided that the cell size is the same along each dimension (i.e. cubic cells in 3D). The “Cole-Karkkainen” (or CK) solver uses the non-standard finite difference formulation (based on extended stencils) of the Maxwell-Ampere equation and can be implemented as follows [11]:
Eqs. (19) and (20) are not being solved explicitly but verified via appropriate initial conditions and current deposition procedure. The NSFD differential operator is given by
where
with
Here \(G\) is a sample vector component, while \(\alpha\), \(\beta\) and \(\xi\) are constant scalars satisfying \(\alpha+4\beta+4\xi=1\). As with the FDTD algorithm, the quantities with half-integer are located between the nodes (electric field components) or in the center of the cell faces (magnetic field components). The operators along \(y\) and \(z\), i.e. \(D_{y}\), \(D_{z}\), \(D_{y}^{*}\), \(D_{z}^{*}\), \(S_{y}^{1}\), \(S_{z}^{1}\), \(S_{y}^{2}\), and \(S_{z}^{2}\), are obtained by circular permutation of the indices.
Assuming cubic cells (\(\Delta x=\Delta y=\Delta z\)), the coefficients given in Karkkainen et al. [10] (\(\alpha=7/12\), \(\beta=1/12\) and \(\xi=1/48\)) allow for the Courant condition to be at \(\Delta t=\Delta x\), which equates to having no numerical dispersion along the principal axes. The algorithm reduces to the FDTD algorithm with \(\alpha=1\) and \(\beta=\xi=0\). An extension to non-cubic cells is provided in 3-D by Cowan et al. [12] and in 2-D by Pukhov [13]. An alternative NSFDTD implementation that enables superluminous waves is also given in Lehe et al. [14].
As mentioned above, a key feature of the algorithms based on NSFDTD is that some implementations [10, 12] enable the time step \(\Delta t=\Delta x\) along one or more axes and no numerical dispersion along those axes. However, as shown in Vay et al. [11], an instability develops at the Nyquist wavelength at (or very near) such a timestep. It is also shown in the same paper that removing the Nyquist component in all the source terms using a bilinear filter (see description of the filter below) suppresses this instability.
Pseudo Spectral Analytical Time Domain (PSATD)
Maxwell’s equations in Fourier space are given by
where \(\tilde{a}\) is the Fourier Transform of the quantity \(a\). As with the real space formulation, provided that the continuity equation \(\partial\tilde{\rho}/\partial t+i\mathbf{k}\cdot\mathbf{\tilde{J}}=0\) is satisfied, then the last two equations will automatically be satisfied at any time if satisfied initially and do not need to be explicitly integrated.
Decomposing the electric field and current between longitudinal and transverse components
gives
with \(\mathbf{\hat{k}}=\mathbf{k}/k\).
If the sources are assumed to be constant over a time interval \(\Delta t\), the system of equations is solvable analytically and is given by (see Haber et al. [15] for the original formulation and Vay et al. [16] for a more detailed derivation):
with \(C=\cos\left(k\Delta t\right)\) and \(S=\sin\left(k\Delta t\right)\).
Combining the transverse and longitudinal components, gives
For fields generated by the source terms without the self-consistent dynamics of the charged particles, this algorithm is free of numerical dispersion and is not subject to a Courant condition. Furthermore, this solution is exact for any time step size subject to the assumption that the current source is constant over that time step.
As shown in Vay et al. [16], by expanding the coefficients \(S_{h}\) and \(C_{h}\) in Taylor series and keeping the leading terms, the PSATD formulation reduces to the perhaps better known pseudo-spectral time-domain (PSTD) formulation [17, 18]:
The dispersion relation of the PSTD solver is given by \(\sin(\frac{\omega\Delta t}{2})=\frac{k\Delta t}{2}.\) In contrast to the PSATD solver, the PSTD solver is subject to numerical dispersion for a finite time step and to a Courant condition that is given by \(\Delta t\leq \frac{2}{\pi}\left(\frac{1}{\Delta x^{2}}+\frac{1}{\Delta y^{2}}+\frac{1}{\Delta z^{2}}\right)^{-1/2}\).
The PSATD and PSTD formulations that were just given apply to the field components located at the nodes of the grid. As noted in Ohmura and Okamura [19], they can also be easily recast on a staggered Yee grid by multiplication of the field components by the appropriate phase factors to shift them from the collocated to the staggered locations. The choice between a collocated and a staggered formulation is application-dependent.
Spectral solvers used to be very popular in the years 1970s to early 1990s, before being replaced by finite-difference methods with the advent of parallel supercomputers that favored local methods. However, it was shown recently that standard domain decomposition with Fast Fourier Transforms that are local to each subdomain could be used effectively with PIC spectral methods [16], at the cost of truncation errors in the guard cells that could be neglected. A detailed analysis of the effectiveness of the method with exact evaluation of the magnitude of the effect of the truncation error is given in Vincenti and Vay [20] for stencils of arbitrary order (up-to the infinite “spectral” order).
WarpX also includes a kinetic-fluid hybrid model in which the electric field is calculated using Ohm’s law instead of directly evolving Maxwell’s equations. This approach allows reduced physics simulations to be done with significantly lower spatial and temporal resolution than in the standard, fully kinetic, PIC. Details of this model can be found in the section Kinetic-fluid hybrid model.
Current deposition
The current densities are deposited on the computational grid from the particle position and velocities, employing splines of various orders [21].
In most applications, it is essential to prevent the accumulation of errors resulting from the violation of the discretized Gauss’ Law. This is accomplished by providing a method for depositing the current from the particles to the grid that preserves the discretized Gauss’ Law, or by providing a mechanism for “divergence cleaning” [1, 22, 23, 24, 25]. For the former, schemes that allow a deposition of the current that is exact when combined with the Yee solver is given in Villasenor and Buneman [26] for linear splines and in Esirkepov [27] for splines of arbitrary order.
The NSFDTD formulations given above and in Vay et al. [11], Cowan et al. [12], Pukhov [13], Lehe et al. [14] apply to the Maxwell-Faraday equation, while the discretized Maxwell-Ampere equation uses the FDTD formulation. Consequently, the charge conserving algorithms developed for current deposition [26, 27] apply readily to those NSFDTD-based formulations. More details concerning those implementations, including the expressions for the numerical dispersion and Courant condition are given in Vay et al. [11], Cowan et al. [12], Pukhov [13], Lehe et al. [14].
Current correction
In the case of the pseudospectral solvers, the current deposition algorithm generally does not satisfy the discretized continuity equation in Fourier space:
In this case, a Boris correction [1] can be applied in \(k\) space in the form
where \(\mathbf{\tilde{E}}_{c}\) is the corrected field. Alternatively, a correction to the current can be applied (with some similarity to the current deposition presented by Morse and Nielson in their potential-based model in Morse and Nielson [28]) using
where \(\mathbf{\tilde{J}}_{c}\) is the corrected current. In this case, the transverse component of the current is left untouched while the longitudinal component is effectively replaced by the one obtained from integration of the continuity equation, ensuring that the corrected current satisfies the continuity equation. The advantage of correcting the current rather than the electric field is that it is more local and thus more compatible with domain decomposition of the fields for parallel computation [16].
Vay deposition
Alternatively, an exact current deposition can be written for the pseudo-spectral solvers, following the geometrical interpretation of existing methods in real space [26, 27, 28].
The Vay deposition scheme is the generalization of the Esirkepov deposition scheme for the spectral case with arbitrary-order stencils [16]. The current density \(\widehat{\boldsymbol{J}}^{\,n+1/2}\) in Fourier space is computed as \(\widehat{\boldsymbol{J}}^{\,n+1/2} = i \, \widehat{\boldsymbol{D}} / \boldsymbol{k}\) when \(\boldsymbol{k} \neq 0\) and set to zero otherwise. The quantity \(\boldsymbol{D}\) is deposited in real space by averaging the currents over all possible grid paths between the initial position \(\boldsymbol{x}^{\,n}\) and the final position \(\boldsymbol{x}^{\,n+1}\) and is defined as
2D Cartesian geometry:
3D Cartesian geometry:
Here, \(w_i\) represents the weight of the \(i\)-th macro-particle and \(\Gamma\) represents its shape factor. Note that in 2D Cartesian geometry, \(D_y\) is effectively \(J_y\) and does not require additional operations in Fourier space.
Field gather
In general, the field is gathered from the mesh onto the macroparticles using splines of the same order as for the current deposition \(\mathbf{S}=\left(S_{x},S_{y},S_{z}\right)\). Three variations are considered:
“momentum conserving”: fields are interpolated from the grid nodes to the macroparticles using \(\mathbf{S}=\left(S_{nx},S_{ny},S_{nz}\right)\) for all field components (if the fields are known at staggered positions, they are first interpolated to the nodes on an auxiliary grid),
“energy conserving (or Galerkin)”: fields are interpolated from the staggered Yee grid to the macroparticles using \(\left(S_{nx-1},S_{ny},S_{nz}\right)\) for \(E_{x}\), \(\left(S_{nx},S_{ny-1},S_{nz}\right)\) for \(E_{y}\), \(\left(S_{nx},S_{ny},S_{nz-1}\right)\) for \(E_{z}\), \(\left(S_{nx},S_{ny-1},S_{nz-1}\right)\) for \(B_{x}\), \(\left(S_{nx-1},S_{ny},S_{nz-1}\right)\) for \(B{}_{y}\) and\(\left(S_{nx-1},S_{ny-1},S_{nz}\right)\) for \(B_{z}\) (if the fields are known at the nodes, they are first interpolated to the staggered positions on an auxiliary grid),
“uniform”: fields are interpolated directly form the Yee grid to the macroparticles using \(\mathbf{S}=\left(S_{nx},S_{ny},S_{nz}\right)\) for all field components (if the fields are known at the nodes, they are first interpolated to the staggered positions on an auxiliary grid).
As shown in Birdsall and Langdon [1], Hockney and Eastwood [2], Lewis [29], the momentum and energy conserving schemes conserve momentum and energy respectively at the limit of infinitesimal time steps and generally offer better conservation of the respective quantities for a finite time step. The uniform scheme does not conserve momentum nor energy in the sense defined for the others but is given for completeness, as it has been shown to offer some interesting properties in the modeling of relativistically drifting plasmas [30].
Filtering
It is common practice to apply digital filtering to the charge or current density in Particle-In-Cell simulations as a complement or an alternative to using higher order splines [1]. A commonly used filter in PIC simulations is the three points filter
where \(\phi^{f}\) is the filtered quantity. This filter is called a bilinear filter when \(\alpha=0.5\). Assuming \(\phi=e^{jkx}\) and \(\phi^{f}=g\left(\alpha,k\right)e^{jkx}\), the filter gain \(g\) is given as a function of the filtering coefficient \(\alpha\) and the wavenumber \(k\) by
The total attenuation \(G\) for \(n\) successive applications of filters of coefficients \(\alpha_{1}\)…\(\alpha_{n}\) is given by
A sharper cutoff in \(k\) space is provided by using \(\alpha_{n}=n-\sum_{i=1}^{n-1}\alpha_{i}\), so that \(G\approx1+O\left(k^{4}\right)\). Such step is called a “compensation” step [1]. For the bilinear filter (\(\alpha=1/2\)), the compensation factor is \(\alpha_{c}=2-1/2=3/2\). For a succession of \(n\) applications of the bilinear factor, it is \(\alpha_{c}=n/2+1\).
It is sometimes necessary to filter on a relatively wide band of wavelength, necessitating the application of a large number of passes of the bilinear filter or on the use of filters acting on many points. The former can become very intensive computationally while the latter is problematic for parallel computations using domain decomposition, as the footprint of the filter may eventually surpass the size of subdomains. A workaround is to use a combination of filters of limited footprint. A solution based on the combination of three point filters with various strides was proposed in Vay et al. [11] and operates as follows.
The bilinear filter provides complete suppression of the signal at the grid Nyquist wavelength (twice the grid cell size). Suppression of the signal at integer multiples of the Nyquist wavelength can be obtained by using a stride \(s\) in the filter
for which the gain is given by
For a given stride, the gain is given by the gain of the bilinear filter shifted in k space, with the pole \(g=0\) shifted from the wavelength \(\lambda=2/\Delta x\) to \(\lambda=2s/\Delta x\), with additional poles, as given by \(sk\Delta x=\arccos\left(\frac{\alpha}{\alpha-1}\right)\pmod{2\pi}\). The resulting filter is pass band between the poles, but since the poles are spread at different integer values in k space, a wide band low pass filter can be constructed by combining filters using different strides. As shown in Vay et al. [11], the successive application of 4-passes + compensation of filters with strides 1, 2 and 4 has a nearly equivalent fall-off in gain as 80 passes + compensation of a bilinear filter. Yet, the strided filter solution needs only 15 passes of a three-point filter, compared to 81 passes for an equivalent n-pass bilinear filter, yielding a gain of 5.4 in number of operations in favor of the combination of filters with stride. The width of the filter with stride 4 extends only on 9 points, compared to 81 points for a single pass equivalent filter, hence giving a gain of 9 in compactness for the stride filters combination in comparison to the single-pass filter with large stencil, resulting in more favorable scaling with the number of computational cores for parallel calculations.
C. K. Birdsall and A. B. Langdon. Plasma Physics Via Computer Simulation. Adam-Hilger, 1991. ISBN 0 07 005371 5.
R. W. Hockney and J. W. Eastwood. Computer simulation using particles. Routledge, 1988. ISBN 0-85274-392-0.
J. P. Boris. Relativistic Plasma Simulation-Optimization of a Hybrid Code. In Proc. Fourth Conf. Num. Sim. Plasmas, 3–67. Naval Res. Lab., Wash., D. C., 1970.
J.-L. Vay. Simulation Of Beams Or Plasmas Crossing At Relativistic Velocity. Physics of Plasmas, 15(5):56701, May 2008. doi:10.1063/1.2837054.
J.-L. Vay, D. P. Grote, R. H. Cohen, and A. Friedman. Novel methods in the particle-in-cell accelerator code-framework warp. Computational Science and Discovery, 5(1):014019 (20 pp.), 2012.
J.-L. Vay, J.-C. Adam, and A. Heron. Asymmetric Pml For The Absorption Of Waves. Application To Mesh Refinement In Electromagnetic Particle-In-Cell Plasma Simulations. Computer Physics Communications, 164(1-3):171–177, Dec 2004. doi:10.1016/J.Cpc.2004.06.026.
K. S. Yee. Numerical Solution Of Initial Boundary Value Problems Involving Maxwells Equations In Isotropic Media. Ieee Transactions On Antennas And Propagation, Ap14(3):302–307, 1966.
J. B. Cole. A High-Accuracy Realization Of The Yee Algorithm Using Non-Standard Finite Differences. Ieee Transactions On Microwave Theory And Techniques, 45(6):991–996, Jun 1997.
J. B. Cole. High-Accuracy Yee Algorithm Based On Nonstandard Finite Differences: New Developments And Verifications. Ieee Transactions On Antennas And Propagation, 50(9):1185–1191, Sep 2002. doi:10.1109/Tap.2002.801268.
M. Karkkainen, E. Gjonaj, T. Lau, and T. Weiland. Low-Dispersionwake Field Calculation Tools. In Proc. Of International Computational Accelerator Physics Conference, 35–40. Chamonix, France, 2006.
J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Numerical Methods For Instability Mitigation In The Modeling Of Laser Wakefield Accelerators In A Lorentz-Boosted Frame. Journal of Computational Physics, 230(15):5908–5929, Jul 2011. doi:10.1016/J.Jcp.2011.04.003.
B. M. Cowan, D. L. Bruhwiler, J. R. Cary, E. Cormier-Michel, and C. G. R. Geddes. Generalized algorithm for control of numerical dispersion in explicit time-domain electromagnetic simulations. Physical Review Special Topics-Accelerators And Beams, Apr 2013. doi:10.1103/PhysRevSTAB.16.041303.
A. Pukhov. Three-dimensional electromagnetic relativistic particle-in-cell code VLPL (Virtual Laser Plasma Lab). Journal of Plasma Physics, 61(3):425–433, Apr 1999. doi:10.1017/S0022377899007515.
R. Lehe, A. Lifschitz, C. Thaury, V. Malka, and X. Davoine. Numerical growth of emittance in simulations of laser-wakefield acceleration. Physical Review Special Topics-Accelerators And Beams, Feb 2013. doi:10.1103/PhysRevSTAB.16.021301.
I. Haber, R. Lee, H. H. Klein, and J. P. Boris. Advances In Electromagnetic Simulation Techniques. In Proc. Sixth Conf. Num. Sim. Plasmas, 46–48. Berkeley, Ca, 1973.
J.-L. Vay, I. Haber, and B. B. Godfrey. A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas. Journal of Computational Physics, 243:260–268, Jun 2013. doi:10.1016/j.jcp.2013.03.010.
J. M. Dawson. Particle Simulation Of Plasmas. Reviews Of Modern Physics, 55(2):403–447, 1983. doi:10.1103/RevModPhys.55.403.
Q. H. Liu. The PSTD Algorithm: A Time-Domain Method Requiring Only Two Cells Per Wavelength. Microwave And Optical Technology Letters, 15(3):158–165, Jun 1997. doi:10.1002/(Sici)1098-2760(19970620)15:3<158::Aid-Mop11>3.3.Co;2-T.
Y. Ohmura and Y. Okamura. Staggered Grid Pseudo-Spectral Time-Domain Method For Light Scattering Analysis. Piers Online, 6(7):632–635, 2010.
H. Vincenti and J.-L. Vay. Detailed analysis of the effects of stencil spatial variations with arbitrary high-order finite-difference Maxwell solver. Computer Physics Communications, 200:147–167, Mar 2016. URL: https://apps.webofknowledge.com/full{\_}record.do?product=UA{\&}search{\_}mode=GeneralSearch{\&}qid=1{\&}SID=1CanLFIHrQ5v8O7cxqV{\&}page=1{\&}doc=2, doi:10.1016/j.cpc.2015.11.009.
H. Abe, N. Sakairi, R. Itatani, and H. Okuda. High-Order Spline Interpolations In The Particle Simulation. Journal of Computational Physics, 63(2):247–267, Apr 1986.
A. B. Langdon. On Enforcing Gauss Law In Electromagnetic Particle-In-Cell Codes. Computer Physics Communications, 70(3):447–450, Jul 1992.
B. Marder. A Method For Incorporating Gauss Law Into Electromagnetic Pic Codes. Journal of Computational Physics, 68(1):48–55, Jan 1987.
J.-L. Vay and C. Deutsch. Charge Compensated Ion Beam Propagation In A Reactor Sized Chamber. Physics of Plasmas, 5(4):1190–1197, Apr 1998.
C. D. Munz, P. Omnes, R. Schneider, E. Sonnendrucker, and U. Voss. Divergence Correction Techniques For Maxwell Solvers Based On A Hyperbolic Model. Journal of Computational Physics, 161(2):484–511, Jul 2000. doi:10.1006/Jcph.2000.6507.
J. Villasenor and O. Buneman. Rigorous Charge Conservation For Local Electromagnetic-Field Solvers. Computer Physics Communications, 69(2-3):306–316, 1992.
T. Z. Esirkepov. Exact Charge Conservation Scheme For Particle-In-Cell Simulation With An Arbitrary Form-Factor. Computer Physics Communications, 135(2):144–153, Apr 2001.
R. L. Morse and C. W. Nielson. Numerical Simulation Of Weibel Instability In One And 2 Dimensions. Phys. Fluids, 14(4):830–&, 1971. doi:10.1063/1.1693518.
H. R. Lewis. Variational algorithms for numerical simulation of collisionless plasma with point particles including electromagnetic interactions. Journal of Computational Physics, 10(3):400–419, 1972. URL: http://www.sciencedirect.com/science/article/pii/0021999172900447, doi:http://dx.doi.org/10.1016/0021-9991(72)90044-7.
B. B. Godfrey and J.-L. Vay. Numerical stability of relativistic beam multidimensional \PIC\ simulations employing the Esirkepov algorithm. Journal of Computational Physics, 248(0):33–46, 2013. URL: http://www.sciencedirect.com/science/article/pii/S0021999113002556, doi:http://dx.doi.org/10.1016/j.jcp.2013.04.006.
Mesh refinement
![Sketches of the implementation of mesh refinement in WarpX with the electrostatic (left) and electromagnetic (right) solvers. In both cases, the charge/current from particles are deposited at the finest levels first, then interpolated recursively to coarser levels. In the electrostatic case, the potential is calculated first at the coarsest level :math:`L_0`, the solution interpolated to the boundaries of the refined patch :math:`r` at the next level :math:`L_{1}` and the potential calculated at :math:`L_1`. The procedure is repeated iteratively up to the highest level. In the electromagnetic case, the fields are computed independently on each grid and patch without interpolation at boundaries. Patches are terminated by absorbing layers (PML) to prevent the reflection of electromagnetic waves. Additional coarse patch :math:`c` and fine grid :math:`a` are needed so that the full solution is obtained by substitution on :math:`a` as :math:`F_{n+1}(a)=F_{n+1}(r)+I[F_n( s )-F_{n+1}( c )]` where :math:`F` is the field, and :math:`I` is a coarse-to-fine interpolation operator. In both cases, the field solution at a given level :math:`L_n` is unaffected by the solution at higher levels :math:`L_{n+1}` and up, allowing for mitigation of some spurious effects (see text) by providing a transition zone via extension of the patches by a few cells beyond the desired refined area (red & orange rectangles) in which the field is interpolated onto particles from the coarser parent level only.](_images/ICNSP_2011_Vay_fig1.png)
Sketches of the implementation of mesh refinement in WarpX with the electrostatic (left) and electromagnetic (right) solvers. In both cases, the charge/current from particles are deposited at the finest levels first, then interpolated recursively to coarser levels. In the electrostatic case, the potential is calculated first at the coarsest level \(L_0\), the solution interpolated to the boundaries of the refined patch \(r\) at the next level \(L_{1}\) and the potential calculated at \(L_1\). The procedure is repeated iteratively up to the highest level. In the electromagnetic case, the fields are computed independently on each grid and patch without interpolation at boundaries. Patches are terminated by absorbing layers (PML) to prevent the reflection of electromagnetic waves. Additional coarse patch \(c\) and fine grid \(a\) are needed so that the full solution is obtained by substitution on \(a\) as \(F_{n+1}(a)=F_{n+1}(r)+I[F_n( s )-F_{n+1}( c )]\) where \(F\) is the field, and \(I\) is a coarse-to-fine interpolation operator. In both cases, the field solution at a given level \(L_n\) is unaffected by the solution at higher levels \(L_{n+1}\) and up, allowing for mitigation of some spurious effects (see text) by providing a transition zone via extension of the patches by a few cells beyond the desired refined area (red & orange rectangles) in which the field is interpolated onto particles from the coarser parent level only.
The mesh refinement methods that have been implemented in WarpX were developed following the following principles: i) avoidance of spurious effects from mesh refinement, or minimization of such effects; ii) user controllability of the spurious effects’ relative magnitude; iii) simplicity of implementation. The two main generic issues that were identified are: a) spurious self-force on macroparticles close to the mesh refinement interface [1, 2]; b) reflection (and possible amplification) of short wavelength electromagnetic waves at the mesh refinement interface [3]. The two effects are due to the loss of translation invariance introduced by the asymmetry of the grid on each side of the mesh refinement interface.
In addition, for some implementations where the field that is computed at a given level is affected by the solution at finer levels, there are cases where the procedure violates the integral of Gauss’ Law around the refined patch, leading to long range errors [1, 2]. As will be shown below, in the procedure that has been developed in WarpX, the field at a given refinement level is not affected by the solution at finer levels, and is thus not affected by this type of error.
Electrostatic
A cornerstone of the Particle-In-Cell method is that given a particle lying in a hypothetical infinite grid, if the grid is regular and symmetrical, and if the order of field gathering matches the order of charge (or current) deposition, then there is no self-force of the particle acting on itself: a) anywhere if using the so-called “momentum conserving” gathering scheme; b) on average within one cell if using the “energy conserving” gathering scheme [4]. A breaking of the regularity and/or symmetry in the grid, whether it is from the use of irregular meshes or mesh refinement, and whether one uses finite difference, finite volume or finite elements, results in a net spurious self-force (which does not average to zero over one cell) for a macroparticle close to the point of irregularity (mesh refinement interface for the current purpose) [1, 2].
A sketch of the implementation of mesh refinement in WarpX is given in Fig. 29. Given the solution of the electric potential at a refinement level \(L_n\), it is interpolated onto the boundaries of the grid patch(es) at the next refined level \(L_{n+1}\). The electric potential is then computed at level \(L_{n+1}\) by solving the Poisson equation. This procedure necessitates the knowledge of the charge density at every level of refinement. For efficiency, the macroparticle charge is deposited on the highest level patch that contains them, and the charge density of each patch is added recursively to lower levels, down to the lowest.

Position history of one charged particle attracted by its image induced by a nearby metallic (dirichlet) boundary. The particle is initialized at rest. Without refinement patch (reference case), the particle is accelerated by its image, is reflected specularly at the wall, then decelerates until it reaches its initial position at rest. If the particle is initialized inside a refinement patch, the particle is initially accelerated toward the wall but is spuriously reflected before it reaches the boundary of the patch whether using the method implemented in WarpX or the MC method. Providing a surrounding transition region 2 or 4 cells wide in which the potential is interpolated from the parent coarse solution reduces significantly the effect of the spurious self-force.
The presence of the self-force is illustrated on a simple test case that was introduced in Vay et al. [1] and also used in Colella and Norgaard [2]: a single macroparticle is initialized at rest within a single refinement patch four cells away from the patch refinement boundary. The patch at level \(L_1\) has \(32\times32\) cells and is centered relative to the lowest \(64\times64\) grid at level \(L_0\) (“main grid”), while the macroparticle is centered in one direction but not in the other. The boundaries of the main grid are perfectly conducting, so that the macroparticle is attracted to the closest wall by its image. Specular reflection is applied when the particle reaches the boundary so that the motion is cyclic. The test was performed with WarpX using either linear or quadratic interpolation when gathering the main grid solution onto the refined patch boundary. It was also performed using another method from P. McCorquodale et al (labeled “MC” in this paper) based on the algorithm given in Mccorquodale et al. [5], which employs a more elaborate procedure involving two-ways interpolations between the main grid and the refined patch. A reference case was also run using a single \(128\times128\) grid with no refined patch, in which it is observed that the particle propagates toward the closest boundary at an accelerated pace, is reflected specularly at the boundary, then slows down until it reaches its initial position at zero velocity. The particle position histories are shown for the various cases in Fig. 30. In all the cases using the refinement patch, the particle was spuriously reflected near the patch boundary and was effectively trapped in the patch. We notice that linear interpolation performs better than quadratic, and that the simple method implemented in WarpX performs better than the other proposed method for this test (see discussion below).

(left) Maps of the magnitude of the spurious self-force \(\epsilon\) in arbitrary units within one quarter of the refined patch, defined as \(\epsilon=\sqrt{(E_x-E_x^{ref})^2+(E_y-E_y^{ref})^2}\), where \(E_x\) and \(E_y\) are the electric field components within the patch experienced by one particle at a given location and \(E_x^{ref}\) and \(E_y^{ref}\) are the electric field from a reference solution. The map is given for the WarpX and the MC mesh refinement algorithms and for linear and quadratic interpolation at the patch refinement boundary. (right) Lineouts of the maximum (taken over neighboring cells) of the spurious self-force. Close to the interface boundary (x=0), the spurious self-force decreases at a rate close to one order of magnitude per cell (red line), then at about one order of magnitude per six cells (green line).
The magnitude of the spurious self-force as a function of the macroparticle position was mapped and is shown in Fig. 31 for the WarpX and MC algorithms using linear or quadratic interpolations between grid levels. It is observed that the magnitude of the spurious self-force decreases rapidly with the distance between the particle and the refined patch boundary, at a rate approaching one order of magnitude per cell for the four cells closest to the boundary and about one order of magnitude per six cells beyond. The method implemented in WarpX offers a weaker spurious force on average and especially at the cells that are the closest to the coarse-fine interface where it is the largest and thus matters most. We notice that the magnitude of the spurious self-force depends strongly on the distance to the edge of the patch and to the nodes of the underlying coarse grid, but weakly on the order of deposition and size of the patch.
A method was devised and implemented in WarpX for reducing the magnitude of spurious self-forces near the coarse-fine boundaries as follows. Noting that the coarse grid solution is unaffected by the presence of the patch and is thus free of self-force, extra “transition” cells are added around the “effective” refined area. Within the effective area, the particles gather the potential in the fine grid. In the extra transition cells surrounding the refinement patch, the force is gathered directly from the coarse grid (an option, which has not yet been implemented, would be to interpolate between the coarse and fine grid field solutions within the transition zone so as to provide continuity of the force experienced by the particles at the interface). The number of cells allocated in the transition zones is controllable by the user in WarpX, giving the opportunity to check whether the spurious self-force is affecting the calculation by repeating it using different thicknesses of the transition zones. The control of the spurious force using the transition zone is illustrated in Fig. 30, where the calculation with WarpX using linear interpolation at the patch interface was repeated using either two or four cells transition regions (measured in refined patch cell units). Using two extra cells allowed for the particle to be free of spurious trapping within the refined area and follow a trajectory that is close to the reference one, and using four extra cells improved further to the point where the resulting trajectory becomes indistinguishable from the reference one. We note that an alternative method was devised for reducing the magnitude of self-force near the coarse-fine boundaries for the MC method, by using a special deposition procedure near the interface [2].
Electromagnetic
The method that is used for electrostatic mesh refinement is not directly applicable to electromagnetic calculations. As was shown in section 3.4 of Vay [3], refinement schemes relying solely on interpolation between coarse and fine patches lead to the reflection with amplification of the short wavelength modes that fall below the cutoff of the Nyquist frequency of the coarse grid. Unless these modes are damped heavily or prevented from occurring at their source, they may affect particle motion and their effect can escalate if trapped within a patch, via multiple successive reflections with amplification.
To circumvent this issue, an additional coarse patch (with the same resolution as the parent grid) is added, as shown in Fig. 29 and described in Vay et al. [6]. Both the fine and the coarse grid patches are terminated by Perfectly Matched Layers, reducing wave reflection by orders of magnitude, controllable by the user [7, 8]. The source current resulting from the motion of charged macroparticles within the refined region is accumulated on the fine patch and is then interpolated onto the coarse patch and added onto the parent grid. The process is repeated recursively from the finest level down to the coarsest. The Maxwell equations are then solved for one time interval on the entire set of grids, by default for one time step using the time step of the finest grid. The field on the coarse and fine patches only contain the contributions from the particles that have evolved within the refined area but not from the current sources outside the area. The total contribution of the field from sources within and outside the refined area is obtained by adding the field from the refined grid \(F(r)\), and adding an interpolation \(I\) of the difference between the relevant subset \(s\) of the field in the parent grid \(F(s)\) and the field of the coarse grid \(F( c )\), on an auxiliary grid \(a\), i.e. \(F(a)=F(r)+I[F(s)-F( c )]\). The field on the parent grid subset \(F(s)\) contains contributions from sources from both within and outside of the refined area. Thus, in effect, there is substitution of the coarse field resulting from sources within the patch area by its fine resolution counterpart. The operation is carried out recursively starting at the coarsest level up to the finest. An option has been implemented in which various grid levels are pushed with different time steps, given as a fixed fraction of the individual grid Courant conditions (assuming same cell aspect ratio for all grids and refinement by integer factors). In this case, the fields from the coarse levels, which are advanced less often, are interpolated in time.
The substitution method has two potential drawbacks due to the inexact cancellation between the coarse and fine patches of : (i) the remnants of ghost fixed charges created by the particles entering and leaving the patches (this effect is due to the use of the electromagnetic solver and is different from the spurious self-force that was described for the electrostatic case); (ii) if using a Maxwell solver with a low-order stencil, the electromagnetic waves traveling on each patch at slightly different velocity due to numerical dispersion. The first issue results in an effective spurious multipole field whose magnitude decreases very rapidly with the distance to the patch boundary, similarly to the spurious self-force in the electrostatic case. Hence, adding a few extra transition cells surrounding the patches mitigates this effect very effectively. The tunability of WarpX’s electromagnetic finite-difference and pseudo-spectral solvers provides the means to optimize the numerical dispersion so as to minimize the second effect for a given application, which has been demonstrated on the laser-plasma interaction test case presented in Vay et al. [6]. Both effects and their mitigation are described in more detail in Vay et al. [6].
Caustics are supported anywhere on the grid with an accuracy that is set by the local resolution, and will be adequately resolved if the grid resolution supports the necessary modes from their sources to the points of wavefront crossing. The mesh refinement method that is implemented in WarpX has the potential to provide higher efficiency than the standard use of fixed gridding, by offering a path toward adaptive gridding following wavefronts.
J.-L. Vay, P. Colella, P. Mccorquodale, B. Van Straalen, A. Friedman, and D. P. Grote. Mesh Refinement For Particle-In-Cell Plasma Simulations: Applications To And Benefits For Heavy Ion Fusion. Laser And Particle Beams, 20(4):569–575, Dec 2002. doi:10.1017/S0263034602204139.
P. Colella and P. C. Norgaard. Controlling Self-Force Errors At Refinement Boundaries For Amr-Pic. Journal of Computational Physics, 229(4):947–957, Feb 2010. doi:10.1016/J.Jcp.2009.07.004.
J.-L. Vay. An Extended Fdtd Scheme For The Wave Equation: Application To Multiscale Electromagnetic Simulation. Journal of Computational Physics, 167(1):72–98, Feb 2001.
C. K. Birdsall and A. B. Langdon. Plasma Physics Via Computer Simulation. Adam-Hilger, 1991. ISBN 0 07 005371 5.
P. Mccorquodale, P. Colella, D. P. Grote, and J.-L. Vay. A Node-Centered Local Refinement Algorithm For Poisson's Equation In Complex Geometries. Journal of Computational Physics, 201(1):34–60, Nov 2004. doi:10.1016/J.Jcp.2004.04.022.
J.-L. Vay, J.-C. Adam, and A. Heron. Asymmetric Pml For The Absorption Of Waves. Application To Mesh Refinement In Electromagnetic Particle-In-Cell Plasma Simulations. Computer Physics Communications, 164(1-3):171–177, Dec 2004. doi:10.1016/J.Cpc.2004.06.026.
J. P. Berenger. Three-Dimensional Perfectly Matched Layer For The Absorption Of Electromagnetic Waves. Journal of Computational Physics, 127(2):363–379, Sep 1996.
J.-L. Vay. Asymmetric Perfectly Matched Layer For The Absorption Of Waves. Journal of Computational Physics, 183(2):367–399, Dec 2002. doi:10.1006/Jcph.2002.7175.
Boundary conditions
Perfectly Matched Layer: open boundary condition for electromagnetic waves
For the transverse electric (TE) case, the original Berenger’s Perfectly Matched Layer (PML) paper [1] writes
This can be generalized to
For \(c_{x}=c_{y}=c^{*}_{x}=c^{*}_{y}=c\) and \(\overline{\sigma }_{x}=\overline{\sigma }_{y}=\overline{\sigma }_{x}^{*}=\overline{\sigma }_{y}^{*}=0\), this system reduces to the Berenger PML medium, while adding the additional constraint \(\sigma _{x}=\sigma _{y}=\sigma _{x}^{*}=\sigma _{y}^{*}=0\) leads to the system of Maxwell equations in vacuum.
Propagation of a Plane Wave in an APML Medium
We consider a plane wave of magnitude (\(E_{0},H_{zx0},H_{zy0}\)) and pulsation \(\omega\) propagating in the APML medium with an angle \(\varphi\) relative to the x axis
where \(\alpha\) and \(\beta\) are two complex constants to be determined.
Introducing Eqs. (36), (37), (38) and (39) into Eqs. (31), (32), (33) and (34) gives
Defining \(Z=E_{0}/\left( H_{zx0}+H_{zy0}\right)\) and using Eqs. (40) and (41), we get
Adding \(H_{zx0}\) and \(H_{zy0}\) from Eqs. (42) and (43) and substituting the expressions for \(\alpha\) and \(\beta\) from Eqs. (44) and (45) yields
If \(c_{x}=c^{*}_{x}\), \(c_{y}=c^{*}_{y}\), \(\overline{\sigma }_{x}=\overline{\sigma }^{*}_{x}\), \(\overline{\sigma }_{y}=\overline{\sigma }^{*}_{y}\), \(\frac{\sigma _{x}}{\varepsilon _{0}}=\frac{\sigma ^{*}_{x}}{\mu _{0}}\) and \(\frac{\sigma _{y}}{\varepsilon _{0}}=\frac{\sigma ^{*}_{y}}{\mu _{0}}\) then
which is the impedance of vacuum. Hence, like the PML, given some restrictions on the parameters, the APML does not generate any reflection at any angle and any frequency. As for the PML, this property is not retained after discretization, as shown subsequently.
Calling \(\psi\) any component of the field and \(\psi _{0}\) its magnitude, we get from Eqs. (36), (44), (45) and (46) that
We assume that we have an APML layer of thickness \(\delta\) (measured along \(x\)) and that \(\sigma _{y}=\overline{\sigma }_{y}=0\) and \(c_{y}=c.\) Using (47), we determine that the coefficient of reflection given by this layer is
which happens to be the same as the PML theoretical coefficient of reflection if we assume \(c_{x}=c\). Hence, it follows that for the purpose of wave absorption, the term \(\overline{\sigma }_{x}\) seems to be of no interest. However, although this conclusion is true at the infinitesimal limit, it does not hold for the discretized counterpart.
Discretization
In the following we set \(\varepsilon_0 = \mu_0 = 1\). We discretize Eqs. (26), (27), (28), and (29) to obtain
and this can be solved to obtain the following leapfrog integration equations
If we account for higher order \(\Delta t\) terms, a better approximation is given by
More generally, this becomes
If we set
then this becomes
When the generalized conductivities are zero, the update equations are
as expected.
Perfect Electrical Conductor
This boundary can be used to model a dielectric or metallic surface. For the electromagnetic solve, at PEC, the tangential electric field and the normal magnetic field are set to 0. In the guard-cell region, the tangential electric field is set equal and opposite to the respective field component in the mirror location across the PEC boundary, and the normal electric field is set equal to the field component in the mirror location in the domain across the PEC boundary. Similarly, the tangential (and normal) magnetic field components are set equal (and opposite) to the respective magnetic field components in the mirror locations across the PEC boundary.
The PEC boundary condition also impacts the deposition of charge and current density. On the boundary the charge density and parallel current density is set to zero. If a reflecting boundary condition is used for the particles, density overlapping with the PEC will be reflected back into the domain (for both charge and current density). If absorbing boundaries are used, an image charge (equal weight but opposite charge) is considered in the mirror location accross the boundary, and the density from that charge is also deposited in the simulation domain. Fig. 32 shows the effect of this. The left boundary is absorbing while the right boundary is reflecting.

PEC boundary current deposition along the x
-axis. The left boundary is absorbing while the right boundary is reflecting.
J. P. Berenger. A Perfectly Matched Layer For The Absorption Of Electromagnetic-Waves. Journal of Computational Physics, 114(2):185–200, Oct 1994.
Moving window and optimal Lorentz boosted frame
The simulations of plasma accelerators from first principles are extremely computationally intensive, due to the need to resolve the evolution of a driver (laser or particle beam) and an accelerated particle beam into a plasma structure that is orders of magnitude longer and wider than the accelerated beam. As is customary in the modeling of particle beam dynamics in standard particle accelerators, a moving window is commonly used to follow the driver, the wake and the accelerated beam. This results in huge savings, by avoiding the meshing of the entire plasma that is orders of magnitude longer than the other length scales of interest.
![[fig:Boosted-frame] A first principle simulation of a short driver beam (laser or charged particles) propagating through a plasma that is orders of magnitude longer necessitates a very large number of time steps. Recasting the simulation in a frame of reference that is moving close to the speed of light in the direction of the driver beam leads to simulating a driver beam that appears longer propagating through a plasma that appears shorter than in the laboratory. Thus, this relativistic transformation of space and time reduces the disparity of scales, and thereby the number of time steps to complete the simulation, by orders of magnitude.](_images/Boosted_frame.png)
A first principle simulation of a short driver beam (laser or charged particles) propagating through a plasma that is orders of magnitude longer necessitates a very large number of time steps. Recasting the simulation in a frame of reference that is moving close to the speed of light in the direction of the driver beam leads to simulating a driver beam that appears longer propagating through a plasma that appears shorter than in the laboratory. Thus, this relativistic transformation of space and time reduces the disparity of scales, and thereby the number of time steps to complete the simulation, by orders of magnitude.
Even using a moving window, however, a full PIC simulation of a plasma accelerator can be extraordinarily demanding computationally, as many time steps are needed to resolve the crossing of the short driver beam with the plasma column. As it turns out, choosing an optimal frame of reference that travels close to the speed of light in the direction of the laser or particle beam (as opposed to the usual choice of the laboratory frame) enables speedups by orders of magnitude [1, 2]. This is a result of the properties of Lorentz contraction and dilation of space and time. In the frame of the laboratory, a very short driver (laser or particle) beam propagates through a much longer plasma column, necessitating millions to tens of millions of time steps for parameters in the range of the BELLA or FACET-II experiments. As sketched in Fig. 33, in a frame moving with the driver beam in the plasma at velocity \(v=\beta c\) (where \(c\) is the speed of light in vacuum), the beam length is now elongated by \(\approx(1+\beta)\gamma\) while the plasma contracts by \(\gamma\) (where \(\gamma=1/\sqrt{1-\beta^2}\) is the relativistic factor associated with the frame velocity). The number of time steps that is needed to simulate a “longer” beam through a “shorter” plasma is now reduced by up to \(\approx(1+\beta) \gamma^2\) (a detailed derivation of the speedup is given below).
The modeling of a plasma acceleration stage in a boosted frame involves the fully electromagnetic modeling of a plasma propagating at near the speed of light, for which Numerical Cerenkov [3, 4] is a potential issue, as explained in more details below. In addition, for a frame of reference moving in the direction of the accelerated beam (or equivalently the wake of the laser), waves emitted by the plasma in the forward direction expand while the ones emitted in the backward direction contract, following the properties of the Lorentz transformation. If one had to resolve both forward and backward propagating waves emitted from the plasma, there would be no gain in selecting a frame different from the laboratory frame. However, the physics of interest for a laser wakefield is the laser driving the wake, the wake, and the accelerated beam. Backscatter is weak in the short-pulse regime, and does not interact as strongly with the beam as do the forward propagating waves which stay in phase for a long period. It is thus often assumed that the backward propagating waves can be neglected in the modeling of plasma accelerator stages. The accuracy of this assumption has been demonstrated by comparison between explicit codes which include both forward and backward waves and envelope or quasistatic codes which neglect backward waves [5, 6, 7].
Theoretical speedup dependency with the frame boost
The derivation that is given here reproduces the one given in Vay et al. [2], where the obtainable speedup is derived as an extension of the formula that was derived earlier [1], taking in addition into account the group velocity of the laser as it traverses the plasma.
Assuming that the simulation box is a fixed number of plasma periods long, which implies the use (which is standard) of a moving window following the wake and accelerated beam, the speedup is given by the ratio of the time taken by the laser pulse and the plasma to cross each other, divided by the shortest time scale of interest, that is the laser period. To first order, the wake velocity \(v_w\) is set by the 1D group velocity of the laser driver, which in the linear (low intensity) limit, is given by [8]:
where \(\omega_p=\sqrt{(n_e e^2)/(\epsilon_0 m_e)}\) is the plasma frequency, \(\omega=2\pi c/\lambda\) is the laser frequency, \(n_e\) is the plasma density, \(\lambda\) is the laser wavelength in vacuum, \(\epsilon_0\) is the permittivity of vacuum, \(c\) is the speed of light in vacuum, and \(e\) and \(m_e\) are respectively the charge and mass of the electron.
In practice, the runs are typically stopped when the last electron beam macro-particle exits the plasma, and a measure of the total time of the simulation is then given by
where \(\lambda_p\approx 2\pi c/\omega_p\) is the wake wavelength, \(L\) is the plasma length, \(v_w\) and \(v_p=\beta_p c\) are respectively the velocity of the wake and of the plasma relative to the frame of reference, and \(\eta\) is an adjustable parameter for taking into account the fraction of the wake which exited the plasma at the end of the simulation. For a beam injected into the \(n^{th}\) bucket, \(\eta\) would be set to \(n-1/2\). If positrons were considered, they would be injected half a wake period ahead of the location of the electrons injection position for a given period, and one would have \(\eta=n-1\). The numerical cost \(R_t\) scales as the ratio of the total time to the shortest timescale of interest, which is the inverse of the laser frequency, and is thus given by
In the laboratory, \(v_p=0\) and the expression simplifies to
In a frame moving at \(\beta c\), the quantities become
where \(\gamma=1/\sqrt{1-\beta^2}\).
The expected speedup from performing the simulation in a boosted frame is given by the ratio of \(R_{lab}\) and \(R_t^*\)
We note that assuming that \(\beta_w\approx1\) (which is a valid approximation for most practical cases of interest) and that \(\gamma<<\gamma_w\), this expression is consistent with the expression derived earlier [1] for the laser-plasma acceleration case, which states that \(R_t^*=\alpha R_t/\left(1+\beta\right)\) with \(\alpha=\left(1-\beta+l/L\right)/\left(1+l/L\right)\), where \(l\) is the laser length which is generally proportional to \(\eta \lambda_p\), and \(S=R_t/R_T^*\). However, higher values of \(\gamma\) are of interest for maximum speedup, as shown below.
For intense lasers (\(a\sim 1\)) typically used for acceleration, the energy gain is limited by dephasing [9], which occurs over a scale length \(L_d \sim \lambda_p^3/2\lambda^2\). Acceleration is compromised beyond \(L_d\) and in practice, the plasma length is proportional to the dephasing length, i.e. \(L= \xi L_d\). In most cases, \(\gamma_w^2>>1\), which allows the approximations \(\beta_w\approx1-\lambda^2/2\lambda_p^2\), and \(L=\xi \lambda_p^3/2\lambda^2\approx \xi \gamma_w^2 \lambda_p/2>>\eta \lambda_p\), so that Eq.(48) becomes
For low values of \(\gamma\), i.e. when \(\gamma<<\gamma_w\), Eq.(49) reduces to
Conversely, if \(\gamma\rightarrow\infty\), Eq.(Eq_scaling1d) becomes
Finally, in the frame of the wake, i.e. when \(\gamma=\gamma_w\), assuming that \(\beta_w\approx1\), Eq.(49) gives
Since \(\eta\) and \(\xi\) are of order unity, and the practical regimes of most interest satisfy \(\gamma_w^2>>1\), the speedup that is obtained by using the frame of the wake will be near the maximum obtainable value given by Eq.(51).
Note that without the use of a moving window, the relativistic effects that are at play in the time domain would also be at play in the spatial domain [1], and the \(\gamma^2\) scaling would transform to \(\gamma^4\). Hence, it is important to use a moving window even in simulations in a Lorentz boosted frame. For very high values of the boosted frame, the optimal velocity of the moving window may vanish (i.e. no moving window) or even reverse.
Numerical Stability and alternate formulation in a Galilean frame
The numerical Cherenkov instability (NCI) [10] is the most serious numerical instability affecting multidimensional PIC simulations of relativistic particle beams and streaming plasmas [11, 12, 13, 14, 15, 16]. It arises from coupling between possibly numerically distorted electromagnetic modes and spurious beam modes, the latter due to the mismatch between the Lagrangian treatment of particles and the Eulerian treatment of fields [17].
In recent papers the electromagnetic dispersion relations for the numerical Cherenkov instability were derived and solved for both FDTD [15, 18] and PSATD [19, 20] algorithms.
Several solutions have been proposed to mitigate the NCI [19, 20, 21, 22, 23, 24]. Although these solutions efficiently reduce the numerical instability, they typically introduce either strong smoothing of the currents and fields, or arbitrary numerical corrections, which are tuned specifically against the NCI and go beyond the natural discretization of the underlying physical equation. Therefore, it is sometimes unclear to what extent these added corrections could impact the physics at stake for a given resolution.
For instance, NCI-specific corrections include periodically smoothing the electromagnetic field components [11], using a special time step [12, 13] or applying a wide-band smoothing of the current components [12, 13, 25]. Another set of mitigation methods involve scaling the deposited currents by a carefully-designed wavenumber-dependent factor [18, 20] or slightly modifying the ratio of electric and magnetic fields (\(E/B\)) before gathering their value onto the macroparticles [19, 22]. Yet another set of NCI-specific corrections [23, 24] consists in combining a small timestep \(\Delta t\), a sharp low-pass spatial filter, and a spectral or high-order scheme that is tuned so as to create a small, artificial “bump” in the dispersion relation [23]. While most mitigation methods have only been applied to Cartesian geometry, this last set of methods [23, 24] has the remarkable property that it can be applied [24] to both Cartesian geometry and quasi-cylindrical geometry (i.e. cylindrical geometry with azimuthal Fourier decomposition [26, 27, 28]). However, the use of a small timestep proportionally slows down the progress of the simulation, and the artificial “bump” is again an arbitrary correction that departs from the underlying physics.
A new scheme was recently proposed, in Kirchen et al. [29], Lehe et al. [30], which completely eliminates the NCI for a plasma drifting at a uniform relativistic velocity – with no arbitrary correction – by simply integrating the PIC equations in Galilean coordinates (also known as comoving coordinates). More precisely, in the new method, the Maxwell equations in Galilean coordinates are integrated analytically, using only natural hypotheses, within the PSATD framework (Pseudo-Spectral-Analytical-Time-Domain [4, 31]).
The idea of the proposed scheme is to perform a Galilean change of coordinates, and to carry out the simulation in the new coordinates:
where \(\boldsymbol{x} = x\,\boldsymbol{u}_x + y\,\boldsymbol{u}_y + z\,\boldsymbol{u}_z\) and \(\boldsymbol{x}' = x'\,\boldsymbol{u}_x + y'\,\boldsymbol{u}_y + z'\,\boldsymbol{u}_z\) are the position vectors in the standard and Galilean coordinates respectively.
When choosing \(\boldsymbol{v}_{gal}= \boldsymbol{v}_0\), where \(\boldsymbol{v}_0\) is the speed of the bulk of the relativistic plasma, the plasma does not move with respect to the grid in the Galilean coordinates \(\boldsymbol{x}'\) – or, equivalently, in the standard coordinates \(\boldsymbol{x}\), the grid moves along with the plasma. The heuristic intuition behind this scheme is that these coordinates should prevent the discrepancy between the Lagrangian and Eulerian point of view, which gives rise to the NCI [17].
An important remark is that the Galilean change of coordinates in Eq. (53) is a simple translation. Thus, when used in the context of Lorentz-boosted simulations, it does of course preserve the relativistic dilatation of space and time which gives rise to the characteristic computational speedup of the boosted-frame technique.
Another important remark is that the Galilean scheme is not equivalent to a moving window (and in fact the Galilean scheme can be independently combined with a moving window). Whereas in a moving window, gridpoints are added and removed so as to effectively translate the boundaries, in the Galilean scheme the gridpoints themselves are not only translated but in this case, the physical equations are modified accordingly. Most importantly, the assumed time evolution of the current \(\boldsymbol{J}\) within one timestep is different in a standard PSATD scheme with moving window and in a Galilean PSATD scheme [30].
In the Galilean coordinates \(\boldsymbol{x}'\), the equations of particle motion and the Maxwell equations take the form
where \(\boldsymbol{\nabla'}\) denotes a spatial derivative with respect to the Galilean coordinates \(\boldsymbol{x}'\).
Integrating these equations from \(t=n\Delta t\) to \(t=(n+1)\Delta t\) results in the following update equations (see Lehe et al. [30] for the details of the derivation):
where we used the short-hand notations \(\mathbf{\tilde{E}}^n \equiv \mathbf{\tilde{E}}(\boldsymbol{k}, n\Delta t)\), \(\mathbf{\tilde{B}}^n \equiv \mathbf{\tilde{B}}(\boldsymbol{k}, n\Delta t)\) as well as:
Note that, in the limit \(\boldsymbol{v}_{gal}=\boldsymbol{0}\), Eqs. (58) and (59) reduce to the standard PSATD equations [4], as expected. As shown in Kirchen et al. [29], Lehe et al. [30], the elimination of the NCI with the new Galilean integration is verified empirically via PIC simulations of uniform drifting plasmas and laser-driven plasma acceleration stages, and confirmed by a theoretical analysis of the instability.
J.-L. Vay. Noninvariance Of Space- And Time-Scale Ranges Under A Lorentz Transformation And The Implications For The Study Of Relativistic Interactions. Physical Review Letters, 98(13):130405/1–4, 2007.
J.-L. Vay, C. G. R. Geddes, E. Esarey, C. B. Schroeder, W. P. Leemans, E. Cormier-Michel, and D. P. Grote. Modeling Of 10 GeV-1 TeV Laser-Plasma Accelerators Using Lorentz Boosted Simulations. Physics of Plasmas, Dec 2011. doi:10.1063/1.3663841.
J. P. Boris and R. Lee. Nonphysical Self Forces In Some Electromagnetic Plasma-Simulation Algorithms. Journal of Computational Physics, 12(1):131–136, 1973.
I. Haber, R. Lee, H. H. Klein, and J. P. Boris. Advances In Electromagnetic Simulation Techniques. In Proc. Sixth Conf. Num. Sim. Plasmas, 46–48. Berkeley, Ca, 1973.
C. G. R. Geddes, D. L. Bruhwiler, J. R. Cary, W. B. Mori, J.-L. Vay, S. F. Martins, T. Katsouleas, E. Cormier-Michel, W. M. Fawley, C. Huang, X. Wang, B. Cowan, V. K. Decyk, E. Esarey, R. A. Fonseca, W. Lu, P. Messmer, P. Mullowney, K. Nakamura, K. Paul, G. R. Plateau, C. B. Schroeder, L. O. Silva, C. Toth, F. S. Tsung, M. Tzoufras, T. Antonsen, J. Vieira, and W. P. Leemans. Computational Studies And Optimization Of Wakefield Accelerators. In Journal of Physics: Conference Series, volume 125, 012002 (11 Pp.). 2008.
C. G. R. Geddes, E. Cormier-Michel, E. Esarey, C. B. Schroeder, and W. P. Leemans. Scaled Simulation Design Of High Quality Laser Wakefield Accelerator Stages. In Proc. Particle Accelerator Conference. Vancouver, Canada, 2009.
B. Cowan, D. Bruhwiler, E. Cormier-Michel, E. Esarey, C. G. R. Geddes, P. Messmer, and K. Paul. Laser Wakefield Simulation Using A Speed-Of-Light Frame Envelope Model. In Aip Conference Proceedings, volume 1086, 309–314. 2009.
E. Esarey, C. B. Schroeder, and W. P. Leemans. Physics Of Laser-Driven Plasma-Based Electron Accelerators. Rev. Mod. Phys., 81(3):1229–1285, 2009. doi:10.1103/Revmodphys.81.1229.
C. B. Schroeder, C. Benedetti, E. Esarey, and W. P. Leemans. Nonlinear Pulse Propagation And Phase Velocity Of Laser-Driven Plasma Waves. Physical Review Letters, 106(13):135002, Mar 2011. doi:10.1103/Physrevlett.106.135002.
B. B. Godfrey. Numerical Cherenkov Instabilities In Electromagnetic Particle Codes. Journal of Computational Physics, 15(4):504–521, 1974.
S. F. Martins, R. A. Fonseca, L. O. Silva, W. Lu, and W. B. Mori. Numerical Simulations Of Laser Wakefield Accelerators In Optimal Lorentz Frames. Computer Physics Communications, 181(5):869–875, May 2010. doi:10.1016/J.Cpc.2009.12.023.
J.‐L. Vay, C. G. R. Geddes, C. Benedetti, D. L. Bruhwiler, E. Cormier‐Michel, B. M. Cowan, J. R. Cary, and D. P. Grote. Modeling Laser Wakefield Accelerators In A Lorentz Boosted Frame. AIP Conference Proceedings, 1299(1):244–249, Nov 2010. URL: https://doi.org/10.1063/1.3520322, arXiv:https://pubs.aip.org/aip/acp/article-pdf/1299/1/244/11928106/244\_1\_online.pdf, doi:10.1063/1.3520322.
J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Numerical Methods For Instability Mitigation In The Modeling Of Laser Wakefield Accelerators In A Lorentz-Boosted Frame. Journal of Computational Physics, 230(15):5908–5929, Jul 2011. doi:10.1016/J.Jcp.2011.04.003.
L. Sironi and A. Spitkovsky. No Title. 2011.
B. B. Godfrey and J.-L. Vay. Numerical stability of relativistic beam multidimensional \PIC\ simulations employing the Esirkepov algorithm. Journal of Computational Physics, 248(0):33–46, 2013. URL: http://www.sciencedirect.com/science/article/pii/S0021999113002556, doi:http://dx.doi.org/10.1016/j.jcp.2013.04.006.
X. Xu, P. Yu, S. F. Martins, F. S. Tsung, V. K. Decyk, J. Vieira, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Numerical instability due to relativistic plasma drift in EM-PIC simulations. Computer Physics Communications, 184(11):2503–2514, 2013. URL: http://www.sciencedirect.com/science/article/pii/S0010465513002312, doi:http://dx.doi.org/10.1016/j.cpc.2013.07.003.
B. B. Godfrey. Canonical Momenta And Numerical Instabilities In Particle Codes. Journal of Computational Physics, 19(1):58–76, 1975.
B. B. Godfrey and J.-L. Vay. Suppressing the numerical Cherenkov instability in FDTD PIC codes. Journal of Computational Physics, 267:1–6, 2014.
B. B. Godfrey, J.-L. Vay, and I. Haber. Numerical stability analysis of the pseudo-spectral analytical time-domain PIC algorithm. Journal of Computational Physics, 258:689–704, 2014.
B. B. Godfrey, J.-L. Vay, and I. Haber. Numerical stability improvements for the pseudospectral EM PIC algorithm. IEEE Transactions on Plasma Science, 42(5):1339–1344, 2014.
B. B. Godfrey, J.-L. Vay, and I. Haber. Numerical stability analysis of the pseudo-spectral analytical time-domain \PIC\ algorithm. Journal of Computational Physics, 258(0):689–704, 2014. URL: http://www.sciencedirect.com/science/article/pii/S0021999113007298, doi:http://dx.doi.org/10.1016/j.jcp.2013.10.053.
B. B. Godfrey and J.-L. Vay. Improved numerical Cherenkov instability suppression in the generalized PSTD PIC algorithm. Computer Physics Communications, 196:221–225, 2015.
P. Yu, X. Xu, V. K. Decyk, F. Fiuza, J. Vieira, F. S. Tsung, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Elimination of the numerical Cerenkov instability for spectral EM-PIC codes. Computer Physics Communications, 192:32–47, Jul 2015. URL: https://apps.webofknowledge.com/full{\_}record.do?product=UA{\&}search{\_}mode=GeneralSearch{\&}qid=2{\&}SID=1CanLFIHrQ5v8O7cxqV{\&}page=1{\&}doc=3, doi:10.1016/j.cpc.2015.02.018.
P. Yu, X. Xu, A. Tableman, V. K. Decyk, F. S. Tsung, F. Fiuza, A. Davidson, J. Vieira, R. A. Fonseca, W. Lu, L. O. Silva, and W. B. Mori. Mitigation of numerical Cerenkov radiation and instability using a hybrid finite difference-FFT Maxwell solver and a local charge conserving current deposit. Computer Physics Communications, 197:144–152, Dec 2015. URL: https://apps.webofknowledge.com/full{\_}record.do?product=UA{\&}search{\_}mode=GeneralSearch{\&}qid=2{\&}SID=1CanLFIHrQ5v8O7cxqV{\&}page=1{\&}doc=2, doi:10.1016/j.cpc.2015.08.026.
J.-L. Vay, C. G. R. Geddes, E. Cormier-Michel, and D. P. Grote. Effects of hyperbolic rotation in Minkowski space on the modeling of plasma accelerators in a Lorentz boosted frame. Physics of Plasmas, 18(3):030701, Mar 2011. URL: https://doi.org/10.1063/1.3559483, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.3559483/16019930/030701\_1\_online.pdf, doi:10.1063/1.3559483.
A. F. Lifschitz, X. Davoine, E. Lefebvre, J. Faure, C. Rechatin, and V. Malka. Particle-in-Cell modelling of laser-plasma interaction using Fourier decomposition. Journal of Computational Physics, 228(5):1803–1814, 2009. URL: http://www.sciencedirect.com/science/article/pii/S0021999108005950, doi:http://dx.doi.org/10.1016/j.jcp.2008.11.017.
A. Davidson, A. Tableman, W. An, F. S. Tsung, W. Lu, J. Vieira, R. A. Fonseca, L. O. Silva, and W. B. Mori. Implementation of a hybrid particle code with a PIC description in r–z and a gridless description in \Phi into OSIRIS. Journal of Computational Physics, 281:1063–1077, 2015. doi:10.1016/j.jcp.2014.10.064.
R. Lehe, M. Kirchen, I. A. Andriyash, B. B. Godfrey, and J.-L. Vay. A spectral, quasi-cylindrical and dispersion-free Particle-In-Cell algorithm. Computer Physics Communications, 203:66–82, 2016. doi:10.1016/j.cpc.2016.02.007.
M. Kirchen, R. Lehe, B. B. Godfrey, I. Dornmair, S. Jalas, K. Peters, J.-L. Vay, and A. R. Maier. Stable discrete representation of relativistically drifting plasmas. Physics of Plasmas, 23(10):100704, Oct 2016. URL: https://doi.org/10.1063/1.4964770, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.4964770/14024121/100704\_1\_online.pdf, doi:10.1063/1.4964770.
R. Lehe, M. Kirchen, B. B. Godfrey, A. R. Maier, and J.-L. Vay. Elimination of numerical Cherenkov instability in flowing-plasma particle-in-cell simulations by using Galilean coordinates. Phys. Rev. E, 94:053305, Nov 2016. URL: https://link.aps.org/doi/10.1103/PhysRevE.94.053305, doi:10.1103/PhysRevE.94.053305.
J.-L. Vay, I. Haber, and B. B. Godfrey. A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas. Journal of Computational Physics, 243:260–268, Jun 2013. doi:10.1016/j.jcp.2013.03.010.
Inputs and Outputs
Initialization of the plasma columns and drivers (laser or particle beam) is performed via the specification of multidimensional functions that describe the initial state with, if needed, a time dependence, or from reconstruction of distributions based on experimental data. Care is needed when initializing quantities in parallel to avoid double counting and ensure smoothness of the distributions at the interface of computational domains. When the sum of the initial distributions of charged particles is not charge neutral, initial fields are computed using generally a static approximation with Poisson solves accompanied by proper relativistic scalings [1, 2].
Outputs include dumps of particle and field quantities at regular intervals, histories of particle distributions moments, spectra, etc, and plots of the various quantities. In parallel simulations, the diagnostic subroutines need to handle additional complexity from the domain decomposition, as well as large amount of data that may necessitate data reduction in some form before saving to disk.
Simulations in a Lorentz boosted frame require additional considerations, as described below.
Inputs and outputs in a boosted frame simulation

(top) Snapshot of a particle beam showing “frozen” (grey spheres) and “active” (colored spheres) macroparticles traversing the injection plane (red rectangle). (bottom) Snapshot of the beam macroparticles (colored spheres) passing through the background of electrons (dark brown streamlines) and the diagnostic stations (red rectangles). The electrons, the injection plane and the diagnostic stations are fixed in the laboratory plane, and are thus counter-propagating to the beam in a boosted frame.
The input and output data are often known from, or compared to, experimental data. Thus, calculating in a frame other than the laboratory entails transformations of the data between the calculation frame and the laboratory frame. This section describes the procedures that have been implemented in the Particle-In-Cell framework Warp [3] to handle the input and output of data between the frame of calculation and the laboratory frame [4]. Simultaneity of events between two frames is valid only for a plane that is perpendicular to the relative motion of the frame. As a result, the input/output processes involve the input of data (particles or fields) through a plane, as well as output through a series of planes, all of which are perpendicular to the direction of the relative velocity between the frame of calculation and the other frame of choice.
Input in a boosted frame simulation
Particles -
Particles are launched through a plane using a technique that is generic and applies to Lorentz boosted frame simulations in general, including plasma acceleration, and is illustrated using the case of a positively charged particle beam propagating through a background of cold electrons in an assumed continuous transverse focusing system, leading to a well-known growing transverse “electron cloud” instability [5]. In the laboratory frame, the electron background is initially at rest and a moving window is used to follow the beam progression. Traditionally, the beam macroparticles are initialized all at once in the window, while background electron macroparticles are created continuously in front of the beam on a plane that is perpendicular to the beam velocity. In a frame moving at some fraction of the beam velocity in the laboratory frame, the beam initial conditions at a given time in the calculation frame are generally unknown and one must initialize the beam differently. However, it can be taken advantage of the fact that the beam initial conditions are often known for a given plane in the laboratory, either directly, or via simple calculation or projection from the conditions at a given time in the labortory frame. Given the position and velocity \(\{x,y,z,v_x,v_y,v_z\}\) for each beam macroparticle at time \(t=0\) for a beam moving at the average velocity \(v_b=\beta_b c\) (where \(c\) is the speed of light) in the laboratory, and using the standard synchronization (\(z=z'=0\) at \(t=t'=0\)) between the laboratory and the calculation frames, the procedure for transforming the beam quantities for injection in a boosted frame moving at velocity \(\beta c\) in the laboratory is as follows (the superscript \('\) relates to quantities known in the boosted frame while the superscript \(^*\) relates to quantities that are know at a given longitudinal position \(z^*\) but different times of arrival):
project positions at \(z^*=0\) assuming ballistic propagation
\[\begin{split}\begin{aligned} t^* &= \left(z-\bar{z}\right)/v_z \label{Eq:t*}\\ x^* &= x-v_x t^* \label{Eq:x*}\\ y^* &= y-v_y t^* \label{Eq:y*}\\ z^* &= 0 \label{Eq:z*}\end{aligned}\end{split}\]the velocity components being left unchanged,
apply Lorentz transformation from laboratory frame to boosted frame
\[\begin{split}\begin{aligned} t'^* &= -\gamma t^* \label{Eq:tp*}\\ x'^* &= x^* \label{Eq:xp*}\\ y'^* &= y^* \label{Eq:yp*}\\ z'^* &= \gamma\beta c t^* \label{Eq:zp*}\\ v'^*_x&=\frac{v_x^*}{\gamma\left(1-\beta \beta_b\right)} \label{Eq:vxp*}\\ v'^*_y&=\frac{v_y^*}{\gamma\left(1-\beta \beta_b\right)} \label{Eq:vyp*}\\ v'^*_z&=\frac{v_z^*-\beta c}{1-\beta \beta_b} \label{Eq:vzp*}\end{aligned}\end{split}\]where \(\gamma=1/\sqrt{1-\beta^2}\). With the knowledge of the time at which each beam macroparticle crosses the plane into consideration, one can inject each beam macroparticle in the simulation at the appropriate location and time.
synchronize macroparticles in boosted frame, obtaining their positions at a fixed \(t'=0\) (before any particle is injected)
\[\begin{aligned} z' &= z'^*-\bar{v}'^*_z t'^* \label{Eq:zp}\end{aligned}\]This additional step is needed for setting the electrostatic or electromagnetic fields at the plane of injection. In a Particle-In-Cell code, the three-dimensional fields are calculated by solving the Maxwell equations (or static approximation like Poisson, Darwin or other [1]) on a grid on which the source term is obtained from the macroparticles distribution. This requires generation of a three-dimensional representation of the beam distribution of macroparticles at a given time before they cross the injection plane at \(z'^*\). This is accomplished by expanding the beam distribution longitudinally such that all macroparticles (so far known at different times of arrival at the injection plane) are synchronized to the same time in the boosted frame. To keep the beam shape constant, the particles are “frozen” until they cross that plane: the three velocity components and the two position components perpendicular to the boosted frame velocity are kept constant, while the remaining position component is advanced at the average beam velocity. As particles cross the plane of injection, they become regular “active” particles with full 6-D dynamics.
A snapshot of a beam that has passed partly through the injection plane in shown in Fig. 34 (top). As the frozen beam macroparticles pass through the injection plane (which moves opposite to the beam in the boosted frame), they are converted to “active” macroparticles. The charge or current density is accumulated from the active and the frozen particles, thus ensuring that the fields at the plane of injection are consistent.
Laser -
Similarly to the particle beam, the laser is injected through a plane perpendicular to the axis of propagation of the laser (by default \(z\)). The electric field \(E_\perp\) that is to be emitted is given by the formula
where \(E_0\) is the amplitude of the laser electric field, \(f\left(x,y,t\right)\) is the laser envelope, \(\omega\) is the laser frequency, \(\phi\left(x,y,\omega\right)\) is a phase function to account for focusing, defocusing or injection at an angle, and \(t\) is time. By default, the laser envelope is a three-dimensional gaussian of the form
where \(\sigma_x\), \(\sigma_y\) and \(\sigma_z\) are the dimensions of the laser pulse; or it can be defined arbitrarily by the user at runtime. If \(\phi\left(x,y,\omega\right)=0\), the laser is injected at a waist and parallel to the axis \(z\).
If, for convenience, the injection plane is moving at constant velocity \(\beta_s c\), the formula is modified to take the Doppler effect on frequency and amplitude into account and becomes
The injection of a laser of frequency \(\omega\) is considered for a simulation using a boosted frame moving at \(\beta c\) with respect to the laboratory. Assuming that the laser is injected at a plane that is fixed in the laboratory, and thus moving at \(\beta_s=-\beta\) in the boosted frame, the injection in the boosted frame is given by
since \(E'_0/E_0=\omega'/\omega=1/\left(1+\beta\right)\gamma\).
The electric field is then converted into currents that get injected via a 2D array of macro-particles, with one positive and one dual negative macro-particle for each array cell in the plane of injection, whose weights and motion are governed by \(E_\perp\left(x',y',t'\right)\). Injecting using this dual array of macroparticles offers the advantage of automatically including the longitudinal component that arises from emitting into a boosted frame, and to automatically verify the discrete Gauss’ law thanks to using charge conserving (e.g. Esirkepov) current deposition scheme [6].
Output in a boosted frame simulation
Some quantities, e.g. charge or dimensions perpendicular to the boost velocity, are Lorentz invariant. Those quantities are thus readily available from standard diagnostics in the boosted frame calculations. Quantities that do not fall in this category are recorded at a number of regularly spaced “stations”, immobile in the laboratory frame, at a succession of time intervals to record data history, or averaged over time. A visual example is given on Fig. 34 (bottom). Since the space-time locations of the diagnostic grids in the laboratory frame generally do not coincide with the space-time positions of the macroparticles and grid nodes used for the calculation in a boosted frame, some interpolation is performed at runtime during the data collection process. As a complement or an alternative, selected particle or field quantities can be dumped at regular intervals and quantities are reconstructed in the laboratory frame during a post-processing phase. The choice of the methods depends on the requirements of the diagnostics and particular implementations.
J.-L. Vay. Simulation Of Beams Or Plasmas Crossing At Relativistic Velocity. Physics of Plasmas, 15(5):56701, May 2008. doi:10.1063/1.2837054.
B. M. Cowan, D. L. Bruhwiler, J. R. Cary, E. Cormier-Michel, and C. G. R. Geddes. Generalized algorithm for control of numerical dispersion in explicit time-domain electromagnetic simulations. Physical Review Special Topics-Accelerators And Beams, Apr 2013. doi:10.1103/PhysRevSTAB.16.041303.
D. P. Grote, A. Friedman, J.-L. Vay, and I. Haber. The Warp Code: Modeling High Intensity Ion Beams. In Aip Conference Proceedings, number 749, 55–58. 2005.
J.-L. Vay, C. G. R. Geddes, E. Esarey, C. B. Schroeder, W. P. Leemans, E. Cormier-Michel, and D. P. Grote. Modeling Of 10 GeV-1 TeV Laser-Plasma Accelerators Using Lorentz Boosted Simulations. Physics of Plasmas, Dec 2011. doi:10.1063/1.3663841.
J.-L. Vay. Noninvariance Of Space- And Time-Scale Ranges Under A Lorentz Transformation And The Implications For The Study Of Relativistic Interactions. Physical Review Letters, 98(13):130405/1–4, 2007.
T. Z. Esirkepov. Exact Charge Conservation Scheme For Particle-In-Cell Simulation With An Arbitrary Form-Factor. Computer Physics Communications, 135(2):144–153, Apr 2001.
Collisions
WarpX includes several different models to capture collisional processes including collisions between kinetic particles (Coulomb collisions, DSMC, nuclear fusion) as well as collisions between kinetic particles and a fixed (i.e. non-evolving) background species (MCC, background stopping).
Background Monte Carlo Collisions (MCC)
Several types of collisions between simulation particles and a neutral background gas are supported including elastic scattering, back scattering, charge exchange, excitation collisions and impact ionization.
The so-called null collision strategy is used in order to minimize the computational burden of the MCC module. This strategy is standard in PIC-MCC and a detailed description can be found elsewhere, for example in Birdsall [1]. In short the maximum collision probability is found over a sensible range of energies and is used to pre-select the appropriate number of macroparticles for collision consideration. Only these pre-selected particles are then individually considered for a collision based on their energy and the cross-sections of all the different collisional processes included.
The MCC implementation assumes that the background neutral particles are thermal,
and are moving at non-relativistic velocities in the lab frame. For each
simulation particle considered for a collision, a velocity vector for a neutral
particle is randomly chosen given the user specified neutral temperature. The
particle velocity is then boosted to the stationary frame of the neutral through
a Galilean transformation. The energy of the collision is calculated using the
particle utility function, ParticleUtils::getCollisionEnergy()
, as
\[\begin{split}\begin{aligned} E_{coll} &= \sqrt{(\gamma mc^2 + Mc^2)^2 - (mu)^2} - (mc^2 + Mc^2) \\ &= \frac{2Mmu^2}{M + m + \sqrt{M^2+m^2+2\gamma mM}}\frac{1}{\gamma + 1} \end{aligned}\end{split}\]
where \(u\) is the speed of the particle as tracked in WarpX (i.e. \(u = \gamma v\) with \(v\) the particle speed), while \(m\) and \(M\) are the rest masses of the simulation and background species, respectively. The Lorentz factor is defined in the usual way, \(\gamma \def \sqrt{1 + u^2/c^2}\). Note that if \(\gamma\to1\) the above expression reduces to the classical equation \(E_{coll} = \frac{1}{2}\frac{Mm}{M+m} u^2\). The collision cross-sections for all scattering processes are evaluated at the energy as calculated above.
Once a particle is selected for a specific collision process, that process determines how the particle is scattered as outlined below.
Direct Simulation Monte Carlo (DSMC)
The algorithm by which binary collisions are treated is outlined below. The description assumes collisions between different species.
Particles from both species are sorted by grid-cells.
The order of the particles in each cell is shuffled.
Within each cell, particles are paired to form collision partners. Particles of the species with fewer members in a given cell is split in half so that each particle has exactly one partner of the other species.
Each collision pair is considered for a collision using the same logic as in the MCC description above.
Particles that are chosen for collision are scattered according to the selected collision process.
Scattering processes
Charge exchange
This process can occur when an ion and neutral (of the same species) collide and results in the exchange of an electron. The ion velocity is simply replaced with the neutral velocity and vice-versa.
Elastic scattering
The elastic
option uses isotropic scattering, i.e., with a differential
cross section that is independent of angle.
This scattering process as well as the ones below that relate to it, are all
performed in the center-of-momentum (COM) frame. Designating the COM velocity of
the particle as \(\vec{u}_c\) and its labframe velocity as \(\vec{u}_l\),
the transformation from lab frame to COM frame is done with a general Lorentz
boost (see function ParticleUtils::doLorentzTransform()
):
\[\begin{split}\begin{bmatrix} \gamma_c c \\ u_{cx} \\ u_{cy} \\ u_{cz} \end{bmatrix} = \begin{bmatrix} \gamma & -\gamma\beta_x & -\gamma\beta_y & -\gamma\beta_z \\ -\gamma\beta_x & 1+(\gamma-1)\frac{\beta_x^2}{\beta^2} & (\gamma-1)\frac{\beta_x\beta_y}{\beta^2} & (\gamma-1)\frac{\beta_x\beta_z}{\beta^2} \\ -\gamma\beta_y & (\gamma-1)\frac{\beta_x\beta_y}{\beta^2} & 1 +(\gamma-1)\frac{\beta_y^2}{\beta^2} & (\gamma-1)\frac{\beta_y\beta_z}{\beta^2} \\ -\gamma\beta_z & (\gamma-1)\frac{\beta_x\beta_z}{\beta^2} & (\gamma-1)\frac{\beta_y\beta_z}{\beta^2} & 1+(\gamma-1)\frac{\beta_z^2}{\beta^2} \\ \end{bmatrix} \begin{bmatrix} \gamma_l c \\ u_{lx} \\ u_{ly} \\ u_{lz} \end{bmatrix}\end{split}\]
where \(\gamma\) is the Lorentz factor of the relative speed between the lab frame and the COM frame, \(\beta_i = v^{COM}_i/c\) is the i’th component of the relative velocity between the lab frame and the COM frame with
\[\vec{v}^{COM} = \frac{m \vec{u_c}}{\gamma_u m + M}\]
The particle velocity in the COM frame is then isotropically scattered using the function ParticleUtils::RandomizeVelocity()
. After the direction of the velocity vector has been appropriately changed, it is transformed back to the lab frame with the reversed Lorentz transform as was done above followed by the reverse Galilean transformation using the starting neutral velocity.
Back scattering
The process is the same as for elastic scattering above expect the scattering angle is fixed at \(\pi\), meaning the particle velocity in the COM frame is updated to \(-\vec{u}_c\).
Excitation
The process is also the same as for elastic scattering except the excitation energy cost is subtracted from the particle energy. This is done by reducing the velocity before a scattering angle is chosen.
Benchmarks
See the MCC example for a benchmark of the MCC implementation against literature results.
Particle cooling due to elastic collisions
It is straight forward to determine the energy a projectile loses during an elastic collision with another body, as a function of scattering angle, through energy and momentum conservation. See for example Lim [2] for a derivation. The result is that given a projectile with mass \(m\), a target with mass \(M\), a scattering angle \(\theta\), and collision energy \(E\), the post collision energy of the projectile is given by
\[\begin{split}\begin{aligned} E_{final} = E - &[(E + mc^2)\sin^2\theta + Mc^2 - \cos(\theta)\sqrt{M^2c^4 - m^2c^4\sin^2\theta}] \\ &\times \frac{E(E+2mc^2)}{(E+mc^2+Mc^2)^2 - E(E+2mc^2)\cos^2\theta} \end{aligned}\end{split}\]
The impact of incorporating relativistic effects in the MCC routine can be seen in the plots below where high energy collisions are considered with both a classical and relativistic implementation of MCC. It is observed that the classical version of MCC reproduces the classical limit of the above equation but especially for ions, this result differs substantially from the fully relativistic result.

C. K. Birdsall. Particle-in-cell charged-particle simulations, plus Monte Carlo collisions with neutral atoms, PIC-MCC. IEEE Transactions on Plasma Science, 19(2):65–85, 1991. doi:10.1109/27.106800.
C.-H. Lim. The interaction of energetic charged particles with gas and boundaries in the particle simulation of plasmas. 2007. URL: https://search.library.berkeley.edu/permalink/01UCS_BER/s4lks2/cdi_proquest_miscellaneous_35689087.
Kinetic-fluid Hybrid Model
Many problems in plasma physics fall in a class where both electron kinetics and electromagnetic waves do not play a critical role in the solution. Examples of such situations include the study of collisionless magnetic reconnection and instabilities driven by ion temperature anisotropy, to mention only two. For these kinds of problems the computational cost of resolving the electron dynamics can be avoided by modeling the electrons as a neutralizing fluid rather than kinetic particles. By further using Ohm’s law to compute the electric field rather than evolving it with the Maxwell-Faraday equation, light waves can be stepped over. The simulation resolution can then be set by the ion time and length scales (commonly the ion cyclotron period \(1/\Omega_i\) and ion skin depth \(l_i\), respectively), which can reduce the total simulation time drastically compared to a simulation that has to resolve the electron Debye length and CFL-condition based on the speed of light.
Many authors have described variations of the kinetic ion & fluid electron model, generally referred to as particle-fluid hybrid or just hybrid-PIC models. The implementation in WarpX is described in detail in Groenewald et al. [1]. This description follows mostly from that reference.
Model
The basic justification for the hybrid model is that the system to which it is applied is dominated by ion kinetics, with ions moving much slower than electrons and photons. In this scenario two critical approximations can be made, namely, neutrality (\(n_e=n_i\)) and the Maxwell-Ampere equation can be simplified by neglecting the displacement current term [2], giving,
\[\mu_0\vec{J} = \vec{\nabla}\times\vec{B},\]
where \(\vec{J} = \sum_{s\neq e}\vec{J}_s + \vec{J}_e + \vec{J}_{ext}\) is the total electrical current, i.e. the sum of electron and ion currents as well as any external current (not captured through plasma particles). Since ions are treated in the regular PIC manner, the ion current, \(\sum_{s\neq e}\vec{J}_s\), is known during a simulation. Therefore, given the magnetic field, the electron current can be calculated.
The electron momentum transport equation (obtained from multiplying the Vlasov equation by mass and integrating over velocity), also called the generalized Ohm’s law, is given by:
\[en_e\vec{E} = \frac{m}{e}\frac{\partial \vec{J}_e}{\partial t} + \frac{m}{e^2}\left( \vec{U}_e\cdot\nabla \right) \vec{J}_e - \nabla\cdot {\overleftrightarrow P}_e - \vec{J}_e\times\vec{B}+\vec{R}_e\]
where \(\vec{U}_e = \vec{J}_e/(en_e)\) is the electron fluid velocity, \({\overleftrightarrow P}_e\) is the electron pressure tensor and \(\vec{R}_e\) is the drag force due to collisions between electrons and ions. Applying the above momentum equation to the Maxwell-Faraday equation (\(\frac{\partial\vec{B}}{\partial t} = -\nabla\times\vec{E}\)) and substituting in \(\vec{J}\) calculated from the Maxwell-Ampere equation, gives,
\[\frac{\partial\vec{J}_e}{\partial t} = -\frac{1}{\mu_0}\nabla\times\left(\nabla\times\vec{E}\right) - \frac{\partial\vec{J}_{ext}}{\partial t} - \sum_{s\neq e}\frac{\partial\vec{J}_s}{\partial t}.\]
Plugging this back into the generalized Ohm’ law gives:
\[\begin{split}\left(en_e +\frac{m}{e\mu_0}\nabla\times\nabla\times\right)\vec{E} =& - \frac{m}{e}\left( \frac{\partial\vec{J}_{ext}}{\partial t} + \sum_{s\neq e}\frac{\partial\vec{J}_s}{\partial t} \right) \\ &+ \frac{m}{e^2}\left( \vec{U}_e\cdot\nabla \right) \vec{J}_e - \nabla\cdot {\overleftrightarrow P}_e - \vec{J}_e\times\vec{B}+\vec{R}_e.\end{split}\]
If we now further assume electrons are inertialess (i.e. \(m=0\)), the above equation simplifies to,
\[en_e\vec{E} = -\vec{J}_e\times\vec{B}-\nabla\cdot{\overleftrightarrow P}_e+\vec{R}_e.\]
Making the further simplifying assumptions that the electron pressure is isotropic and that the electron drag term can be written using a simple resistivity (\(\eta\)) and hyper-resistivity (\(\eta_h\)) i.e. \(\vec{R}_e = en_e(\eta-\eta_h \nabla^2)\vec{J}\), brings us to the implemented form of Ohm’s law:
\[\vec{E} = -\frac{1}{en_e}\left( \vec{J}_e\times\vec{B} + \nabla P_e \right)+\eta\vec{J}-\eta_h \nabla^2\vec{J}.\]
Lastly, if an electron temperature is given from which the electron pressure can be calculated, the model is fully constrained and can be evolved given initial conditions.
Implementation details
Note
Various verification tests of the hybrid model implementation can be found in the examples section.
The kinetic-fluid hybrid extension mostly uses the same routines as the standard electromagnetic
PIC algorithm with the only exception that the E-field is calculated from the
above equation rather than it being updated from the full Maxwell-Ampere equation. The
function WarpX::HybridPICEvolveFields()
handles the logic to update the E&M fields
when the “hybridPIC” model is used. This function is executed after particle pushing
and deposition (charge and current density) has been completed. Therefore, based
on the usual time-staggering in the PIC algorithm, when HybridPICEvolveFields()
is called
at timestep \(t=t_n\), the quantities \(\rho^n\), \(\rho^{n+1}\), \(\vec{J}_i^{n-1/2}\)
and \(\vec{J}_i^{n+1/2}\) are all known.
Field update
The field update is done in three steps as described below.
First half step
Firstly the E-field at \(t=t_n\) is calculated for which the current density needs to be interpolated to the correct time, using \(\vec{J}_i^n = 1/2(\vec{J}_i^{n-1/2}+ \vec{J}_i^{n+1/2})\). The electron pressure is simply calculated using \(\rho^n\) and the B-field is also already known at the correct time since it was calculated for \(t=t_n\) at the end of the last step. Once \(\vec{E}^n\) is calculated, it is used to push \(\vec{B}^n\) forward in time (using the Maxwell-Faraday equation) to \(\vec{B}^{n+1/2}\).
Second half step
Next, the E-field is recalculated to get \(\vec{E}^{n+1/2}\). This is done using the known fields \(\vec{B}^{n+1/2}\), \(\vec{J}_i^{n+1/2}\) and interpolated charge density \(\rho^{n+1/2}=1/2(\rho^n+\rho^{n+1})\) (which is also used to calculate the electron pressure). Similarly as before, the B-field is then pushed forward to get \(\vec{B}^{n+1}\) using the newly calculated \(\vec{E}^{n+1/2}\) field.
Extrapolation step
Obtaining the E-field at timestep \(t=t_{n+1}\) is a well documented issue for the hybrid model. Currently the approach in WarpX is to simply extrapolate \(\vec{J}_i\) forward in time, using
\[\vec{J}_i^{n+1} = \frac{3}{2}\vec{J}_i^{n+1/2} - \frac{1}{2}\vec{J}_i^{n-1/2}.\]
With this extrapolation all fields required to calculate \(\vec{E}^{n+1}\) are known and the simulation can proceed.
Sub-stepping
It is also well known that hybrid PIC routines require the B-field to be updated with a smaller timestep than needed for the particles. A 4th order Runge-Kutta scheme is used to update the B-field. The RK scheme is repeated a number of times during each half-step outlined above. The number of sub-steps used can be specified by the user through a runtime simulation parameter (see input parameters section).
Electron pressure
The electron pressure is assumed to be a scalar quantity and calculated using the given input parameters, \(T_{e0}\), \(n_0\) and \(\gamma\) using
\[P_e = n_0T_{e0}\left( \frac{n_e}{n_0} \right)^\gamma.\]
The isothermal limit is given by \(\gamma = 1\) while \(\gamma = 5/3\) (default) produces the adiabatic limit.
Electron current
WarpX’s displacement current diagnostic can be used to output the electron current in the kinetic-fluid hybrid model since in the absence of kinetic electrons, and under the assumption of zero displacement current, that diagnostic simply calculates the hybrid model’s electron current.
R. E. Groenewald, A. Veksler, F. Ceccherini, A. Necas, B. S. Nicks, D. C. Barnes, T. Tajima, and S. A. Dettrick. Accelerated kinetic model for global macro stability studies of high-beta fusion reactors. Physics of Plasmas, 30(12):122508, Dec 2023. doi:10.1063/5.0178288.
C. W. Nielson and H. R. Lewis. Particle-Code Models in the Nonradiative Limit. In J. Killeen, editor, Controlled Fusion, volume 16 of Methods in Computational Physics: Advances in Research and Applications, pages 367–388. Elsevier, 1976. doi:10.1016/B978-0-12-460816-0.50015-4.
Cold Relativistic Fluid Model
An alternate to the representation of the plasma as macroparticles, is the cold relativistic fluid model. The cold relativistic fluid model is typically faster to compute than particles and useful to replace particles when kinetic effects are negligible. This can be done for certain parts of the plasma, such as the background plasma, while still representing particle beams as a group of macroparticles. The two models then couple through Maxwell’s equations.
In the cold limit (zero internal temperature and pressure) of a relativistic plasma, the Maxwell-Fluid
equations govern the plasma evolution. The fluid equations per species, s
, are given by,
Where the fields are updated via Maxwell’s equations,
The fluids are coupled to the fields through,
where the particle quantities are calculated by the PIC algorithm.
Implementation details
Fluid Loop embedded within the overall PIC loop.
The fluid timeloop is embedded inside the standard PIC timeloop and consists of the following steps: 1. Higuera and Cary push of the momentum 2. Non-inertial (momentum source) terms (only in cylindrical geometry) 3. boundary conditions and MPI Communications 4. MUSCL scheme for advection terms 5. Current and Charge Deposition. Fig. 35 gives a visual representation of these steps, and we describe each of these in more detail.
- Step 0: Preparation
Before the fluid loop begins, it is assumed that the program is in the state where fields \(\mathbf{E}\) and \(\mathbf{B}\) are available integer timestep. The fluids themselves are represented by arrays of fluid quantities (density and momentum density, \(\mathbf{Q} \equiv \{ N, NU_x, NU_y, NU_z \}\)) known on a nodal grid and at half-integer timestep.
- Step 1: Higuera and Cary Push
The time staggering of the fields is used by the momentum source term, which is solved with a Higuera and Cary push [1]. We do not adopt spatial grid staggering, all discretized fluid quantities exist on the nodal grid. External fields can be included at this step.
- Step 2: Non-inertial Terms
In RZ, the divergence of the flux terms has additional non-zero elements outside of the derivatives. These terms are Strang split and are time integrated via equation 2.18 from Shu and Osher [2], which is the SSP-RK3 integrator.
- Step 3: Boundary Conditions and Communications
At this point, the code applies boundary conditions (assuming Neumann boundary conditions for the fluid quantities) and exchanges guard cells between MPI ranks in preparation of derivative terms in the next step.
- Step 4: Advective Push
For the advective term, a MUSCL scheme with a low-diffusion minmod slope limiting is used. We further simplify the conservative equations in terms of primitive variables, \(\{ N, U_x, U_y, U_z \}\). Which we found to be more stable than conservative variables for the MUSCL reconstruction. Details of the scheme can be found in Van Leer [3].
- Step 5: Current and Charge Deposition
Once this series of steps is complete and the fluids have been evolved by an entire timestep, the current and charge is deposited onto the grid and added to the total current and charge densities.
Note
The algorithm is safe with zero fluid density.
It also implements a positivity limiter on the density to prevent negative density regions from forming.
There is currently no ability to perform azimuthal mode decomposition in RZ.
Mesh refinement is not supported for the fluids.
The implemented MUSCL scheme has a simplified slope averaging, see the extended writeup for details.
More details on the precise implementation are available here, WarpX_Cold_Rel_Fluids.pdf.
Warning
If using the fluid model with the Kinetic-Fluid Hybrid model or the electrostatic solver, there is a known issue that the fluids deposit at a half-timestep offset in the charge-density.
A. V. Higuera and J. R. Cary. Structure-preserving second-order integration of relativistic charged particle trajectories in electromagnetic fields. Physics of Plasmas, 24(5):052104, 04 2017. URL: https://doi.org/10.1063/1.4979989, arXiv:https://pubs.aip.org/aip/pop/article-pdf/doi/10.1063/1.4979989/15988441/052104\_1\_online.pdf, doi:10.1063/1.4979989.
C.-W. Shu and S. Osher. Efficient implementation of essentially non-oscillatory shock-capturing schemes. Journal of Computational Physics, 77(2):439–471, 1988. URL: https://www.sciencedirect.com/science/article/pii/0021999188901775, doi:https://doi.org/10.1016/0021-9991(88)90177-5.
B. Van Leer. On The Relation Between The Upwind-Differencing Schemes Of Godunov, Engquist—Osher and Roe, pages 33–52. Springer Berlin Heidelberg, 1997. URL: https://doi.org/10.1007/978-3-642-60543-7_3, doi:10.1007/978-3-642-60543-7_3.
Development
Contribute to WarpX
We welcome new contributors! Here is how to participate to the WarpX development.
Git workflow
The WarpX project uses git for version control. If you are new to git, you can follow this tutorial.
Configure your GitHub Account & Development Machine
First, let’s setup your Git environment and GitHub account.
Go to https://github.com/settings/profile and add your real name and affiliation
Go to https://github.com/settings/emails and add & verify the professional e-mails you want to be associated with.
Configure
git
on the machine you develop on to use the same spelling of your name and email:git config --global user.name "FIRSTNAME LASTNAME"
git config --global user.email EMAIL@EXAMPLE.com
Go to https://github.com/settings/keys and add the SSH public key of the machine you develop on. (Check out the GitHub guide to generating SSH keys or troubleshoot common SSH problems. )
Make your own fork
First, fork the WarpX “mainline” repo on GitHub by pressing the Fork button on the top right of the page. A fork is a copy of WarpX on GitHub, which is under your full control.
Then, we create local copies, for development:
# Clone the mainline WarpX source code to your local computer.
# You cannot write to this repository, but you can read from it.
git clone git@github.com:ECP-WarpX/WarpX.git
cd WarpX
# rename what we just cloned: call it "mainline"
git remote rename origin mainline
# Add your own fork. You can get this address on your fork's Github page.
# Here is where you will publish new developments, so that they can be
# reviewed and integrated into "mainline" later on.
# "myGithubUsername" needs to be replaced with your user name on GitHub.
git remote add myGithubUsername git@github.com:myGithubUsername/WarpX.git
Now you are free to play with your fork (for additional information, you can visit the Github fork help page).
Note
We only need to do the above steps for the first time.
Let’s Develop
You are all set! Now, the basic WarpX development workflow is:
Implement your changes and push them on a new branch
branch_name
on your fork.Create a Pull Request from branch
branch_name
on your fork to branchdevelopment
on the main WarpX repo.
Create a branch branch_name
(the branch name should reflect the piece of code you want to add, like fix-spectral-solver
) with
# start from an up-to-date development branch
git checkout development
git pull mainline development
# create a fresh branch
git checkout -b branch_name
and do the coding you want.
It is probably a good time to look at the AMReX documentation and at the Doxygen reference pages:
WarpX Doxygen: https://warpx.readthedocs.io/en/latest/_static/doxyhtml
AMReX Doxygen: https://amrex-codes.github.io/amrex/doxygen
PICSAR Doxygen: (todo)
Once you are done developing, add the files you created and/or modified to the git
staging area with
git add <file_I_created> <and_file_I_modified>
Build your changes
If you changed C++ files, then now is a good time to test those changes by compiling WarpX locally. Follow the developer instructions in our manual to set up a local development environment, then compile and run WarpX.
Commit & push your changes
Periodically commit your changes with
git commit
The commit message is super important in order to follow the developments during code-review and identify bugs. A typical format is:
This is a short, 40-character title
After a newline, you can write arbitrary paragraphs. You
usually limit the lines to 70 characters, but if you don't, then
nothing bad will happen.
The most important part is really that you find a descriptive title
and add an empty newline after it.
For the moment, commits are on your local repo only. You can push them to your fork with
git push -u myGithubUsername branch_name
If you want to synchronize your branch with the development
branch (this is useful when the development
branch is being modified while you are working on branch_name
), you can use
git pull mainline development
and fix any conflict that may occur.
Submit a Pull Request
A Pull Request (PR) is the way to efficiently visualize the changes you made and to propose your new feature/improvement/fix to the WarpX project.
Right after you push changes, a banner should appear on the Github page of your fork, with your <branch_name>
.
Click on the
compare & pull request
button to prepare your PR.It is time to communicate your changes: write a title and a description for your PR. People who review your PR are happy to know
what feature/fix you propose, and why
how you made it (added new/edited files, created a new class than inherits from…)
how you tested it and what was the output you got
and anything else relevant to your PR (attach images and scripts, link papers, etc.)
Press
Create pull request
. Now you can navigate through your PR, which highlights the changes you made.
Please DO NOT write large pull requests, as they are very difficult and time-consuming to review. As much as possible, split them into small, targeted PRs. For example, if find typos in the documentation open a pull request that only fixes typos. If you want to fix a bug, make a small pull request that only fixes a bug.
If you want to implement a feature and are not too sure how to split it, just open an issue about your plans and ping other WarpX developers on it to chime in. Generally, write helper functionality first, test it and then write implementation code. Submit tests, documentation changes and implementation of a feature together for pull request review.
Even before your work is ready to merge, it can be convenient to create a PR (so you can use Github tools to visualize your changes).
In this case, please put the [WIP]
tag (for Work-In-Progress) at the beginning of the PR title.
You can also use the GitHub project tab in your fork to organize the work into separate tasks/PRs and share it with the WarpX community to get feedback.
Include a test to your PR
A new feature is great, a working new feature is even better! Please test your code and add your test to the automated test suite. It’s the way to protect your work from adventurous developers. Instructions are given in the testing section of our developer’s documentation.
Include documentation about your PR
Now, let users know about your new feature by describing its usage in the WarpX documentation.
Our documentation uses Sphinx, and it is located in Docs/source/
.
For instance, if you introduce a new runtime parameter in the input file, you can add it to Docs/source/running_cpp/parameters.rst.
If Sphinx is installed on your computer, you should be able to generate the html documentation with
make html
in Docs/
. Then open Docs/build/html/index.html
with your favorite web browser and look
for your changes.
Once your code is ready with documentation and automated test, congratulations!
You can create the PR (or remove the [WIP]
tag if you already created it).
Reviewers will interact with you if they have comments/questions.
Style and conventions
For indentation, WarpX uses four spaces (no tabs)
Some text editors automatically modify the files you open. We recommend to turn on to remove trailing spaces and replace Tabs with 4 spaces.
The number of characters per line should be <100
Exception: in documentation files (
.rst
/.md
) use one sentence per line independent of its number of characters, which will allow easier edits.Space before and after assignment operator (
=
)To define a function, use a space between the name of the function and the paranthesis, e.g.,
myfunction ()
. When calling a function, no space should be used, i.e., just usemyfunction()
. The reason this is beneficial is that when we do agit grep
to search formyfunction ()
, we can clearly see the locations wheremyfunction ()
is defined and wheremyfunction()
is called. Also, usinggit grep "myfunction ()"
searches for files only in the git repo, which is more efficient compared to thegrep "myfunction ()"
command that searches through all the files in a directory, including plotfiles for example.To define a class, use
class
on the same line as the name of the class, e.g.,class MyClass
. The reason this is beneficial is that when we do agit grep
to search forclass MyClass
, we can clearly see the locations whereclass MyClass
is defined and whereMyClass
is called.When defining a function or class, make sure the starting
{
token appears on a new line.Use curly braces for single statement blocks. For example:
for (int n = 0; n < 10; ++n) { amrex::Print() << "Like this!"; } for (int n = 0; n < 10; ++n) { amrex::Print() << "Or like this!"; }
but not
for (int n = 0; n < 10; ++n) amrex::Print() << "Not like this."; for (int n = 0; n < 10; ++n) amrex::Print() << "Nor like this.";
It is recommended that style changes are not included in the PR where new code is added. This is to avoid any errors that may be introduced in a PR just to do style change.
WarpX uses
CamelCase
convention for file names and class names, rather thansnake_case
.The names of all member variables should be prefixed with
m_
. This is particularly useful to avoid capturing member variables by value in a lambda function, which causes the whole object to be copied to GPU when running on a GPU-accelerated architecture. This convention should be used for all new piece of code, and it should be applied progressively to old code.#include
directives in C++ have a distinct order to avoid bugs, see the WarpX repo structure for detailsFor all new code, we should avoid relying on
using namespace amrex;
and all amrex types should be prefixed with amrex::. Inside limited scopes, AMReX type literals can be included withusing namespace amrex::literals;
. Ideally, old code should be modified accordingly.
Implementation Details
AMReX basics (excessively basic)
WarpX is built on the Adaptive Mesh Refinement (AMR) library AMReX. This section provides a very sporadic description of the main AMReX classes and concepts relevant for WarpX, that can serve as a reminder. Please read the AMReX basics doc page, of which this section is largely inspired.
amrex::Box
: Dimension-dependent lower and upper indices defining a rectangular volume in 3D (or surface in 2D) in the index space.Box
is a lightweight meta-data class, with useful member functions.amrex::BoxArray
: Collection ofBox
on a single AMR level. The information of which MPI rank owns whichBox
in aBoxArray
is inDistributionMapping
.amrex::FArrayBox
: Fortran-ordered array of floating-pointamrex::Real
elements defined on aBox
. AFArrayBox
can represent scalar data or vector data, withncomp
components.amrex::MultiFab
: Collection of FAB (=FArrayBox
) on a single AMR level, distributed over MPI ranks. The concept of ghost cells is defined at theMultiFab
level.amrex::ParticleContainer
: A collection of particles, typically for particles of a physical species. Particles in aParticleContainer
are organized perBox
. Particles in aBox
are organized per tile (this feature is off when running on GPU). Particles within a tile are stored in several structures, each being contiguous in memory: (i) a Struct-of-Array (SoA) foramrex::ParticleReal
data such as positions, weight, momentum, etc., (ii) a Struct-of-Array (SoA) forint
data, such as ionization levels, and (iii) a Struct-of-Array (SoA) for auint64_t
unique identifier index per particle (containing a 40bit id and 24bit cpu sub-identifier as assigned at particle creation time). This id is also used to check if a particle is active/valid or marked for removal.
The simulation domain is decomposed in several Box
, and each MPI rank owns (and performs operations on) the fields and particles defined on a few of these Box
, but has the metadata of all of them. For convenience, AMReX provides iterators, to easily iterate over all FArrayBox
(or even tile-by-tile, optionally) in a MultiFab
own by the MPI rank (MFIter
), or over all particles in a ParticleContainer
on a per-box basis (ParIter
, or its derived class WarpXParIter
). These are respectively done in loops like:
// mf is a pointer to MultiFab
for ( amrex::MFIter mfi(mf, false); mfi.isValid(); ++mfi ) { ... }
and
// *this is a pointer to a ParticleContainer
for (WarpXParIter pti(*this, lev); pti.isValid(); ++pti) { ... }
When looping over FArrayBox
in a MultiFab
, the iterator provides functions to retrieve the metadata of the Box
on which the FAB
is defined (MFIter::box()
, MFIter::tilebox()
or variations) or the particles defined on this Box
(ParIter::GetParticles()
).
WarpX Structure
Repo Organization
All the WarpX source code is located in Source/
.
All sub-directories have a pretty straightforward name.
The PIC loop is part of the WarpX class, in function WarpX::Evolve
implemented in Source/WarpXEvolve.cpp
.
The core of the PIC loop (i.e., without diagnostics etc.) is in WarpX::OneStep_nosub
(when subcycling is OFF) or WarpX::OneStep_sub1
(when subcycling is ON, with method 1).
Here is a visual representation of the repository structure.
Code organization
The main WarpX class is WarpX, implemented in Source/WarpX.cpp
.
Build System
WarpX uses the CMake build system generator.
Each sub-folder contains a file CMakeLists.txt
with the names of the source files (.cpp
) that are added to the build.
Do not list header files (.H
) here.
For experienced developers, we also support AMReX’ GNUmake build script collection.
The file Make.package
in each sub-folder has the same purpose as the CMakeLists.txt
file, please add new .cpp
files to both dirs.
C++ Includes
All WarpX header files need to be specified relative to the Source/
directory.
e.g.
#include "Utils/WarpXConst.H"
files in the same directory as the including header-file can be included with
#include "FileName.H"
By default, in a MyName.cpp
source file we do not include headers already included in MyName.H
. Besides this exception, if a function or a class
is used in a source file, the header file containing its declaration must be included, unless the inclusion of a facade header is more appropriate. This is
sometimes the case for AMReX headers. For instance AMReX_GpuLaunch.H
is a façade header for AMReX_GpuLaunchFunctsC.H
and AMReX_GpuLaunchFunctsG.H
, which
contain respectively the CPU and the GPU implementation of some methods, and which should not be included directly.
Whenever possible, forward declarations headers are included instead of the actual headers, in order to save compilation time (see dedicated section below). In WarpX forward
declaration headers have the suffix *_fwd.H
, while in AMReX they have the suffix *Fwd.H
.
The include order (see PR #874 and PR #1947) and proper quotation marks are:
In a MyName.cpp
file:
#include "MyName.H"
(its header) then(further) WarpX header files
#include "..."
thenWarpX forward declaration header files
#include "..._fwd.H"
AMReX header files
#include <...>
thenAMReX forward declaration header files
#include <...Fwd.H>
thenPICSAR header files
#include <...>
thenother third party includes
#include <...>
thenstandard library includes, e.g.
#include <vector>
In a MyName.H
file:
#include "MyName_fwd.H"
(the corresponding forward declaration header, if it exists) thenWarpX header files
#include "..."
thenWarpX forward declaration header files
#include "..._fwd.H"
AMReX header files
#include <...>
thenAMReX forward declaration header files
#include <...Fwd.H>
thenPICSAR header files
#include <...>
thenother third party includes
#include <...>
thenstandard library includes, e.g.
#include <vector>
Each of these groups of header files should ideally be sorted alphabetically, and a blank line should be placed between the groups.
For details why this is needed, please see PR #874, PR #1947, the LLVM guidelines, and include-what-you-use.
Forward Declaration Headers
Forward declarations can be used when a header file needs only to know that a given class exists, without any further detail (e.g., when only a pointer to an instance of
that class is used). Forward declaration headers are a convenient way to organize forward declarations. If a forward declaration is needed for a given class MyClass
, declared in MyClass.H
,
the forward declaration should appear in a header file named MyClass_fwd.H
, placed in the same folder containing MyClass.H
. As for regular header files, forward declaration headers must have
include guards. Below we provide a simple example:
MyClass_fwd.H
:
#ifndef MY_CLASS_FWD_H
#define MY_CLASS_FWD_H
class MyClass;
#endif // MY_CLASS_FWD_H
MyClass.H
:
#ifndef MY_CLASS_H
#define MY_CLASS_H
#include "MyClass_fwd.H"
#include "someHeader.H"
class MyClass {
void stuff ();
};
#endif // MY_CLASS_H
MyClass.cpp
:
#include "MyClass.H"
class MyClass {
void stuff () { /* stuff */ }
};
Usage: in SomeType.H
#ifndef SOMETYPE_H
#define SOMETYPE_H
#include "MyClass_fwd.H" // all info we need here
#include <memory>
struct SomeType {
std::unique_ptr<MyClass> p_my_class;
};
#endif // SOMETYPE_H
Usage: in somewhere.cpp
#include "SomeType.H"
#include "MyClass.H" // because we call "stuff()" we really need
// to know the full declaration of MyClass
void somewhere ()
{
SomeType s;
s.p_my_class = std::make_unique<MyClass>();
s.p_my_class->stuff();
}
All files that only need to know the type SomeType
from SomeType.H
but do not access the implementation details of MyClass
will benefit from improved compilation times.
Dimensionality
This section describes the handling of dimensionality in WarpX.
Build Options
Dimensions |
CMake Option |
---|---|
3D3V |
|
2D3V |
|
1D3V |
|
RZ |
|
Note that one can build multiple WarpX dimensions at once via -DWarpX_DIMS="1;2;RZ;3"
.
See building from source for further details.
Defines
Depending on the build variant of WarpX, the following preprocessor macros will be set:
Macro |
3D3V |
2D3V |
1D3V |
RZ |
---|---|---|---|---|
|
|
|
|
|
|
defined |
undefined |
undefined |
undefined |
|
undefined |
undefined |
defined |
undefined |
|
undefined |
defined |
undefined |
undefined |
|
undefined |
undefined |
undefined |
defined |
|
|
|
|
|
At the same time, the following conventions will apply:
Convention |
3D3V |
2D3V |
1D3V |
RZ |
Fields |
||||
AMReX Box dimensions |
|
|
|
|
WarpX axis labels |
|
|
|
|
Particles |
||||
AMReX |
|
|
|
|
WarpX position names |
|
|
|
|
extra SoA attribute |
|
Please see the following sections for particle SoA details.
Conventions
In 2D3V, we assume that the position of a particle in y
is equal to 0
.
In 1D3V, we assume that the position of a particle in x
and y
is equal to 0
.
Fields
Note
Add info on staggering and domain decomposition. Synchronize with section initialization
.
The main fields are the electric field Efield
, the magnetic field Bfield
, the current density current
and the charge density rho
. When a divergence-cleaner is used, we add another field F
(containing \(\vec \nabla \cdot \vec E - \rho\)).
Due the AMR strategy used in WarpX (see section Theory: AMR for a complete description), each field on a given refinement level lev
(except for the coarsest 0
) is defined on:
the fine patch (suffix
_fp
, the actual resolution onlev
).the coarse patch (suffix
_cp
, same physical domain with the resolution of MR levellev-1
).the auxiliary grid (suffix
_aux
, same resolution as_fp
), from which the fields are gathered from the grids to particle positions. For this reason. onlyE
andB
are defined on this_aux
grid (not the current density or charge density).In some conditions, i.e., when buffers are used for the field gather (for numerical reasons), a copy of E and B on the auxiliary grid
_aux
of the level belowlev-1
is stored in fields with suffix_cax
(for coarse aux).
As an example, the structures for the electric field are Efield_fp
, Efield_cp
, Efield_aux
(and optionally Efield_cax
).
Declaration
All the fields described above are public members of class WarpX
, defined in WarpX.H
. They are defined as an amrex::Vector
(over MR levels) of std::array
(for the 3 spatial components \(E_x\), \(E_y\), \(E_z\)) of std::unique_ptr
of amrex::MultiFab
, i.e.:
amrex::Vector<std::array< std::unique_ptr<amrex::MultiFab>, 3 > > Efield_fp;
Hence, Ex
on MR level lev
is a pointer to an amrex::MultiFab
. The other fields are organized in the same way.
Allocation and initialization
The MultiFab
constructor (for, e.g., Ex
on level lev
) is called in WarpX::AllocLevelMFs
.
By default, the MultiFab
are set to 0
at initialization. They can be assigned a different value in WarpX::InitLevelData
.
Field solver
The field solver is performed in WarpX::EvolveE
for the electric field and WarpX::EvolveB
for the magnetic field, called from WarpX::OneStep_nosub
in WarpX::Evolve
. This section describes the FDTD field push. It is implemented in Source/FieldSolver/FiniteDifferenceSolver/
.
As all cell-wise operation, the field push is done as follows (this is split in multiple functions in the actual implementation to avoid code duplication) :
// Loop over MR levels
for (int lev = 0; lev <= finest_level; ++lev) {
// Get pointer to MultiFab Ex on level lev
MultiFab* Ex = Efield_fp[lev][0].get();
// Loop over boxes (or tiles if not on GPU)
for ( MFIter mfi(*Ex, TilingIfNotGPU()); mfi.isValid(); ++mfi ) {
// Apply field solver on the FAB
}
}
The innermost step // Apply field solver on the FAB
could be done with 3 nested for
loops for the 3 dimensions (in 3D). However, for portability reasons (see section Developers: Portability), this is done in two steps: (i) extract AMReX data structures into plain-old-data simple structures, and (ii) call a general ParallelFor
function (translated into nested loops on CPU or a kernel launch on GPU, for instance):
// Get Box corresponding to the current MFIter
const Box& tex = mfi.tilebox(Ex_nodal_flag);
// Extract the FArrayBox into a simple structure, for portability
Array4<Real> const& Exfab = Ex->array(mfi);
// Loop over cells and perform stencil operation
amrex::ParallelFor(tex,
[=] AMREX_GPU_DEVICE (int j, int k, int l)
{
Ex(i, j, k) += c2 * dt * (
- T_Algo::DownwardDz(By, coefs_z, n_coefs_z, i, j, k)
+ T_Algo::DownwardDy(Bz, coefs_y, n_coefs_y, i, j, k)
- PhysConst::mu0 * jx(i, j, k) );
}
);
where T_Algo::DownwardDz
and T_Algo::DownwardDy
represent the discretized derivative
for a given algorithm (represented by the template parameter T_Algo
). The available
discretization algorithms can be found in Source/FieldSolver/FiniteDifferenceSolver/FiniteDifferenceAlgorithms
.
Guard cells exchanges
Communications are mostly handled in Source/Parallelization/
.
For E and B guard cell exchanges, the main functions are variants of amrex::FillBoundary(amrex::MultiFab, ...)
(or amrex::MultiFab::FillBoundary(...)
) that fill guard cells of all amrex::FArrayBox
in an amrex::MultiFab
with valid cells of corresponding amrex::FArrayBox
neighbors of the same amrex::MultiFab
. There are a number of FillBoundaryE
, FillBoundaryB
etc. Under the hood, amrex::FillBoundary
calls amrex::ParallelCopy
, which is also sometimes directly called in WarpX. Most calls a
For the current density, the valid cells of neighboring MultiFabs
are accumulated (added) rather than just copied. This is done using amrex::MultiFab::SumBoundary
, and mostly located in Source/Parallelization/WarpXSumGuardCells.H
.
Interpolations for MR
This is mostly implemented in Source/Parallelization
, see the following functions (you may complain to the authors if the documentation is empty)
-
void WarpX::SyncCurrent(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_fp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_cp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_buffer)
Apply filter and sum guard cells across MR levels. If current centering is used, center the current from a nodal grid to a staggered grid. For each MR level beyond level 0, interpolate the fine-patch current onto the coarse-patch current at the same level. Then, for each MR level, including level 0, apply filter and sum guard cells across levels.
- Parameters:
J_fp – [inout] reference to fine-patch current
MultiFab
(all MR levels)J_cp – [inout] reference to coarse-patch current
MultiFab
(all MR levels)J_buffer – [inout] reference to buffer current
MultiFab
(all MR levels)
-
void WarpX::RestrictCurrentFromFineToCoarsePatch(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_fp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_cp, int lev)
Fills the values of the current on the coarse patch by averaging the values of the current of the fine patch (on the same level).
-
void WarpX::AddCurrentFromFineLevelandSumBoundary(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_fp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_cp, const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J_buffer, int lev)
Filter
General functions for filtering can be found in Source/Filter/
, where the main Filter
class is defined (see below). All filters (so far there are two of them) in WarpX derive from this class.
-
class Filter
Subclassed by BilinearFilter, NCIGodfreyFilter
Bilinear filter
The multi-pass bilinear filter (applied on the current density) is implemented in Source/Filter/
, and class WarpX
holds an instance of this class in member variable WarpX::bilinear_filter
. For performance reasons (to avoid creating too many guard cells), this filter is directly applied in communication routines, see WarpX::AddCurrentFromFineLevelandSumBoundary
above and
-
void WarpX::ApplyFilterJ(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> ¤t, int lev, int idim)
-
void WarpX::SumBoundaryJ(const amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> ¤t, int lev, int idim, const amrex::Periodicity &period)
Godfrey’s anti-NCI filter for FDTD simulations
This filter is applied on the electric and magnetic field (on the auxiliary grid) to suppress the Numerical Cherenkov Instability when running FDTD. It is implemented in Source/Filter/
, and there are two different stencils, one for Ex
, Ey
and Bz
and the other for Ez
, Bx
and By
.
-
class NCIGodfreyFilter : public Filter
Class for Godfrey’s filter to suppress Numerical Cherenkov Instability.
It derives from the base class Filter. The filter stencil is initialized in method ComputeStencils. Computing the stencil requires to read parameters from a table, where each lines stands for a value of c*dt/dz. The filter is applied using the base class’ method ApplyStencil.
The class WarpX
holds two corresponding instances of this class in member variables WarpX::nci_godfrey_filter_exeybz
and WarpX::nci_godfrey_filter_bxbyez
. It is a 9-point stencil (is the z
direction only), for which the coefficients are computed using tabulated values (depending on dz/dx) in Source/Utils/NCIGodfreyTables.H
, see variable table_nci_godfrey_galerkin_Ex_Ey_Bz
. The filter is applied in PhysicalParticleContainer::Evolve
, right after field gather and before particle push, see
-
void PhysicalParticleContainer::applyNCIFilter(int lev, const amrex::Box &box, amrex::Elixir &exeli, amrex::Elixir &eyeli, amrex::Elixir &ezeli, amrex::Elixir &bxeli, amrex::Elixir &byeli, amrex::Elixir &bzeli, amrex::FArrayBox &filtered_Ex, amrex::FArrayBox &filtered_Ey, amrex::FArrayBox &filtered_Ez, amrex::FArrayBox &filtered_Bx, amrex::FArrayBox &filtered_By, amrex::FArrayBox &filtered_Bz, const amrex::FArrayBox &Ex, const amrex::FArrayBox &Ey, const amrex::FArrayBox &Ez, const amrex::FArrayBox &Bx, const amrex::FArrayBox &By, const amrex::FArrayBox &Bz, amrex::FArrayBox const *&ex_ptr, amrex::FArrayBox const *&ey_ptr, amrex::FArrayBox const *&ez_ptr, amrex::FArrayBox const *&bx_ptr, amrex::FArrayBox const *&by_ptr, amrex::FArrayBox const *&bz_ptr)
Apply NCI Godfrey filter to all components of E and B before gather.
The NCI Godfrey filter is applied on Ex, the result is stored in filtered_Ex and the pointer exfab is modified (before this function is called, it points to Ex and after this function is called, it points to Ex_filtered)
- Parameters:
lev – MR level
box – box onto which the filter is applied
exeli – safeguard Elixir object (to avoid de-allocating too early —between ParIter iterations— on GPU) for field Ex
eyeli – safeguard Elixir object (to avoid de-allocating too early —between ParIter iterations— on GPU) for field Ey
ezeli – safeguard Elixir object (to avoid de-allocating too early —between ParIter iterations— on GPU) for field Ez
bxeli – safeguard Elixir object (to avoid de-allocating too early —between ParIter iterations— on GPU) for field Bx
byeli – safeguard Elixir object (to avoid de-allocating too early —between ParIter iterations— on GPU) for field By
bzeli – safeguard Elixir object (to avoid de-allocating too early —between ParIter iterations— on GPU) for field Bz
filtered_Ex – Array containing filtered value
filtered_Ey – Array containing filtered value
filtered_Ez – Array containing filtered value
filtered_Bx – Array containing filtered value
filtered_By – Array containing filtered value
filtered_Bz – Array containing filtered value
Ex – Field array before filtering (not modified)
Ey – Field array before filtering (not modified)
Ez – Field array before filtering (not modified)
Bx – Field array before filtering (not modified)
By – Field array before filtering (not modified)
Bz – Field array before filtering (not modified)
ex_ptr – pointer to the Ex field (modified)
ey_ptr – pointer to the Ey field (modified)
ez_ptr – pointer to the Ez field (modified)
bx_ptr – pointer to the Bx field (modified)
by_ptr – pointer to the By field (modified)
bz_ptr – pointer to the Bz field (modified)
Particles
Particle containers
Particle structures and functions are defined in Source/Particles/
. WarpX uses the Particle
class from AMReX for single particles. An ensemble of particles (e.g., a plasma species, or laser particles) is stored as a WarpXParticleContainer
(see description below) in a per-box (and even per-tile on CPU) basis.
-
class WarpXParticleContainer : public NamedComponentParticleContainer<amrex::DefaultAllocator>
WarpXParticleContainer is the base polymorphic class from which all concrete particle container classes (that store a collection of particles) derive. Derived classes can be used for plasma particles, photon particles, or non-physical particles (e.g., for the laser antenna). It derives from amrex::ParticleContainerPureSoA<PIdx::nattribs>, where the template arguments stand for the number of int and amrex::Real SoA data in amrex::SoAParticle.
SoA amrex::Real: positions x, y, z, momentum ux, uy, uz, … see PIdx for details; more can be added at runtime
SoA int: 0 attributes by default, but can be added at runtime
SoA uint64_t: idcpu, a global 64bit index, with a 40bit local id and a 24bit cpu id (both set at creation) the list.
WarpXParticleContainer contains the main functions for initialization, interaction with the grid (field gather and current deposition) and particle push.
Note: many functions are pure virtual (meaning they MUST be defined in derived classes, e.g., Evolve) or actual functions (e.g. CurrentDeposition).
Subclassed by LaserParticleContainer, PhysicalParticleContainer
Physical species are stored in PhysicalParticleContainer
, that derives from WarpXParticleContainer
. In particular, the main function to advance all particles in a physical species is PhysicalParticleContainer::Evolve
(see below).
-
virtual void PhysicalParticleContainer::Evolve(int lev, const amrex::MultiFab &Ex, const amrex::MultiFab &Ey, const amrex::MultiFab &Ez, const amrex::MultiFab &Bx, const amrex::MultiFab &By, const amrex::MultiFab &Bz, amrex::MultiFab &jx, amrex::MultiFab &jy, amrex::MultiFab &jz, amrex::MultiFab *cjx, amrex::MultiFab *cjy, amrex::MultiFab *cjz, amrex::MultiFab *rho, amrex::MultiFab *crho, const amrex::MultiFab *cEx, const amrex::MultiFab *cEy, const amrex::MultiFab *cEz, const amrex::MultiFab *cBx, const amrex::MultiFab *cBy, const amrex::MultiFab *cBz, amrex::Real t, amrex::Real dt, DtType a_dt_type = DtType::Full, bool skip_deposition = false, PushType push_type = PushType::Explicit) override
Finally, all particle species (physical plasma species PhysicalParticleContainer
, photon species PhotonParticleContainer
or non-physical species LaserParticleContainer
) are stored in MultiParticleContainer
. The class WarpX
holds one instance of MultiParticleContainer
as a member variable, called WarpX::mypc
(where mypc stands for “my particle containers”):
-
class MultiParticleContainer
The class MultiParticleContainer holds multiple instances of the polymorphic class WarpXParticleContainer, stored in its member variable “allcontainers”. The class WarpX typically has a single (pointer to an) instance of MultiParticleContainer.
MultiParticleContainer typically has two types of functions:
Functions that loop over all instances of WarpXParticleContainer in allcontainers and calls the corresponding function (for instance, MultiParticleContainer::Evolve loops over all particles containers and calls the corresponding WarpXParticleContainer::Evolve function).
Functions that specifically handle multiple species (for instance ReadParameters or mapSpeciesProduct).
Loop over particles
A typical loop over particles reads:
// pc is a std::unique_ptr<WarpXParticleContainer>
// Loop over MR levels
for (int lev = 0; lev <= finest_level; ++lev) {
// Loop over particles, box by box
for (WarpXParIter pti(*this, lev); pti.isValid(); ++pti) {
// Do something on particles
// [MY INNER LOOP]
}
}
The innermost step [MY INNER LOOP]
typically calls amrex::ParallelFor
to perform operations on all particles in a portable way. The innermost loop in the code snippet above could look like:
// Get Struct-Of-Array particle data, also called attribs
// (x, y, z, ux, uy, uz, w)
auto& attribs = pti.GetAttribs();
auto& x = attribs[PIdx::x];
// [...]
// Number of particles in this box
const long np = pti.numParticles();
Link fields and particles?
In WarpX, the loop over boxes through a MultiFab
iterator MFIter
and the loop over boxes through a ParticleContainer
iterator WarpXParIter
are consistent.
On a loop over boxes in a MultiFab
(MFIter
), it can be useful to access particle data on a GPU-friendly way. This can be done by:
// Index of grid (= box)
const int grid_id = mfi.index();
// Index of tile within the grid
const int tile_id = mfi.LocalTileIndex();
// Get GPU-friendly arrays of particle data
auto& ptile = GetParticles(lev)[std::make_pair(grid_id,tile_id)];
// Only need attribs (i.e., SoA data)
auto& soa = ptile.GetStructOfArrays();
// As an example, let's get the ux momentum
const ParticleReal * const AMREX_RESTRICT ux = soa.GetRealData(PIdx::ux).data();
On a loop over particles it can be useful to access the fields on the box we are looping over (typically when we use both field and particle data on the same box, for field gather or current deposition for instance). This is done for instance by adding this snippet in [MY INNER LOOP]
:
// E is a reference to, say, WarpX::Efield_aux
// Get the Ex field on the grid
const FArrayBox& exfab = (*E[lev][0])[pti];
// Let's be generous and also get the underlying box (i.e., index info)
const Box& box = pti.validbox();
Main functions
-
virtual void PhysicalParticleContainer::PushPX(WarpXParIter &pti, amrex::FArrayBox const *exfab, amrex::FArrayBox const *eyfab, amrex::FArrayBox const *ezfab, amrex::FArrayBox const *bxfab, amrex::FArrayBox const *byfab, amrex::FArrayBox const *bzfab, amrex::IntVect ngEB, int, long offset, long np_to_push, int lev, int gather_lev, amrex::Real dt, ScaleFields scaleFields, DtType a_dt_type = DtType::Full)
-
void WarpXParticleContainer::DepositCurrent(amrex::Vector<std::array<std::unique_ptr<amrex::MultiFab>, 3>> &J, amrex::Real dt, amrex::Real relative_time)
Deposit current density.
- Parameters:
J – [inout] vector of current densities (one three-dimensional array of pointers to MultiFabs per mesh refinement level)
dt – [in] Time step for particle level
relative_time – [in] Time at which to deposit J, relative to the time of the current positions of the particles. When different than 0, the particle position will be temporarily modified to match the time of the deposition.
Note
The current deposition is used both by PhysicalParticleContainer
and LaserParticleContainer
, so it is in the parent class WarpXParticleContainer
.
Buffers
To reduce numerical artifacts at the boundary of a mesh-refinement patch, WarpX has an option to use buffers: When particles evolve on the fine level, they gather from the coarse level (e.g., Efield_cax
, a copy of the aux
data from the level below) if they are located on the fine level but fewer than WarpX::n_field_gather_buffer
cells away from the coarse-patch boundary. Similarly, when particles evolve on the fine level, they deposit on the coarse level (e.g., Efield_cp
) if they are located on the fine level but fewer than WarpX::n_current_deposition_buffer
cells away from the coarse-patch boundary.
WarpX::gather_buffer_masks
and WarpX::current_buffer_masks
contain masks indicating if a cell is in the interior of the fine-resolution patch or in the buffers. Then, particles depending on this mask in
-
void PhysicalParticleContainer::PartitionParticlesInBuffers(long &nfine_current, long &nfine_gather, long np, WarpXParIter &pti, int lev, amrex::iMultiFab const *current_masks, amrex::iMultiFab const *gather_masks)
Note
Buffers are complex!
Particle attributes
WarpX adds the following particle attributes by default to WarpX particles.
These attributes are stored in Struct-of-Array (SoA) locations of the AMReX particle containers: one SoA for amrex::ParticleReal
attributes, one SoA for int
attributes and one SoA for a uint64_t
global particle index per particle.
The data structures for those are either pre-described at compile-time (CT) or runtime (RT).
Attribute name |
|
Description |
Where |
When |
Notes |
---|---|---|---|---|---|
|
|
Particle position. |
SoA |
CT |
|
|
|
Particle position. |
SoA |
CT |
|
|
|
Particle position. |
SoA |
CT |
|
|
|
CPU-local particle index where the particle was created. |
SoA |
CT |
First 40 bytes of idcpu |
|
|
CPU index where the particle was created. |
SoA |
CT |
Last 24 bytes of idcpu |
|
|
PIC iteration of the last step before the particle hits the boundary. |
SoA |
RT |
Added when there is particle-boundary interaction. Saved in the boundary buffers. |
|
|
Difference of time between the
|
SoA |
RT |
Added when there is particle-boundary interaction. Saved in the boundary buffers. |
|
|
Normal components to the boundary on the position where the particle hits the boundary. |
SoA |
RT |
Added when there is particle-boundary interaction. Saved in the boundary buffers. |
|
|
Ion ionization level |
SoA |
RT |
Added when ionization physics is used. |
|
|
QED: optical depth of the Quantum- Synchrotron process |
SoA |
RT |
Added when PICSAR QED physics is used. |
|
|
QED: optical depth of the Breit- Wheeler process |
SoA |
RT |
Added when PICSAR QED physics is used. |
WarpX allows extra runtime attributes to be added to particle containers (through AddRealComp("attrname")
or AddIntComp("attrname")
).
The attribute name can then be used to access the values of that attribute.
For example, using a particle iterator, pti
, to loop over the particles the command pti.GetAttribs(particle_comps["attrname"]).dataPtr();
will return the values of the "attrname"
attribute.
User-defined integer or real attributes are initialized when particles are generated in AddPlasma()
.
The attribute is initialized with a required user-defined parser function.
Please see the input options addIntegerAttributes
and addRealAttributes
for a user-facing documentation.
Commonly used runtime attributes are described in the table below and are all part of SoA particle storage:
Attribute name |
|
Description |
Default value |
---|---|---|---|
|
|
The coordinates of the particles at the previous timestep. |
user-defined |
|
|
The coordinates of the particles when they were created. |
user-defined |
A Python example that adds runtime options can be found in Examples/Tests/particle_data_python
Note
Only use _
to separate components of vectors!
Accelerator lattice
The files in this directory handle the accelerator lattice. These are fields of various types and configurations. The lattice is laid out along the z-axis.
The AcceleratorLattice has the instances of the accelerator element types and handles the input of the data.
The LatticeElementFinder manages the application of the fields to the particles. It maintains index lookup tables that allow rapidly determining which elements the particles are in.
The classes for each element type are in the subdirectory LatticeElements.
Host and device classes
The LatticeElementFinder and each of the element types have two classes, one that lives on the host and one that can be trivially copied to the device. This dual structure is needed because of the complex data structures describing both the accelerator elements and the index lookup tables. The host level classes manage the data structures, reading in and setting up the data. The host classes copy the data to the device and maintain the pointers to that data on the device. The device level classes grab pointers to the appropriate data (on the device) needed when fetching the data for the particles.
External fields
The lattice fields are applied to the particles from the GetExternalEBField class. If a lattice is defined, the GetExternalEBField class gets the lattice element finder device level instance associated with the grid being operated on. The fields are applied from that instance, which calls the “get_field” method for each lattice element type that is defined for each particle.
Adding new element types
A number of places need to be touched when adding a new element types. The best method is to look for every place where the “quad” element is referenced and duplicate the code for the new element type. Changes will only be needed within the AcceleratorLattice directory.
Initialization
Note
Section almost empty!!
General simulation initialization
Regular simulation
Running in a boosted frame
Field initialization
Particle initialization
Diagnostics
Regular Diagnostics (plotfiles)
Note
Section empty!
Back-Transformed Diagnostics
Note
Section empty!
Moving Window
Note
Section empty!
QED
Quantum synchrotron
Note
Section empty!
Breit-Wheeler
Note
Section empty!
Schwinger process
If the code is compiled with QED and the user activates the Schwinger process in the input file,
electron-positron pairs can be created in vacuum in the function
MultiParticleContainer::doQEDSchwinger
:
-
void MultiParticleContainer::doQEDSchwinger()
If Schwinger process is activated, this function is called at every timestep in Evolve and is used to create Schwinger electron-positron pairs. Within this function we loop over all cells to calculate the number of created physical pairs. If this number is higher than 0, we create a single particle per species in this cell, with a weight corresponding to the number of physical particles.
MultiParticleContainer::doQEDSchwinger
in turn calls the function filterCreateTransformFromFAB
:
Warning
doxygenfunction: Unable to resolve function “filterCreateTransformFromFAB” with arguments (DstTile&, DstTile&, const amrex::Box, const FABs&, const Index, const Index, FilterFunc&&, CreateFunc1&&, CreateFunc2&&, TransFunc&&) in doxygen xml output for project “WarpX” from directory: ../doxyxml/. Potential matches:
- template<int N, typename DstPC, typename DstTile, typename FAB, typename Index, typename CreateFunc1, typename CreateFunc2, typename TransFunc, amrex::EnableIf_t<std::is_integral<Index>::value, int> foo = 0> Index filterCreateTransformFromFAB(DstPC &pc1, DstPC &pc2, DstTile &dst1, DstTile &dst2, const amrex::Box box, const FAB *src_FAB, const Index *mask, const Index dst1_index, const Index dst2_index, CreateFunc1 &&create1, CreateFunc2 &&create2, TransFunc &&transform, const amrex::Geometry &geom_lev_zero) noexcept
- template<int N, typename DstPC, typename DstTile, typename FABs, typename Index, typename FilterFunc, typename CreateFunc1, typename CreateFunc2, typename TransFunc> Index filterCreateTransformFromFAB(DstPC &pc1, DstPC &pc2, DstTile &dst1, DstTile &dst2, const amrex::Box box, const FABs &src_FABs, const Index dst1_index, const Index dst2_index, FilterFunc &&filter, CreateFunc1 &&create1, CreateFunc2 &&create2, TransFunc &&transform, const amrex::Geometry &geom_lev_zero) noexcept
filterCreateTransformFromFAB
proceeds in three steps.
In the filter phase, we loop on every cell and calculate the number of physical pairs created within
the time step dt as a function of the electromagnetic field at the given cell position.
This probabilistic calculation is done via a wrapper that calls the PICSAR
library.
In the create phase, the particles are created at the desired positions, currently at the cell nodes.
In the transform phase, we assign a weight to the particles depending on the number of physical
pairs created.
At most one macroparticle is created per cell per timestep per species, with a weight corresponding to
the total number of physical pairs created.
So far the Schwinger module requires using warpx.grid_type = collocated
or
algo.field_gathering = momentum-conserving
(so that the auxiliary fields are calculated on the nodes)
and is not compatible with either mesh refinement, RZ coordinates or single precision.
Portability
Note
Section empty!
Warning logger
The ⚠️ warning logger ⚠️ allows grouping the warning messages raised during the simulation, in order to display them together in a list (e.g., right after step 1 and at the end of the simulation).
General description
If no warning messages are raised, the warning list should look as follows:
**** WARNINGS ******************************************************************
* GLOBAL warning list after [ FIRST STEP ]
*
* No recorded warnings.
********************************************************************************
On the contrary, if warning messages are raised, the list should look as follows:
**** WARNINGS ******************************************************************
* GLOBAL warning list after [ FIRST STEP ]
*
* --> [!! ] [Species] [raised once]
* Both 'electrons.charge' and electrons.species_type' are specified.
* electrons.charge' will take precedence.
* @ Raised by: ALL
*
* --> [!! ] [Species] [raised once]
* Both 'electrons.mass' and electrons.species_type' are specified.
* electrons.mass' will take precedence.
* @ Raised by: ALL
*
********************************************************************************
Here, GLOBAL
indicates that warning messages are gathered across all the MPI ranks (specifically
after the FIRST STEP
).
Each entry of warning list respects the following format:
* --> [PRIORITY] [TOPIC] [raised COUNTER]
* MULTILINE MESSAGE
* MULTILINE MESSAGE
* @ Raised by: WHICH_RANKS
where:
[PRIORITY]
can be[! ]
(low priority),[!! ]
(medium priority) or[!!!]
(high priority). It indicates the importance of the warning.[TOPIC]
indicates which part of the code is concerned by the warning (e.g., particles, laser, parallelization…)MULTILINE MESSAGE
is an arbitrary text message. It can span multiple-lines. Text is wrapped automatically.COUNTER
indicates the number of times the warning was raised across all the MPI ranks. This means that if we run WarpX with 2048 MPI ranks and each rank raises the same warning once, the displayed message will be[raised 2048 times]
. Possible values areonce
,twice
,XX times
WHICH_RANKS
can be eitherALL
or a sequence of rank IDs. It is the list of the MPI ranks which have raised the warning message.
Entries are sorted first by priority (high priority first), then by topic (alphabetically) and finally by text message (alphabetically).
How to record a warning for later display
In the code, instead of using amrex::Warning
to immediately print a warning message, the following method should be called:
ablastr::warn_manager::WMRecordWarning(
"QED",
"Using default value (2*me*c^2) for photon energy creation threshold",
ablastr::warn_manager::WarnPriority::low);
In this example, QED
is the topic, Using [...]
is the warning message and ablastr::warn_manager::WarnPriority::low
is the priority.
RecordWarning is not a collective call and should also be thread-safe (it can be called in OpenMP loops).
In case the user wants to also print the warning messages immediately, the runtime parameter warpx.always_warn_immediately
can be set to 1
.
The Warning manager is a singleton class defined in Source/ablastr/warn_manager/WarnManager.H`
How to print the warning list
The warning list can be printed as follows:
amrex::Print() << ablastr::warn_manager::GetWMInstance().PrintGlobalWarnings("THE END");
where the string is a temporal marker that appears in the warning list. At the moment this is done right after step one and at the end of the simulation. Calling this method triggers several collective calls that allow merging all the warnings recorded by all the MPI ranks.
Implementation details
How warning messages are recorded
Warning messages are stored by each rank as a map associating each message with a counter. A message is defined by its priority, its topic and its text. Given two messages, if any of these components differ between the two, the messages are considered as different.
How the global warning list is generated
In order to generate the global warning list we follow the strategy outlined below.
Each MPI rank has a
map<Msg, counter>
, associating each with a counter, which counts how many times the warning has been raised on that rank.When
PrintGlobalWarnings
is called, the MPI ranks send to the I/O rank the number of different warnings that they have observed. The I/O rank finds the rank having more warnings and broadcasts 📢 this information back to all the others. This rank, referred in the following as gather rank, will lead 👑 the generation of the global warning listThe gather rank serializes its warning messages [📝,📝,📝,📝,📝…] into a byte array 📦 and broadcasts 📢 this array to all the other ranks.
The other ranks unpack this byte array 📦, obtaining a list of messages [📝,📝,📝,📝,📝…]
For each message seen by the gather rank , each rank prepares a vector containing the number of times it has seen that message (i.e., the counter in
map<Msg, counter>
ifMsg
is in the map): [1️⃣,0️⃣,1️⃣,4️⃣,0️⃣…]In addition, each rank prepares a vector containing the messages seen only by that rank, associated with the corresponding counter: [(📝,1️⃣), (📝,4️⃣),…]
Each rank appends the second list to the first one and packs them into a byte array: [1️⃣,0️⃣,1️⃣,4️⃣,0️⃣…] [(📝,1️⃣), (📝,4️⃣),…] –> 📦
Each rank sends 📨 this byte array to the gather rank, which puts them together in a large byte vector [📦,📦,📦,📦,📦…]
The gather rank parses the byte array, adding the counters of the other ranks to its counters, adding new messages to the message list, and keeping track of which rank has generated which warning 📜
If the gather rank is also the I/O rank, then we are done 🎉, since the rank has a list of messages, global counters and ranks lists [(📝,4️⃣,📜 ), (📝,1️⃣,📜 ),… ]
If the gather rank is not the I/O rank, then it packs the list into a byte array and sends 📨 it to the I/O rank, which unpacks it: gather rank [(📝,4️⃣,📜 ), (📝,1️⃣,📜 ),… ] –> 📦 –> 📨 –> 📦 –> [(📝,4️⃣,📜 ), (📝,1️⃣,📜 ),… ] I/O rank
This procedure is described in more details in these slides.
How to test the warning logger
In order to test the warning logger there is the possibility to inject “artificial” warnings with the inputfile. For instance, the following inputfile
#################################
####### GENERAL PARAMETERS ######
#################################
max_step = 10
amr.n_cell = 128 128
amr.max_grid_size = 64
amr.blocking_factor = 32
amr.max_level = 0
geometry.dims = 2
geometry.prob_lo = -20.e-6 -20.e-6 # physical domain
geometry.prob_hi = 20.e-6 20.e-6
#################################
####### Boundary condition ######
#################################
boundary.field_lo = periodic periodic
boundary.field_hi = periodic periodic
#################################
############ NUMERICS ###########
#################################
warpx.serialize_initial_conditions = 1
warpx.verbose = 1
warpx.cfl = 1.0
warpx.use_filter = 0
# Order of particle shape factors
algo.particle_shape = 1
#################################
######## DEBUG WARNINGS #########
#################################
warpx.test_warnings = w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15 w16 w17 w18 w19 w20 w21 w22
w1.topic = "Priority Sort Test"
w1.msg = "Test that priority is correctly sorted"
w1.priority = "low"
w1.all_involved = 1
w2.topic = "Priority Sort Test"
w2.msg = "Test that priority is correctly sorted"
w2.priority = "medium"
w2.all_involved = 1
w3.topic = "Priority Sort Test"
w3.msg = "Test that priority is correctly sorted"
w3.priority = "high"
w3.all_involved = 1
w4.topic = "ZZA Topic sort Test"
w4.msg = "Test that topic is correctly sorted"
w4.priority = "medium"
w4.all_involved = 1
w5.topic = "ZZB Topic sort Test"
w5.msg = "Test that topic is correctly sorted"
w5.priority = "medium"
w5.all_involved = 1
w6.topic = "ZZC Topic sort Test"
w6.msg = "Test that topic is correctly sorted"
w6.priority = "medium"
w6.all_involved = 1
w7.topic = "Msg sort Test"
w7.msg = "AAA Test that msg is correctly sorted"
w7.priority = "medium"
w7.all_involved = 1
w8.topic = "Msg sort Test"
w8.msg = "BBB Test that msg is correctly sorted"
w8.priority = "medium"
w8.all_involved = 1
w9.topic = "Long line"
w9.msg = "Test very long line: a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a"
w9.priority = "medium"
w9.all_involved = 1
w10.topic = "Repeated warnings"
w10.msg = "Test repeated warnings"
w10.priority = "high"
w10.all_involved = 1
w11.topic = "Repeated warnings"
w11.msg = "Test repeated warnings"
w11.priority = "high"
w11.all_involved = 1
w12.topic = "Repeated warnings"
w12.msg = "Test repeated warnings"
w12.priority = "high"
w12.all_involved = 1
w13.topic = "Not all involved (0)"
w13.msg = "Test warnings raised by a fraction of ranks"
w13.priority = "high"
w13.all_involved = 0
w13.who_involved = 0
w14.topic = "Not all involved (0)"
w14.msg = "Test warnings raised by a fraction of ranks"
w14.priority = "high"
w14.all_involved = 0
w14.who_involved = 0
w15.topic = "Not all involved (1)"
w15.msg = "Test warnings raised by a fraction of ranks"
w15.priority = "high"
w15.all_involved = 0
w15.who_involved = 1
w16.topic = "Not all involved (1,2)"
w16.msg = "Test warnings raised by a fraction of ranks"
w16.priority = "high"
w16.all_involved = 0
w16.who_involved = 1 2
w17.topic = "Different counters"
w17.msg = "Test that different counters are correctly summed"
w17.priority = "low"
w17.all_involved = 1
w18.topic = "Different counters"
w18.msg = "Test that different counters are correctly summed"
w18.priority = "low"
w18.all_involved = 1
w19.topic = "Different counters"
w19.msg = "Test that different counters are correctly summed"
w19.priority = "low"
w19.all_involved = 0
w19.who_involved = 0
w20.topic = "Different counters B"
w20.msg = "Test that different counters are correctly summed"
w20.priority = "low"
w20.all_involved = 1
w21.topic = "Different counters B"
w21.msg = "Test that different counters are correctly summed"
w21.priority = "low"
w21.all_involved = 1
w22.topic = "Different counters B"
w22.msg = "Test that different counters are correctly summed"
w22.priority = "low"
w22.all_involved = 0
w22.who_involved = 1
should generate the following warning list (if run on 4 MPI ranks):
**** WARNINGS ******************************************************************
* GLOBAL warning list after [ THE END ]
*
* --> [!!!] [Not all involved (0)] [raised twice]
* Test warnings raised by a fraction of ranks
* @ Raised by: 0
*
* --> [!!!] [Not all involved (1)] [raised once]
* Test warnings raised by a fraction of ranks
* @ Raised by: 1
*
* --> [!!!] [Not all involved (1,2)] [raised twice]
* Test warnings raised by a fraction of ranks
* @ Raised by: 1 2
*
* --> [!!!] [Priority Sort Test] [raised 4 times]
* Test that priority is correctly sorted
* @ Raised by: ALL
*
* --> [!!!] [Repeated warnings] [raised 12 times]
* Test repeated warnings
* @ Raised by: ALL
*
* --> [!! ] [Long line] [raised 4 times]
* Test very long line: a a a a a a a a a a a a a a a a a a a a a a a a a a a
* a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
* @ Raised by: ALL
*
* --> [!! ] [Msg sort Test] [raised 4 times]
* AAA Test that msg is correctly sorted
* @ Raised by: ALL
*
* --> [!! ] [Msg sort Test] [raised 4 times]
* BBB Test that msg is correctly sorted
* @ Raised by: ALL
*
* --> [!! ] [Priority Sort Test] [raised 4 times]
* Test that priority is correctly sorted
* @ Raised by: ALL
*
* --> [!! ] [ZZA Topic sort Test] [raised 4 times]
* Test that topic is correctly sorted
* @ Raised by: ALL
*
* --> [!! ] [ZZB Topic sort Test] [raised 4 times]
* Test that topic is correctly sorted
* @ Raised by: ALL
*
* --> [!! ] [ZZC Topic sort Test] [raised 4 times]
* Test that topic is correctly sorted
* @ Raised by: ALL
*
* --> [! ] [Different counters] [raised 9 times]
* Test that different counters are correctly summed
* @ Raised by: ALL
*
* --> [! ] [Different counters B] [raised 9 times]
* Test that different counters are correctly summed
* @ Raised by: ALL
*
* --> [! ] [Priority Sort Test] [raised 4 times]
* Test that priority is correctly sorted
* @ Raised by: ALL
*
********************************************************************************
Processing PICMI Input Options
The input parameters in a WarpX PICMI file are processed in two layers. The first layer is the Python level API, which mirrors the C++ application input structure; the second is the translation from the PICMI input to the equivalent app (AMReX) input file parameters.
The two layers are described below.
Input parameters
In a C++ input file, each of the parameters has a prefix, for example geometry
in geometry.prob_lo
.
For each of these prefixes, an instance of a Python class is created and the parameters saved as attributes.
This construction is used since the lines in the input file look very much like a Python assignment statement,
assigning attributes of class instances, for example geometry.dims = 3
.
Many of the prefix instances are predefined, for instance geometry
is created in the file Python/pywarpx/Geometry.py
.
In that case, geometry
is an instance of the class Bucket
(specified in Python/pywarpx/Bucket.py
),
the general class for prefixes.
It is called Bucket
since its main purpose is a place to hold attributes.
Most of the instances are instances of the Bucket
class.
There are exceptions, such as constants
and diagnostics
where extra processing is needed.
There can also be instances created as needed.
For example, for the particle species, an instance is created for each species listed in particles.species_names
.
This gives a place to hold the parameters for the species, e.g., electrons.mass
.
The instances are then used to generate the input parameters.
Each instance can generate a list of strings, one for each attribute.
This happens in the Bucket.attrlist
method.
The strings will be the lines as in an input file, for example "electrons.mass = m_e"
.
The lists for each instance are gathered into one long list in the warpx
instance (of the class WarpX
defined in
Python/pywarpx/WarpX.py
).
This instance has access to all of the predefined instances as well as lists of the generated instances.
In both of the ways that WarpX can be run with Python, that list of input parameter strings will be generated.
This is done in the routine WarpX.create_argv_list
in Python/pywarpx/WarpX.py
.
If WarpX will be run directly in Python, that list will be sent to the amrex_init
routine as the argv
.
This is as if all of the input parameters had been specified on the command line.
If Python is only used as a prepocessor to generate the input file, the list are the strings that are written out to create the
input file.
There are two input parameters that do not have prefixes, max_step
and stop_time
.
These are handled via keyword arguments in the WarpX.create_argv_list
method.
Conversion from PICMI
In the PICMI implementation, defined in Python/pywarpx/picmi.py
, for each PICMI class, a class was written that
inherits the PICMI class and does the processing of the input.
Each of the WarpX classes has two methods, init
and initialize_inputs
.
The init
method is called during the creation of the class instances that happens in the user’s PICMI input file.
This is part of the standard - each of the PICMI classes call the method handle_init
from the constructor __init__
routines.
The main purpose is to process application specific keyword arguments (those that start with warpx_
for example).
These are then passed into the init
methods.
In the WarpX implementation, in the init
, each of the WarpX specific arguments are saved as attributes of the implementation
class instances.
It is in the second method, initialize_inputs
, where the PICMI input parameters are translated into WarpX input parameters.
This method is called later during the initialization.
The prefix instances described above are all accessible in the implementation classes (via the pywarpx
module).
For each PICMI input quantity, the appropriate WarpX input parameters are set in the prefix classes.
As needed, for example in the Species
class, the dynamic prefix instances are created and the attributes set.
Simulation class
The Simulation
class ties it all together.
In a PICMI input file, all information is passed into the Simulation
class instance, either through the constructor
or through add_
methods.
Its initialize_inputs
routine initializes the input parameters it handles and also calls the initialize_inputs
methods of all of the PICMI class instances that have been passed in, such as the field solver, the particles species,
and the diagnostics.
As with other PICMI classes, the init
routine is called by the constructor and initialize_inputs
is called during
initialization.
The initialization happens when either the write_input_file
method is called or the step
method.
After initialize_inputs
is finished, the attributes of the prefix instances have been filled in, and the process described
above happens, where the prefix instances are looped over to generate the list of input parameter strings (that is either written
out to a file or passed in as argv
).
The two parameters that do not have a prefix, max_step
and stop_time
, are passed into the warpx
method as keyword
arguments.
Python runtime interface
The Python interface provides low and high level access to much of the data in WarpX. With the low level access, a user has direct access to the underlying memory contained in the MultiFabs and in the particle arrays. The high level provides a more user friendly interface.
High level interface
There are two python modules that provide convenient access to the fields and the particles.
Fields
The fields
module provides wrapper around most of the MultiFabs that are defined in the WarpX class.
For a list of all of the available wrappers, see the file Python/pywarpx/fields.py
.
For each MultiFab, there is a function that will return a wrapper around the data.
For instance, the function ExWrapper
returns a wrapper around the x
component of the MultiFab vector Efield_aux
.
from pywarpx import fields
Ex = fields.ExWrapper()
By default, this wraps the MultiFab for level 0. The level
argument can be specified for other levels.
By default, the wrapper only includes the valid cells. To include the ghost cells, set the argument include_ghosts=True
.
The wrapper provides access to the data via global indexing.
Using standard array indexing (with exceptions) with square brackets, the data can be accessed using indices that are relative to the full domain (across the MultiFab and across processors).
With multiple processors, the result is broadcast to all processors.
This example will return the Bz
field at all points along x
at the specified y
and z
indices.
from pywarpx import fields
Bz = fields.BzWrapper()
Bz_along_x = Bz[:,5,6]
The same global indexing can be done to set values. This example will set the values over a range in y
and z
at the
specified x
. The data will be scattered appropriately to the underlying FABs.
from pywarpx import fields
Jy = fields.JyFPWrapper()
Jy[5,6:20,8:30] = 7.
The code does error checking to ensure that the specified indices are within the bounds of the global domain.
Note that negative indices are handled differently than with numpy arrays because of the possibility of having ghost cells.
With ghost cells, the lower ghost cells are accessed using negative indices (since 0
is the index of the lower bound of the
valid cells). Without ghost cells, a negative index will always raise an out of bounds error since there are no ghost cells.
Under the covers, the wrapper object has a list of numpy arrays that have pointers to the underlying data, one array for each FAB. When data is being fetched, it loops over that list to gather the data. The result is then gathered among all processors. Note that the result is not writeable, in the sense that changing it won’t change the underlying data since it is a copy. When the data is set, using the global indexing, a similar process is done where the processors loop over their FABs and set the data at the appropriate indices.
The wrappers are always up to date since whenever an access is done (either a get or a set), the list of numpy arrays for the FABs is regenerated. In this case, efficiency is sacrificed for consistency.
If it is needed, the list of numpy arrays associated with the FABs can be obtained using the wrapper method _getfields
.
Additionally, there are the methods _getlovects
and _gethivects
that get the list of the bounds of each of the arrays.
Particles
This is still in development.
Tip
A tutorial-style overview of the code structure can also be found in a developer presentation from 03/2020. It contains information about the code structure, a step-by-step description of what happens in a simulation (initialization and iterations) as well as slides on topics relevant to WarpX development.
Information in the following pages are generally more up-to-date, but the slides above might still be useful.
C++ Objects & Functions
We generate the documentation of C++ objects and functions from our C++ source code by adding Doxygen strings.
WarpX and ABLASTR: C++ Doxygen
This documentation dynamically links to objects described in dependencies:
AMReX: C++ Doxygen and Manual
openPMD-api: C++ Doxygen and Manual
PICSAR-QED: C++ Doxygen is TODO
GNUmake Build System (Legacy)
CMake is our primary build system. In this section, we describe our legacy build scripts - do not use them unless you used them before.
WarpX is built on AMReX, which also provides support for a Linux-centric set of build scripts implemented in GNUmake. Since we sometimes need to move fast and test highly experimental compilers and Unix derivates on core components of WarpX, this set of build scripts is used by some of our experienced developers.
Warning
On the long-term, these scripts do not scale to the full feature set of WarpX and its dependencies. Please see the CMake-based developer section instead.
This page describes the most basic build with GNUmake files and points to instructions for more advanced builds.
Downloading the source code
Clone the source codes of WarpX, and its dependencies AMReX and PICSAR into one
single directory (e.g. warpx_directory
):
mkdir warpx_directory
cd warpx_directory
git clone https://github.com/ECP-WarpX/WarpX.git
git clone https://github.com/ECP-WarpX/picsar.git
git clone https://github.com/ECP-WarpX/warpx-data.git
git clone https://github.com/AMReX-Codes/amrex.git
Note
The warpx-data repository is currently only needed for MCC cross-sections.
Basic compilation
WarpX requires a C/C++ compiler (e.g., GNU, LLVM or Intel) and an MPI implementation (e.g., OpenMPI or MPICH).
Start a GNUmake build by cd
-ing into the directory WarpX
and type
make -j 4
This will generate an executable file in the Bin
directory.
Compile-time vs. run-time options
WarpX has multiple compile-time and run-time options. The compilation
options are set in the file GNUmakefile
. The default
options correspond to an optimized code for 3D geometry. The main compile-time
options are:
DIM=3
or2
: Geometry of the simulation (note that running an executable compiled for 3D with a 2D input file will crash).
DEBUG=FALSE
orTRUE
: Compiling inDEBUG
mode can help tremendously during code development.
USE_PSATD=FALSE
orTRUE
: Compile the Pseudo-Spectral Analytical Time Domain Maxwell solver. Requires an FFT library.
USE_RZ=FALSE
orTRUE
: Compile for 2D axisymmetric geometry.
COMP=gcc
orintel
: Compiler.
USE_MPI=TRUE
orFALSE
: Whether to compile with MPI support.
USE_OMP=TRUE
orFALSE
: Whether to compile with OpenMP support.
USE_GPU=TRUE
orFALSE
: Whether to compile for Nvidia GPUs (requires CUDA).
USE_OPENPMD=TRUE
orFALSE
: Whether to support openPMD for I/O (requires openPMD-api).
MPI_THREAD_MULTIPLE=TRUE
orFALSE
: Whether to initialize MPI with thread multiple support. Required to use asynchronous IO with more thanamrex.async_out_nfiles
(by default, 64) MPI tasks. Please see data formats for more information.
PRECISION=FLOAT USE_SINGLE_PRECISION_PARTICLES=TRUE
: Switch from default double precision to single precision (experimental).
For a description of these different options, see the corresponding page in the AMReX documentation.
Alternatively, instead of modifying the file GNUmakefile
, you can directly pass the options in command line ; for instance:
make -j 4 USE_OMP=FALSE
In order to clean a previously compiled version (typically useful for troubleshooting, if you encounter unexpected compilation errors):
make realclean
before re-attempting compilation.
Advanced GNUmake instructions
Building WarpX with support for openPMD output
WarpX can dump data in the openPMD format. This feature currently requires to have a parallel version of HDF5 installed ; therefore we recommend to use spack in order to facilitate the installation.
More specifically, we recommend that you try installing the openPMD-api library 0.15.1 or newer using spack (first section below). If this fails, a back-up solution is to install parallel HDF5 with spack, and then install the openPMD-api library from source.
In order to install spack, you can simply do:
git clone https://github.com/spack/spack.git
export SPACK_ROOT=$PWD/spack
. $SPACK_ROOT/share/spack/setup-env.sh
You may want to auto-activate spack when you open a new terminal by adding this to your $HOME/.bashrc
file:
echo -e "# activate spack package manager\n. ${SPACK_ROOT}/share/spack/setup-env.sh" >> $HOME/.bashrc
WarpX Development Environment with Spack
Create and activate a Spack environment with all software needed to build WarpX
spack env create warpx-dev # you do this once
spack env activate warpx-dev
spack add gmake
spack add mpi
spack add openpmd-api
spack add pkg-config
spack install
This will download and compile all dependencies.
Whenever you need this development environment in the future, just repeat the quick spack env activate warpx-dev
step.
For example, we can now compile WarpX by cd
-ing into the WarpX
folder and typing:
spack env activate warpx-dev
make -j 4 USE_OPENPMD=TRUE
You will also need to load the same spack environment when running WarpX, for instance:
spack env activate warpx-dev
mpirun -np 4 ./warpx.exe inputs
You can check which Spack environments exist and if one is still active with
spack env list # already created environments
spack env st # is an environment active?
Installing openPMD-api from source
You can also build openPMD-api from source, e.g. to build against the module environment of a supercomputer cluster.
First, load the according modules of the cluster to support the openPMD-api dependencies. You can find the required and optional dependencies here.
You usually just need a C++ compiler, CMake, and one or more file backend libraries, such as HDF5 and/or ADIOS2.
If optional dependencies are installed in non-system paths, one needs to hint their installation location with an environment variable during the build phase:
# optional: only if you manually installed HDF5 and/or ADIOS2 in custom directories
export HDF5_ROOT=$HOME/path_to_installed_software/hdf5-1.12.0/
export ADIOS2_ROOT=$HOME/path_to_installed_software/adios2-2.7.1/
Then, in the $HOME/warpx_directory/
, download and build openPMD-api:
git clone https://github.com/openPMD/openPMD-api.git
mkdir openPMD-api-build
cd openPMD-api-build
cmake ../openPMD-api -DopenPMD_USE_PYTHON=OFF -DCMAKE_INSTALL_PREFIX=$HOME/warpx_directory/openPMD-install/ -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_RPATH='$ORIGIN'
cmake --build . --target install
Finally, compile WarpX:
cd ../WarpX
# Note that one some systems, /lib might need to be replaced with /lib64.
export PKG_CONFIG_PATH=$HOME/warpx_directory/openPMD-install/lib/pkgconfig:$PKG_CONFIG_PATH
export CMAKE_PREFIX_PATH=$HOME/warpx_directory/openPMD-install:$CMAKE_PREFIX_PATH
make -j 4 USE_OPENPMD=TRUE
Note
If you compile with CMake, all you need to add is the -DWarpX_OPENPMD=ON
option (on by default), and we will download and build openPMD-api on-the-fly.
When running WarpX, we will recall where you installed openPMD-api via RPATHs, so you just need to load the same module environment as used for building (same MPI, HDF5, ADIOS2, for instance).
# module load ... (compiler, MPI, HDF5, ADIOS2, ...)
mpirun -np 4 ./warpx.exe inputs
Building the spectral solver
By default, the code is compiled with a finite-difference (FDTD) Maxwell solver. In order to run the code with a spectral solver, you need to:
Install (or load) an MPI-enabled version of FFTW. For instance, for Debian, this can be done with
apt-get install libfftw3-dev libfftw3-mpi-devSet the environment variable
FFTW_HOME
to the path for FFTW. For instance, for Debian, this is done withexport FFTW_HOME=/usr/Set
USE_PSATD=TRUE
when compiling:make -j 4 USE_PSATD=TRUE
See Building WarpX to use RZ geometry for using the spectral solver with USE_RZ. Additional steps are needed.
PSATD is compatible with single precision, but please note that, on CPU, FFTW needs to be compiled with option --enable-float
.
Building WarpX to use RZ geometry
WarpX can be built to run with RZ geometry. Both an FDTD solver (the default) and a PSATD solver are available. Both solvers allow multiple azimuthal modes.
To select RZ geometry, set the flag USE_RZ = TRUE when compiling:
make -j 4 USE_RZ=TRUE
Note that this sets DIM=2, which is required with USE_RZ=TRUE. The executable produced will have “RZ” as a suffix.
RZ geometry with spectral solver
Additional steps are needed to build the spectral solver. Some of the steps
are the same as is done for the Cartesian spectral solver, setting up the FFTW
package and setting USE_PSATD=TRUE
.
Install (or load) an MPI-enabled version of FFTW. For instance, for Debian, this can be done with
apt-get install libfftw3-dev libfftw3-mpi-devSet the environment variable
FFTW_HOME
to the path for FFTW. For instance, for Debian, this is done withexport FFTW_HOME=/usr/Download and build the blaspp and lapackpp packages. These can be obtained from GitHub.
git clone https://github.com/icl-utk-edu/blaspp.git git clone https://github.com/icl-utk-edu/lapackpp.gitThe two packages can be built in multiple ways. A recommended method is to follow the cmake instructions provided in the INSTALL.md that comes with the packages. They can also be installed using spack.
Set the environment variables BLASPP_HOME and LAPACKPP_HOME to the locations where the packages libraries were installed. For example, using bash:
export BLASPP_HOME=/location/of/installation/blaspp export LAPACKPP_HOME=/location/of/installation/lapackppIn some case, the blas and lapack libraries need to be specified. If needed, this can be done by setting the BLAS_LIB and LAPACK_LIB environment variables appropriately. For example, using bash:
export BLAS_LIB=-lblasSet
USE_PSATD=TRUE
when compiling:make -j 4 USE_RZ=TRUE USE_PSATD=TRUE
Building WarpX with GPU support (Linux only)
Warning
In order to build WarpX on a specific GPU cluster (e.g. Summit), look for the corresponding specific instructions, instead of those on this page.
In order to build WarpX with GPU support, make sure that you have cuda and mpich installed on your system. (Compiling with openmpi currently fails.) Then compile WarpX with the option USE_GPU=TRUE, e.g.
make -j 4 USE_GPU=TRUE
Installing WarpX as a Python package
A full Python installation of WarpX can be done, which includes a build of all of the C++ code, or a pure Python version can be made which only installs the Python scripts. WarpX requires Python version 3.8 or newer.
For a full Python installation of WarpX
WarpX’ Python bindings depend on numpy
, periodictable
, picmistandard
, and mpi4py
.
Type
make -j 4 USE_PYTHON_MAIN=TRUE
or edit the GNUmakefile
and set USE_PYTHON_MAIN=TRUE
, and type
make -j 4
Additional compile time options can be specified as needed.
This will compile the code, and install the Python bindings and the Python scripts as a package (named pywarpx
) in your standard Python installation (i.e. in your site-packages
directory).
If you do not have write permission to the default Python installation (e.g. typical on computer clusters), there are two options. The recommended option is to use a virtual environment, which provides the most flexibility and robustness.
Alternatively, add the --user
install option to have WarpX installed elsewhere.
make -j 4 PYINSTALLOPTIONS=--user
With --user
, the default location will be in your home directory, ~/.local
, or the location defined by the environment variable PYTHONUSERBASE
.
In HPC environments, it is often recommended to install codes in scratch or work space which typically have faster disk access.
The different dimensioned versions of WarpX, 3D, 2D, and RZ, can coexist in the Python installation.
The appropriate one will be imported depending on the input file.
Note, however, other options will overwrite - compiling with DEBUG=TRUE
will replace the version compiled with DEBUG=FALSE
for example.
For a pure Python installation
This avoids the compilation of the C++ and is recommended when only using the Python input files as preprocessors.
This installation depend on numpy
, periodictable
, and picmistandard
.
Go into the Python
subdirectory and run
python setup.py install
This installs the Python scripts as a package (named pywarpx
) in your standard Python installation (i.e. in your site-packages
directory).
If you do not have write permission to the default Python installation (e.g. typical on computer clusters), there are two options.
The recommended option is to use a virtual environment, which provides the most flexibility and robustness.
Alternatively, add the --user
install option to have WarpX installed elsewhere.
python setup.py install --user
With --user
, the default location will be in your home directory, ~/.local
, or the location defined by the environment variable PYTHONUSERBASE
.
Building WarpX with Spack
As mentioned in the install section, WarpX can be installed using Spack. From the Spack web page: “Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments.”
Note
Quick-start hint for macOS users:
Before getting started with Spack, please check what you manually installed in /usr/local
.
If you find entries in bin/
, lib/
et al. that look like you manually installed MPI, HDF5 or other software at some point, then remove those files first.
If you find software such as MPI in the same directories that are shown as symbolic links then it is likely you brew installed software before. Run brew unlink … on such packages first to avoid software incompatibilities.
Spack is available from github. Spack only needs to be cloned and can be used right away - there are no installation steps. You can add binary caches for faster builds:
spack mirror add rolling https://binaries.spack.io/develop
spack buildcache keys --install --trust
Do not miss out on the official Spack tutorial if you are new to Spack.
The spack command, spack/bin/spack
, can be used directly or spack/bin
can be added to your PATH
environment variable.
WarpX is built with the single command
spack install warpx
This will build the 3-D version of WarpX using the development
branch.
At the very end of the output from build sequence, Spack tells you where the WarpX executable has been placed.
Alternatively, spack load warpx
can be called, which will put the executable in your PATH
environment variable.
WarpX can be built in several variants, see
spack info warpx
spack info py-warpx
for all available options.
For example
spack install warpx dims=2 build_type=Debug
will build the 2-D version and also turns debugging on.
See spack help --spec
for all syntax details.
Also, please consult the basic usage section of the Spack package manager for an extended introduction to Spack.
The Python version of WarpX is available through the py-warpx
package.
Workflows
Profiling the Code
Profiling allows us to find the bottle-necks of the code as it is currently implemented. Bottle-necks are the parts of the code that may delay the simulation, making it more computationally expensive. Once found, we can update the related code sections and improve its efficiency. Profiling tools can also be used to check how load balanced the simulation is, i.e. if the work is well distributed across all MPI ranks used. Load balancing can be activated in WarpX by setting input parameters, see the parallelization input parameter section.
AMReX’s Tiny Profiler
By default, WarpX uses the AMReX baseline tool, the TINYPROFILER, to evaluate the time information for different parts of the code (functions) between the different MPI ranks. The results, timers, are stored into four tables in the standard output, stdout, that are located below the simulation steps information and above the warnings regarding unused input file parameters (if there were any).
The timers are displayed in tables for which the columns correspond to:
name of the function
number of times it is called in total
minimum of time spent exclusively/inclusively in it, between all ranks
average of time, between all ranks
maximum time, between all ranks
maximum percentage of time spent, across all ranks
If the simulation is well load balanced the minimum, average and maximum times should be identical.
The top two tables refer to the complete simulation information. The bottom two are related to the Evolve() section of the code (where each time step is computed).
Each set of two timers show the exclusive, top, and inclusive, bottom, information depending on whether the time spent in nested sections of the codes are included.
Note
When creating performance-related issues on the WarpX GitHub repo, please include Tiny Profiler tables (besides the usual issue description, input file and submission script), or (even better) the whole standard output.
For more detailed information please visit the AMReX profiling documentation. There is a script located here that parses the Tiny Profiler output and generates a JSON file that can be used with Hatchet in order to analyze performance.
AMReX’s Full Profiler
The Tiny Profiler provides a summary across all MPI ranks. However, when analyzing load-balancing, it can be useful to have more detailed information about the behavior of each individual MPI rank. The workflow for doing so is the following:
Compile WarpX with full profiler support:
cmake -S . -B build -DAMReX_BASE_PROFILE=YES -DAMReX_TRACE_PROFILE=YES -DAMReX_COMM_PROFILE=YES -DAMReX_TINY_PROFILE=OFF cmake --build build -j 4
Warning
Please note that the AMReX build options for
AMReX_TINY_PROFILE
(our default:ON
) and full profiling traces viaAMReX_BASE_PROFILE
are mutually exclusive. Further tracing options are sub-options ofAMReX_BASE_PROFILE
.To turn on the tiny profiler again, remove the
build
directory or turn offAMReX_BASE_PROFILE
again:cmake -S . -B build -DAMReX_BASE_PROFILE=OFF -DAMReX_TINY_PROFILE=ON
Run the simulation to be profiled. Note that the WarpX executable will create a new folder bl_prof, which contains the profiling data.
Note
When using the full profiler, it is usually useful to profile only a few PIC iterations (e.g. 10-20 PIC iterations), in order to improve readability. If the interesting PIC iterations occur only late in a simulation, you can run the first part of the simulation without profiling, the create a checkpoint, and then restart the simulation for 10-20 steps with the full profiler on.
Note
The next steps can be done on a local computer (even if the simulation itself ran on an HPC cluster). In this case, simply copy the folder bl_prof to your local computer.
In order, to visualize the profiling data, install amrvis using spack:
spack install amrvis dims=2 +profiling
Then create timeline database from the bl_prof data and open it:
<amrvis-executable> -timelinepf bl_prof/ <amrvis-executable> pltTimeline/
In the above, <amrvis-executable> should be replaced by the actual of your amrvis executable, which can be found starting to type amrvis and then using Tab completion, in a Terminal.
- This will pop-up a window with the timeline. Here are few guidelines to navigate it:
Use the horizontal scroller to find the area where the 10-20 PIC steps occur.
In order to zoom on an area, you can drag and drop with the mouse, and the hit Ctrl-S on a keyboard.
You can directly click on the timeline to see which actual MPI call is being perform. (Note that the colorbar can be misleading.)
Nvidia Nsight-Systems
Vendor homepage and product manual.
Nsight-Systems provides system level profiling data, including CPU and GPU interactions. It runs quickly, and provides a convenient visualization of profiling results including NVTX timers.
Perlmutter Example
Example on how to create traces on a multi-GPU system that uses the Slurm scheduler (e.g., NERSC’s Perlmutter system). You can either run this on an interactive node or use the Slurm batch script header documented here.
# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1
# 1 OpenMP thread
export OMP_NUM_THREADS=1
export TMPDIR="$PWD/tmp"
rm -rf ${TMPDIR} profiling*
mkdir -p ${TMPDIR}
# record
srun --ntasks=4 --gpus=4 --cpu-bind=cores \
nsys profile -f true \
-o profiling_%q{SLURM_TASK_PID} \
-t mpi,cuda,nvtx,osrt,openmp \
--mpi-impl=mpich \
./warpx.3d.MPI.CUDA.DP.QED \
inputs_3d \
warpx.numprocs=1 1 4 amr.n_cell=512 512 2048 max_step=10
Note
If everything went well, you will obtain as many output files named profiling_<number>.nsys-rep
as active MPI ranks.
Each MPI rank’s performance trace can be analyzed with the Nsight System graphical user interface (GUI).
In WarpX, every MPI rank is associated with one GPU, which each creates one trace file.
Warning
The last line of the sbatch file has to match the data of your input files.
Summit Example
Example on how to create traces on a multi-GPU system that uses the
jsrun
scheduler (e.g., OLCF’s Summit system):
# nsys: remove old traces
rm -rf profiling* tmp-traces
# nsys: a location where we can write temporary nsys files to
export TMPDIR=$PWD/tmp-traces
mkdir -p $TMPDIR
# WarpX: one OpenMP thread per MPI rank
export OMP_NUM_THREADS=1
# record
jsrun -n 4 -a 1 -g 1 -c 7 --bind=packed:$OMP_NUM_THREADS \
nsys profile -f true \
-o profiling_%p \
-t mpi,cuda,nvtx,osrt,openmp \
--mpi-impl=openmpi \
./warpx.3d.MPI.CUDA.DP.QED inputs_3d \
warpx.numprocs=1 1 4 amr.n_cell=512 512 2048 max_step=10
Warning
Sep 10th, 2021 (OLCFHELP-3580):
The Nsight-Compute (nsys
) version installed on Summit does not record details of GPU kernels.
This is reported to Nvidia and OLCF.
Details
In these examples, the individual lines for recording a trace profile are:
srun
: execute multi-GPU runs withsrun
(Slurm’smpiexec
wrapper), here for four GPUs-f true
overwrite previously written trace profiles-o
: record one profile file per MPI rank (per GPU); if you runmpiexec
/mpirun
with OpenMPI directly, replaceSLURM_TASK_PID
withOMPI_COMM_WORLD_RANK
-t
: select a couple of APIs to trace--mpi--impl
: optional, hint the MPI flavor./warpx...
: select the WarpX executable and a good inputs filewarpx.numprocs=...
: make the run short, reasonably small, and run only a few steps
Now open the created trace files (per rank) in the Nsight-Systems GUI. This can be done on another system than the one that recorded the traces. For example, if you record on a cluster and open the analysis GUI on your laptop, it is recommended to make sure that versions of Nsight-Systems match on the remote and local system.
Nvidia Nsight-Compute
Vendor homepage and product manual.
Nsight-Compute captures fine grained information at the kernel level concerning resource utilization. By default, it collects a lot of data and runs slowly (can be a few minutes per step), but provides detailed information about occupancy, and memory bandwidth for a kernel.
Example
Example of how to create traces on a single-GPU system. A jobscript for Perlmutter is shown, but the SBATCH headers are not strictly necessary as the command only profiles a single process. This can also be run on an interactive node, or without a workload management system.
#!/bin/bash -l
#SBATCH -t 00:30:00
#SBATCH -N 1
#SBATCH -J ncuProfiling
#SBATCH -A <your account>
#SBATCH -q regular
#SBATCH -C gpu
#SBATCH --ntasks-per-node=1
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=map_gpu:0
#SBATCH --mail-user=<email>
#SBATCH --mail-type=ALL
# record
dcgmi profile --pause
ncu -f -o out \
--target-processes all \
--set detailed \
--nvtx --nvtx-include="WarpXParticleContainer::DepositCurrent::CurrentDeposition/" \
./warpx input max_step=1 \
&> warpxOut.txt
Note
To collect full statistics, Nsight-Compute reruns kernels, temporarily saving device memory in host memory. This makes it slower than Nsight-Systems, so the provided script profiles only a single step of a single process. This is generally enough to extract relevant information.
Details
In the example above, the individual lines for recording a trace profile are:
dcgmi profile --pause
other profiling tools can’t be collecting data, see this Q&A.-f
overwrite previously written trace profiles.-o
: output file for profiling.--target-processes all
: required for multiprocess code.--set detailed
: controls what profiling data is collected. If only interested in a few things, this can improve profiling speed.detailed
gets pretty much everything.--nvtx
: collects NVTX data. See note.--nvtx-include
: tells the profiler to only profile the given sections. You can also use-k
to profile only a given kernel../warpx...
: select the WarpX executable and a good inputs file.
Now open the created trace file in the Nsight-Compute GUI. As with Nsight-Systems, this can be done on another system than the one that recorded the traces. For example, if you record on a cluster and open the analysis GUI on your laptop, it is recommended to make sure that versions of Nsight-Compute match on the remote and local system.
Note
nvtx-include syntax is very particular. The trailing / in the example is significant. For full information, see the Nvidia’s documentation on NVTX filtering .
Testing the code
When adding a new feature, you want to make sure that (i) you did not break the existing code and (ii) your contribution gives correct results. While existing capabilities are tested regularly remotely (when commits are pushed to an open PR on CI, and every night on local clusters), it can also be useful to run tests on your custom input file. This section details how to use both automated and custom tests.
Continuous Integration in WarpX
Configuration
Our regression tests are using the suite published and documented at AMReX-Codes/regression_testing.
Most of the configuration of our regression tests happens in Regression/Warpx-tests.ini
.
We slightly modify this file in Regression/prepare_file_ci.py
.
For example, if you like to change the compiler to compilation to build on Nvidia GPUs, modify this block to add -DWarpX_COMPUTE=CUDA
:
[source]
dir = /home/regtester/AMReX_RegTesting/warpx
branch = development
cmakeSetupOpts = -DAMReX_ASSERTIONS=ON -DAMReX_TESTING=ON -DWarpX_COMPUTE=CUDA
We also support changing compilation options via the usual build environment variables.
For instance, compiling with clang++ -Werror
would be:
export CXX=$(which clang++)
export CXXFLAGS="-Werror"
Run Pre-Commit Tests Locally
When proposing code changes to Warpx, we perform a couple of automated stylistic and correctness checks on the code change. You can run those locally before you push to save some time, install them once like this:
python -m pip install -U pre-commit
pre-commit install
See pre-commit.com and our .pre-commit-config.yaml
file in the repository for more details.
Run the test suite locally
Once your new feature is ready, there are ways to check that you did not break anything. WarpX has automated tests running every time a commit is added to an open pull request. The list of automated tests is defined in ./Regression/WarpX-tests.ini.
For easier debugging, it can be convenient to run the tests on your local machine by executing the script ./run_test.sh from WarpX’s root folder, as illustrated in the examples below:
# Example:
# run all tests defined in ./Regression/WarpX-tests.ini
./run_test.sh
# Example:
# run only the test named 'pml_x_yee'
./run_test.sh pml_x_yee
# Example:
# run only the tests named 'pml_x_yee', 'pml_x_ckc' and 'pml_x_psatd'
./run_test.sh pml_x_yee pml_x_ckc pml_x_psatd
Note that the script ./run_test.sh runs the tests with the exact same compile-time options and runtime options used to run the tests remotely.
Moreover, the script ./run_test.sh compiles all the executables that are necessary in order to run the chosen tests.
The default number of threads allotted for compiling is set with numMakeJobs = 8
in ./Regression/WarpX-tests.ini.
However, when running the tests on a local machine, it is usually possible and convenient to allot more threads for compiling, in order to speed up the builds.
This can be accomplished by setting the environment variable WARPX_CI_NUM_MAKE_JOBS
, with the preferred number of threads that fits your local machine, e.g. export WARPX_CI_NUM_MAKE_JOBS=16
(or less if your machine is smaller).
On public CI, we overwrite the value to WARPX_CI_NUM_MAKE_JOBS=2
, in order to avoid overloading the available remote resources.
Note that this will not change the number of threads used to run each test, but only the number of threads used to compile each executable necessary to run the tests.
Once the execution of ./run_test.sh is completed, you can find all the relevant files associated with each test in one single directory.
For example, if you run the single test pml_x_yee
, as shown above, on 04/30/2021, you can find all relevant files in the directory ./test_dir/rt-WarpX/WarpX-tests/2021-04-30/pml_x_yee/
.
The content of this directory will look like the following (possibly including backtraces if the test crashed at runtime):
$ ls ./test_dir/rt-WarpX/WarpX-tests/2021-04-30/pml_x_yee/
analysis_pml_yee.py # Python analysis script
inputs_2d # input file
main2d.gnu.TEST.TPROF.MTMPI.OMP.QED.ex # executable
pml_x_yee.analysis.out # Python analysis output
pml_x_yee.err.out # error output
pml_x_yee.make.out # build output
pml_x_yee_plt00000/ # data output (initialization)
pml_x_yee_plt00300/ # data output (last time step)
pml_x_yee.run.out # test output
Add a test to the suite
There are three steps to follow to add a new automated test (illustrated here for PML boundary conditions):
An input file for your test, in folder Example/Tests/…. For the PML test, the input file is at
Examples/Tests/pml/inputs_2d
. You can also re-use an existing input file (even better!) and pass specific parameters at runtime (see below).A Python script that reads simulation output and tests correctness versus theory or calibrated results. For the PML test, see
Examples/Tests/pml/analysis_pml_yee.py
. It typically ends with Python statementassert( error<0.01 )
.If you need a new Python package dependency for testing, add it in
Regression/requirements.txt
Add an entry to
Regression/WarpX-tests.ini
, so that a WarpX simulation runs your test in the continuous integration process, and the Python script is executed to assess the correctness. For the PML test, the entry is
[pml_x_yee]
buildDir = .
inputFile = Examples/Tests/pml/inputs2d
runtime_params = warpx.do_dynamic_scheduling=0 algo.maxwell_solver=yee
dim = 2
addToCompileString =
cmakeSetupOpts = -DWarpX_DIMS=2
restartTest = 0
useMPI = 1
numprocs = 2
useOMP = 1
numthreads = 1
compileTest = 0
doVis = 0
analysisRoutine = Examples/Tests/pml/analysis_pml_yee.py
If you re-use an existing input file, you can add arguments to runtime_params
, like runtime_params = amr.max_level=1 amr.n_cell=32 512 max_step=100 plasma_e.zmin=-200.e-6
.
Note
If you added analysisRoutine = Examples/analysis_default_regression.py
, then run the new test case locally and add the checksum file for the expected output.
Note
We run those tests on our continuous integration services, which at the moment only have 2 virtual CPU cores.
Thus, make sure that the product of numprocs
and numthreads
for a test is <=2
.
Useful tool for plotfile comparison: fcompare
AMReX provides fcompare
, an executable that takes two plotfiles
as input and returns the absolute and relative difference for each field between these two plotfiles. For some changes in the code, it is very convenient to run the same input file with an old and your current version, and fcompare
the plotfiles at the same iteration. To use it:
# Compile the executable
cd <path to AMReX>/Tools/Plotfile/ # This may change
make -j 8
# Run the executable to compare old and new versions
<path to AMReX>/Tools/Plotfile/fcompare.gnu.ex old/plt00200 new/plt00200
which should return something like
variable name absolute error relative error
(||A - B||) (||A - B||/||A||)
----------------------------------------------------------------------------
level = 0
jx 1.044455105e+11 1.021651316
jy 4.08631977e+16 7.734299273
jz 1.877301764e+14 1.073458933
Ex 4.196315448e+10 1.253551615
Ey 3.330698083e+12 6.436470137
Ez 2.598167798e+10 0.6804387128
Bx 273.8687473 2.340209782
By 152.3911863 1.10952567
Bz 37.43212767 2.1977289
part_per_cell 15 0.9375
Ex_fp 4.196315448e+10 1.253551615
Ey_fp 3.330698083e+12 6.436470137
Ez_fp 2.598167798e+10 0.6804387128
Bx_fp 273.8687473 2.340209782
By_fp 152.3911863 1.10952567
Bz_fp 37.43212767 2.1977289
Documentation
Doxygen documentation
WarpX uses a Doxygen documentation. Whenever you create a new class, please document it where it is declared (typically in the header file):
/** \brief A brief title
*
* few-line description explaining the purpose of MyClass.
*
* If you are kind enough, also quickly explain how things in MyClass work.
* (typically a few more lines)
*/
class MyClass
{ ... }
Doxygen reads this docstring, so please be accurate with the syntax! See Doxygen manual for more information. Similarly, please document functions when you declare them (typically in a header file) like:
/** \brief A brief title
*
* few-line description explaining the purpose of my_function.
*
* \param[in,out] my_int a pointer to an integer variable on which
* my_function will operate.
* \return what is the meaning and value range of the returned value
*/
int MyClass::my_function (int* my_int);
An online version of this documentation is linked here.
Breathe documentation
Your Doxygen documentation is not only useful for people looking into the code, it is also part of the WarpX online documentation based on Sphinx!
This is done using the Python module Breathe, that allows you to write Doxygen documentation directly in the source and have it included it in your Sphinx documentation, by calling Breathe functions.
For instance, the following line will get the Doxygen documentation for WarpXParticleContainer
in Source/Particles/WarpXParticleContainer.H
and include it to the html page generated by Sphinx:
.. doxygenclass:: WarpXParticleContainer
Building the documentation
To build the documentation on your local computer, you will need to install Doxygen as well as the Python module breathe. First, make sure you are in the root directory of WarpX’s source and install the Python requirements:
python3 -m pip install -r Docs/requirements.txt
You will also need Doxygen (macOS: brew install doxygen
; Ubuntu: sudo apt install doxygen
).
Then, to compile the documentation, use
cd Docs/
make html
# This will first compile the Doxygen documentation (execute doxygen)
# and then build html pages from rst files using sphinx and breathe.
Open the created build/html/index.html
file with your favorite browser.
Rebuild and refresh as needed.
Checksum regression tests
WarpX has checksum regression tests: as part of CI testing, when running a given test, the checksum module computes one aggregated number per field (Ex_checksum = np.sum(np.abs(Ex))
) and compares it to a reference (benchmark). This should be sensitive enough to make the test fail if your PR causes a significant difference, print meaningful error messages, and give you a chance to fix a bug or reset the benchmark if needed.
The checksum module is located in Regression/Checksum/
, and the benchmarks are stored as human-readable JSON files in Regression/Checksum/benchmarks_json/
, with one file per benchmark (for instance, test Langmuir_2d
has a corresponding benchmark Regression/Checksum/benchmarks_json/Langmuir_2d.json
).
For more details on the implementation, the Python files in Regression/Checksum/
should be well documented.
From a user point of view, you should only need to use checksumAPI.py
. It contains Python functions that can be imported and used from an analysis Python script. It can also be executed directly as a Python script. Here are recipes for the main tasks related to checksum regression tests in WarpX CI.
Include a checksum regression test in an analysis Python script
This relies on the function evaluate_checksum
:
- checksumAPI.evaluate_checksum(test_name, output_file, output_format='plotfile', rtol=1e-09, atol=1e-40, do_fields=True, do_particles=True)[source]
Compare output file checksum with benchmark. Read checksum from output file, read benchmark corresponding to test_name, and assert their equality.
- Parameters:
test_name (string) – Name of test, as found between [] in .ini file.
output_file (string) – Output file from which the checksum is computed.
output_format (string) – Format of the output file (plotfile, openpmd).
rtol (float, default=1.e-9) – Relative tolerance for the comparison.
atol (float, default=1.e-40) – Absolute tolerance for the comparison.
do_fields (bool, default=True) – Whether to compare fields in the checksum.
do_particles (bool, default=True) – Whether to compare particles in the checksum.
For an example, see
#!/usr/bin/env python3
import os
import re
import sys
sys.path.insert(1, '../../../../warpx/Regression/Checksum/')
import checksumAPI
# this will be the name of the plot file
fn = sys.argv[1]
# Get name of the test
test_name = os.path.split(os.getcwd())[1]
# Run checksum regression test
if re.search( 'single_precision', fn ):
checksumAPI.evaluate_checksum(test_name, fn, rtol=2.e-6)
else:
checksumAPI.evaluate_checksum(test_name, fn)
This can also be included in an existing analysis script. Note that the plotfile must be <test name>_plt?????
, as is generated by the CI framework.
Evaluate a checksum regression test from a bash terminal
You can execute checksumAPI.py
as a Python script for that, and pass the plotfile that you want to evaluate, as well as the test name (so the script knows which benchmark to compare it to).
./checksumAPI.py --evaluate --output-file <path/to/plotfile> --output-format <'openpmd' or 'plotfile'> --test-name <test name>
See additional options
--skip-fields
if you don’t want the fields to be compared (in that case, the benchmark must not have fields)--skip-particles
same thing for particles--rtol
relative tolerance for the comparison--atol
absolute tolerance for the comparison (a sum of both is used bynumpy.isclose()
)
Create/Reset a benchmark with new values that you know are correct
Create/Reset a benchmark from a plotfile generated locally
This is using checksumAPI.py
as a Python script.
./checksumAPI.py --reset-benchmark --output-file <path/to/plotfile> --output-format <'openpmd' or 'plotfile'> --test-name <test name>
See additional options
--skip-fields
if you don’t want the benchmark to have fields--skip-particles
same thing for particles
Since this will automatically change the JSON file stored on the repo, make a separate commit just for this file, and if possible commit it under the Tools
name:
git add <test name>.json
git commit -m "reset benchmark for <test name> because ..." --author="Tools <warpx@lbl.gov>"
Reset a benchmark from the Azure pipeline output on Github
Alternatively, the benchmarks can be reset using the output of the Azure continuous intergration (CI) tests on Github. The output can be accessed by following the steps below:
On the Github page of the Pull Request, find (one of) the pipeline(s) failing due to benchmarks that need to be updated and click on “Details”.
Click on “View more details on Azure pipelines”.
Click on “Build & test”.
From this output, there are two options to reset the benchmarks:
For each of the tests failing due to benchmark changes, the output contains the content of the new benchmark file, as shown below. This content can be copied and pasted into the corresponding benchmark file. For instance, if the failing test is
LaserAcceleration_BTD
, this content can be pasted into the fileRegression/Checksum/benchmarks_json/LaserAcceleration_BTD.json
.If there are many tests failing in a single Azure pipeline, it might become more convenient to update the benchmarks automatically. WarpX provides a script for this, located in
Tools/DevUtils/update_benchmarks_from_azure_output.py
. This script can be used by following the steps below:From the Azure output, click on “View raw log”.
This should lead to a page that looks like the image below. Save it as a text file on your local computer.
On your local computer, go to the WarpX folder and cd to the
Tools/DevUtils
folder.Run the command
python update_benchmarks_from_azure_output.py /path/to/azure_output.txt
. The benchmarks included in that Azure output should now be updated.Repeat this for every Azure pipeline (e.g.
cartesian2d
,cartesian3d
,qed
) that contains benchmarks that need to be updated.
Fast, Local Compilation
For simplicity, WarpX compilation with CMake by default downloads, configures and compiles compatible versions of central dependencies such as:
on-the-fly, which is called a superbuild.
In some scenarios, e.g., when compiling without internet, with slow internet access, or when working on WarpX and its dependencies, modifications to the superbuild strategy might be preferable. In the below workflows, you as the developer need to make sure to use compatible versions of the dependencies you provide.
Compiling From Local Sources
This workflow is best for developers that make changes to WarpX, AMReX, PICSAR, openPMD-api and/or pyAMReX at the same time. For instance, use this if you add a feature in AMReX and want to try it in WarpX before it is proposed as a pull request for inclusion in AMReX.
Instead of downloading the source code of the above dependencies, one can also use an already cloned source copy.
For instance, clone these dependencies to $HOME/src
:
cd $HOME/src
git clone https://github.com/ECP-WarpX/WarpX.git warpx
git clone https://github.com/AMReX-Codes/amrex.git
git clone https://github.com/openPMD/openPMD-api.git
git clone https://github.com/ECP-WarpX/picsar.git
git clone https://github.com/AMReX-Codes/pyamrex.git
git clone https://github.com/pybind/pybind11.git
Now modify the dependencies as needed in their source locations, update sources if you cloned them earlier, etc. When building WarpX, the following CMake flags will use the respective local sources:
cd src/warpx
rm -rf build
cmake -S . -B build \
-DWarpX_PYTHON=ON \
-DWarpX_amrex_src=$HOME/src/amrex \
-DWarpX_openpmd_src=$HOME/src/openPMD-api \
-DWarpX_picsar_src=$HOME/src/picsar \
-DWarpX_pyamrex_src=$HOME/src/pyamrex \
-DWarpX_pybind11_src=$HOME/src/pybind11
cmake --build build -j 8
cmake --build build -j 8 --target pip_install
Compiling With Pre-Compiled Dependencies
This workflow is the best and fastest to compile WarpX, when you just want to change code in WarpX and have the above central dependencies already made available in the right configurations (e.g., w/ or w/o MPI or GPU support) from a module system or package manager.
Instead of downloading the source code of the above central dependencies, or using a local copy of their source, we can compile and install those dependencies once. By setting the CMAKE_PREFIX_PATH environment variable to the respective dependency install location prefixes, we can instruct CMake to find their install locations and configurations.
WarpX supports this with the following CMake flags:
cd src/warpx
rm -rf build
cmake -S . -B build \
-DWarpX_PYTHON=ON \
-DWarpX_amrex_internal=OFF \
-DWarpX_openpmd_internal=OFF \
-DWarpX_picsar_internal=OFF \
-DWarpX_pyamrex_internal=OFF \
-DWarpX_pybind11_internal=OFF
cmake --build build -j 8
cmake --build build -j 8 --target pip_install
As a background, this is also the workflow how WarpX is built in package managers such as Spack and Conda-Forge.
Faster Python Builds
The Python bindings of WarpX and AMReX (pyAMReX) use pybind11. Since pybind11 relies heavily on C++ metaprogramming, speeding up the generated binding code requires that we perform a link-time optimization (LTO) step, also known as interprocedural optimization (IPO).
For fast local development cycles, one can skip LTO/IPO with the following flags:
cd src/warpx
cmake -S . -B build \
-DWarpX_PYTHON=ON \
-DWarpX_PYTHON_IPO=OFF \
-DpyAMReX_IPO=OFF
cmake --build build -j 8 --target pip_install
Note
We might transition to nanobind in the future, which does not rely on LTO/IPO for optimal binaries. You can contribute to this pyAMReX pull request to help exploring this library (and if it works for the HPC/GPU compilers that we need to support).
For robustness, our pip_install
target performs a regular wheel
build and then installs it with pip
.
This step will check every time of WarpX dependencies are properly installed, to avoid broken installations.
When developing without internet or after the first pip_install
succeeded in repeated installations in rapid development cycles, this check of pip
can be skipped by using the pip_install_nodeps
target instead:
cmake --build build -j 8 --target pip_install_nodeps
CCache
WarpX builds will automatically search for CCache to speed up subsequent compilations in development cycles. Make sure a recent CCache version is installed to make use of this feature.
For power developers that switch a lot between fundamentally different WarpX configurations (e.g., 1D to 3D, GPU and CPU builds, many branches with different bases, developing AMReX and WarpX at the same time), also consider increasing the CCache cache size and changing the cache directory if needed, e.g., due to storage quota constraints or to choose a fast(er) filesystem for the cache files.
The clang-tidy linter
Clang-tidy CI test
WarpX’s CI tests include several checks performed with the
clang-tidy linter
(currently the version 15 of this tool). The complete list of checks
enforced in CI tests can be found in the .clang-tidy
configuration file.
Run clang-tidy linter locally
We provide a script to run clang-tidy locally. The script can be run as follows, provided that all the requirements to compile WarpX are met (see building from source <install-developers>). The script generates a simple wrapper to ensure that clang-tidy is only applied to WarpX source files and compiles WarpX in 1D,2D,3D, and RZ using such wrapper. By default WarpX is compiled in single precision with PSATD solver, QED module, QED table generator and Embedded boundary in order to find more potential issues with the clang-tidy tool.
Few optional environment variables can be set to tune the behavior of the script:
WARPX_TOOLS_LINTER_PARALLEL
: sets the number of cores to be used for the compilationCLANG
,CLANGXX
, andCLANGTIDY
: set the version of the compiler and of the linter
Note: clang v15 is currently used in CI tests. It is therefore recommended to use this version. Otherwise, a newer version may find issues not currently covered by CI tests (checks are opt-in) while older versions may not find all the issues.
export WARPX_TOOLS_LINTER_PARALLEL=12
export CLANG=clang-15
export CLANGXX=clang++-15
export CLANGTIDY=clang-tidy-15
./Tools/Linter/runClangTidy.sh
FAQ
This section lists frequently asked developer questions.
What is 0.0_rt
?
It’s a C++ floating-point literal for zero of type amrex::Real
.
We use literals to define constants with a specific type, in that case the zero-value.
There is also 0.0_prt
, which is a literal zero of type amrex::ParticleReal
.
In std C++, you know: 0.0
(literal double
), 0.0f
(literal float
) and 0.0L
(literal long double
).
We do not use use those, so that we can configure floating point precision at compile time and use different precision for fields (amrex::Real
) and particles (amrex::ParticleReal
).
You can also write things like 42.0_prt
if you like to have another value than zero.
We use these C++ user literals ([1], [2], [3]), because we want to avoid that double operations, i.e., 3. / 4.
, implicit casts, or even worse integer operations, i.e., 3 / 4
, sneak into the code base and make results wrong or slower.
Do you worry about using size_t
vs. uint
vs. int
for indexing things?
std::size_t is the C++ unsigned int type for all container sizes.
Close to but not necessarily uint
, depends on the platform.
For “hot” inner loops, you want to use int
instead of an unsigned integer type. Why? Because int
has no handling for overflows (it is intentional, undefined behavior in C++), which allows compilers to vectorize easier, because they don’t need to check for an overflow every time one reaches the control/condition section of the loop.
C++20 will also add support for ssize (signed size), but we currently require C++17 for builds.
Thus, sometimes you need to static_cast<int>(...)
.
What does std::make_unique
do?
make_unique is a C++ factory method that creates a std::unique_ptr<T>
.
Follow-up: Why use this over just *my_ptr = new <class>
?
Because so-called smart-pointers, such as std::unique_ptr<T>
, do delete themselves automatically when they run out of scope.
That means: no memory leaks, because you cannot forget to delete
them again.
Why name header files .H
instead of .h
?
This is just a convention that we follow through the code base, which slightly simplifies what we need to parse in our various build systems. We inherited that from AMReX. Generally speaking, C++ file endings can be arbitrary, we just keep them consistent to avoid confusion in the code base.
To be explicit and avoid confusion (with C/ObjC), we might change them all to .hpp
and .cpp
/.cxx
at some point, but for now .H
and .cpp
is what we do (as in AMReX).
What are #include "..._fwd.H"
and #include <...Fwd.H>
files?
These are C++ forward declarations.
In C++, #include
statements copy the referenced header file literally into place, which can increase the compile time of a .cpp
file to an object file significantly, especially with transitive header files including each other.
In order to reduce compile time, we define forward declarations in WarpX and AMReX for commonly used, large classes. The C++ standard library also uses that concept, e.g., in iosfwd.
What does const int /*i_buffer*/
mean in argument list?
This is often seen in a derived class, overwriting an interface method.
It means we do not name the parameter because we do not use it when we overwrite the interface.
But we add the name as a comment /* ... */
so that we know what we ignored when looking at the definition of the overwritten method.
What is Pinned Memory?
We need pinned aka “page locked” host memory when we:
do asynchronous copies between the host and device
want to write to CPU memory from a GPU kernel
A typical use case is initialization of our (filtered/processed) output routines.
AMReX provides pinned memory via the amrex::PinnedArenaAllocator
, which is the last argument passed to constructors of ParticleContainer
and MultiFab
.
Read more on this here: How to Optimize Data Transfers in CUDA C/C++ (note that pinned memory is a host memory feature and works with all GPU vendors we support)
Bonus: underneath the hood, asynchronous MPI communications also pin and unpin memory. One of the benefits of GPU-aware MPI implementations is, besides the possibility to use direct device-device transfers, that MPI and GPU API calls are aware of each others’ pinning ambitions and do not create data races to unpin the same memory.
Maintenance
Dependencies & Releases
Update WarpX’ Core Dependencies
WarpX has direct dependencies on AMReX and PICSAR, which we periodically update.
The following scripts automate this workflow, in case one needs a newer commit of AMReX or PICSAR between releases:
./Tools/Release/updateAMReX.py
./Tools/Release/updatepyAMReX.py
./Tools/Release/updatePICSAR.py
Create a new WarpX release
WarpX has one release per month.
The version number is set at the beginning of the month and follows the format YY.MM
.
In order to create a GitHub release, you need to:
Create a new branch from
development
and update the version number in all source files. We usually wait for the AMReX release to be tagged first, then we also point to its tag.There is a script for updating core dependencies of WarpX and the WarpX version:
./Tools/Release/updateAMReX.py ./Tools/Release/updatepyAMReX.py ./Tools/Release/updatePICSAR.py ./Tools/Release/newVersion.shFor a WarpX release, ideally a git tag of AMReX & PICSAR shall be used instead of an unnamed commit.
Then open a PR, wait for tests to pass and then merge.
Local Commit (Optional): at the moment,
@ax3l
is managing releases and signs tags (naming:YY.MM
) locally with his GPG key before uploading them to GitHub.Publish: On the GitHub Release page, create a new release via
Draft a new release
. Either select the locally created tag or create one online (naming:YY.MM
) on the merged commit of the PR from step 1.In the release description, please specify the compatible versions of dependencies (see previous releases), and provide info on the content of the release. In order to get a list of PRs merged since last release, you may run
git log <last-release-tag>.. --format='- %s'Optional/future: create a
release-<version>
branch, write a changelog, and backport bug-fixes for a few days.
Automated performance tests
WarpX has automated performance test scripts, which run weak scalings for various tests on a weekly basis. The results are stored in the perf_logs repo and plots of the performance history can be found on this page.
These performance tests run automatically, so they need to do git
operations etc. For this reason, they need a separate clone of the source repos, so they don’t conflict with one’s usual operations. This is typically in a sub-directory in the $HOME
, with variable $AUTOMATED_PERF_TESTS
pointing to it. Similarly, a directory is needed to run the simulations and store the results. By default, it is $SCRATCH/performance_warpx
.
The test runs a weak scaling (1,2,8,64,256,512 nodes) for 6 different tests Tools/PerformanceTests/automated_test_{1,2,3,4,5,6}_*
, gathered in 1 batch job per number of nodes to avoid submitting too many jobs.
Setup on Summit @ OLCF
Here is an example setup for Summit:
# I put the next three lines in $HOME/my_bashrc.sh
export proj=aph114 # project for job submission
export AUTOMATED_PERF_TESTS=$HOME/AUTOMATED_PERF_TESTS/
export SCRATCH=/gpfs/alpine/scratch/$(whoami)/$proj/
mkdir $HOME/AUTOMATED_PERF_TESTS
cd $AUTOMATED_PERF_TESTS
git clone https://github.com/ECP-WarpX/WarpX.git warpx
git clone https://github.com/ECP-WarpX/picsar.git
git clone https://github.com/AMReX-Codes/amrex.git
git clone https://github.com/ECP-WarpX/perf_logs.git
Then, in $AUTOMATED_PERF_TESTS
, create a file run_automated_performance_tests_512.sh
with the following content:
#!/bin/bash -l
#BSUB -P APH114
#BSUB -W 00:15
#BSUB -nnodes 1
#BSUB -J PERFTEST
#BSUB -e err_automated_tests.txt
#BSUB -o out_automated_tests.txt
module load nano
module load cmake/3.20.2
module load gcc/9.3.0
module load cuda/11.0.3
module load blaspp/2021.04.01
module load lapackpp/2021.04.00
module load boost/1.76.0
module load adios2/2.7.1
module load hdf5/1.12.2
module unload darshan-runtime
export AMREX_CUDA_ARCH=7.0
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=$(which g++)
# Make sure all dependencies are installed and loaded
cd $HOME
module load python/3.8.10
module load freetype/2.10.4 # matplotlib
module load openblas/0.3.5-omp
export BLAS=$OLCF_OPENBLAS_ROOT/lib/libopenblas.so
export LAPACK=$OLCF_OPENBLAS_ROOT/lib/libopenblas.so
python3 -m pip install --user --upgrade pip
python3 -m pip install --user virtualenv
python3 -m venv $HOME/sw/venvs/warpx-perftest
source $HOME/sw/venvs/warpx-perftest/bin/activate
# While setting up the performance tests for the first time,
# execute the lines above this comment and then the commented
# lines below this comment once, before submission.
# The commented lines take too long for the job script.
#python3 -m pip install --upgrade pip
#python3 -m pip install --upgrade build packaging setuptools wheel
#python3 -m pip install --upgrade cython
#python3 -m pip install --upgrade numpy
#python3 -m pip install --upgrade markupsafe
#python3 -m pip install --upgrade pandas
#python3 -m pip install --upgrade matplotlib==3.2.2 # does not try to build freetype itself
#python3 -m pip install --upgrade bokeh
#python3 -m pip install --upgrade gitpython
#python3 -m pip install --upgrade tables
# Run the performance test suite
cd $AUTOMATED_PERF_TESTS/warpx/Tools/PerformanceTests/
python run_automated.py --n_node_list='1,2,8,64,256,512' --automated
# submit next week's job
cd $AUTOMATED_PERF_TESTS/
next_date=`date -d "+7 days" '+%Y:%m:%d:%H:%M'`
bsub -b $next_date ./run_automated_performance_tests_512.sh
Then, running
bsub run_automated_performance_tests_512.sh
will submit this job once, and all the following ones. It will:
Create directory
$SCRATCH/performance_warpx
if doesn’t exist.Create 1 sub-directory per week per number of nodes (1,2,8,64,256,512).
Submit one job per number of nodes. It will run 6 different tests, each twice (to detect fluctuations).
Submit an analysis job, that will read the results ONLY AFTER all runs are finished. This uses the dependency feature of the batch system.
This job reads the Tiny Profiler output for each run, and stores the results in a pandas file at the hdf5 format.
Execute
write_csv.py
from theperf_logs
repo to append a csv and a hdf5 file with the new results.Commit the results (but DO NOT PUSH YET)
Then, the user periodically has to
cd $AUTOMATED_PERF_TESTS/perf_logs
git pull # to get updates from someone else, or from another supercomputer
git push
This will update the database but not the online plots. For this, you need to periodically run something like
cd $AUTOMATED_PERF_TESTS/perf_logs
git pull
python generate_index_html.py
git add -u
git commit -m "upload new html page"
git push
Setup on Cori @ NERSC
Still to be written!
Epilogue
Glossary
In daily communication, we tend to abbreviate a lot of terms. It is important to us to make it easy to interact with the WarpX community and thus, this list shall help to clarify often used terms.
Abbreviations
ABLASTR: Accelerated BLAST Recipes, the library inside WarpX to share functionality with other BLAST codes
ALCF: Argonne Leadership Computing Facility, a supercomputing center located near Chicago, IL (USA)
ALS: Advance Light Source, a U.S. Department of Energy scientific user facility at Lawrence Berkeley National Laboratory
AMR: adaptive mesh-refinement
BC: boundary condition (of a simulation)
BCK: Benkler-Chavannes-Kuster method, a stabilization technique for small cells in the electromagnetic solver
BTD: backtransformed diagnostics, a method to collect data for analysis from a boosted frame simulation
CEX: charge-exchange collisions
CFL: the Courant-Friedrichs-Lewy condition, a numerical parameter for the numerical convergence of PDE solvers
CI: continuous integration, automated tests that we perform before a proposed code-change is accepted; see PR
CPU: central processing unit, we usual mean a socket or generally the host-side of a computer (compared to the accelerator, e.g. GPU)
DOE: The United States Department of Energy, the largest sponsor of national laboratory research in the United States of America
DSMC: Direct Simulation Monte Carlo, a method to capture collisions between kinetic particles
ECP: Exascale Computing Project, a U.S. DOE funding source that supports WarpX development
ECT: Enlarged Cell Technique, an electromagnetic solver with accurate resolution of perfectly conducting embedded boundaries
EB: embedded boundary, boundary conditions inside the simulation box, e.g. following material surfaces
EM: electromagnetic, e.g. EM PIC
ES: electrostatic, e.g. ES PIC
FDTD: Finite-difference time-domain or Yee’s method, a class of grid-based finite-difference field solvers
FRC: Field Reversed Configuration, an approach of magnetic confinement fusion
GPU: originally graphics processing unit, now used for fast general purpose computing (GPGPU); also called (hardware) accelerator
IO: input/output, usually files and/or data
IPO: interprocedural optimization, a collection of compiler optimization techniques that analyze the whole code to avoid duplicate calculations and optimize performance
ISI: Induced Spectral Incoherence (a laser pulse manipulation technique)
LDRD: Laboratory Directed Research and Development, a funding program in U.S. DOE laboratories that kick-started ABLASTR development
LPA: laser-plasma acceleration, historically used for laser-electron acceleration
LPI: laser-plasma interaction (often for laser-solid physics) or laser-plasma instability (often in fusion physics), depending on context
LTO: link-time optimization, program optimizations for file-by-file compilation that optimize object files before linking them together to an executable
LWFA: laser-wakefield acceleration (of electrons/leptons)
MCC: Monte-Carlo collisions wherein a kinetic species collides with a fluid species, for example used in glow discharge simulations
MR: mesh-refinement
MS: magnetostatic, e.g. MS PIC
MVA: magnetic-vortex acceleration (of protons/ions)
NERSC: National Energy Research Scientific Computing Center, a supercomputing center located in Berkeley, CA (USA)
NSF: the National Science Foundation, a large public agency in the United States of America, supporting research and education
OLCF: Oak Ridge Leadership Computing Facility, a supercomputing center located in Oak Ridge, TN (USA)
OTP: One-Time-Password; see 2FA
PDE: partial differential equation, an equation which imposes relations between the various partial derivatives of a multivariable function
PIC: particle-in-cell, the method implemented in WarpX
PICMI: Particle-In-Cell Modeling Interface, a standard proposing naming and structure conventions for particle-in-cell simulation input
PICSAR: Particle-In-Cell Scalable Application Resource, a high performance parallelization library intended to help scientists porting their Particle-In-Cell (PIC) codes to next generation of exascale computers
PR: github pull request, a proposed change to the WarpX code base
PSATD: pseudo-spectral analytical time-domain method, a spectral field solver with better numerical properties than FDTD solvers
PWFA: plasma-wakefield acceleration
RPA: radiation-pressure acceleration (of protons/ions), e.g. hole-boring (HB) or light-sail (LS) acceleration
RPP: Random Phase Plate (a laser pulse manipulation technique)
RZ: for the coordinate system
r-z
in cylindrical geometry; we use “RZ” when we refer to quasi-cylindrical geometry, decomposed in azimuthal modes (see details here)SENSEI: Scalable in situ analysis and visualization, light weight framework for in situ data analysis offering access to multiple visualization and analysis backends
SEE: secondary electron emission
SSD: Smoothing by Spectral Dispersion (a laser pulse manipulation technique)
TNSA: target-normal sheet acceleration (of protons/ions)
Terms
accelerator: depending on context, either a particle accelerator in physics or a hardware accelerator (e.g. GPU) in computing
AMReX: C++ library for block-structured adaptive mesh-refinement, a primary dependency of WarpX
Ascent: many-core capable flyweight in situ visualization and analysis infrastructure, a visualization backend usable with WarpX data
boosted frame: a Lorentz-boosted frame of reference for a simulation
- evolve: this is a generic term to advance a quantity (same nomenclature in AMReX).
For instance,
WarpX::EvolveE(dt)
advances the electric field for durationdt
,PhysicalParticleContainer::Evolve(...)
does field gather + particle push + current deposition for all particles inPhysicalParticleContainer
, andWarpX::Evolve
is the centralWarpX
function that performs 1 PIC iteration.
Frontier: an Exascale supercomputer at OLCF
hybrid-PIC: a plasma simulation scheme that combines fluid and kinetic approaches, with (usually) the electrons treated as a fluid and the ions as kinetic particles (see Kinetic-fluid Hybrid Model)
laser: most of the time, we mean a laser pulse
openPMD: Open Standard for Particle-Mesh Data Files, a community meta-data project for scientific data
Ohm’s law solver: the logic that solves for the electric-field when using the hybrid-PIC algorithm
Perlmutter: a Berkeley Lab nobel laureate and a Pre-Exascale supercomputer at NERSC
plotfiles: the internal binary format for data files in AMReX
Python: a popular scripted programming language
scraping: a term often used to refer to the process of removing particles that have crossed into an embedded boundary or pass an absorbing domain boundary from the simulation
WarpX Governance
WarpX is led in an open governance model, described in this file.
Steering Committee
Current Roster
Jean-Luc Vay (chair)
Remi Lehe
Axel Huebl
See: GitHub team
Role
Members of the steering committee (SC) can change organizational settings, do administrative operations such as rename/move/archive repositories, change branch protection rules, etc. SC members can call votes for decisions (technical or governance).
The SC can veto decisions of the technical committee (TC) by voting in the SC. The TC can overwrite a veto with a 2/3rd majority vote in the TC. Decisions are documented in the weekly developer meeting notes and/or on the GitHub repository.
The SC can change the governance structure, but only in a unanimous vote.
Decision Process
Decision of the SC usually happen in the weekly developer meetings, via e-mail or public chat.
Decisions are made in a non-confidential manner, by majority on the cast votes of SC members. Votes can be cast in asynchronous manner, e.g., over the time of 1-2 weeks. In tie situations, the chair of the SC acts as the tie breaker.
Appointment Process
Appointed by current SC members in an unanimous vote. As a SC member, regularly attending and contributing to the weekly developer meetings is expected.
SC members can resign or be removed by majority vote, e.g., due to inactivity, bad acting or other reasons.
Technical Committee
Current Roster
Luca Fedeli
Roelof Groenewald
David Grote
Axel Huebl
Revathi Jambunathan
Remi Lehe
Andrew Myers
Maxence Thévenet
Jean-Luc Vay
Weiqun Zhang
Edoardo Zoni
See: GitHub team
Role
The technical committee (TC) is the core governance body, where under normal operations most ideas are discussed and decisions are made. Individual TC members can approve and merge code changes. Usually, they seek approval by another maintainer for their own changes, too. TC members lead - and weigh in on - technical discussions and, if needed, can call for a vote between TC members for a technical decision. TC members merge/close PRs and issues, and moderate (including block/mute) bad actors. The TC can propose governance changes to the SC.
Decision Process
Discussion in the TC usually happens in the weekly developer meetings.
If someone calls for a vote to make a decision: majority based on the cast votes; we need 50% of the committee participating to vote. In the absence of a quorum, the SC will decide according to its voting rules.
Votes are cast in a non-confidential manner. Decisions are documented in the weekly developer meeting notes and/or on the GitHub repository.
TC members can individually appoint new contributors, unless a vote is called on an individual.
Appointment Process
TC members are the maintainers of WarpX. As a TC member, regularly attending and contributing to the weekly developer meetings is expected.
One is appointed to the TC by the steering committee, in a unanimous vote, or by majority vote of the TC. The SC can veto appointments. Steering committee members can also be TC members.
TC members can resign or be removed by majority vote by either TC or SC, e.g., due to inactivity, bad acting or other reasons.
Contributors
Current Roster
See: GitHub team
Role
Contributors are valuable, vetted developers of WarpX. Contributions can be in many forms and not all need to be code contributions. Examples include code pull requests, support in issues & user discussions, writing and updating documentation, writing tutorials, visualizations, R&D on algorithms, testing and benchmarking, etc. Contributors can participate in developer meetings and weigh in on discussions. Contributors can “triage” (add labels) to pull requests, issues, and GitHub discussion pages. Contributors can comment and review PRs (but not merge).
Decision Process
Contributors can individually decide on classification (triage) of pull requests, issues, and GitHub discussion pages.
Appointment Process
Appointed after contributing to WarpX (see above) by any member of the TC.
The role can be lost by resigning or by decision of an individual TC or SC member, e.g., due to inactivity, bad acting or other.
Former Members
“Former members” are the giants on whose shoulders we stand. But, for the purpose of WarpX governance, they are not tracked as a governance role in WarpX. Instead, former (e.g., inactive) contributors are acknowledged separately in GitHub contributor tracking, the WarpX documentation, references, citable Zenodo archives of releases, etc. as appropriate.
Former members of SC, TC and Contributors are not kept in the roster, since committee role rosters shall reflect currently active members and the responsible governance body.
Funding and Acknowledgements
WarpX is supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering, and early testbed platforms, in support of the nation’s exascale computing imperative.
WarpX is supported by the CAMPA collaboration, a project of the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research and Office of High Energy Physics, Scientific Discovery through Advanced Computing (SciDAC) program.
ABLASTR seed development is supported by the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under U.S. Department of Energy Contract No. DE-AC02-05CH11231.
CEA-LIDYL actively contributes to the co-development of WarpX. As part of this initiative, WarpX also receives funding from the French National Research Agency (ANR - Plasm-On-Chip), the Horizon H2020 program and CEA.
We acknowledge all the contributors and users of the WarpX community who participate to the code quality with valuable code improvement and important feedback.