Checksums on Tests

When running an automated test, we often compare the data of final time step of the test with expected values to catch accidental changes. Instead of relying on reference files that we would have to store in their full size, we calculate an aggregate checksum.

For this purpose, the checksum Python module computes one aggregated number per field (e.g., the sum of the absolute values of the array elements) and compares it to a reference value (benchmark). This should be sensitive enough to make the test fail if your PR causes a significant difference, print meaningful error messages, and give you a chance to fix a bug or reset the benchmark if needed.

The checksum module is located in Regression/Checksum/, and the benchmarks are stored as human-readable JSON files in Regression/Checksum/benchmarks_json/, with one file per benchmark (for example, the test test_2d_langmuir_multi has a corresponding benchmark Regression/Checksum/benchmarks_json/test_2d_langmuir_multi.json).

For more details on the implementation, please refer to the Python implementation in Regression/Checksum/.

From a user point of view, you should only need to use checksumAPI.py, which contains Python functions that can be imported and used from an analysis Python script or can also be executed directly as a Python script.

How to compare checksums in your analysis script

This relies on the function evaluate_checksum:

checksumAPI.evaluate_checksum(test_name, output_file, output_format='plotfile', rtol=1e-09, atol=1e-40, do_fields=True, do_particles=True)[source]

Compare output file checksum with benchmark. Read checksum from output file, read benchmark corresponding to test_name, and assert their equality.

If the environment variable CHECKSUM_RESET is set while this function is run, the evaluation will be replaced with a call to reset_benchmark (see below).

Parameters:
  • test_name (string) – Name of test, as found between [] in .ini file.

  • output_file (string) – Output file from which the checksum is computed.

  • output_format (string) – Format of the output file (plotfile, openpmd).

  • rtol (float, default=1.e-9) – Relative tolerance for the comparison.

  • atol (float, default=1.e-40) – Absolute tolerance for the comparison.

  • do_fields (bool, default=True) – Whether to compare fields in the checksum.

  • do_particles (bool, default=True) – Whether to compare particles in the checksum.

Here’s an example:

#!/usr/bin/env python3

import os
import sys

sys.path.insert(1, "../../../../warpx/Regression/Checksum/")
from checksumAPI import evaluate_checksum

# compare checksums
evaluate_checksum(
    test_name=os.path.split(os.getcwd())[1],
    output_file=sys.argv[1],
    rtol=1e-2,
)

This can also be included as part of an existing analysis script.

How to evaluate checksums from the command line

You can execute checksumAPI.py as a Python script for that, and pass the plotfile that you want to evaluate, as well as the test name (so the script knows which benchmark to compare it to).

./checksumAPI.py --evaluate --output-file <path/to/plotfile> --output-format <'openpmd' or 'plotfile'> --test-name <test name>

See additional options

  • --skip-fields if you don’t want the fields to be compared (in that case, the benchmark must not have fields)

  • --skip-particles same thing for particles

  • --rtol relative tolerance for the comparison

  • --atol absolute tolerance for the comparison (a sum of both is used by numpy.isclose())

How to create or reset checksums with local benchmark values

This is using checksumAPI.py as a Python script.

./checksumAPI.py --reset-benchmark --output-file <path/to/plotfile> --output-format <'openpmd' or 'plotfile'> --test-name <test name>

See additional options

  • --skip-fields if you don’t want the benchmark to have fields

  • --skip-particles same thing for particles

Since this will automatically change the JSON file stored on the repo, make a separate commit just for this file, and if possible commit it under the Tools name:

git add <test name>.json
git commit -m "reset benchmark for <test name> because ..." --author="Tools <warpx@lbl.gov>"

How to reset checksums for a list of tests with local benchmark values

If you set the environment variable export CHECKSUM_RESET=ON before running tests that are compared against existing benchmarks, the test analysis will reset the benchmarks to the new values, skipping the comparison.

With CTest (coming soon), select the test(s) to reset by name or label.

# regex filter: matched names
CHECKSUM_RESET=ON ctest --test-dir build -R "Langmuir_multi|LaserAcceleration"

# ... check and commit changes ...

How to reset checksums for a list of tests with benchmark values from the Azure pipeline output

Alternatively, the benchmarks can be reset using the output of the Azure continuous intergration (CI) tests on Github. The output can be accessed by following the steps below:

  • On the Github page of the Pull Request, find (one of) the pipeline(s) failing due to benchmarks that need to be updated and click on “Details”.

    Screen capture showing how to access Azure pipeline output on Github.
  • Click on “View more details on Azure pipelines”.

    Screen capture showing how to access Azure pipeline output on Github.
  • Click on “Build & test”.

    Screen capture showing how to access Azure pipeline output on Github.

From this output, there are two options to reset the benchmarks:

  1. For each of the tests failing due to benchmark changes, the output contains the content of the new benchmark file, as shown below. This content can be copied and pasted into the corresponding benchmark file. For instance, if the failing test is LaserAcceleration_BTD, this content can be pasted into the file Regression/Checksum/benchmarks_json/LaserAcceleration_BTD.json.

    Screen capture showing how to read new benchmark file from Azure pipeline output.
  2. If there are many tests failing in a single Azure pipeline, it might become more convenient to update the benchmarks automatically. WarpX provides a script for this, located in Tools/DevUtils/update_benchmarks_from_azure_output.py. This script can be used by following the steps below:

    • From the Azure output, click on “View raw log”.

      Screen capture showing how to download raw Azure pipeline output.
    • This should lead to a page that looks like the image below. Save it as a text file on your local computer.

      Screen capture showing how to download raw Azure pipeline output.
    • On your local computer, go to the WarpX folder and cd to the Tools/DevUtils folder.

    • Run the command python update_benchmarks_from_azure_output.py /path/to/azure_output.txt. The benchmarks included in that Azure output should now be updated.

    • Repeat this for every Azure pipeline (e.g. cartesian2d, cartesian3d, qed) that contains benchmarks that need to be updated.