CAMx Scalability with Parallelization
We present an example of CAMx runtimes on the US EPA's High Performance Computing (HPC) system (Atmos). The CAMx v6.40 configuration includes:
-
Single domain over the eastern US, 225x225x25 grid at 12 km resolution
-
CB6r2 gas-phase chemistry + CF aerosol chemistry
-
PiG invoked for major point sources
-
Source Apportionment: 9 regions x 1 sector, OSAT + PSAT (sulfer and nitrogen families), 220 total tracers
The plot below shows model speed for 1 simulation day using combinations of OMP and MPI parallelization and combinations of standard disk and solid state (RAM) I/O. Speed improves up to 512 cores:
-
Multiple points shown for each number of total cores results from different OMP/MPI combinations
-
At 128 total cores, 128 MPI x 1 OMP is slowest, 32 MPI x 4 OMP is fastest
-
Fast I/O (such as solid state drives) become important at large numbers of cores
-
We recommend using OMP and MPI in combination
-
Conduct tests to determine which OMP/MPI combinations work best for your model application and computer system

Acknowledgment: We thank US EPA Region 7 for providing these results.