The Co-design Center for Particle Applications (CoPA) provides contributions to enable application readiness as we move toward exascale architectures for the “motif” of particle-based applications. CoPA focuses on co-design of several “sub-motifs,” including short-range particle-particle interactions (e.g., those that often dominate molecular dynamics (MD) and smoothed particle hydrodynamics methods), long-range particle-particle interactions (e.g., electrostatic MD and gravitational N-body), particle-in-cell (PIC) methods, and O(N) complexity electronic structure and quantum MD (QMD) algorithms. Relevant particle applications are represented within CoPA and help drive the co-design process. Exascale Computing Project (ECP) application projects—such as EXAALT (LAMMPS-SNAP), WDMApp (XGC), ExaSky (HACC/SWFFT), and ExaAM (MPM)—serve as application partners as well as non-ECP applications.
Particle-based simulation approaches are ubiquitous in computational science and engineering. The particles might represent, for example, the atomic nuclei of quantum and classical MD methods or gravitationally interacting bodies or tracer particles in N-body simulations. In each case, every particle interacts with its environment through the local electronic structure or by direct particle–particle interactions at shorter ranges and/or the particle–mesh interactions between a particle and a local field that is set up by longer-range effects.
CoPA’s co-design process consists of using proxy applications, or apps, and libraries to aid the exascale readiness of application partners. Two main library directions have emerged: (1) Cabana’s Particle Simulation Toolkit and (2) the Parallel, Rapid O(N), and Graph-Based Recursive Electronic Structure Solve (PROGRESS) and Basic Matrix Library (BML) QMD libraries. Each strives for performance portability, flexibility, and scalability across architectures with and without GPU acceleration by providing optimized data structure, data layout, and data movement in the context of the sub-motifs they address. Cabana focuses on short-range and long-range particle interactions for MD, PIC, and N-body applications, whereas PROGRESS/BML focuses on O(N) complexity algorithms for electronic structure and QMD applications. QMD is computationally dominated by matrix operations, whereas the other sub-motifs share particle and particle-grid operations. Proxy apps are vehicles used to evaluate the viability of incorporating various types of algorithms, data structures, architecture-specific optimizations, and the associated trade-offs; examples include ExaMiniMD, CabanaMD, CabanaPIC, and ExaSP2.
The Cabana toolkit provides particle algorithm implementations and user-configurable particle data structures. Cabana users can leverage the algorithms and computational kernels provided by the toolkit independent of whether they are also using the native toolkit data structures through memory-wrapping interfaces. The algorithms span the space of particle operations necessary for supporting each relevant application type, spanning across all sub-motifs. This includes intranode (i.e., local and threaded) operations on particles and internode (i.e., communication between nodes) operations to form a hybrid parallel capability. Cabana uses the ECP Kokkos programming model for on-node parallelism, providing performance and portability on pre-exascale and anticipated exascale systems by using current and future US Department of Energy-deployed architectures, including multicore CPUs and GPUs. Within Cabana, Kokkos is used for abstractions to memory allocation, array-like data structures, and parallel loop concepts, which allow one code to be written for multiple architectures. Cabana is available at https://github.com/ECP-CoPA/Cabana.
The PROGRESS/BML QMD libraries provide increased productivity in the implementation and optimization of O(N) complexity and QMD algorithms with a framework in which the matrix operations are separate from the solver implementations. The framework relies on two main libraries: PROGRESS and BML. Electronic structure codes call the solvers in the PROGRESS library, which in turn rely on BML. The BML library provides basic matrix data structure and operations. These consist of linear algebra matrix operations that are optimized based on the format of the matrix and the architecture in which the program will run. Applications can also directly implement specific algorithms based on BML when those are not available routines in PROGRESS. The overarching goal is to construct a flexible library ecosystem that helps to quickly adapt and optimize electronic structure applications on exascale architectures. PROGRESS is available at https://github.com/lanl/qmd-progress, and BML is available at https://github.com/lanl/bml.
The FFTX library capability has been recently added to CoPA. FFTX is providing two approaches for FFTs – “generic” heuristics for choosing a decomposition, possibly augmented by hand-coded implementations for a few specific sizes (e.g. powers of 2) and analysis + code generation + autotuning that depends on the details of the transform, including the specific radices. FFTX is working towards providing general-purpose FFTs on exascale architectures with code generation, as well as developing an integrated algorithms approach for ECP projects, ExaFEL, WarpX, and NWChemEx. FFTX is available at https://github.com/spiral-software/fftx.
Library efforts, algorithm development, and interactions with particle applications represented within CoPA all contribute to the co-design process and strategy. The computational kernels that require optimization for exascale computing are associated with the nature of particle interactions. Applications with short-ranged, long-ranged, and particle-grid interactions are addressed within the Cabana library, whereas applications that require a quantum mechanical description of interactions are addressed within the PROGRESS and BML libraries. Including expertise and application partners who represent all the sub-motifs has allowed the CoPA team to understand and create these libraries and proxy apps of interest for short-range MD, long-range MD, PIC, and QMD applications. Success is measured by the use of these products within ECP and non-ECP projects. Notable successes include the following.