Researchers supporting the development of NWChemEx, an open-source, exascale software platform for high-performance quantum chemistry simulations, have demonstrated a novel modular software design solution that can be extended to future architectures with minimal software engineering effort and provides a sustainable pathway to software development with kernels that can be plugged into many high-level algorithms. Their work, funded by the Exascale Computing Project, addresses the challenge of finding programming models that perform well across the heterogeneous architectures found in the world’s supercomputers—which may still use multicore CPUs and GPU accelerators from multiple vendors—as well as the limitations that single-source portability layers present to optimizing scientific software workflows. The team demonstrated similar performance profiles for NVIDIA, AMD and Intel GPUs using numerical integration of the exchange-correlation potential in Kohn-Sham density functional theory (KS-DFT), a quantum chemistry method critical to the simulation of molecules and materials. Their findings were published in the September 2021 issue of Parallel Computing.
The researchers’ modular, object-oriented software design solution separates the expression of scientific workflows from the implementation details of individual algorithmic kernels. This allows a developer to express the overall algorithm in a high-level, single-source language while allowing for implementation of a handful of performance-critical kernels on a per-architecture basis, providing sustainability in software development efforts. On each architecture of interest, each kernel is implemented and optimized as a plugin loaded at run-time or compile-time depending on the application, enabling the developer to simultaneously target several architectures without modifying the high-level algorithmic workflow and providing extensibility. The modular design also allows for rapid testing and prototyping of novel implementation strategies for individual architectures without interfering with implementations of other kernels or architectures.
Work is underway to unify the GPU and CPU implementations of the KS-DFT module in NWChemEx as well as to extend the GPU implementation to FPGAs and ASICs while maintaining high-level algorithmic specification. If successful, the implementation would be a first-of-its-kind in energy-efficient scientific computing with high impact in the post-exascale computing era.
David B. Williams-Young, Abhishek Bagusetty, Wibe A. de Jong, Douglas Doerfler, Hubertus J.J. van Dam, Álvaro Vázquez-Mayagoitia, Theresa L. Windus, Chao Yang. “Achieving Performance Portability in Gaussian Basis Set Density Functional Theory on Accelerator Based Architectures in NWChemEx.” 2021. Parallel Computing (September).