NWChemEx

One main goal of the US Department of Energy’s (DOE’s) advanced biofuels program is to develop fuels that can be distributed by using the existing infrastructure and to replace existing fuels on a gallon-for-gallon basis. However, producing high-quality biofuels in a sustainable and economically competitive way is technically challenging, especially in a changing global climate. Designing feedstock to efficiently produce biomass and designing new catalysts to efficiently convert biomass-derived intermediates into biofuels are two significant science challenges involved in advanced biofuel development.

The NWChemEx project directly addresses a Priority Goal in DOE’s 2014–2018 Strategic Plan, namely by developing high-performance computational “models demonstrating that biomass can be a viable, sustainable feedstock” for the production of biofuels and other bioproducts. In addition to providing the means to resolve these biofuel challenge problems, NWChemEx will enable exascale computers to be applied toward many molecular-scale challenges, such as developing new materials for solar energy conversion and next-generation batteries, simulating chemical processes in combustion, CO2 capture, H2 production and storage, predicting the transport and sequestration of energy by-products in the environment, and designing new functional materials.

Project Details

The NWChemEx project is redesigning and reimplementing NWChem for pre-exascale and exascale computers. NWChemEx is based on NWChem, an open-source, high-performance parallel computational chemistry code funded by the DOE Biological and Environmental Research (BER) program that provides a broad range of capabilities for modeling molecular systems. NWChemEx will support a broad range of chemistry research important to DOE BER and DOE Basic Energy Sciences on computing systems that range from terascale workstations and petascale servers to exascale computers.

In particular, the NWChemEx project is developing high-performance, scalable implementations of three primary physical models:

  • Hartree-Fock (HF) and density functional theory (DFT) methods: These methods are the foundation for the physical models to be incorporated in the NWChemEx framework. Their implementation must be significantly revised to simulate the large molecular systems in the targeted science challenges on exascale computers. Both Gaussian and plane-wave basis set methods have been developed on the project.
  • Coupled cluster (CC) methods: A suite of canonical and reduced-scaling CC methods will be implemented in NWChemEx. These methods are the gold standard in electronic structure theory and provide the level of fidelity required to address the targeted science challenges.
  • Density functional embedding theory: Embedding techniques provide a natural and mathematically sound basis for seamlessly integrating subsystems with different electronic structure representations, enabling the active site of interest to be described with high-accuracy CC methods while using a lower fidelity method to describe the impact of the environment on the molecular processes in the active

To illustrate the performance of NWChemEx on biomolecular systems at the exascale, the ubiquitin molecule was selected as a performance benchmark. Ubiquitin is a protein molecule typical of many biomolecular molecules, and an abundance of experimental data are available from it and its fragments. Although it will be infeasible to run canonical coupled cluster calculations on ubiquitin, which is a 1,231-atom molecule, reduced-scaling CC calculations can be run on it. The availability of both implementations along with a sequence of ubiquitin fragments will allow any inaccuracies in the reduced-scaling method to be identified and corrected.

To illustrate the capability of NWChemEx for chemical reactions, the project will examine several elementary chemical transformations that have been postulated for the conversion of propanol to propene in the H-ZSM-5 zeolite (basic unit cell: Si96O192). Reduced-scaling CC calculations embedded in the water and zeolite environment will be used to redefine the structures and energetics of the postulated elementary steps in the conversion of propanol to propene. Depending on the outcome of these calculations, additional work might be required to characterize the mechanism of this conversion more fully.

Principal Investigator(s):

Theresa Windus, Ames Laboratory

Collaborators:

Ames National Laboratory, Argonne National Laboratory, Brookhaven National Laboratory, Lawrence Berkeley National Laboratory, Oak Ridge National Laboratory, Pacific Northwest National Laboratory, Virginia Polytechnic Institute and State University

Progress to date

  • PluginPlay was designed and implemented as the framework for initiating and connecting modules. This component includes the memoization and caching of results to decrease redundant computations, increase support for scripting, and enable application programming interfaces for I/O. A publication has just been accepted in the Journal of Chemical Physics.
  • The Chemist module that supports core data associated with quantum chemistry codes (e.g., molecular and basis set information) was implemented. Chemist provides a high-level domain-specific language (DSL) to facilitate writing computational chemistry software.
  • An initial beta version of TensorWrapper was designed and implemented. TensorWrapper is envisioned as a way of unifying existing disparate tensor libraries, with an initial focus on TiledArray and TAMM.
  • The simulation development environment, SimDE, was designed and implemented to provide a developer-friendly package for writing modular software for NWChemEx. SimDE maps the DSLs of Chemist and TensorWrapper to the generic infrastructure of PluginPlay. DOI: 10.1109/MCSE.2018.2884921.
  • Developed a set of distributed memory algorithms for the evaluation of the Coulomb and exact-exchange matrices for hybrid Kohn-Sham DFT with Gaussian basis sets via density-fitted (DF-J) and seminumerical (sn-K) methods.
  • Completed the implementation and most of the performance tuning on GPUs of the TAMM module for dense tensor operations. This has resulted in scalable and performant CCSD and CCSD(T) methods.
  • Completed the implementation of sparse tensors on GPUs within TAMM. Domain local pair natural orbital (DLPNO) CCSD(T) has been implemented and is currently being tuned to improve performance.
  • Explored several density embedding schemes with an updated implementation underway.
  • Full Gamma-point plane-wave DFT and HF, AIMD, QM/MM (stand-alone and interfaced to LAMMPs), QM-classical DFT, PAW (including lanthanide potentials), and PW-dielectric methods have been developed.   In addition, the COVOS method was developed that can be used to generate input for standard many-body methods, .e.g., CCSD(T).  A full Band structure code is currently being developed.
  • Developed GPU support in GA that enables users to create global arrays that are hosted in GPU memory. Recent optimizations include an option for using CPU-aware MPI to eliminate some host-to-device data transfers and an option to replace standard memory copies with highly parallel kernels that copy strided data in data transfers.
  • Added an option to use System V share memory instead of POSIX shared memory in GA. This potentially overcomes a POSIX limitation that restricts users to using a maximum of half the available memory in global arrays.
  • Extensive user and developer documentation is in place and is under continual development.
  • Python interfaces are in place for C++ modules and functionality. More user-friendly python interfaces are in development.
  • Developed CMinx, a tool for generating API documentation for CMake functions and modules. DOI: 10.21105/joss.04680.
  • Developed CMakeTest, a unit testing framework for CMake modules.
  • Developed the CMakePP language, an object-oriented version of CMake which is fully backwards compatible with traditional CMake. A publication is in preparation.
  • Used the CMakePP language to develop CMaize, a significantly streamlined build system for C++ projects. A publication is in preparation.

National Nuclear Security Administration logo U.S. Department of Energy Office of Science logo