CLOVER

Scientific applications must apply efficient and scalable implementations of numerical operations, such as matrix-vector products and Fourier transforms, to simulate their phenomena of interest. Software libraries are powerful ways to share verified, optimized numerical algorithms and their implementations. The CLOVER project is delivering scalable, portable numerical algorithms to facilitate efficient simulations. The team evolves implementations to run effectively on the pre-exascale and exascale systems and adds new capabilities that might be needed by applications.

Project Details

Mathematical libraries encapsulate the latest results from the mathematics and computer science communities, and many exascale applications rely on these numerical libraries to incorporate the most advanced technologies available in their simulations. Advances in mathematical libraries are necessary for enabling computational science on exascale systems as the exascale architectures introduce new complexities that algorithms and their implementations must address to be scalable, efficient, and robust. The CLOVER project is ensuring the healthy functionality of the mathematical libraries on which these applications depend. The libraries supported by the CLOVER project, SLATE, heFFTe, and Ginkgo span the range from lightweight collections of subroutines with simple application programming interfaces (APIs) to more “end-to-end” integrated environments and provide access to a wide range of algorithms for complex problems.

SLATE provides dense linear algebra operations for large-scale machines with multiple GPU accelerators per node. The team focuses on adding support to SLATE for the most critical workloads required by exascale applications: BLAS, linear systems, least squares, matrix inverses, singular value problems, and eigenvalue problems.

HeFFTe delivers highly efficient fast Fourier transforms (FFTs) for exascale computing. Applications include molecular dynamics, spectrum estimation, fast convolution and correlation, signal modulation, and wireless multimedia applications. HeFFTe implements fast and robust multidimensional FFTs and FFT specializations that target large-scale heterogeneous systems with multicore processors and hardware accelerators.

Ginkgo is an accelerator-focused production-ready, next-generation sparse linear algebra library that provides scalable preconditioned iterative solvers. To ease adoption and usage, the library employs a uniform interface to all functionality. Separating the algorithms from architecture-specific kernels provides a high level of platform portability and enables Ginkgo to run on all Exascale Computing Project (ECP) exascale systems.

Principal Investigator(s):

Hartwig Anzt, University of Tennessee, Knoxville

Collaborators:

Karlsruhe Institute of Technology

Progress to date

  • The CLOVER team produced a version of SLATE that supports Level 3 BLAS, norms, linear solvers, mixed-precision linear solvers, least-squares solvers, and eigenvalue and singular value solvers. It includes compatibility APIs for LAPACK and ScaLAPACK users. The SLATE project developed the BLAS++ and LAPACK++ libraries as a portability layer, with CUDA, ROCm, and oneAPI backends.
  • The CLOVER team’s heFFTe 2.3 provides excellent scalability with performance reaching more than 90% of roofline peak and GPU kernels that achieve more than 40× speedup with respect to local kernels from CPU state-of-the-art libraries. The software includes standard and specialized FFT capabilities for NVIDIA, AMD, and Intel GPUs and added bindings for C, Fortran, and Python, making heFFTe portable and easy to integrate in application software. HeFFTe interfaces are developed for ECP projects, such as Cabana and Exaalt.
  • The CLOVER team prepared production-ready backends of the Ginkgo library for AMD GPUs (via HIP), Intel GPUs (via DPC++), and NVIDIA GPUs (via CUDA), thereby allowing the usage of preconditioned iterative solvers on all ECP exascale systems. Ginkgo now also features production-ready batched iterative solvers and mixed precision algebraic multigrid. GPU-resident sparse direct solver functionality is under development.

National Nuclear Security Administration logo U.S. Department of Energy Office of Science logo