xSDK4ECP

The large number of software technologies being delivered to application developers poses challenges, especially if the application needs to use more than one technology simultaneously, such as using a linear solver from the PETSc/TAO mathematics library in conjunction with a time integrator from the SUNDIALS library. The Extreme-Scale Scientific Software Development Kit (xSDK) project is an effort to create a value-added aggregation of mathematics and scientific libraries and increase the combined usability, standardization, and interoperability of these libraries.

Project Details

As architectures become more complex and applications become reliant on multiple libraries to supply performant capabilities on those architectures to achieve their exascale performance and science goals, the ability to incorporate multiple libraries into one executable is necessary. The xSDK project is an effort to provide the turnkey installation and use of popular scientific packages needed for next-generation scientific applications. The xSDK project is working to (1) enable the seamless combined use of diverse, independently developed numerical libraries as needed by exascale applications; (2) develop interoperability layers among numerical libraries to improve code quality, access, usability, interoperability, and sustainability; and (3) provide an aggregate build and install capability for the numerical libraries that supports hierarchical, modular installation. It also focuses on integrated execution (control inversion and adaptive execution strategies.The xSDK project focuses on community development and a commitment to combined success via quality improvement policies, better build infrastructure, and the ability to use numerical libraries in combination to solve large-scale multiphysics and multiscale problems. The project represents a different approach to coordinating library development and deployment. Previously the xSDK, scientific software packages were cohesive with one team effort but not across these efforts. The xSDK goes a step further by developing community policies followed by each independent library included in the xSDK. This policy-driven, coordinated approach enables independent development that still results in compatible and composable capabilities. Moreover, the xSDK provides a forum for collaborative numerical library development, helping independent teams accelerate the adoption of best practices, enabling the interoperability of independently developed libraries, and improving developer productivity and library sustainability.

The xSDK project also entails a coordinated effort to investigate and deploy multiprecision functionality in the Exascale Computing Project (ECP) software technology (ST) ecosystem to enable the use of low-precision hardware function units, reduce the pressure on memory and communication interfaces, and achieve improved performance. After conducting a comprehensive analysis of existing theory and multiprecision functionalities, the project focuses on developing multiprecision capabilities in xSDK member libraries, acknowledging the cross-library interoperability as a first-class design consideration. After a thorough investigation of the numerical robustness and performance potential, the project has started integrating multiprecision functionality into ECP application projects.

Other efforts of the project include the development of autotuning software for the parameter optimization of high-performance computing codes and the inclusion of batched sparse linear algebra in xSDK libraries for improved GPU performance. The team is collaborating with ECP industry partners to design interfaces for batched sparse linear algebra operations for various GPUs. This set of new APIs supports development of batched sparse linear solvers and preconditioners on GPUs as well as new interoperability layers for integration across applications, solvers and lower-level libraries.

Principal Investigator(s):

Ulrike Yang, Lawrence Livermore National Laboratory

Collaborators:

Lawrence Livermore National Laboratory; Argonne National Laboratory; Sandia National Laboratories; Lawrence Berkeley National Laboratory; Oak Ridge National Laboratory; University of California, Berkeley; University of Tennessee, Knoxville; University of Oregon; Karlsruhe Institute of Technology; University of Manchester; Charles University at Prague

Progress to date

  • The xSDK team released xSDK version 0.8.0, which added two new xSDK members—ExaGO and HiOp—to the other xSDK libraries—AMRex, ArborX, ButterflyPACK, deal.II, DTK, Ginkgo, heFFTe, hypre, libEnsemble, MAGMA, MFEM, Omega_h, PETSc, PHIST, PLASMA, preCICE, PUMI, SLATE, SLEPc, STRUMPACK, SUNDIALS, SuperLU, Tasmanian, and Trilinos—and the two domain components Alquimia and PFLOTRAN. The team continued to develop community policies. It recently released version 1.0.0, which makes two recommended policies mandatory and adds two recommended polices.
  • The team released xsdk-examples version 0.3.0, a suite of example codes that demonstrate interoperabilities between select xSDK libraries. The release adds 5 new examples codes and an improved build system. The test suite has also become an integral part of the new improved xSDK testing strategies, which also include testing of subsets of the xSDK with xSDK library development versions as an effort of improved sustainability of the xSDK.
  • The machine-learning based autotuning software GPTune is a tool to select parameters for high-performance computing codes to maximize their performance. It has successfully been used for tuning heFFTe, hypre, IMPACT-Z, M3D-C1, MFEM, NIMROD, PLASMA, ScaLAPACK, SLATE, STRUMPACK, SuperLU, CNN, GCN, and KRR. It outperformed two state-of-the-art tuners, OpenTuner and HpBandster, with a speedup of up to 2.5 when tuning ScaLAPACK QR.
  • The xSDK-multiprecision effort has advanced mixed precision algorithms capable of exploiting low precision hardware units to reduce the runtime to generate high quality solutions. Promising algorithmic approaches were adapted to the hardware technology and deployed as production-ready mixed precision functionality, including mixed precision algebraic multigrid methods, iterative refinement solvers, Krylov solvers, and BLAS routines.
  • The xSDK-Batched Sparse Linear Algebra effort designs, develops, and integrates into ECP applications new solvers and accompanying operations that applies the concept of batching used in Batched BLAS and LAPACK libraries to sparse matrices and ports it to hardware accelerators. The team also extracted a number of test matrices that are now used for evaluation both the performance and accuracy of the multiple implementations already available for the ECP hardware systems.

National Nuclear Security Administration logo Exascale Computing Project logo small U.S. Department of Energy Office of Science logo