Many application codes rely on high-performance mathematical libraries to solve the systems of equations generated during their simulation. Because these solvers often dominate the computation time of such simulations, these libraries must be efficient and scalable on the upcoming complex exascale hardware architectures for the application codes to perform well. The Portable Extensible Toolkit for Scientific Computations/Toolkit for Advanced Optimization (PETSc/TAO) project delivers efficient mathematical libraries to application developers for sparse linear and nonlinear systems of equations, time integration, and parallel discretization. It also provides libEnsemble to manage the running of large collections of related simulations necessary for numerical optimization, sensitivity analysis, and uncertainty quantification (the so-called “outer-loop”).
Algebraic solvers, generally nonlinear solvers that use sparse linear solvers via Newton’s method, and integrators form the core computation of many scientific simulations. The PETSc/TAO is a scalable mathematical library that runs portably on everything from laptops to the existing high-performance machines. The PETSc/TAO project is extending and enhancing the library to ensure that it will be performant on exascale architectures, is delivering the libEnsemble tool to manage collections of related simulation for outer-loop methods, and is working with exascale application developers to satisfy their solver needs.
There are no scalable “black box” sparse solvers or integrators that work for all applications or single implementations that work well for all scales of problem size. Hence, algebraic solver libraries provide a wide variety of algorithms and implementations that can be customized for the application and range of problem sizes at hand. The PETSc/TAO team is currently focusing on enhancing the PETSc/TAO library to include scalable solvers that efficiently use many-core and GPU-based systems. This work includes adding support for the range of GPUs that will be deployed and for the Kokkos performance portability layer, optimizing the team’s GPU-aware communications, implementing data structure optimizations to better use many-core and GPU-based systems, and developing algorithms that scale to larger concurrency and provide scalability to the exascale.
The availability of systems with over 100 times the processing power of today’s machines compels the use of these systems not just for a single simulation but rather within a tight outer loop of numerical optimization, sensitivity analysis, and uncertainty quantification. This requires the implementation of a scalable library to manage a dynamic hierarchical collection of running, possibly interacting, scalable simulations. The libEnsemble library directs such multiple concurrent simulations. In this area, the team is focused developing libEnsemble, integrating libEnsemble with the PETSc/TAO library, and extending the PETSc/TAO library to include new algorithms capable of using libEnsemble.