CEED’s Impact on Exascale Computing Project (ECP) Efforts is Wide-Ranging

The Center for Efficient Exascale Discretizations (CEED) within the US Department of Energy’s ECP is helping applications leverage future architectures by developing state-of-the-art discretization algorithms that better exploit the hardware and deliver a significant performance gain over conventional methods. The focus is on high-order methods for high-fidelity and better machine utilization, with a range of orders providing flexibility in uncertain hardware and software environments.

Working closely with application scientists, CEED is delivering software libraries and community standards for high-order unstructured meshing, operator evaluation, adaptive mesh refinement, linear solvers, and more. CEED researchers are also collaborating with hardware vendors and software technologies projects to utilize and impact the upcoming exascale hardware and its software stack through CEED-developed kernels, benchmarks, and mini-apps.

The center is a research partnership involving more than 30 computational scientists from two DOE labs (Lawrence Livermore [LLNL] and Argonne) and five universities (the University of Illinois at Urbana-Champaign, Virginia Tech, the University of Colorado Boulder, the University of Tennessee, and Rensselaer Polytechnic Institute).

Recent Progress

Among the examples of CEED’s recent progress are the public release of its second distribution, CEED-2.0, work on iterative linear and nonlinear solvers for high-order discretizations of nonsymmetric problems, and the 4.0 release of the MFEM finite element library that for the first time adds GPU support at a general discretization level.

CEED’s release of its second software distribution consisted of twelve integrated Spack packages for libCEED, MFEM, Nek5000, NekCEM, Laghos, NekBone, HPGMG, OCCA, MAGMA, gslib, PETSc, and PUMI, plus an updated CEED “meta-package.” As part of CEED 2.0, the team developed comprehensive documentation including Docker, Shifter and Singularity containers, and configuration packages for the Argonne Leadership Computing Facility, the Oak Ridge Leadership Computing Facility, the National Energy Research Scientific Computing Center, and the LLNL computing center.

The CEED team is developing iterative linear and nonlinear solvers for high-order discretizations of nonsymmetric problems, such as those arising in implicit and steady-state formulations of the advection-diffusion, Navier-Stokes, and Reynolds-averaged Navier-Stokes equations. Current semi-implicit formulations for direct numerical simulations and large-eddy simulations of turbulence have near optimal complexity.

However, a broad range of applications exists—including the ECP projects ExaWind and ExaSMR—in which disparate time scales drive a need for implicit treatment of the nonlinear/nonsymmetric terms in the governing equations. To this end, CEED is developing multilevel Schwarz preconditioners that call for fast subdomain solves coupled with global coarse-grid problems to effectively eliminate errors at each scale.

 

ExaSMR simulation by Alper Yidiz (TAMU) showing a helical cool steam generator with 72 tubes

The performance tuning and new algorithmic performance improvements in CEED are enabling faster and more accurate simulations in multiphysics applications such as ExaSMR (left) and MARBL (right). The ExaSMR-relevant simulation on the left, by Alper Yidiz (Texas A&M University), shows a helical cool steam generator with 72 tubes for Re=27,000, using 1.4 billion grid points and 131,000 MPI ranks on MIRA (at the Argonne Leadership Computing Facility, Argonne National Laboratory). The MARBL simulation on the right uses high-order finite elements to model laser-driven radiating Kevin-Helmholtz instability at Lawrence Livermore National Laboratory. Courtesy: CEED

The subdomain solves exploit the local tensor-product structure of the spectral element method. Standard tensor-product (i.e., separable) diagonalization approaches do not directly apply in these cases. There are, however, recent advances in the development of approximate tensor-product decompositions that are appropriate for preconditioning. The CEED team has made significant advances in these preconditioners, particularly for 3D problems, and have applied them to several challenging test problems.

As part of the research and development activities in CEED, the MFEM team at LLNL released version 4.0 of the MFEM finite elementary library. This is the first release of the library that adds GPU support at a general discretization level. The release makes GPU acceleration easily available to many finite element applications in ECP, including simulations of compressible flow, additive manufacturing, wind turbines, subsurface flow, and more.

MFEM-4.0

MFEM-4.0 offers initial support for both CPU and GPU acceleration of key linear algebra and finite element kernels based on flexible memory management and runtime-selectable execution backends. Courtesy: CEED

The GPU support is based on optimized CUDA, libCEED, OCCA, RAJA and OpenMP device kernels and an internal device/host memory manager designed to work seamlessly with the new kernels. Initial results with MFEM’s example codes and the Laghos miniapp indicate a more than 10x speedup when using a single GPU compared with a multi-core CPU on Summit-type architecture. The CEED team at LLNL is working actively to bring these benefits to the MARBL code and other ECP applications.

Additional features in the MFEM-4.0 release include partially assembled finite element operators in the core library, support for wedge/prism elements and meshes with mixed element types, general “low-order refined” to “high-order” field transfer, and seven new examples and mini-apps.

Upcoming Activities

With respect to near-future activities, the CEED team is planning to release a GPU-capable version of the Nek5000 code and work closely with its ECP target applications to help them port to the Summit and Sierra machines.

The project is also organizing its 3rd annual meeting, which is open to everyone interested in high-order methods and applications. The meeting will take place August 6–8, 2019, at Virginia Tech.

In the longer term, CEED researchers plan to port and optimize the CEED software stack and applications to the Aurora, Frontier, and El Capitan machines. The team will also continue to perform research and development of high-order algorithms for unstructured meshing, matrix-free operator evaluation, adaptive mesh refinement, linear solvers, and other topics of interest to ECP applications.

Related Links

CEED project website
CEED milestone reports publicly available on Zenodo