Modern cosmological observations carried out with large-scale sky surveys are unique probes of fundamental physics. They have led to a remarkably successful model for the dynamics of the universe and several breakthrough discoveries. Three key ingredients—dark energy, dark matter, and inflation—are signposts to further breakthroughs because all reach beyond the known boundaries of the Standard Model of particle physics. Sophisticated large-scale simulations of cosmic structure formation are essential to this scientific enterprise. They not only shed light on some of the biggest challenges in physical science but also rank among the very largest and most scientifically rich simulations run on supercomputers today. The ExaSky project is extending existing cosmological simulation codes to work on exascale platforms to address this challenge.

Project Details

A new generation of sky surveys will provide key insights into questions raised by the current cosmological paradigm and provide new classes of measurements, such as those of neutrino masses. They could lead to exciting new results, including the discovery of primordial gravitational waves and modifications of general relativity. Existing supercomputers do not have the performance or memory needed to run the next-generation simulations that are required to meet the challenge posed by future surveys whose timelines are parallel those of the Exascale Computing Project. The ExaSky project extends the capabilities of the HACC and Nyx cosmological simulation codes to efficiently use exascale resources as they become available. The Eulerian AMR code Nyx complements the Lagrangian nature of HACC. The two codes are being used to develop a joint program for the verification of gravitational evolution, gas dynamics, and subgrid models in cosmological simulations run at very high dynamic range.

To establish accuracy baselines, there are statistical and systematic error requirements on many cosmological summary statistics. The accuracy requirements are typically scale-dependent, large spatial scales being subject to finite-size effects and small scales being subject to several more significant problems, such as particle shot noise and code evolution errors, including subgrid modeling biases. Strict accuracy requirements were already set by the observational requirements for US Department of Energy-supported surveys, such as the Cosmic Microwave Background-Stage 4 (CMB-S4), Dark Energy Spectroscopic Instrument (DESI), and the Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST), which are typically sub-percent (statistical) over the range of well-observed spatial scales. Systematic errors must be characterized and controlled, where possible, to the percent level or better. The final challenge problem runs will be carried out with a new set of subgrid models for gas cooling, UV heating, star formation, and supernova and active galactic nucleus feedback under active development.

The simulation sizes are set by the scales of the cosmological surveys. The challenge problem simulations must cover boxes of linear sizes up to several gigaparsecs in scale with galaxy formation-related physics modeled down to roughly 0.1 kiloparsecs—a dynamic range of one part in 10 million, improving the current state of the art by an order of magnitude. Multiple box sizes will be run to cover the range of scales that must be robustly predicted. The mass resolution of the simulations in the smaller boxes will go down to roughly 1 million solar masses for the baryon tracer particles and about five times this value for the dark matter particles. The final dynamic range achieved depends on the total memory available on the first-generation exascale systems.

The ExaSky science challenge problem will eventually comprise a small number of very large cosmological simulations run with HACC that simultaneously address many science problems of interest. Setting up the science challenge problem in turn requires multiple simulations—building subgrid models by matching against results from very high-resolution galaxy formation astrophysics codes via a nested-box simulation approach, having a medium-scale set for parameter exploration, and—based on these results—designing and implementing the final large-scale challenge problem runs on exascale platforms.

Project simulations are classified into three categories: (1) large-volume, high-mass, and force resolution gravity-only simulations; (2) large-volume, high-mass, and force resolution hydrodynamic simulations, including detailed subgrid modeling; and (3) small-volume, very high-mass, and medium/high-force resolution hydrodynamic simulations, including subgrid modeling.

The first simulation set is targeted at observations of luminous red galaxies, emission line galaxies, and quasars. The simulations are relevant to DESI, the NASA SPHEREx mission, end-to-end simulations for LSST, and modeling the cosmic infrared background for CMB-S4. The second and main set of simulations will include hydrodynamics and detailed subgrid modeling with the resolution and physics reach improving over time as more powerful systems arrive. The main probes targeted with these simulations are strong and weak gravitational lensing shear measurements, galaxy clustering, clusters of galaxies, and cross-correlations internal to this set and with CMB probes, such as CMB lensing and thermal and kinematic Sunyaev-Zel’dovich effect observations. A set of smaller volume hydrodynamic simulations will be performed in support of the program for convergence testing and verification and to develop and test a new generation of subgrid models based on results from high-resolution, small effective volume, galaxy formation studies performed by other groups.

Principal Investigator(s):

Salman Habib, Argonne National Laboratory


Argonne National Laboratory, Los Alamos National Laboratory, Lawrence Berkeley National Laboratory

Progress to date

  • A high-performance hybrid N-body gravity solver for cosmological simulations was proven at scale on manycore (e.g., Cori, Theta) and large-scale CPU/GPU (e.g., Cooley, Summit, Titan) systems in full production mode. Algorithms were proven for the challenge problem.
  • A new, improved Lagrangian hydrodynamics method (CRK-SPH) with a complete suite of astrophysical subgrid models was integrated into HACC for manycore and GPU systems. Eulerian cosmological hydrodynamics capability with Nyx was run at scale on manycore systems (e.g., Cori II, Theta) and was recently run on Summit. This included significant work on performance optimization and improved deep AMR capabilities.
  • Key HACC and Nyx solvers were ported to pre-exascale prototype systems at Argonne and Oak Ridge National Laboratories, achieving projected levels of performance.
  • A data reduction capability (lossy compression) was developed for cosmological simulations that reduces storage and I/O requirements by factors ranging from ~5 to on the order of ~100.

ExaSky is enabling large-scale cosmological simulations that, when combined with exascale computing and next-generation sky surveys, will improve our understanding of the large-scale physical processes that drive the evolution of structure in the universe.

National Nuclear Security Administration logo U.S. Department of Energy Office of Science logo