Modern cosmological observations carried out with large-scale sky surveys are unique probes of fundamental physics. They have led to a remarkably successful model for the dynamics of the universe and several breakthrough discoveries. Three key ingredients—dark energy, dark matter, and inflation—are signposts to further breakthroughs because all reach beyond the known boundaries of the Standard Model of particle physics. Sophisticated large-scale simulations of cosmic structure formation are essential to this scientific enterprise. They not only shed light on some of the biggest challenges in physical science but also rank among the very largest and most scientifically rich simulations run on supercomputers today. The ExaSky project is extending existing cosmological simulation codes to work on exascale platforms to address this challenge.
A new generation of sky surveys will provide key insights into questions raised by the current cosmological paradigm and provide new classes of measurements, such as those of neutrino masses. They could lead to exciting new results, including the discovery of primordial gravitational waves and modifications of general relativity. Existing supercomputers do not have the performance or memory needed to run the next-generation simulations that are required to meet the challenge posed by future surveys whose timelines are parallel those of the Exascale Computing Project. The ExaSky project extends the capabilities of the HACC and Nyx cosmological simulation codes to efficiently use exascale resources as they become available. The Eulerian AMR code Nyx complements the Lagrangian nature of HACC. The two codes are being used to develop a joint program for the verification of gravitational evolution, gas dynamics, and subgrid models in cosmological simulations run at very high dynamic range.
To establish accuracy baselines, there are statistical and systematic error requirements on many cosmological summary statistics. The accuracy requirements are typically scale-dependent, large spatial scales being subject to finite-size effects and small scales being subject to several more significant problems, such as particle shot noise and code evolution errors, including subgrid modeling biases. Strict accuracy requirements were already set by the observational requirements for US Department of Energy-supported surveys, such as the Cosmic Microwave Background-Stage 4 (CMB-S4), Dark Energy Spectroscopic Instrument (DESI), and the Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST), which are typically sub-percent (statistical) over the range of well-observed spatial scales. Systematic errors must be characterized and controlled, where possible, to the percent level or better. The final challenge problem runs will be carried out with a new set of subgrid models for gas cooling, UV heating, star formation, and supernova and active galactic nucleus feedback under active development.
The simulation sizes are set by the scales of the cosmological surveys. The challenge problem simulations must cover boxes of linear sizes up to several gigaparsecs in scale with galaxy formation-related physics modeled down to roughly 0.1 kiloparsecs—a dynamic range of one part in 10 million, improving the current state of the art by an order of magnitude. Multiple box sizes will be run to cover the range of scales that must be robustly predicted. The mass resolution of the simulations in the smaller boxes will go down to roughly 1 million solar masses for the baryon tracer particles and about five times this value for the dark matter particles. The final dynamic range achieved depends on the total memory available on the first-generation exascale systems.
The ExaSky science challenge problem will eventually comprise a small number of very large cosmological simulations run with HACC that simultaneously address many science problems of interest. Setting up the science challenge problem in turn requires multiple simulations—building subgrid models by matching against results from very high-resolution galaxy formation astrophysics codes via a nested-box simulation approach, having a medium-scale set for parameter exploration, and—based on these results—designing and implementing the final large-scale challenge problem runs on exascale platforms.
Project simulations are classified into three categories: (1) large-volume, high-mass, and force resolution gravity-only simulations; (2) large-volume, high-mass, and force resolution hydrodynamic simulations, including detailed subgrid modeling; and (3) small-volume, very high-mass, and medium/high-force resolution hydrodynamic simulations, including subgrid modeling.
The first simulation set is targeted at observations of luminous red galaxies, emission line galaxies, and quasars. The simulations are relevant to DESI, the NASA SPHEREx mission, end-to-end simulations for LSST, and modeling the cosmic infrared background for CMB-S4. The second and main set of simulations will include hydrodynamics and detailed subgrid modeling with the resolution and physics reach improving over time as more powerful systems arrive. The main probes targeted with these simulations are strong and weak gravitational lensing shear measurements, galaxy clustering, clusters of galaxies, and cross-correlations internal to this set and with CMB probes, such as CMB lensing and thermal and kinematic Sunyaev-Zel’dovich effect observations. A set of smaller volume hydrodynamic simulations will be performed in support of the program for convergence testing and verification and to develop and test a new generation of subgrid models based on results from high-resolution, small effective volume, galaxy formation studies performed by other groups.