A Conversation with Researcher David Daniel about the Ristra Project
You may associate the word “ristra” with something colorful or creative, such as a string of drying chile peppers or as a symbol of design. However, Ristra is also the name of a US Department of Energy (DOE) Exascale Computing Project (ECP) effort at Los Alamos National Laboratory (LANL). The Ristra project is creating a set of codes targeting national-security-relevant multiphysics problems, or challenges that pertain to multiple simultaneous physical phenomena.
Ristra—together with sister projects at Lawrence Livermore National Laboratory and Sandia National Laboratories—is part of the national security applications portfolio of ECP. Primary funding is from the Advanced Technology Development and Mitigation subprogram of the National Nuclear Security Administration’s Advanced Simulation and Computing Program.
“The Ristra project has made solid progress both in terms of the physics needs and in demonstrating new computer science technologies.”
The “why” behind the initiation of Ristra was the need to rethink the approach to multiphysics simulations from both the physics and computer science perspectives.
From the physics side, led by Aimee Hungerford of LANL, the project is exploring more-direct methods of solution and rethinking the way physics components are coupled in computer modeling and simulation. The objective is to enable the components to evolve concurrently rather than in the sequential way that is common in today’s codes.
Relative to the computer science aspect, led by David Daniel of LANL, the researchers want to investigate the use of modern parallel programming systems that promise a simpler path to the efficient parallel execution of codes than in current message-passing models.
LANL has a broad mission related to national security applications that increasingly rely on predictive science from multiphysics applications for a wide range of scientific fields. Ristra focuses on applications in advanced materials research and what is called inertial confinement fusion, in which attempts are made to initiate nuclear fusion reactions by heating and compressing a fuel target. These activities challenge researchers across a broad range of physics methods and scales.
Ristra Project Execution
The Ristra team uses a system known as Legion that originated at Stanford University and is now developed jointly with Nvidia and LANL. It provides a conceptually sequential programming model with a rich data model and a task execution model. This combination allows parallel computational work to be identified and efficiently scheduled to available hardware using a custom mapping algorithm.
Ristra also employs what is called the Flexible Computational Science Infrastructure (FleCSI) as the key abstraction layer to separate the concerns of physics expression from data management and parallel execution. FleCSI enables multiple parallel backends such as Legion or message passing interface (MPI) to be swapped in and out without changing the physics code.
FleCSI supplies the infrastructure for the implementation of mesh and mesh-free discretizations, together with associated data models for physical fields. A mesh is a network formed of cells and points used to solve partial differential equations; discretization refers to division into an equivalent number of finite elements. For multiphysics simulations, efficient mapping of data between these discretizations is necessary, and this is provided by Portage, Ristra’s remapping and linking library.
Both FleCSI and Portage have been open sourced by LANL and are available on github.com.
Ristra has made some solid progress both in terms of the physics and in demonstrating new technologies such as Legion in the project’s applications, Daniel said. An example pertains to a LANL multi-material radiation hydrodynamics code called Symphony, which was developed primarily with inertial confinement fusion in mind. Daniel explained that the Ristra team has incorporated in Symphony an innovative novel multiscale radiation solver software that couples high- and low-order differential equation solvers for radiation in a way that is consistent with the hydrodynamics aspect and promises more-accurate simulations.
“Also, by using that multiscale approach, we expose some coarse-grain parallelism that should allow us to experiment with novel approaches to concurrency in future applications,” Daniel said. “We’re exploring a sort of fine-grained parallelism that one thinks of in terms of breaking up the work into threads and MPI ranks. We’re also thinking about breaking up the functional structure of an application into physics modules or submodules so that they can be evolved independently and concurrently and eliminate some of the synchronizations that are inherent between the coupling of physics codes in the way that most multiphysics codes are written today. And that could give us opportunity for pursuing greater scalability on machines of the exascale era and beyond.”
The Collaborative Team
The Ristra project is composed of approximately 50 scientists from various disciplines who contribute at some level. Twenty-five core team members are distributed across computer science, materials science, hydrodynamics, radiation science, and astrophysics.
“There certainly are challenges in having a diverse team like that, so one of our approaches is to have frequent project-wide meetings, which we call tea times, where we all get together,” Daniel said. “We can discuss in smaller groups or in one large group if there are issues that pertain to the whole project.”
Ristra has a large external review scheduled for fiscal year 2020, which will lead the team to take stock of achievements and request advice from the outside concerning progress and where the effort needs to go next, Daniel explained.
“And then beyond that review,” he said, “we anticipate a greater focus on performance and scalability and preparing to demonstrate our codes on the exascale platforms when they appear.”
Researchers (partial list)