What is the Exascale Computing Project (ECP)?
The ECP is a collaborative effort of two US Department of Energy (DOE) organizations – the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA). ECP was established to accelerate delivery of a capable exascale computing system that integrates hardware and software capability to deliver approximately 50 times more performance than the nation’s most powerful supercomputers in use today. ECP’s work encompasses applications, system software, hardware technologies and architectures. In addition to being a DOE multi-lab collaborative effort, ECP will work closely with other Federal government agencies in a ‘whole-of-Nation’ approach to establishing an enduring national HPC ecosystem along with HPC workforce development.
The goal of the ECP is to deliver breakthrough modeling and simulation solutions that analyze more data in less time, providing insights and answers to the most critical US challenges in scientific discovery, energy assurance, economic competitiveness, and national security.
DOE formalized this long-term strategic effort under the guidance of key leaders from six of the major DOE-SC and NNSA national laboratories: Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Sandia.
ECP will have broad impact and plays an important role in driving US technological competitiveness amid the convergence of HPC, big data analytics and machine learning, topics that ECP-funded research and development efforts will impact across the spectrum of science and engineering domains and disciplines.
It is important to note that the ECP does not have the responsibility for building a future generation of extremely fast, large capacity supercomputers. The ECP role is that of strategic leadership and preparing the foundation for capable exascale systems by identifying and supporting research efforts to accelerate applications, software, hardware platforms and architectures critical to the development of a capable national exascale ecosystem.
ECP’s focus is on “capable” exascale systems means hardware, software, applications, platforms and facilities are co-designed and integrated to deliver sustained performance (not just benchmark achievement) supporting key DOE missions and contributing to US economic competitiveness.
What is a ‘capable’ exascale system?
A capable exascale system is defined as a supercomputer that:
- Delivers 50x the performance of today’s 20PF systems, supporting applications that deliver high-fidelity solutions in less time and address problems of greater complexity.
- Operates in a power envelope of 20-30MW
- Is sufficiently resilient (perceived fault rate of no more than 1/week)
- Includes a software stack that supports a broad spectrum of applications and workloads
The ECP’s plan of record is:
- A 7-year project that follows the holistic/co-design approach, that runs through 2023 (including 12 months of schedule contingency)
- Enable an initial exascale system based on advanced architecture delivered in 2021
- Enable capable exascale systems, based on ECP R&D, delivered in 2022 and deployed in 2023 as part of NNSA and SC facility upgrades
How does ECP add value to what the DOE laboratories already are doing in terms of using HPC to advance scientific discovery?
The ECP leads the formalized project management and integration processes that bridge and align the resources of the DOE and NNSA laboratories, allowing them to more effectively work with industry. This includes integration with technology and system vendors and software and application developers that goes b eyond the specific needs and charters of any one laboratory. The ECP leadership team, comprised of some of the most senior technology leaders of the DOE and NNSA HPC communities, is chartered with managing this complex, multi-year project. Their job is to take full advantage of existing infrastructure when feasible and to maximize project efficiency by managing resources and investments while accelerating research and development.
“The Exascale Computing Project offers a rare opportunity to advance all elements of the HPC ecosystem in unison,” ECP Director Paul Messina said. “Co-design and integration of hardware, software, applications and platforms, a strategic imperative of the ECP, is essential to deploying exascale-class systems that will meet the future requirements of the scientific communities these systems will serve.”
Why is ECP needed?
American leadership in HPC is being challenged as never before, and the stakes are high. The new computing technologies required to achieve exascale will eventually make their way into consumer products and the services that enhance US global economic competitiveness and improve our quality of life. The ECP provides a leadership team with HPC technology and complex project management expertise to ensure a coordinated, collaborative approach to defining and developing necessary future exascale ecosystems, maximizing the return on the nation’s investment in the computing that underpins scientific advancement, national security, and economic well-being.
How is the ECP funded?
DOE has a long history of supporting high-end computing system acquisitions at its national laboratories through the DOE ASCR (Advanced Scientific Computing Research) and NNSA ASC (Advanced Simulation and Computing) programs. With ECP, the DOE Office of Science and the NNSA are jointly funding a coordinated multi-lab effort to avoid duplication, maximize efficiency and drive significant new efforts in terms of application readiness, hardware and software co-design, and workforce development.
How is the ECP structured?
The ECP is a 7-year project led by six DOE and NNSA laboratories and executed in collaboration with academia and industry. The ECP leadership team has staff from the six labs, but it is expected that additional staff from most of the 17 DOE national laboratories will participate in the project.
Will the ECP lead the procurement of the nation’s first exascale supercomputers?
The procurement of future exascale-class supercomputers for the DOE-SC and NNSA laboratories will be handled under the same base facility programs in place today, a process familiar to most HPC system and software suppliers. Prior to the procurement phase, the ECP team will help to establish the design, performance and implementation requirements of these future systems. ECP will play a key role in determining the requirements for hardware, software, applications, and facilities that will be reflected in the exascale Request for Proposal (RFP) documents.
The ECP will also play a key role in helping to drive new training programs throughout the US HPC ecosystem to prepare application developers, researchers and scientists to take full advantage of future generation exascale environments.
The elements of co-design that impact hardware and software development, a major effort on enhancing application readiness, and an expansive HPC user training effort are critical aspects of what the ECP will contribute to bringing the US to the forefront of the exascale computing era.
For more information:
Exascale Computing Project