Oak Ridge National Laboratory
Awareness of ECP and its mission is growing and resonating—and for good reason. ECP is an incredible effort focused on advancing areas of key importance to our country: economic competitiveness, breakthrough science and technology, and national security. And, fortunately, ECP has a foundation that bodes extremely well for the prospects of its success, with the demonstrably strong commitment of the US Department of Energy (DOE) and the talent of some of America’s best and brightest researchers.
ECP is composed of about 100 small teams of domain, computer, and computational scientists, and mathematicians from DOE labs, universities, and industry. We are tasked with building applications that will execute well on exascale systems, enabled by a robust exascale software stack, and supporting necessary vendor R&D to ensure the compute nodes and hardware infrastructure are adept and able to do the science that needs to be done with the first exascale platforms.
Highlighting ECP’s Mission and Progress
I recently sat for a video interview with ECP Communications to revisit the big-picture perspective of why, how, and what we’re doing to pursue our mission. We talk, in general terms, about the teams, the projects, and the co-design centers; and the magnitude of what’s required relative to hardware, particularly considering the uptick during the last 4 or 5 years in artificial intelligence, machine learning, and deep learning.
As a reminder for some—and new information for those whose interest in ECP has only recently been piqued—we clarify what is within ECP’s scope and what isn’t.
Finally, our video chat will provide you with highlights of ECP’s progress since the first of the year and some of the key areas with which we’ll be concerning ourselves during the rest of 2018.
With respect to progress, marrying high-risk exploratory and high-return R&D with formal project management is a formidable challenge. In January, through what is called DOE’s Independent Project Review, or IPR, process, we learned that we can indeed meet that challenge in a way that allows us to drive hard with a sense of urgency and still deliver on the essential products and solutions.
In short, we passed the review with flying colors—and what’s especially encouraging is that the feedback we received tells us what we can do to improve. Moreover, we found that what the reviewers said was very consistent with our own thinking. Undoubtedly, the IPR experience represented a key step forward for us. We’ll be going through an IPR at least once a year, and that’s a good thing because we believe external scrutiny of what we’re doing and how we’re doing it is important and useful.
I also highlight successes we’ve had in our three research focus areas: Application Development (AD), Software Technology (ST), and Hardware and Integration (HI). But at this point, I also want to note that each of ECP’s focus area directors recently participated in audio interviews to share their up-close perspectives that you can listen to via this newsletter in the Focus Areas Update section; their associated discussion points are also posted for you to read.
We’ve made significant headway in identifying key AD and ST products. AD has demonstrated effectiveness by releasing a number of applications over the last several months while also developing deep-dive algorithms and models. The ST effort, with relatively new leadership, has been moving from R&D to product development and deployment. ST has a good plan for packaging our various-size components into bite-size chunks of software that the DOE laboratories will consume, integrate, and test.
The scope of Hardware and Integration (HI) includes support for US vendor research and development focused on innovative architectures for competitive exascale system designs (PathForward), hardware evaluation, an integrated and continuously tested exascale software ecosystem deployed at DOE facilities, accelerated application readiness on targeted exascale architectures, and training on key ECP technologies to accelerate the software development cycle and optimize productivity of application and software developers. In the HI PathForward activity, which funds US vendor R&D for nodes that are tuned for our applications and system designs, the vendors have been hitting milestones on schedule and on time. We are feeling very optimistic that the vendor R&D will appear in key products in the exascale systems when they’re procured.
Developing a Diverse Portfolio of Applications
ECP supports all of the key program offices in DOE (Office of Science, applied offices, NNSA Defense Programs), and so our incredible teams are engaged in several main categories of applications research. Examples of some of those categories are national security, energy, fundamental materials and chemistry, scientific discovery, and data analytics.
For national security, we’re developing next-generation applications in support of the NNSA’s stockpile stewardship program, namely reliability testing and maintenance of U.S. nuclear weapons without the use of nuclear testing.
In energy, our work is centered on fission and fusion reactors, wind plants, combustion for internal engines and land-based gas turbines, advanced particle accelerator design, and chemical looping reactors for the clean combustion of fossil fuels. The chemistry and materials category is looking at everything from strongly correlated quantum materials to atomistic design of materials for extreme environments to advanced additive manufacturing process design. Our researchers in additive manufacturing are endeavoring to understand that process essentially to allow the printing of qualified metal alloys for defense and aerospace. On the chemical side, a great example of what we’re doing is catalyst design. We’re also addressing the very foundations of matter via the study of the strong nuclear force and the associated Standard Model, which is among the most fundamental focus areas of nuclear and high-energy physics.
Our earth and space science applications include astrophysics and cosmology (e.g., understanding the origin of elements in the universe, and understanding the evolution of the universe and trying to explain dark matter and dark energy). Other key applications include subsurface, or the accurate modeling of the geologic subsurface for fossil fuel extraction, waste disposal, and carbon capture technologies; developing a cloud-resolving Earth system model to enable regional climate change impact assessments; and addressing the risks and response of the nation’s infrastructure to large earthquakes.
Within the data analytics category, we have artificial intelligence and machine-learning applications focused on the cancer moonshot, which is basically precision medicine for oncology. We’re also investigating metagenomics data sets for new products and life forms. We are also focused on optimization of the US power grid for the efficient use of new technologies in support of new consumer products and on a multiscale, multisector urban simulation framework that supports the design, planning, policies, and optimized operation of cities. Another facet of our data analytics work involves seeing how we can extract more knowledge from the experimental data coming from the DOE Science facilities. Our study is focused on SLAC’s Linear Coherent Light Source (LCLS) facility, but we are committed to helping myriad facilities across the DOE complex in terms of the streaming of data and trying to determine what’s in it and how we can drive experiments or computationally steer them to give us more insight.
ECP aims to be a thought leader and provide direction, whether the subject is programming next-generation hardware or designing models and algorithms to target certain physical phenomena, for example. We know we must interface with industry—from small businesses to large corporations—to avoid missing functional requirements that are important to them.
That’s why we stood up ECP’s Industry Council to work with us as an external advisory group. It is really helping to guide us concerning the challenge problems we’re addressing. The council gives us advice on whether the applications we’re tackling can be leveraged in their environments and, if not, how we can move in that direction. We meet with the council every couple of months to discuss the status of progress and where ECP is headed to ensure it will best fit the needs of US industry.
Understanding and Mitigating Risks
ECP must adhere to a very aggressive schedule, and I believe we are, and with the proper sense of urgency. The schedule is not only extremely dynamic but also abounding with risks. We can, however, unassumingly say that we are on track because we rigorously monitor the work to a granular level. To help us perform the tracking, we use tools called the schedule performance index and the cost performance index.
Some projects have higher risk and more technical challenges than others, and that’s understandable. We rely heavily on our project office and our leadership team to understand what the risks are—both the known unknowns and the unknown unknowns.
I believe that within the next year or so as we learn more about the first three exascale systems deployed in this country, a lot of our risks will either be retired or moved into the known unknowns category, which we can mitigate with our project’s use of contingency.
We execute according to a certain funding profile, and so we hope that our DOE sponsors will be able to deliver on what we believe is the funding profile necessary for success.
Another very important consideration for us is ensuring that the right programming models are available for the hardware, from both the software stack and the applications sides of ECP, so that the heterogeneity of the memory and the CPU hierarchy in the exascale systems can be optimally exploited.
Workforce development is a risk as well. We have been fortunate to be able to staff our project teams with some of the best and brightest in the world. Ensuring they’re working on problems that are fun and challenging so they’ll stay with us is very important. These scientists and engineers are arguably among of the most marketable people anywhere, so they’re really in high demand outside of ECP.
One other especially notable risk is ensuring that the US vendors deliver with the hardware and low-level software we need for our applications. The PathForward project allows us to inject resources for crucial vendor R&D. Through PathForward partnerships, we can pull in products sooner so we can extract the product quality and efficacy we need more quickly.
Looking toward the Horizon
Last year we executed the first of what will be an annual deep dive assessment of our AD and ST efforts, and so we’ll conduct our next one this year.
We will engage external subject matter experts in that process, and we expect to see applications have fairly well-defined quantitative metrics for the performance parameters. The AD teams have laid out challenge problems of national interest that they plan to address on the exascale systems. Quantifying those challenge problems involves determining exactly what they are in terms of speeds and feeds on the system, and we believe we’ll be able to better clarify those numbers.
On the ST side, we’ll examine what we call impact goals and metrics, which describe who is using a software component, whether the component is installed at a facility, and, if so, whether a line of sight to the facility or to an application is in place. Having that line of sight is crucial to proper integration.
Finally, we anticipate dozens of more milestones to be completed by the end of the year that most definitely inform what we believe are exciting responses to the exascale Request for Proposals. ECP has the job of ensuring the vendors’ R&D is in a good place to propose exciting products for the exascale platform, and we’re working very hard to make that happen.
The status of testing on the Summit system and the success and impact of the co-design centers are subjects of an audio update from Andrew Siegel, director of ECP’s Application Development (AD) research focus area.
ECP’s Software Technology (ST) research focus area is working on several elements of its Software Development Kit initiative and preparing for the coordinated release of ST products for Q1FY19. ST also has started several new projects that will address functionality gaps identified at the beginning of the fiscal year. ST Director Michael Heroux discusses those topics in an audio update.
Hardware and Integration
The US Department of Energy high performance computing laboratories will acquire, install, and operate exascale-class systems. ECP’s Hardware and Integration (HI) focus area helps the labs and ECP applications and software teams prepare for exascale through mutually beneficial collaborations. HI Director Terri Quinn explains the activities and goals of the focus area in an audio segment.
New Master Projects List
We’ve had a number of people ask that we post a master list showing all the ECP research projects. Although you can find all our research project pages by going to the Focus Areas on the ECP homepage, you can now use the last item in the Focus Areas drop-down menu to access a comprehensive list of all active ECP research projects.
Recent Training Events
As a quick reminder, you can find the presentations and videos from past webinars on the external ECP website. Following are links to a couple of the most recent: On-Demand Learning for Better Scientific Software and Software Citation Today and Tomorrow.
If you have any questions, future training events that you would like to advertise, and/or have suggestions for future ECP training events, please contact Ashley Barker, firstname.lastname@example.org.
New Exascale System for Earth Simulation
A new earth modeling system recently unveiled will have weather-scale resolution and use advanced computers to simulate aspects of Earth’s variability and anticipate decadal changes that will critically impact the U.S. energy sector in coming years.
ECP Appoints General Electric’s Kepczynski as Industry Council Chair
ECP has announced the selection of GE’s Brunon (Dave) Kepczynski as the new chair of ECP’s Industry Council.
LBNL’s David McCallen Presents ECP Application Research Project at HPC User Forum
Lawrence Berkeley National Laboratory’s David McCallen, a researcher from the Exascale Computing Project, spoke at the 69th HPC User Forum in Tucson, Arizona, about exascale simulations for regional-scale earthquake hazard and risk.
A Game Changer: Metagenomic Clustering Powered by HPC
A new Berkeley Lab algorithm allows biologists to harness the capabilities of massively parallel supercomputers to make sense of a genomic data deluge.
ECP’s SOLLVE Project Helps Users at Hackathon
The Scaling OpenMP with LLVM for Exascale Performance and Portability, or SOLLVE, project is centered on developing enhancements to OpenMP that meet critical needs of ECP applications. Members of the project assisted users from February 26 through March 2 at a hackathon hosted by Brookhaven National Laboratory. The hackathon was for research and computational scientists, code developers, and computing hardware experts to optimize scientific codes for high-performance computing.
Andrew Siegel Speaks about ECP at Ohio Supercomputer Center’s Spring Statewide Users Group Conference
Andew Siegel, Exascale Computing Project Application Development focus area director, recently provided overview and insight into ECP in a keynote address at the Ohio Supercomputer Center Statewide Users Group spring conference in Columbus, Ohio.
Secretary of Energy Rick Perry Announces $1.8B Initiative for New Supercomputers
On April 9, US Secretary of Energy Rick Perry announced a Request for Proposals, potentially worth up to $1.8 billion, for the development of at least two new exascale supercomputers to be deployed at US Department of Energy National Laboratories in the 2021–2023 timeframe.
Co-Design is Key to ECP’s Holistic Approach
Creating a capable exascale ecosystem requires an interdisciplinary engineering approach. Within the Exascale Computing Project, developers of the software ecosystem, the hardware technology, and a new generation of computational science applications are collaborating in a participatory design process referred to as co-design.
Shaking Things up with Earthquake Simulations
Research done as part of ECP has used some of the world’s most powerful supercomputers to model ground shaking for a magnitude 7.0 earthquake on the Hayward fault and show more realistic motions than ever before.
ECP’s Podcast Series, Let’s Talk Exascale, Picks Up Momentum— Approaching Its 20th Episode
In the early part of 2018, a new avenue debuted for sharing information about the activities, challenges, milestone accomplishments, and science impact of ECP’s work in Application Development, Software Technology, and Hardware and Integration. This outlet is a very well-received podcast series called Let’s Talk Exascale. Visitors have the option of listening to the audio recordings and/or reading edited transcripts of the discussions.
Program Offers Developers a Training and Productivity Edge on Emerging Technologies
ECP’s Training and Productivity program delivers a robust developer training for ECP project members and the broader HPC community, including industry’s computing staff members, so they will be able to take full advantage of exascale hardware and software.