By Scott Gibson
Hi. In this podcast we explore the efforts of the Department of Energy’s (DOE’s) Exascale Computing Project (ECP)—from the development challenges and achievements to the ultimate expected impact of exascale computing on society.
This is episode one in a series on Frontier, the nation’s first exascale supercomputer. Joining us is Justin Whitt of Oak Ridge National Laboratory (ORNL). Justin is program director of the Oak Ridge Leadership Computing Facility at ORNL. He’s responsible for providing leadership-class computers to researchers on behalf of DOE’s Advanced Scientific Computing Research program, which is part of DOE’s Office of Science. Within his role at the lab, Justin is also the director of the project to acquire and deploy ORNL’s next supercomputer called Frontier. I interviewed Justin on September 29, 2021.
Our topics: What Frontier will do, why Frontier is unique, what’s special about exascale computing and the journey to achieve it, getting the physical space ready for Frontier, and more.
Gibson: Let’s start with the elevator speech. What will the Frontier supercomputer do?
Whitt: Yes, well, Frontier will provide an unprecedented amount of computer power to researchers around the world. Core to this is its ability to perform over 1 quintillion calculations in a single second. That’s a 1 with 18 zeros after it. When these quintillion calculations involve very large or very precise numbers, that’s what a computer scientist would call an exaflop. When a computer has the capability to do those calculations, we call those exascale computers, and Frontier will be the nation’s first exascale computer.
Gibson: What’s unique about Frontier?
Whitt: In one way, exascale systems like Frontier are just another point on a continuum of ever more powerful computers. Yet, they’re very exciting because they represent the farthest that we as humans have reached along that continuum to this point. And so, Frontier for researchers is just very exciting in that it brings that computational power, that capability to their research.
Gibson: What does it take to get to exascale?
Whitt: It’s probably good to go back a few years. Just over 10 years ago, many experts wondered if we could ever get to exascale computing. First off, they thought that an exascale computer would take over a 100 megawatts of electrical power to operate and that made it impractical to operate an exascale computer. Then the questions became, ‘On systems of this size, could hardware and software on systems of this size ever be reliable and stable enough to use, and if it was stable, could we program it efficiently enough to harness that much computer hardware simultaneously?’ And so, over more than 10 years now, there’s been strategic federal investments and public–private partnerships that have been truly exemplary in finding innovative answers to these questions. And that’s what’s allowed us to be where we’re at today, where exascale computers like Frontier are possible.
Gibson: Please describe for us the structural transformation that was required to prepare the room that Frontier will occupy.
Whitt: Yeah, again, let’s go back in time a little bit to the end of 2019. At that point, the Titan supercomputer was occupying very valuable data center space we knew we were going to need for Frontier. Researchers using Titan were transitioned over to a new supercomputer at the time called Summit, allowing us to decommission Titan. Titan had to be disassembled, and we ended up removing almost half a million pounds of Titan components from the data center.
For Frontier, we knew that we primarily needed three things—we needed more power; we needed more water for cooling and a stronger raised floor in that data center. So, we began by removing the existing raised floor. And once we had that out and had some of the old Titan infrastructure out, we began removing, reprovisioning, rerouting all the electrical and water conduits inside of the building. At the same time, we were bringing additional power and cooling capability to the building. And that brings us to where we’re at today. The data center is again operational and is ready for Frontier.
Gibson: Is there an analogy you can give to convey the space requirements for Frontier in a way a broad audience can understand?
Whitt: Yeah, Frontier will take up an area that’s roughly about a quarter of a football field or about two-and-a-half regulation basketball courts.
Gibson: Will you describe the aggressive challenge the Frontier team set up for the work of standing up the system?
Whitt: One thing that was clear from the beginning about DOE’s need for the system was that researchers needed it ASAP. In all, we reduced the original schedule by over a year. This took accelerating every aspect of the project, from construction in the data center to developing the computer codes that researcher would use on Frontier. Accelerating all these different pieces was quite a feat. But even that would have been for naught without what was really a hero effort by our partners at Hewlett Packard Enterprise and Advance Micro Devices. These companies also had to accelerate their development of key technologies, making processors, software, and network technologies available much sooner.
Gibson: What engineering and science need for exascale computers led to the accelerated time frame to deliver Frontier?
Whitt: I think the thing to understand there is that the supercomputers that are operating today are often 4x to 5x oversubscribed. That means from a supply-and-demand perspective there is 4x or 5x more demand for the fastest supercomputers than we can currently supply.
You can think about this in two ways: it limits the number of researchers that can use the supercomputer and it limits how much of the supercomputer a single researcher can consume. Since much of the research on these computers requires a large portion of the computer and therefore it really can’t be done anywhere else. It’s easy to see how this can have a direct effect on US leadership in innovation and technology. But even putting that aside, Frontier is going to open whole new realms of science that’ll allow researchers across science and engineering disciplines to probe more deeply into natural phenomena through modeling and simulation to make just inhuman inferences across massive amounts of data and to answer questions and make discoveries that, really, have never been possible before. And that is really what makes this an exciting time for researchers.
Gibson: What’s your assessment of Frontier’s progress?
Whitt: Ha, well, we have certainly had some challenges, for instance the scarcity of materials and parts brought on by the pandemic. But when we started down this road several years ago, we planned to have Frontier into the hands of the researchers next year, and we are 100 percent on track to do that. As we like to say, we are on scope, schedule, and budget. In fact, Frontier is being built inside our data center even as we speak.
Gibson: Is there anything else you’d like to cover?
Whitt: I think one thing that’s worth commenting on is I spent some time reflecting on why we’re at where we’re at today with Frontier. And when I think about this, I’m inevitably in awe of both the breadth and depth of the partnerships across the DOE labs, academia, and industry that have allowed us to overcome a lot of these technical challenges and allow exascale computing to exist today. And I think it’s worth noting that the Exascale Computing Project is a microcosm of these partnerships and the scientific and software applications that are being developed through that project and being prepared for exascale systems is another example of the power of these partnerships.
Gibson: Thanks, Justin, for being on ECP’s podcast. We’ll get further perspectives on Frontier in other episodes.
- The Pioneering Frontier article series
- The Road to Exascale
- Exascale Computing’s Four Biggest Challenges and How They Were Overcome
Frontier Construction Features:
- 09/29/21—Stunning Specs: What’s Inside the Nation’s First Exascale Supercomputer Facility?
- 05/20/21—OLCF Announces Storage Specifications for Frontier Exascale System
- 12/11/20—Building an Exascale-Class Data Center
- 09/23/20—Powering Frontier and complementary Photo Story
- 01/26/20—Making Room for Frontier