A conversation with Tim Germann of Los Alamos National Laboratory on ECP’s Co-Design Center for Particle Applications
Tim Germann of Los Alamos National Laboratory leads ECP’s Co-Design Center for Particle Applications (COPA). He spoke with ECP Communications at SC17 about COPA. This is an edited transcript of the conversation.
Tim, can you tell us what your project is about?
Within ECP, there are essentially three main pillars: applications, which refers to developing the scientific application codes and the use cases and requirements for scientific discovery among other applications on future exascale supercomputers; software technologies, which are the techniques, the programming languages, and tools to use these computers; and then hardware technologies, which are, of course, the hardware components. To bring all of this together, ECP supports five co-design centers. Our center, COPA, is interfacing between these three areas, and each of the five co-design centers is looking at what are called computational motifs that multiple applications use.
So our center, from its name, is looking at particle-based applications. These particles can be atoms in molecular dynamics to look at the behavior of proteins or materials, or they can be all the way up to astrophysical objects. They can be measured pieces of matter formed from the Big Bang and how they evolved to form galaxies, clusters, stars, and planets.
And there are other examples such as particle-in-cell codes that are used for modeling fluids, solids, and plasmas, including several ECP plasma physics applications.
We’re looking, as are the other co-design centers, at whether there are common techniques that we can learn and exchange between these applications, or, ideally, components that multiple applications can use—for example, a library or a tool kit on which you can build any particle-based application.
It sounds like the implications of your research involve more than national security.
It’s much broader. A few years ago we took our molecular dynamics particle-based code and adapted the atoms to agents or people to model the spread of disease. It’s really much broader than just atoms or specific particles. It is a common motif.
Has working as part of the ECP opened up new collaborations for you?
It has. My background was mostly in materials and molecular dynamics. And so we had a co-design center previously under the US Department of Energy that was focused on materials modeling, but in this case with broadening this out to other particle applications, we’ve teamed up with the plasma physicists and cosmologists for these very different application areas. And then the common thread from the previous generation of co-design is working closely with computer scientists developing the software stack and the hardware vendors developing the nodes and the systems and the networking of these next-generation machines.
How long have you been involved with this co-design center?
COPA is about a year old this week [the week of November 12, 2017], actually. It’s our first anniversary.
What can you tell us about any of the milestones you’ve achieved?
We have a few. I mentioned the longer-term goal is to develop this software tool kit that you can use for multiple applications. That’s a longer-term effort that we’re just in the design stage for now.
For the shorter term, we have some specific libraries that we’ve extracted, including one for fast Fourier transforms for scalable FFTs. That’s a common algorithm that many codes, not just particle codes, but several applications codes use. So that was one of our milestones.
Another was for the co-design process, and to work with vendors and other computer scientists, where a particular challenge is that you can’t often deal with a complex million-line application code. You need smaller components you can rewrite and, say, run through a simulator or rewrite in a new programming model.
So we’ve developed and released a proxy application for forward-looking molecular dynamics algorithms and, particularly, the more complex interactions between atoms that we can model with the performance of the new machines.
You don’t yet know the architectural details of exascale platforms, so how do you plan for them and the scaling issues?
The one common theme is that if you’re going to efficiently utilize them, you have to be able to write your algorithm and application in a way that exposes as much concurrency, simultaneous work, as possible so it can be spread on the different processing elements, whether they’re graphics processors or processors that, say, are designed for cell phones, or processors for machine-learning applications. They all have, one way or another, a large number of individual computing units that have to be able to be utilized simultaneously. So that’s the first thing from the algorithm side.
The second piece is more on the software technology. Today those different types of processors—although they all have a large amount of concurrency—have different programming languages and different ways of using them. And so you essentially have to rewrite your code multiple times.
One of the goals of the ECP and the software technology in this area is to develop forward-looking models so that you could write your code once and then run efficiently on multiple types of architectures. So we’re looking at evaluating those.
About how many researchers are on the team working with you?
It’s about 20 people, I would say. And a large number of those are staff scientists at national labs who are part of other projects, either applications or software technology, or even projects outside of ECP. Usually the dedicated people working as part of the co-design center are the students and postdocs and maybe early-career staff who can dedicate their time.
What are the next significant milestones?
We are designing this common framework, and we would like to be able to use it to develop application codes in this wide range of areas. It’s kind of a high-risk attempt.
Grid-based codes use, say, a mesh representation. There have been libraries developed and used in those fields to develop different application codes. But in the particle simulation arena, everyone has kind of written their own code, their own algorithm and specialized code.
Getting the buy-in from applications and demonstrating the utility, this is the risk, and it will be exciting if we can achieve it.
Are these codes considered proprietary, or are they going to be open code?
No, they’ll be open. The application codes that we’re working with, I believe, are all open source, as well.
Does your project take advantage of any of the ECP’s allocations of computer time?
We haven’t yet, since, as I mentioned, the proxy apps are designed to just run on, or test, prototype hardware and simulators or emulators of future hardware. So those are mostly just looking at single-node or a few-node systems.
The libraries—as we go toward the larger libraries and framework, then toward demonstrating scalability and testing scalable algorithms—are going to be a big piece. So I think we’re going to wrap that part up over the next year and begin using more of the machines.
For the folks who haven’t worked with the term before, can you explain what a proxy app is?
It’s kind of a broad catchall term for some subset of a full-scale application code. It can be anything as small as an individual kernel, say, being a matrix multiplier, a fast Fourier transform or some individual piece of the code, up to something that represents some of the workflow of how you go through the different steps in an algorithm and how the data needs to be transferred from one step to the next and the dependencies, but without the full complexity of the application code. So often you obviously have to throw away stuff, and that may be the analysis or some of the more complex esoteric routines that are used in some applications, but not all. It’s essentially the key components of the algorithm. Whereas application codes often range from 100,000 lines of computer code to even millions of lines, proxy apps are typically a few hundred to maybe a couple thousand lines of code. So as I mentioned before, it’s something simulators and emulators can get their hands around, something you can afford to completely rewrite in a new language or test out a new algorithm with, which you can’t do on the full-scale application code.