Making Components of GAMESS Software More Scalable
A conversation with Mark Gordon of Ames Laboratory
Mark Gordon, Ames Laboratory Associate and Distinguished Professor at Iowa State University, spoke with Exascale Computing Project (ECP) Communications at SC17 in Denver. Gordon is the principal investigator for the ECP project called General Atomic and Molecular Electronic Structure System (GAMESS), which is developing methods for computational chemistry. We talked to him about various details of the project and what it has accomplished so far. This is an edited transcript of our conversation.
What is GAMESS all about?
GAMESS has been around quite a while, since the late 1970s. We started converting it to take advantage of parallel computers in the early 90s, and, actually, Paul Messina [former ECP director] was involved in that from the very early days when he was at Cal Tech. Since then, we’ve been making components of GAMESS more and more scalable so they can take advantage of larger and larger parallel computers. It was a natural progression for us to get into the exascale computing model.
Who uses GAMESS?
GAMESS has on the order of 150,000 users throughout the world, in over 100 countries. The biggest is the United States, followed by Japan. It’s on almost every platform, from Apple computers and Windows machines, all the way up to the current best computers provided by the US Department of Energy, so we use it a lot on Mira at Argonne National Laboratory. GAMESS is available at every national supercomputing facility that we’re aware of. It’s also used in the pharmaceutical industry. A number of corporations use GAMESS. Education is another area for the program—it is used widely in computational chemistry courses at universities.
What about GAMESS makes it an important element of the future exascale computing ecosystem?
It is probably one of the two most multifunctional computational chemistry programs available. Just about anything a user would want to do with computational chemistry can be done with GAMESS—everything from very high-level quantum chemistry to semi-empirical methods to classical force fields. That appeals to people who are developing new pharmaceuticals and are interested in things like catalysis. GAMESS is also easy to use. We have a very nice manual. GAMESS is distributed at no cost, so that’s appealing as well.
What do you highlight today as some of the important milestones with the research?
The first one that comes to mind relates to another program and ECP project called QMCPack that comes from a number of places, but mostly from Oak Ridge National Laboratory. That application does something called quantum Monte Carlo calculations, which are very high level, very sophisticated, and extremely computationally demanding. We now have an initial interface between GAMESS and QMCPack. One of the big features of GAMESS that makes it so scalable is that we have what are called fragmentation methods, which split a very large system into much smaller pieces, or fragments. There can be thousands or tens of thousands of fragments. Fragmentation methods would allow a user to do quantum Monte Carlo on much, much larger molecules and molecular systems than can be done now. So that’s a really big milestone. We have an initial interface that works. We’ve done it so far on water clusters. The next step is going to be to do it on heterogeneous catalysts. So that’s one big milestone we’ve reached.
A second big milestone is that we have a program associated with GAMESS called Libcchem, which uses a more modern computing language. We’ve developed several new functionalities for the program that have now been released to the public, which is very exciting. As of September 30th, 2017, we have a new release of both GAMESS and Libcchem that’s available to anybody who wants to download them. As of today [November 14, 2017]—this is new, and I haven’t talked about it—GAMESS is now the only electronic structure code available on containers with NVIDIA. So we’re the first ones that are into containers. We’re really excited about that.
What new collaborations have been created as a result of this effort?
Several have been created. QMCPack is not new but was part of our proposal and is a very important one. When you make an interface between two very complex programs like GAMESS and QMCPack, that’s a very nontrivial exercise. In terms of new collaborations, we’ve been talking very fruitfully with the people from the data transfer kit, DTK; they are in Oak Ridge. We are also talking with the people from Swift, who are at Argonne. They are also part of the ECP, so they’ve been very helpful in the discussion concerning how to make this interface work really smoothly. Those two are the main new collaborations.
So you have new relationships that have been established because of working under the ECP umbrella?
Yes, it would have never happened without the ECP.
Has your research taken advantage of any of the ECP’s allocations of computer time?
Yes, absolutely. We have an allocation, and it has been used almost entirely for test runs of the GAMESS–QMCPack interface. We’ve used about 10 percent of our allocation. We’ll probably use it all well before the end of the year—it’s helpful.
What’s the next big thing that’s coming up for your project?
A major part of our portion of the ECP is to evaluate the ability of programs like GAMESS to take advantage of novel architectures. Consequently, we have a collaboration with Shirley Moore and Jeff Vetter at Oak Ridge in which we already have a prototype of GAMESS running on FPGAs. This is a first. As far as we know, there’s been no electronic structure code on FPGAs before this. We have a collaboration with the people at EP Analytics in San Diego, and they are very interested in novel architectures. With them we have been looking at ARM architectures. So we have a paper with them already on running GAMESS on ARM, and we’re planning to look at the new Thunder X2, which is the latest ARM system. These collaborations are all part of the GAMESS ECP with Masha Sosonkina at Old Dominion University, and this is a very important component. Most users don’t realize that one of the major bottlenecks of exascale computing right now is the power draw. For several years, we’ve been very interested in assessing the tradeoff between time to solution and power consumption, and that’s being done both with Masha at Old Dominion University and Alistair Rendell, who’s at Australian National University. So we also have a couple of papers with both of them showing the tradeoffs. As every new architecture comes around, we immediately pounce and try to get GAMESS running on it to study the tradeoff between the time to solution and power consumption. This is a really big deal when the power consumption per year is more than the cost of the exascale computer itself, so we’re very excited about our evaluations in this area.
Is there anything about the research you would like to express that we haven’t covered here?
Heterogeneous catalysis is the motivating application for everything we’re doing. So for people who are nonexperts, everybody who has a car has a catalytic converter—that’s heterogeneous catalysis. Without research on how to do heterogeneous catalysis, your car would not have that catalytic converter. At Ames lab, we have experimental groups that synthesize and analyze heterogeneous catalysis. Mesoporous silica nanoparticles are really big particles of silica, which simply means sand. We can think of silica as glass, but you can make it in a form that will allow the placement of different kinds of molecules inside the pore that will accelerate the reactions. The catalysis is heterogeneous because of the molecules that are reacting in the gas phase. The catalyst is a solid, just like the catalytic converter in your car. We have a beautiful collaboration with the experimental groups at Ames Lab and our theoretical group to simulate what’s going on inside those nanoparticles. The goal is to be able to design new and better versions of the catalyst, which is a really big deal. You have to include so many atoms in these simulations that they will not happen without exascale. The Exascale Computing Project is a massive collaborative effort that has been and will continue to be instrumental in bringing all of this together.