Maintaining the Nation’s Power Grid with Exascale Computing

Code addresses new challenges and optimizes the grid to react to disruptions

By Lawrence Bernard

Christopher S. Oehmen of Pacific Northwest National Laboratory

Christopher S. Oehmen of Pacific Northwest National Laboratory is principal investigator of ECP’s ExaSGD subproject. Image: Andrea Starr/Pacific Northwest National Laboratory

Christopher S. Oehmen is a biomedical engineer doing critical research on how to protect the nation’s power grid.

“It turns out that the power grid is like a biological system. They are both very irregular with control systems that are very sparse,” Oehmen says. “A molecular switch turns one thing on, and then other things happen. Understanding that is useful for many fields, including power.”

Oehmen is principal investigator of an Exascale Computing Project (ECP) effort called ExaSGD that involves researchers at five US Department of Energy (DOE) national laboratories to develop algorithms and techniques to address these new challenges and optimize the grid’s response to potential disruption events under a variety of weather scenarios.

The layers of complexity are so great that proper modeling and analysis require the computational capability found only on exascale computing platforms.

Increasing Complexity Presents Challenges

The nation’s power grid is one of the most complicated systems ever built, and it continues to grow in both scope and demand. Protecting this system from failures is a critical mission of DOE. Oehmen, a data scientist and group leader in computational biology at DOE’s Pacific Northwest National Laboratory (PNNL), uses models and data analytics to understand grid failures and look for ways to prevent them. The only way to do that effectively, he says, is to use the latest generation of leadership-class computers, including Oak Ridge National Laboratory’s (ORNL’s) Frontier exascale system.

“More complexity in the system makes planning more challenging,” Oehmen said. “If you have a hurricane, you’re going to lose a bunch of pieces of the grid, or maybe a whole substation, in the path of the hurricane. It could break more than one part simultaneously. How do we protect the grid when that happens? The larger the next generation computer systems, the more contingencies we can look at, and we can do so more quickly.”

ORNL’s Frontier, which is the world’s fastest supercomputer and the first to reach exascale, can calculate more than 1 quintillion operations per second and is 10 times more powerful than Summit, ORNL’s previous top-ranked supercomputer. Frontier will enable scientists to develop critically needed technologies for the country’s energy, economic, and national security interests by helping researchers address problems of national importance that were previously intractable on smaller and less powerful machines.

Renewable Energy Sources Add Complexity

Renewable energy sources are becoming more common, which makes running the grid even more complicated. Similar to an ocean liner attempting an abrupt turn, the power grid is slow to adapt to new directions—or new sources of power generation in this case. Likewise, the grid is powered by very large systems, such as hydroelectric dams and coal-burning power plants. Renewable sources, such as wind or solar farms, can change quickly over a large area in a matter of minutes. Legacy software used to predict grid behavior was not built with these rapid changes in mind.

“The ability to speed up or ramp down generation by traditional means is limited. It takes a really long time to turn off if you wanted to. With these additional technologies, you now have to think about sudden changes in how you operate the power grid,” Oehmen said.

Another factor that adds complexity is the increased use of electric vehicles, which are charged using the power grid. As these vehicles become more common, their interaction with the power grid becomes more complex because these batteries can either consume power (when charging) or provide it (when temporarily discharging).

New Methods Needed

Electrical engineer working laptop at electric tower

Credit: Getty Images

Grid operators currently use simplified models due to computational limitations, Oehmen explained. In normal conditions, a single workstation or laptop computer is enough to run a code that can verify that—based on years of past experience available in the model—a segment of the power grid won’t fail after losing a given component. However, emerging renewable technologies mean these traditional methods may no longer provide a complete picture.

To run at exascale, the ExaSGD team had to overcome a major challenge that stems from the nature of the power grid itself. The grid is both irregular and sparse, but these features created challenges for graphics processing units (GPUs), which favor more dense mathematics and are at the heart of leadership-class supercomputers. The irregularity comes from the unevenly distributed components—some parts of the grid have many elements, and some parts have only a few. The sparseness means that even the most centralized components are only connected to a few other components. As a result, power grid models are most often uneven mathematical structures that contain many zeroes.

Team Solves Linear Solver Issue

Understanding how power grids work at this scale requires a process called nonlinear optimization. In this process, researchers seek an optimal balance between power generation and load under complex constraints.

Think of it like trying to find the peak of a mountain using only the information at your feet. First, determine which way is uphill. Then take a measured step forward and repeat the process—correcting the direction along the way—until arriving at the peak. At the center of this process is a simplifying step that uses a linear solver that must execute efficiently on GPUs to properly leverage the computing power of the exascale era.

When the ExaSGD team began its effort, no suitable sparse linear solvers ran well on GPUs. So, the first approach was to attempt to use dense solvers, but the irregularity of the grid limited the usefulness of this approach. Instead, the team developed efficient sparse linear solver options—both in-house and in partnership with other ECP teams.

The team demonstrated that it could fit larger grid models onto each GPU (e.g., up to the size of the Western Interconnection grid), and the models’ run times were also much faster—which is essential for completing calculations in the time constraints of operating a power grid so that problems potentially could be solved in real time.

Solutions Also Work at Small Scale

“You’re able to get larger problems on the GPUs. So, we overcame both those challenges,” Ohemen said. “We’re solving big problems at exascale, but as we do that, we are coming up with solutions that also work at smaller scales.”

Hence, a grid operator with just a small piece of the overall picture can solve a more complicated model by using commodity-sized systems.

“We get great benefit even at the small scale. Now they can see the whole picture. They can see the top of their mountain, even with all the complexity of new grid technologies,” Oehmen said. The result is increased reliability and safety of the grid at a lower cost.

In addition to PNNL and ORNL, other DOE national laboratories contributing to this research are: Lawrence Livermore National Laboratory, National Renewable Energy Laboratory, and Argonne National Laboratory.

Oehmen has a bachelor’s degree in physics and mathematics from Saint Louis University and a master’s and a PhD in biomedical engineering from the Joint Graduate Program in Biomedical Engineering at the University of Memphis and the University of Tennessee Health Science Center.

Originally from Tennessee, Oehmen has been at PNNL for almost 20 years. He now loves the inland Pacific Northwest. He, his wife, and two children enjoy hiking, biking, camping, and assorted other outdoor activities. “I can get to mountains, forests, sand dunes, ocean, rivers—anything—within a short time. And I don’t have to go far to be nowhere with no one else around.”

This research is part of the DOE-led Exascale Computing Initiative, a partnership between DOE’s Office of Science and the National Nuclear Security Administration. The Exascale Computing Project, launched in 2016, brings together research, development, and deployment activities as part of a capable exascale computing ecosystem to ensure an enduring exascale computing capability for the nation.