I’m Kathryn Mohror from Lawrence Livermore National Laboratory, and I work in the area of I/O [input/output] for high-performance computing systems. And what I’m really passionate about is figuring out how to make I/O and data management easier and faster for application users.
I work on the ExaIO project and I lead the development of the UnifyFS file system. And these efforts are funded by the Exascale Computing Project. In this role, what I think about a lot is what the next-generation of applications that run on exascale platforms are going to need from data management software.
So, what’s really exciting to me is that we’re at a tipping point where applications won’t be able to manage their data in the way that they’ve done for decades.
And the reason we’re at a tipping point is because compute is getting so much faster with the addition of accelerators into HPC systems. At exascale, the applications will be generating more and more data, and if we continue to manage data in the traditional way, we’ll just be spending huge amounts of time in I/O, and, quite honestly, we’d run out of storage space.
So, more and more, what we see is the community realizing that we need to find a new normal for managing and analyzing all this data. The trend is that people are moving towards workflows where traditional HPC simulations are coupled with analysis tasks in a single job.
For example, you might see a traditional simulation output data that is immediately consumed by an analysis task that post-processes the data. Perhaps it generates a movie, or the data could be input to train a machine learning model.
This new paradigm of coupling simulations and analysis tasks is very exciting to me because it means that there’s a great need for I/O middleware to help scientists manage their data efficiently and to avoid using the parallel file system for sharing the data, which can really slow things down.
This is where tools like UnifyFS come in. UnifyFS enables applications to use the burst buffers on systems like Summit today or the Frontier exascale platform just as easily as they would use the parallel file system, but it’s much, much faster because UnifyFS stores the data on the compute nodes where the job is running.
There’s quite a few efforts in ECP and DOE, in general, aimed at helping users with their I/O and data management, and it’s a pretty exciting time. Every day, HPC users are coming up with new ideas on how they will combine their simulation and analysis tasks. And that’s a challenge for us I/O researchers to come up with good solutions to help them.