Simplifying the Deployment of High-Performance Computing Tools and Libraries

By Scott Gibson

The Extreme-scale Scientific Software Stack (E4S) is a community effort to provide open-source software packages for developing, deploying, and running scientific applications on high-performance computing (HPC) platforms.

Sameer Shende, University of Oregon

Sameer Shende of the University of Oregon

Sameer Shende of the University of Oregon helps develop E4S under the US Department of Energy’s Exascale Computing Project (ECP) Software Technology (ST) Programming Models and Runtimes software development kit area. He also contributes to the TAU performance system.

Shende joined ECP’s Podcast, Let’s Talk Exascale, for an interview in Denver at SC19: The International Conference for High Performance Computing, Networking, Storage, and Analysis this past November.

E4S consists of a set of ST products combined in a container distribution as well what’s called a bare-metal installable software framework. It’s built on the Spack package manager, and users can download the container-based distribution of E4S and use it with a number of different HPC container runtimes.

The runtimes include Docker, Singularity, Shifter, or CharlieCloud. The E4S team also provides build caches, which contain binary distributions of these tools with which users can efficiently create their own container.

“You can pretty much cherry-pick the packages that you want using base images provided by E4S,” Shende said. “And you can install the tools directly on HPC systems using Spack and our E4S build caches. This provides users with a versatile platform to easily deploy the software that’s being developed by ECP.”

E4S is the delivery platform for ECP-sponsored software technology products and other products. The Spack platform provides over 3,000 packages.

“We create a build cache, which contains binaries of these packages on specific platforms,” Shende said. “We are providing this for the x86_64 platform under Linux as well as ppc64le on IBM and the aarch64 or ARM64 platforms with support for GPUs. Scientific software, which has been traditionally very difficult to install, can be easily installed on their target HPC systems, and users can create their own containers and run these along with our full-stack image.”

Merging in a coordinated fashion all of the packages that must work together is a challenge for the E4S team. “It requires a lot of work in getting the packages to build correctly and to integrate well in this environment,” Shende said.

Another challenge is to ensure the packages run efficiently. “If we have an MPI runtime on which all the packages are built consistently with a single set of compilers and MPI packages, then that container should be deployed well on the HPC target platforms, and this includes substituting the MPI that is in the application with the MPI that is available on the system so that it can use the high-speed internode connections and the network efficiently with the system MPI by swapping the MPI,” Shende said. “So, you can pretty much develop the application using our container on your laptop and take the same binary to an HPC system where our E4S container is available and deploy it. The libraries and tools remain inside the container while the application and data are external to the container.”

A number of tutorials at SC19 used the latest E4S release (version 1.0). It consists of 50 different ST products that work well together and test suites that validate the software stack. “We have integrated this platform,” Shende said. “We have deployed it on a number of DOE HPC systems already, and we are expanding this portfolio as we target other platforms. We hope to see this contribute significantly to the problems of reproducibility of software artifacts. People could use our base containers and create their own containers.”

Container technology is promising because it enables the user to take an existing set of libraries and tools, consider the dependency metrics of a particular software product, and deploy the software efficiently. “And there’s only one kernel that’s running when a container is deployed, unlike other virtualization approaches. So it’s very efficient,” Shende said.

Complex dependencies can obstruct new users. E4S is ready to leave a legacy of lowering the barrier for using HPC software on exascale and extreme-scale systems. Shende said E4S will simplify software installation for scientific software in general, and people will realize its value whether they’re installing software on their workstation or on an HPC system. They can use a container-based deployment either with the full E4S container or with their own customized version applying recipes from the E4S project and E4S-developed Spack build caches to quickly create container distributions and deploy their software.

Related Link

Sameer Shende is also featured in episode 33 of the podcast speaking about the Tuning and Analysis Utilities (TAU) performance evaluation tool.