Software Deployment at the Facilities

Lead: Ryan Prout, Oak Ridge National Laboratory

Through close partnership with ECP code teams, DOE HPC Facilities, and vendors, the Software Deployment team deploys and integrates an exascale software stack and deploys a software integration and testing capability at the Facilities to support continuous integration with site environments, including container technologies and software development kits.

Description: The deployment and integration of ECP software, vendor-provided software, and facility-based software environments to establish supported software stacks that meet application needs and provide for optimized facility operation.

The overarching goal of the Software Deployment (SD) effort is to deploy and integrate an exascale software stack and to deploy a software integration and testing capability at Facilities to support continuous integration with site environments, including container technologies and software development kits (SDKs). This can only be done through close partnerships with the DOE HPC Facilities, Application Development (AD) and Software Technology (ST) projects, and vendors, with all parties sharing a common vision for an exascale software ecosystem. This project has the following major deliverables:

  • deploy and operate a continuous (automated) integration and testing capability,
  • develop and implement a software deployment and integration pipeline using SDKs, Spack, and a build infrastructure,
  • characterize and understand Facility software environments, and
  • develop a map of ECP applications and software products to Facility sites.

Success in this effort will require an understanding of the Facilities software environment ecosystems, the target applications expected, and understanding and development of interfaces among vendors and software providers. This will require coordination of all parties responsible for the entire hardware and software environment. Ensuring that there is an expansive testing infrastructure and integration process is paramount to a healthy exascale software ecosystem.

The scope of the SD continuous testing infrastructure is to develop and support deployment of a process and framework for continuous testing and integration of software at the Facilities supporting the ECP software products, applications, and site environments. This will include implementation of a continuous testing framework enhanced with security and HPC features specified by the Facilities and ECP.

The Hardware and Integration (HI) and ST focus areas have a complementary arrangement in which ST delivers its products and HI deploys them.

  • ST delivers software to HI (and others) by designing and implementing it (with HI input) to run on the Facilities platforms and making it available as source code via GitHub, GitLab, or some other accessible repository.
  • HI deploys ST (and other) software on Facilities’ platforms through enhanced build and deployment processes (with support from ST) that take advantage of the testing infrastructure and ST-based product packaging and build tools. Leveraging knowledge and best practices across sites is also a key focus.

Separating the concerns of delivery and deployment is essential because these activities require different skill sets and Facilities have unique needs that include those related to the HPC systems at their sites.

Lead: Ryan Adamson, Oak Ridge National Laboratory

 

National Nuclear Security Administration logo U.S. Department of Energy Office of Science logo