Operating systems provide the necessary functionality to libraries and applications, such as allocating memory and spawning processes, and manage the resources on the nodes in an exascale system. The Argo project is augmenting and optimizing existing operating system and runtime components, as well as building portable, open-source system software that improves performance and scalability and provides increased functionality to exascale libraries, applications, and runtime systems with a focus on resource management, memory management, and power management.
Many exascale applications have a complex runtime structure, ranging from in situ data analysis through an ensemble of largely independent individual sub-jobs to arbitrarily complex workflow structures. To meet the emerging needs of exascale workloads while providing optimal performance and resilience, the compute, memory, and interconnect resources must be managed in cooperation with applications, libraries, and runtime systems. Argo’s goal is to augment and optimize low-level system software components for use in production exascale systems, providing portable, open-source, integrated software that improves the performance and scalability of—and offers increased functionality to—exascale applications, libraries, and runtime systems. The project focuses on resource management, memory management, and power management.
The Argo team is delivering resource management infrastructure to coordinate static allocation and the dynamic management of node resources, such as processor cores, memory, and caches. It supports multiple resource management policies suitable for a variety of application workloads. By taking care of system-specific aspects, such as topology mapping and partitioning massively parallel resources, this infrastructure will improve the performance and portability of exascale applications, libraries, and their runtimes.
Memory management libraries are being developed to provide flexible and portable memory management mechanisms that make it easier to obtain high performance. One approach incorporates nonvolatile memory into complex memory hierarchies by using a memory map; another provides explicit, application-aware memory management for deep memory systems. These libraries will directly support new applications that analyze large, distributed datasets and make it easier to program heterogeneous hardware resources.
Another effort of the Argo team is providing fully integrated, end-to-end infrastructure for power and performance management, including power-aware plugins for resource managers, workflow managers, job-level runtimes, and a vendor-neutral power control library. This infrastructure directly addresses the challenge of managing the performance of exascale applications on highly power-constrained systems.
An HPC system’s power, performance, and scientific throughput are affected by configurations and requirements at different levels of the system, such as job schedulers, runtime systems, and the tuning of the applications themselves. Each level presents different challenges and opportunities for optimization.
The Exascale Computing Project (ECP) PowerStack effort was motivated by the needs of supercomputer users who are not familiar with low-level architecture details across different vendors, and some power and performance dials require the user to have elevated privileges. As a result, accessing power and performance optimization features that are complex and vendor specific can be chaotic, unwieldy, and error prone from the users’ perspective.
PowerStack developed hierarchical interfaces for power management at three specific levels: batch job schedulers, job-level runtime systems, and node-level managers. Each level in PowerStack provides options for adaptive and dynamic power management depending on the requirements of the supercomputing site under consideration.
PowerStack relies on Variorum—an extensible, open-source, vendor-neutral library for exposing power and performance capabilities of low-level hardware dials across diverse architectures in a user-friendly manner.
Variorum provides vendor-neutral APIs such that the user can query or control hardware dials without needing to know the underlying vendor’s implementation (e.g., model-specific registers or low-level sensor and GPU interfaces). These APIs permit integration with higher-level system software such as schedulers and runtime systems. The HPC PowerStack Initiative uses these APIs to better manage power, energy, and performance of diverse HPC architectures through various metrics.
These APIs enable application developers to gain a better understanding of power, energy, and performance through various metrics. Additionally, the APIs enable system software to control hardware dials to optimize for a particular goal when integrated with a software stack comprising Flux, PowerAPI, Caliper, Kokkos, LDMS, and others.
The HPC PowerStack community meets regularly, details of which can be found at their website. This ensures user engagement.HPC PowerStack also closely collaborates with the following community efforts:
To support general sustainability, PowerStack (Variorum) has been integrated into E4S and has a Spack package. Numerous Variorum integrations assist in growing both adoption and a user community.