Operating systems provide the necessary functionality to libraries and applications, such as allocating memory and spawning processes, and manage the resources on the nodes in an exascale system. The Argo project is augmenting and optimizing existing operating system and runtime components, as well as building portable, open-source system software that improves performance and scalability and provides increased functionality to exascale libraries, applications, and runtime systems with a focus on resource management, memory management, and power management.
Many exascale applications have a complex runtime structure, ranging from in situ data analysis through an ensemble of largely independent individual sub-jobs to arbitrarily complex workflow structures. To meet the emerging needs of exascale workloads while providing optimal performance and resilience, the compute, memory, and interconnect resources must be managed in cooperation with applications, libraries, and runtime systems. Argo’s goal is to augment and optimize low-level system software components for use in production exascale systems, providing portable, open-source, integrated software that improves the performance and scalability of—and offers increased functionality to—exascale applications, libraries, and runtime systems. The project focuses on resource management, memory management, and power management.
The Argo team is delivering resource management infrastructure to coordinate static allocation and the dynamic management of node resources, such as processor cores, memory, and caches. It supports multiple resource management policies suitable for a variety of application workloads. By taking care of system-specific aspects, such as topology mapping and partitioning massively parallel resources, this infrastructure will improve the performance and portability of exascale applications, libraries, and their runtimes.
Memory management libraries are being developed to provide flexible and portable memory management mechanisms that make it easier to obtain high performance. One approach incorporates nonvolatile memory into complex memory hierarchies by using a memory map; another provides explicit, application-aware memory management for deep memory systems. These libraries will directly support new applications that analyze large, distributed datasets and make it easier to program heterogeneous hardware resources.
Another effort of the Argo team is providing fully integrated, end-to-end infrastructure for power and performance management, including power-aware plugins for resource managers, workflow managers, job-level runtimes, and a vendor-neutral power control library. This infrastructure directly addresses the challenge of managing the performance of exascale applications on highly power-constrained systems.