Understanding the performance characteristics of exascale applications is necessary for identifying and addressing the barriers to achieving performance goals. This becomes more difficult as the architectures become more complex. The Performance Application Programming Interface (PAPI) provides library and application developers with generic and portable access to low-level performance counters found across the exascale machine, enabling users to see the relationships between software performance and hardware events. These relationships provide a critical step toward improving performance.
The Exascale PAPI (Exa-PAPI) project is developing a new C++ PAPI (PAPI++) software package from the ground up that offers a standard interface and methodology for using low-level performance counters in CPUs, GPUs, on/off-chip memory, interconnects, and the I/O system, including energy/power management. PAPI++ is building on classic PAPI functionality and strengthening its path to exascale with a more efficient and flexible software design that takes advantage of C++’s object-oriented nature but preserves the low-overhead monitoring of performance counters and adds a vast testing suite.
In addition to providing hardware counter-based information, a standardizing layer for monitoring software-defined events (SDEs) is being incorporated that exposes the internal behavior of runtime systems and libraries, such as communication and math libraries, to the applications. As a result, the notion of performance events is broadened from strictly hardware-related events to include software-based information.
Enabling the monitoring of hardware and software events provides more flexibility to developers when capturing performance information.
In summary, the Exa-PAPI team is preparing PAPI support to solve the challenges posed by exascale systems by (1) widening its applicability and providing robust support for exascale hardware resources; (2) supporting finer-grain measurement and control of power, thus offering software developers a basic building block for dynamic application optimization under power constraints; extending PAPI to support SDEs; and (4) applying semantic analysis to hardware counters so that application developers can better make sense of the ever-growing list of raw hardware performance events that can be measured during execution.
The team will channel the monitoring capabilities of hardware counters, power usage, and SDEs into a robust PAPI++ software package. PAPI++ is meant to be PAPI’s replacement with a more flexible and sustainable software design.