Computational science applications generate massive amounts of data from which scientists must extract information and visualize the results. Performing the visualization and analysis tasks in situ while the simulation is running can result in the improved use of computational resources and reduce the time that the scientists must wait for their results. The ALPINE/ZFP project is delivering in situ visualization and analysis infrastructure and algorithms, including a data compression capability for floating-point arrays to reduce memory, communication, I/O, and offline storage costs.
Many high-performance simulation codes write data to disk to visualize and analyze it after the simulation is completed. Given the exascale I/O bandwidth constraints, this process must be performed in situ to fully use the exascale resources. In situ data analysis and visualization selects, analyzes, reduces, and generates extracts from a scientific simulation while the simulation is running to overcome bandwidth and storage bottlenecks associated with writing the full simulation results to the file system. The ALPINE/ ZFP project produces in situ visualization and analysis infrastructure that will be used by the exascale applications along with a lossy compression capability for floating point arrays.
The ALPINE development effort focuses on delivering exascale visualization and analysis algorithms that will be critical for exascale applications; developing an exascale-capable infrastructure for in situ algorithms and deploying it into existing applications, libraries, and tools; and engaging with exascale application teams to integrate ALPINE with their software. This capability will leverage existing, successful software, ParaView/Catalyst, VisIt, and a new lightweight infrastructure, Ascent. ALPINE capabilities will be integrated into these infrastructures for deployment in exascale science codes to address exascale challenges.
Overcoming the performance cost of data movement is also critical. With deepening memory hierarchies and dwindling per-core memory bandwidth due to increasing parallelism, even on-node data motion causes significant performance bottlenecks and primary source of power consumption. The ZFP software is a floating-point array primitive that mitigates this problem by using very high-speed, lossy (but optionally error-bounded) compression to significantly reduce data volumes and I/O times. The ZFP development effort focuses on (1) extending ZFP to make it more readily usable in an exascale computing setting by parallelizing it on both CPU and GPU while ensuring thread safety, (2) providing bindings for multiple programming languages, (3) adding new functionality, (4) hardening the software and adopting best practices for software development, and (5) integrating ZFP with a variety of exascale applications, I/O libraries, and software tools.