The Exascale Computing Project has concluded. This site is retained for historical references.

Data and Visualization

Lead: James Ahrens, Los Alamos National Laboratory

The ECP’s software portfolio has a large collection of data management and visualization products that provides essential capabilities for compressing, analyzing, moving, and managing data. These tools are becoming even more important as the volume of simulation data that is produced grows faster than the ability to capture and interpret it.

ADIOS

Objective: Support efficient I/O and code coupling services

Exascale architectures will have complex, heterogeneous memory hierarchies, ranging from node-level caches and main memory all the way to persistent storage via the file system, that applications need to effectively achieve their science goals. At the same time, exascale applications are becoming more complex in their data flows, from multiscale and multiphysics simulations that need to exchange data between separate codes to simulations that invoke data analysis and visualization services.

Principal Investigators: Scott Klasky, Oak Ridge National Laboratory

DataLib

Objective: Support efficient I/O, I/O monitoring and data services

Exascale applications generate massive amounts of data that need to be analyzed and stored to achieve their science goals. The speed at which the data can be written to the storage system is a critical factor in achieving these goals. As exascale architectures become more complex, with multiple compute nodes and accelerators and heterogenous memory systems, the storage technologies must evolve to support these architectural features.

Principal Investigators: Rob Ross, Argonne National Laboratory

VTK-m

Objective: Provide VTK- based scientific visualization software that supports shared memory parallelism

As exascale simulations generate data, scientists need to extract information and understand their results. One of the primary mechanisms for understanding these results is to produce visualizations that can be viewed and manipulated. The VTK-m project is developing and deploying scientific visualization software capable of efficiently using exascale architectural features, such as the shared-memory parallelism available on many-core CPUs and GPUs.

Principal Investigators: Ken Moreland, Oak Ridge National Laboratory

VeloC/SZ

Objective: Develop two software products: VeloC checkpoint restart and SZ lossy compression with strict error bounds

Long-running large-scale simulations and high-resolution, high-frequency instrument detectors are generating extremely large volumes of data at a high rate. While reliable scientific computing is routinely achieved at small scale, it becomes remarkably difficult at exascale due to both an increased number of disruptions as the machines become larger and more complex from, for example, component failures and the big data challenge.

Principal Investigators: Franck Cappello, Argonne National Laboratory

ExaIO

Objective: Develop an efficient system topology and storage hierarchy-aware HDF5 and UnifyFS parallel I/O libraries

In pursuit of more accurate modeling of real-world systems, scientific applications at exascale will generate and analyze massive amounts of data. A critical requirement of these applications to complete their science mission is the capability to access and manage these data efficiently on exascale systems. Parallel I/O, the key technology behind moving data between compute nodes and storage, faces monumental challenges from new application workflows as well as the memory.

Principal Investigators: John Wu, Lawrence Berkeley National Laboratory; Suren Byna, Ohio State University

Alpine/ZFP

Objective: Deliver in situ visualization and analysis algorithms, infrastructure and data reduction of floating-point arrays

Computational science applications generate massive amounts of data from which scientists need to extract information and visualize the results. Performing the visualization and analysis tasks in situ, while the simulation is running, can lead to improved use of computational resources and reduce the time the scientists must wait for their results.

Principal Investigators: Jim Ahrens, Los Alamos National Laboratory

National Nuclear Security Administration logo Exascale Computing Project logo small U.S. Department of Energy Office of Science logo