Subfiling and Multiple dataset APIs: An introduction to two new features in HDF5 version 1.14

When:
September 30, 2022 @ 12:00 pm – 1:00 pm
2022-09-30T12:00:00-04:00
2022-09-30T13:00:00-04:00
Contact:
Scot Breitenfeld

Subfiling and Multiple dataset APIs: An introduction to two new features in HDF5 version 1.14

For parallel I/O, the principle behind Subfiling is to find the middle ground between a single shared file and one file per process, thereby avoiding the complexity of one file per process and minimizing the locking issues of a single shared file on a parallel file system. The first part of the talk will cover Subfiling’s implementation, its usage, and the performance benefits observed compared to a single shared file. The second part of the talk will introduce new HDF5 multiple dataset APIs and highlight the performance benefits when using them. The HDF5 library allows a data access operation to access one dataset at a time. However, accessing multiple datasets requires the user to issue an I/O call for each dataset. Hence, the new multiple dataset APIs allow users to access multiple datasets with a single I/O call. In addition, the new routines can improve performance, especially when data is accessed across several datasets from all processes.

Presenters: Neil Fortner and Jordan Henderson

The webinar will be held on September 30, 2022.