Subfiling and Multiple dataset APIs: An introduction to two new features in HDF5 version 1.14

September 30

Calendar

When:

September 30, 2022 @ 12:00 pm – 1:00 pm

2022-09-30T12:00:00-04:00

2022-09-30T13:00:00-04:00

Contact:

Scot Breitenfeld

Training Event

training ideas IDEAS Productivity Agile Project Management Kanban

Subfiling and Multiple dataset APIs: An introduction to two new features in HDF5 version 1.14

For parallel I/O, the principle behind Subfiling is to find the middle ground between a single shared file and one file per process, thereby avoiding the complexity of one file per process and minimizing the locking issues of a single shared file on a parallel file system. The first part of the talk will cover Subfiling’s implementation, its usage, and the performance benefits observed compared to a single shared file. The second part of the talk will introduce new HDF5 multiple dataset APIs and highlight the performance benefits when using them. The HDF5 library allows a data access operation to access one dataset at a time. However, accessing multiple datasets requires the user to issue an I/O call for each dataset. Hence, the new multiple dataset APIs allow users to access multiple datasets with a single I/O call. In addition, the new routines can improve performance, especially when data is accessed across several datasets from all processes.

Presenters: Neil Fortner and Jordan Henderson

The webinar will be held on September 30, 2022.