This project is focused on developing a user-level file system, called UnifyCR, so that applications can use node-local burst buffers for shared files. UnifyCR will be designed to support bulk-synchronous I/O which is the most common I/O workload for HPC, including checkpoint/restart and periodic output. With UnifyCR, applications will be able to write to fast, scalable, node-local burst buffers as easily as they do the parallel file system.
The hierarchical storage for future HPC systems will include compute-node local SSDs as burst buffers. This distributed burst buffer design promises fast, scalable I/O performance because burst buffer bandwidth and capacity will automatically scale with the compute resources used by jobs and workflows. However, a major concern for this distributed design is how to present the disjoint storage devices as a single storage location to applications that use shared files. To address this, we will develop UnifyCR, a user-level file system, highly-specialized for shared file access on HPC systems with distributed burst buffers. UnifyCR transparently intercepts I/O calls, so it integrates cleanly with other software including I/O and checkpoint/restart libraries. Additionally, because UnifyCR is tailored for HPC systems and workloads, it can deliver high performance.