Jupyter and HPC: Current State and Future Roadmap

February 28, 2018 @ 1:00 pm – 2:00 pm
Osni Marques

The IDEAS Productivity project, in partnership with the DOE Computing Facilities of the ALCF, OLCF, and NERSC and the DOE Exascale Computing Project (ECP) has resumed the webinar series on Best Practices for HPC Software Developers, which we began in 2016.

As part of this series, we offer one-hour webinars on topics in scientific software development and high-performance computing, approximately once a month. Participation is free and open to the public, but registration is required for each event. The next webinar in the series was titled “Jupyter and HPC: Current State and Future Roadmap”, and was presented by Matthias Bussonnier (UC Berkeley), Shreyas Cholia (LBNL), and Suhas Somnath (ORNL).  The webinar took place on Wednesday, February 28, 2018 at 1:00 pm ET.


During the last few years the Jupyter notebook has become one of the tools of choice for the data science and high-performance computing (HPC) communities. This webinar will provide an overview of why Jupyter is gaining traction in education, data science, and HPC, with emphasis on how notebooks can be used as interactive documents for exploration and reporting.  We will present an overview of how Jupyter works and how the network protocol can be leveraged for both a local single machine and remote-cluster work.  We will discuss the nuts and bolts of how Jupyter has been deployed at NERSC as a case study in implementation of Jupyter in an HPC environment. This work implies learning the Jupyter ecosystem to take advantage of its powerful abstractions to develop custom infrastructure to satisfy policies and user needs. The webinar showed, as a use case, how Jupyter notebooks have transformed data discovery, visualization, and interactive analysis for the scanning probe and electron microscopy communities at Oak Ridge National Laboratory. It also showed how notebooks can seamlessly accommodate measurements from a wide variety of instruments through Pycroscopy, a framework for instrument agnostic data storage and analysis.