Doug Kothe
Oak Ridge National Laboratory
ECP Director
Team,
First, let me wish each of you a very Merry Christmas!
For the ECP, 2017 has been a year of growth, accomplishments, and positioning for our next phase. As you learned in our last newsletter update, we recently made some ECP Leadership changes.
Andrew Siegel (ANL) and Erik Draeger (LLNL) have taken over the Application Development focus area and have hit the ground running, reorganizing the Application Development function into six categories, each with a new “level 3” lead in place. The six application development categories and associated leads are the following:
Chemistry and Materials Applications: Jack DeSlippe (LBNL)
Energy Applications: Tom Evans (ORNL)
Earth and Space Science Applications: Anshu Dubey (ANL)
Data Analytics and Optimization Applications: William Hart (SNL)
National Security Applications: Marianne Francois (LANL)
Co-Design: Phil Collela (LBNL)
Mike Heroux (SNL) and Jonathan Carter (LBNL) have been quite busy with the Software Technology focus area, which consists of five categories with level 3 leads, as follows:
Programming Models and Runtimes: Rajeev Thakur (ANL)
Development Tools: Jeff Vetter (ORNL)
Mathematical Libraries: Lois Curfman McInnes (ANL)
Data and Visualization: Jim Ahrens (LANL)
Software Ecosystem and Delivery: Rob Neely (LLNL)
And finally, perhaps the biggest changes are in the area of Hardware and Integration, led by Terri Quinn (LLNL) and Susan Coghlan (ANL). Terri and Susan have built a team infrastructure with six level 3 leads, as follows:
PathForward: Bronis de Supinski (LLNL)
Hardware Evaluation: Si Hammond (SNL)
Application Integration at Facilities: Judy Hill (ORNL)
Software Deployment at Facilities: Dave Montoya (LANL)
Facility Resource Utilization: Julia White (ORNL)
Training and Productivity: Ashley Barker (ORNL)
We have implemented some WBS* changes, which we believe are needed to increase our focus on product delivery and “steady state” execution post-startup. Application Development has changed from a programmatic to a domain-based structure to enable closer, more effective leadership and project management by domain science experts. The Software Technology structure is now consolidated and more streamlined, with aggressive movement into a critical product development stage after its initial R&D stage and line of sight of ST products to applications. Hardware and Integration required an expanded scope to more proactively and directly integrate—including a formal handoff of ECP products and technologies—with DOE HPC facilities. The size and complexity of ECP has warranted a more empowered extended leadership team with explicit roles, responsibilities, authorities, and accountabilities. We are confident these changes best position ECP for its objectives moving forward.
*A Work Breakdown Structure (WBS) is a key project deliverable that organizes the team’s work into manageable sections.
Second Annual ECP Meeting
The second annual ECP meeting will be February 5–9, 2018, in Knoxville, Tennessee. An exciting program is planned, with more than two dozen tutorials and 30 breakout sessions led by members of the ECP community. The breakout sessions will focus on key topics and areas of interest to the ECP community, and tutorials will give participants an opportunity to engage directly with tools, algorithm, and developer experts from the HPC community.
Here is a sample of the diverse breakout session topics on the agenda: “Bridging of Facilities BOF,” “Core-Edge Coupling: An Integrated ECP Demonstration,” “Enhancing Productivity and Innovation in ECP with a Team of Teams Approach,” and “Fighting Application Amnesia: Tools for Memory Analysis and Characterization of Scientific Applications.” Tutorial topics include “Performance Tuning of Scientific Codes with the Roofline Model,” “Container Computing for HPC and Scientific Workflows,” “Using C++ for Scientific Programming,” and many others.
Last year nearly 500 guests from ECP projects, DOE facilities, and other organizations attended the first annual meeting. This year we anticipate exceeding this number, with additional participation by PathForward vendors and facilities.
It was great to see so many of you at SC17 in November. The ECP was quite visible with many of our researchers making appearances at booth presentations, BOFs, and panel talks. Our ECP communications team interviewed researchers and PIs representing 13 of our thrust areas and is hard at work producing a new podcast section for the ECP website where these interviews and many more will be featured. Stay tuned for more details.
Finally, in early 2018 we will kick off a significant redesign of our ECP website to bring technical project descriptions and updates front and center. The goal is to start rolling out these changes in late-January and complete them during the first quarter.
From everyone on the ECP Leadership Team, our best wishes for a safe and wonderful holiday season.
Raspberry Pi: ‘Cluster Development Breadboard’
“Exascale” refers to computing systems at least 50 times faster than the most powerful supercomputers in use today. The problem faced by Los Alamos National Laboratory (LANL) and similar labs building these systems is one of scale. To get the required performance, you need a lot of nodes, and to make it work, you need a lot of research and development.
There’s a catch-22, however: how do you write the operating systems, network stacks, launch, and boot systems for such large computers without having one on which to test it all? Use an existing supercomputer? No—the existing clusters are fully booked 24/7 doing science, they cost millions of dollars a year to run, and they may not have the architecture you need for your next-generation machine anyway. Older machines retired from science may be available, but at this scale, they cost far too much to use and are usually very hard to maintain.
The LANL solution? Build a “model supercomputer” with Raspberry Pi! Think of it as a “cluster development breadboard.”
From ORNL Review: Infographic on the Promise of Exascale Computing
ORNL Review, the research and development magazine of Oak Ridge National Laboratory, recently published an infographic showing some of the ways exascale will expand research and development, as compared with a single desktop computer and with the nation’s most powerful computer today, the Titan system at the Oak Ridge Leadership Computing Facility.
Read More and View the Graphic >
Video: Exascale for Free Electron Lasers Project
The Exascale for Free Electron Lasers project aims to stream data from the Linac Coherent Light Source beamlines to a supercomputer via ESnet (the US Department of Energy’s dedicated science network), perform data processing, and provide fast feedback to users in quasi real time (less than 10 minutes).
CANDLE Project Receives Award
The CANcer Distributed Learning Environment (CANDLE), an effort of the Exascale Computing Project, was honored in the HPCwire Readers’ & Editors’ Choice Awards, presented at the 2017 International Conference for High Performance Computing, Networking, Storage and Analysis (SC17) November 13 in Denver.
From HPCwire: Exascale Systems Will Help Cities with Long-Range Planning
Charlie Cattlett of Argonne National Laboratory and the Exascale Computing Project moderated a panel of smart city practitioners at SC17 who shared the strategies, techniques and technologies they use to understand their cities better and to improve the lives of their residents. He said the vision for exascale is to build “a framework for different computation models to be coupled together in multiple scales to look at long-range forecasting for cities.”
Drive to Exascale is Subject of Nature News Feature
A recent article in the journal Nature looks at worldwide efforts to take high-performance computing to exascale.
Co-Design Center Develops Next-Generation Simulation Tools
The Center for Efficient Exascale Discretizations provides Exascale Computing Project applications with leading-edge simulation algorithms that can extract greater performance from exascale hardware than what is currently available.
Sandia National Laboratories: Kokkos Tutorial and Hackathon
Sandia National Laboratories held another in a series of Kokkos tutorials and hackathons on December 5–9. Join the Exascale Computing Project Kokkos team for the next one January 16–18 in Santa Fe, New Mexico, where Kokkos experts will be available to help you with your applications.
Argonne Training Program on Extreme-Scale Computing
Computational scientists now have the opportunity to apply for the upcoming Argonne Training Program on Extreme-Scale Computing (ATPESC), to take place from July 29 to August 10, 2018. This program provides intensive hands-on training on the key skills, approaches and tools to design, implement, and execute computational science and engineering applications on current supercomputers and the HPC systems of the future.