Software Technology

Programming Models and Runtimes

Enhancing Qthreads for ECP Science and Energy Impact

Principal Investigators: Ron Brightwell, Sandia National Laboratories (SNL); Stephen Olivier, SNL

This project addresses the challenge of scalably coupling multithreaded parallelism on the many-core node with communication such as MPI. Most ECP applications are using this combination of programming models, with the Kokkos or RAJA performance portability libraries and/or the OpenMP API for multithreading. The key challenge arises when multiple threads make communication calls, and those calls must be serviced by the MPI implementation and NIC. Existing solutions, such as MPI_THREAD_MULTIPLE, are often plagued by synchronization overheads. Even the best vendor MPI implementations incur high overheads when the number of threads exceeds the number of hardware contexts in the NIC. Unlike previous approaches, we attack the problem not only from the communication side (MPI), but with assistance from the multithreading runtime system. We use the Qthreads runtime, a scalable, event-driven library for node-level task parallelism, to implement our solution. Developed at Sandia Labs since 2007, Qthreads serves as a back-end for Kokkos and the Cray Chapel language, as well as providing a portable native C API. In addition, the techniques developed in this project will be the subject of tech transfer efforts to OpenMP and MPI. The project technical lead is chair of the OpenMP Subcommittee on Task Parallelism, and one of the other technical experts on the project is a key contributor to the MPI Forum and the OMPI-X ECP project that is enhancing the open-source Open MPI implementation of MPI for exascale. We are leveraging, in particular, the new “Fine Points” capability for threaded MPI execution developed in the OMPI-X ECP project, and synergies between Qthreads and the OpenMP and Kokkos tasking models.