qmcpack_figure4 - Exascale Computing Project

Figure 4. Performance of the new batched offload implementation in QMCPACK now rivals the legacy CUDA version. Shown is throughput relative to this version vs. number of atoms for a broad range of problems. The different datasets show the progression made with new QMCPACK versions and with improvements in LLVM’s OpenMP implementation.