Rendezvous methods reduce performance bottlenecks in particle and grid-based simulations

May 28, 2021

Scientists have demonstrated the value in two particle simulators of so-called rendezvous methods, which invoke a communication pattern useful when the processors sending and receiving information are unknown to each other. Datums from multiple sending processors effectively rendezvous on a single processor, so that processor can perform a computation that requires all the datums. On a large machine, millions or billions of such datums can be communicated simultaneously to create a load-balanced rendezvous decomposition of the data. This work showed the approach can reduce performance bottlenecks and scale effectively to exascale machines. The scientists’ findings were published in the September 2020 issue of the Journal of Parallel and Distributed Computing.

The team implemented rendezvous algorithms within two particle simulation codes, the molecular dynamics code LAMMPS and Direct Simulation Monte Carlo code SPARTA, for which some setup and other occasional operations were too slow using simpler brute-force algorithms, when running large problems on large machines. The new rendezvous algorithms performed dramatically faster at scale. For example, LAMMPS can now enumerate bond topologies for molecular systems with billions of atoms and SPARTA can compute grid/surface intersections in models with billions of grid cells and millions of surface elements much more efficiently.

The researchers believe rendezvous methods could potentially be useful for a variety of computational tasks performed in particle and grid-based codes when simpler algorithms do not scale well.

Plimpton, Steven J., and Christopher Knight. “Rendezvous Algorithms for Large-scale Modeling and Simulation.” Journal of Parallel and Distributed Computing 147 (September 2020): 184–195.

https://doi.org/10.1016/j.jpdc.2020.09.001

All Summaries