Fig.10 cuda | Researchers bridge communications gap to enable exascale compute power

Timeline of CG (top) and CGAsync (bottom) on rank 2. Each ran 10 iterations. The blue csr… bars are csrMV (i.e., SpMV) kernels in cuSPARSE, and the red c… bars are cudaMemcpyAsync() copying data
from device to host.