SpComm3D: A Framework for Enabling Sparse Communication in 3D Sparse Kernels
Existing 3D algorithms for distributed-memory sparse kernels suffer from limited scalability due to reliance on bulk sparsity-agnostic communication. While easier to use, sparsity-agnostic communication leads to unnecessary bandwidth and memory consumption. We present SpComm3D, a framework for enabl...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
30.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Existing 3D algorithms for distributed-memory sparse kernels suffer from
limited scalability due to reliance on bulk sparsity-agnostic communication.
While easier to use, sparsity-agnostic communication leads to unnecessary
bandwidth and memory consumption. We present SpComm3D, a framework for enabling
sparsity-aware communication and minimal memory footprint such that no
unnecessary data is communicated or stored in memory. SpComm3D performs sparse
communication efficiently with minimal or no communication buffers to further
reduce memory consumption. SpComm3D detaches the local computation at each
processor from the communication, allowing flexibility in choosing the best
accelerated version for computation. We build 3D algorithms with SpComm3D for
the two important sparse ML kernels: Sampled Dense-Dense Matrix Multiplication
(SDDMM) and Sparse matrix-matrix multiplication (SpMM). Experimental
evaluations on up to 1800 processors demonstrate that SpComm3D has superior
scalability and outperforms state-of-the-art sparsity-agnostic methods with up
to 20x improvement in terms of communication, memory, and runtime of SDDMM and
SpMM. The code is available at: https://github.com/nfabubaker/SpComm3D |
---|---|
DOI: | 10.48550/arxiv.2404.19638 |