A Greedy Algorithm for Optimally Pipelining a Reduction
Collective communications are ubiquitous in parallel applications. We present two new algorithms for performing a reduction. The operation associated with our reduction needs to be associative and commutative. The two algorithms are developed under two different communication models (unidirectional...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
17.10.2013
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Collective communications are ubiquitous in parallel applications. We present
two new algorithms for performing a reduction. The operation associated with
our reduction needs to be associative and commutative. The two algorithms are
developed under two different communication models (unidirectional and
bidirectional). Both algorithms use a greedy scheduling scheme. For a
unidirectional, fully connected network, we prove that our greedy algorithm is
optimal when some realistic assumptions are respected. Previous algorithms fit
the same assumptions and are only appropriate for some given configurations.
Our algorithm is optimal for all configurations. We note that there are some
configuration where our greedy algorithm significantly outperform any existing
algorithms. This result represents a contribution to the state-of-the art. For
a bidirectional, fully connected network, we present a different greedy
algorithm. We verify by experimental simulations that our algorithm matches the
time complexity of an optimal broadcast (with addition of the computation).
Beside reversing an optimal broadcast algorithm, the greedy algorithm is the
first known reduction algorithm to experimentally attain this time complexity.
Simulations show that this greedy algorithm performs well in practice,
outperforming any state-of-the-art reduction algorithms. Positive experiments
on a parallel distributed machine are also presented. |
---|---|
DOI: | 10.48550/arxiv.1310.4645 |