Near-Optimal Massively Parallel Graph Connectivity

Identifying the connected components of a graph, apart from being a fundamental problem with countless applications, is a key primitive for many other algorithms. In this paper, we consider this problem in parallel settings. Particularly, we focus on the Massively Parallel Computations (MPC) model,...

Full description

Saved in:
Bibliographic Details
Published in2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS) pp. 1615 - 1636
Main Authors Behnezhad, Soheil, Dhulipala, Laxman, Esfandiari, Hossein, Lacki, Jakub, Mirrokni, Vahab
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Identifying the connected components of a graph, apart from being a fundamental problem with countless applications, is a key primitive for many other algorithms. In this paper, we consider this problem in parallel settings. Particularly, we focus on the Massively Parallel Computations (MPC) model, which is the standard theoretical model for modern parallel frameworks such as MapReduce, Hadoop, or Spark. We consider the truly sublinear regime of MPC for graph problems where the space per machine is n δ for some desirably small constant δ ϵ (0, 1). We present an algorithm that for graphs with diameter D in the wide range [log ε n, n], takes O(log D) rounds to identify the connected components and takes O(log log n) rounds for all other graphs. The algorithm is randomized, succeeds with high probability, does not require prior knowledge of D, and uses an optimal total space of O(m). We complement this by showing a conditional lower-bound based on the widely believed TwoCycle conjecture that Ω(log D) rounds are indeed necessary in this setting. Studying parallel connectivity algorithms received a resurgence of interest after the pioneering work of Andoni etal [FOCS 2018] who presented an algorithm with O(log D log log n) round-complexity. Our algorithm improves this result for the whole range of values of D and almost settles the problem due to the conditional lower-bound. Additionally, we show that with minimal adjustments, our algorithm can also be implemented in a variant of (CRCW) PRAM in asymptotically the same number of rounds.
ISSN:2575-8454
DOI:10.1109/FOCS.2019.00095