BTS: Load-Balanced Distributed Union-Find for Finding Connected Components with Balanced Tree Structures
How can we efficiently find connected components with Union-Find in a distributed system? Union-Find is the most efficient sequential algorithm for finding connected components with low memory usage and high speed. Several studies have adapted Union-Find to distributed memory systems to process larg...
Saved in:
Published in | 2024 IEEE 40th International Conference on Data Engineering (ICDE) pp. 1090 - 1102 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
13.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | How can we efficiently find connected components with Union-Find in a distributed system? Union-Find is the most efficient sequential algorithm for finding connected components with low memory usage and high speed. Several studies have adapted Union-Find to distributed memory systems to process large graphs quickly; however, they all suffer from load balancing problems. We notice that the leading cause of the load balancing problems is the nature of Union-Find, which gathers more and more edges to a small number of vertices as it proceeds. In this paper, we propose BTS, a new fast and scalable distributed Union-Find algorithm for finding connected components in large graphs. BTS resolves the load balancing problems by proposing Balanced Union-Find, which allocates vertices to each processor and makes edges link to vertices in the same processor as much as possible. We further optimize BTS with edge refinement to minimize network traffic and memory usage. Experimental results show that BTS efficiently resolves the load balancing problems, processing 16-1024 times larger graphs with 3.1-261.9 times faster speeds than existing algorithms. |
---|---|
ISSN: | 2375-026X |
DOI: | 10.1109/ICDE60146.2024.00089 |