GreedyML: A Parallel Algorithm for Maximizing Constrained Submodular Functions
We describe a parallel approximation algorithm for maximizing monotone submodular functions subject to hereditary constraints on distributed memory multiprocessors. Our work is motivated by the need to solve submodular optimization problems on massive data sets, for practical contexts such as data s...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
15.03.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We describe a parallel approximation algorithm for maximizing monotone
submodular functions subject to hereditary constraints on distributed memory
multiprocessors. Our work is motivated by the need to solve submodular
optimization problems on massive data sets, for practical contexts such as data
summarization, machine learning, and graph sparsification.
Our work builds on the randomized distributed RandGreedi algorithm, proposed
by Barbosa, Ene, Nguyen, and Ward (2015). This algorithm computes a distributed
solution by randomly partitioning the data among all the processors and then
employing \emph{a single} accumulation step in which all processors send their
partial solutions to one processor. However, for large problems, the
accumulation step exceeds the memory available on a processor, and the
processor that performs the accumulation becomes a computational bottleneck.
Hence we propose a generalization of the RandGreedi algorithm that employs
multiple accumulation steps to reduce the memory required. We analyze the
approximation ratio and the time complexity of the algorithm (in the BSP
model). We evaluate the new GreedyML algorithm on three classes of problems,
and report results from large-scale data sets with millions of elements. The
results show that the GreedyML algorithm can solve problems where the
sequential Greedy and distributed RandGreedi algorithms fail due to memory
constraints. For certain computationally intensive problems, the GreedyML
algorithm is faster than the RandGreedi algorithm. The observed approximation
quality of the solutions computed by the GreedyML algorithm closely matches
those obtained by the RandGreedi algorithm on these problems. |
---|---|
DOI: | 10.48550/arxiv.2403.10332 |