A GPU-Accelerated Bi-linear ADMM Algorithm for Distributed Sparse Machine Learning
This paper introduces the Bi-linear consensus Alternating Direction Method of Multipliers (Bi-cADMM), aimed at solving large-scale regularized Sparse Machine Learning (SML) problems defined over a network of computational nodes. Mathematically, these are stated as minimization problems with convex l...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
25.05.2024
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2405.16267 |
Cover
Loading…
Summary: | This paper introduces the Bi-linear consensus Alternating Direction Method of
Multipliers (Bi-cADMM), aimed at solving large-scale regularized Sparse Machine
Learning (SML) problems defined over a network of computational nodes.
Mathematically, these are stated as minimization problems with convex local
loss functions over a global decision vector, subject to an explicit $\ell_0$
norm constraint to enforce the desired sparsity. The considered SML problem
generalizes different sparse regression and classification models, such as
sparse linear and logistic regression, sparse softmax regression, and sparse
support vector machines. Bi-cADMM leverages a bi-linear consensus reformulation
of the original non-convex SML problem and a hierarchical decomposition
strategy that divides the problem into smaller sub-problems amenable to
parallel computing. In Bi-cADMM, this decomposition strategy is based on a
two-phase approach. Initially, it performs a sample decomposition of the data
and distributes local datasets across computational nodes. Subsequently, a
delayed feature decomposition of the data is conducted on Graphics Processing
Units (GPUs) available to each node. This methodology allows Bi-cADMM to
undertake computationally intensive data-centric computations on GPUs, while
CPUs handle more cost-effective computations. The proposed algorithm is
implemented within an open-source Python package called Parallel Sparse Fitting
Toolbox (PsFiT), which is publicly available. Finally, computational
experiments demonstrate the efficiency and scalability of our algorithm through
numerical benchmarks across various SML problems featuring distributed
datasets. |
---|---|
DOI: | 10.48550/arxiv.2405.16267 |