Distributed (ATC) Gradient Descent for High Dimension Sparse Regression

We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the probl...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on information theory Vol. 69; no. 8; p. 1
Main Authors Ji, Yao, Scutari, Gesualdo, Sun, Ying, Honnappa, Harsha
Format Journal Article
LanguageEnglish
Published New York IEEE 01.08.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0018-9448
1557-9654
DOI10.1109/TIT.2023.3267742

Cover

More Information
Summary:We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimension. This paper provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions-which hold with high probability for standard data generation models-suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2023.3267742