Distributed (ATC) Gradient Descent for High Dimension Sparse Regression
We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the probl...
Saved in:
Published in | IEEE transactions on information theory Vol. 69; no. 8; p. 1 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.08.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 0018-9448 1557-9654 |
DOI | 10.1109/TIT.2023.3267742 |
Cover
Summary: | We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimension. This paper provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions-which hold with high probability for standard data generation models-suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0018-9448 1557-9654 |
DOI: | 10.1109/TIT.2023.3267742 |