Distributed (ATC) Gradient Descent for High Dimension Sparse Regression

We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the probl...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on information theory Vol. 69; no. 8; p. 1
Main Authors	Ji, Yao, Scutari, Gesualdo, Sun, Ying, Honnappa, Harsha
Format	Journal Article
Language	English
Published	New York IEEE 01.08.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Convergence Convexity Distributed optimization high-dimension statistics linear convergence Linear regression Mesh networks Optimization Probability Smoothness sparse linear regression Standard data Statistical analysis Tuning
Online Access	Get full text
ISSN	0018-9448 1557-9654
DOI	10.1109/TIT.2023.3267742

Cover

Abstract	We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimension. This paper provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions-which hold with high probability for standard data generation models-suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings.
AbstractList	We study linear regression from data distributed over a network of agents (with no server node) by means of LASSO estimation, in high-dimension, which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimension. This paper provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions–which hold with high probability for standard data generation models–suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings. We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimension. This paper provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions-which hold with high probability for standard data generation models-suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings.
Author	Scutari, Gesualdo Sun, Ying Ji, Yao Honnappa, Harsha
Author_xml	– sequence: 1 givenname: Yao surname: Ji fullname: Ji, Yao organization: School of Industrial Engineering, Purdue University, West Lafayette, IN, USA – sequence: 2 givenname: Gesualdo orcidid: 0000-0002-6453-6870 surname: Scutari fullname: Scutari, Gesualdo organization: School of Industrial Engineering, Purdue University, West Lafayette, IN, USA – sequence: 3 givenname: Ying surname: Sun fullname: Sun, Ying organization: School of Electrical Engineering and Computer Science, The Pennsylvania State University, State College, PA, USA – sequence: 4 givenname: Harsha surname: Honnappa fullname: Honnappa, Harsha organization: School of Industrial Engineering, Purdue University, West Lafayette, IN, USA
BookMark	eNp9UE1PAjEQbYwmAnr34GETL3pY7Me2uz0SUCAhMVE8N6U7xRLYxbYc_Pd2hYPx4OnNTN6bmff66LxpG0DohuAhIVg-LufLIcWUDRkVZVnQM9QjnJe5FLw4Rz2MSZXLoqguUT-ETWoLTmgPTScuRO9Whwh1dj9ajh-yqde1gyZmEwimQ9v6bObWH9nE7aAJrm2yt732AbJXWHsI3eQKXVi9DXB9wgF6f35ajmf54mU6H48WuaGSxpxwVmBuasmEJmVNKlnWsq6YrlLFhV1ZIY3hhAtGmNUVLo2tQUgLhTVYUjZAd8e9e99-HiBEtWkPvkknFa1YSZL7H5Y4soxvQ_BglXFRx_Rn9NptFcGqC02l0FQXmjqFloT4j3Dv3U77r_8kt0eJA4BfdIIZTza-AdhAd8s
CODEN	IETTAW
CitedBy_id	crossref_primary_10_1109_TIT_2024_3428325 crossref_primary_10_1109_TSP_2024_3460690
Cites_doi	10.1561/9781601988515 10.1007/s00365-007-9003-x 10.1109/TSP.2015.2461520 10.1145/1961189.1961199 10.1109/TSP.2021.3086579 10.1137/14096668X 10.1109/TAC.2010.2041686 10.1007/s00365-010-9117-4 10.1214/17-AOS1587 10.1137/130943170 10.1109/JPROC.2018.2817461 10.1109/TAC.2017.2730481 10.1109/TCNS.2017.2698261 10.1109/TSP.2014.2304432 10.1109/Allerton.2012.6483273 10.1007/s00365-007-9005-8 10.1137/19M1244822 10.1016/j.jpdc.2006.08.010 10.1137/18M121784X 10.1137/19M1259973 10.1109/TSIPN.2016.2524588 10.1109/TSP.2020.3008605 10.1109/TSP.2018.2818081 10.1017/CBO9780511794308.006 10.1109/TSP.2012.2198470 10.1109/TSP.2011.2146776 10.1080/01621459.2018.1429274 10.1214/12-AOS1032 10.1109/TSP.2012.2217338 10.1007/s43670-022-00032-8 10.1109/TAC.2014.2363299 10.1109/TAC.2014.2298712 10.1016/j.crma.2008.03.014 10.1201/b18401 10.1109/TAC.2008.2009515 10.1109/TSIPN.2018.2846183
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1109/TIT.2023.3267742
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Computer Science
EISSN	1557-9654
EndPage	1
ExternalDocumentID	10_1109_TIT_2023_3267742 10103556
Genre	orig-research
GrantInformation_xml	– fundername: Office of Naval Research Global grantid: N00014-21-1-2673 funderid: 10.13039/100007297
GroupedDBID	-~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACGOD ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS F5P HZ~ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS RXW TAE TN5 3EH 5VS AAYOK AAYXX ABFSI AETEA AETIX AGSQL AI. AIBXA ALLEH CITATION E.L EJD H~9 IAAWW IBMZZ ICLAB IDIHD IFJZH RIG VH1 VJK 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c292t-153405cd936a17d1897d9d83a889756fbf69cc5156313fa807cfde69fe4fc0923
IEDL.DBID	RIE
ISSN	0018-9448
IngestDate	Mon Jun 30 06:25:25 EDT 2025 Tue Jul 01 02:16:22 EDT 2025 Thu Apr 24 23:01:41 EDT 2025 Wed Aug 27 02:21:20 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	8
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c292t-153405cd936a17d1897d9d83a889756fbf69cc5156313fa807cfde69fe4fc0923
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-6453-6870
PQID	2837132692
PQPubID	36024
PageCount	1
ParticipantIDs	crossref_primary_10_1109_TIT_2023_3267742 proquest_journals_2837132692 crossref_citationtrail_10_1109_TIT_2023_3267742 ieee_primary_10103556
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2023-08-01
PublicationDateYYYYMMDD	2023-08-01
PublicationDate_xml	– month: 08 year: 2023 text: 2023-08-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on information theory
PublicationTitleAbbrev	TIT
PublicationYear	2023
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 loh (ref24) 2011; 24 ref12 ref15 ref14 ref11 ref10 nedi? (ref30) 2010; 55 ref17 ref16 ref19 ref18 bao (ref4) 2021; 130 nedi? (ref28) 2016; 27 wellner (ref46) 2013 vershynin (ref43) 2018 scaman (ref34) 2017 ref51 ref50 ref48 ref47 ref42 ref41 sivakumar (ref38) 2015; 28 ref49 ref8 ref7 ref6 ref5 auzinger (ref3) 2011 ref40 ref35 ref37 ref36 ref31 ref33 ref2 ref1 nedi? (ref29) 2009; 54 wainwright (ref44) 2019 ref23 ref26 ref25 ref21 wang (ref45) 2017; 70 raskutti (ref32) 2010; 11 ref27 chafaï (ref9) 2012; 37 lee (ref22) 2017; 18 sun (ref39) 2022 ji (ref20) 2021
References_xml	– ident: ref33 doi: 10.1561/9781601988515 – ident: ref5 doi: 10.1007/s00365-007-9003-x – ident: ref36 doi: 10.1109/TSP.2015.2461520 – volume: 24 start-page: 1 year: 2011 ident: ref24 article-title: High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity publication-title: Proc Adv Neural Inf Process Syst – ident: ref10 doi: 10.1145/1961189.1961199 – ident: ref47 doi: 10.1109/TSP.2021.3086579 – ident: ref35 doi: 10.1137/14096668X – volume: 130 start-page: 46 year: 2021 ident: ref4 article-title: One-round communication efficient distributed M-estimation publication-title: Proc Int Conf Artif Intell Statist – year: 2021 ident: ref20 article-title: Distributed sparse regression via penalization publication-title: arXiv 2111 06530 – volume: 55 start-page: 922 year: 2010 ident: ref30 article-title: Constrained consensus and optimization in multi-agent networks publication-title: IEEE Trans Autom Control doi: 10.1109/TAC.2010.2041686 – ident: ref1 doi: 10.1007/s00365-010-9117-4 – ident: ref6 doi: 10.1214/17-AOS1587 – ident: ref50 doi: 10.1137/130943170 – ident: ref27 doi: 10.1109/JPROC.2018.2817461 – ident: ref48 doi: 10.1109/TAC.2017.2730481 – ident: ref31 doi: 10.1109/TCNS.2017.2698261 – volume: 11 start-page: 2241 year: 2010 ident: ref32 article-title: Restricted eigenvalue properties for correlated Gaussian designs publication-title: J Mach Learn Res – year: 2013 ident: ref46 publication-title: Weak Convergence and Empirical Processes With Applications to Statistics – start-page: 3027 year: 2017 ident: ref34 article-title: Optimal algorithms for smooth and strongly convex distributed optimization in networks publication-title: Proc Int Conf Mach Learn – ident: ref37 doi: 10.1109/TSP.2014.2304432 – year: 2011 ident: ref3 publication-title: Iterative Solution of Large Linear Systems – ident: ref11 doi: 10.1109/Allerton.2012.6483273 – ident: ref26 doi: 10.1007/s00365-007-9005-8 – ident: ref7 doi: 10.1137/19M1244822 – year: 2019 ident: ref44 publication-title: High-dimensional statistics A non-asymptotic viewpoint – ident: ref23 doi: 10.1016/j.jpdc.2006.08.010 – volume: 28 start-page: 1 year: 2015 ident: ref38 article-title: Beyond sub-Gaussian measurements: High-dimensional structured estimation with sub-exponential designs publication-title: Proc Adv Neural Inf Process Syst – ident: ref13 doi: 10.1137/18M121784X – volume: 27 start-page: 2597 year: 2016 ident: ref28 article-title: Achieving geometric convergence for distributed optimization over time-varying graphs publication-title: SIAM J Optim – ident: ref40 doi: 10.1137/19M1259973 – ident: ref25 doi: 10.1109/TSIPN.2016.2524588 – ident: ref49 doi: 10.1109/TSP.2020.3008605 – ident: ref51 doi: 10.1109/TSP.2018.2818081 – ident: ref42 doi: 10.1017/CBO9780511794308.006 – ident: ref12 doi: 10.1109/TSP.2012.2198470 – ident: ref18 doi: 10.1109/TSP.2011.2146776 – year: 2022 ident: ref39 article-title: High-dimensional inference over networks: Linear convergence and statistical guarantees publication-title: arXiv 2201 08507 – ident: ref21 doi: 10.1080/01621459.2018.1429274 – volume: 18 start-page: 115 year: 2017 ident: ref22 article-title: Communication-efficient sparse regression publication-title: J Mach Learn Res – year: 2018 ident: ref43 publication-title: High-Dimensional Probability An Introduction with Applications in Data Science – volume: 70 start-page: 3636 year: 2017 ident: ref45 article-title: Efficient distributed learning with sparsity publication-title: Proc Int Conf Mach Learn – ident: ref2 doi: 10.1214/12-AOS1032 – ident: ref41 doi: 10.1109/TSP.2012.2217338 – ident: ref14 doi: 10.1007/s43670-022-00032-8 – volume: 37 year: 2012 ident: ref9 publication-title: Interactions betweens compressed sensing random matrices and high dimensional geometry – ident: ref17 doi: 10.1109/TAC.2014.2363299 – ident: ref19 doi: 10.1109/TAC.2014.2298712 – ident: ref8 doi: 10.1016/j.crma.2008.03.014 – ident: ref15 doi: 10.1201/b18401 – volume: 54 start-page: 48 year: 2009 ident: ref29 article-title: Distributed subgradient methods for multi-agent optimization publication-title: IEEE Trans Autom Control doi: 10.1109/TAC.2008.2009515 – ident: ref16 doi: 10.1109/TSIPN.2018.2846183
SSID	ssj0014512
Score	2.462727
Snippet	We study linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension , which allows... We study linear regression from data distributed over a network of agents (with no server node) by means of LASSO estimation, in high-dimension, which allows...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	1
SubjectTerms	Algorithms Convergence Convexity Distributed optimization high-dimension statistics linear convergence Linear regression Mesh networks Optimization Probability Smoothness sparse linear regression Standard data Statistical analysis Tuning
Title	Distributed (ATC) Gradient Descent for High Dimension Sparse Regression
URI	https://ieeexplore.ieee.org/document/10103556 https://www.proquest.com/docview/2837132692
Volume	69
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED7RTjBQKEUUCvLAQIekeaceqxZakOgAqdQtSvxgALVVSRd-PWc7qQoIxBQPsWX5fL7vfOfvAK49j4rAz3L0TVDdgkBSK_ekb8k4V9Yr57Hm7nycRpNZ8DAP5-Vjdf0WRgihk8-ErZo6ls-XbKOuylDDXQftY1SDGu4z81hrGzIIQtdQg7uoweh0VDFJh_aS-8RWZcJtxCoId7wvNkgXVflxEmvzcteAaTUxk1Xyam-K3GYf3zgb_z3zIzgsgSYZmJ1xDHti0YRGVcSBlDrdhIMdRsITGI8Uka6qgSU4uRkkwy4Zr3VWWEFGhvmJIMwlKj2EjFRlAHXbRp5X6B8L8iReTFrtogWzu9tkOLHKWgsW86hXWHjwIXRjnPpR5sbc7dOYU973sz62wkjmMqKMIfiJfNeXWd-JmeQiolIEkjmIEk-hvlguxBkQKqWOTyIW8AOBxpCHWeBK_EqGw-dt6FWrn7KSiFzVw3hLtUPi0BTllSp5paW82tDd9lgZEo4__m2p5d_5z6x8GzqVhNNSTd9TRf2D7nhEvfNful3AvhrdpPx1oF6sN-ISYUiRX-nt9wmEmNWR
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED7xGICBZxHl6YGBDknzcJJ6RJRSoO0AqdQtSvxgALUI0oVfz9lOEA-BmOLBTiKfz_ed7_wdwGkQMEnDvEDfBNWNUsWcIlCho5JCW69CJIa7cziK-2N6M4km1WV1cxdGSmmSz6SrmyaWL2Z8ro_KUMN9D-1jvAjLaPhpZK9rfQQNaORbcnAfdRjdjjoq6bF2ep26ulC4i2gFAU_wxQqZsio_9mJjYHobMKp_zeaVPLrzsnD52zfWxn__-yasV1CTnNu1sQULcroNG3UZB1Jp9TasfeIk3IGrrqbS1VWwpCBn5-lFi1y9mLywknQt9xNBoEt0ggjp6toA-ryN3D-jhyzJnXywibXTBox7l-lF36mqLTg8YEHp4NaH4I0LFsa5nwi_wxLBRCfMO9iKYlWomHGO8CcO_VDlHS_hSsiYKUkV9xAn7sLSdDaVe0CYUiZCiWggpBLNoYhy6it8Ko6vL5rQrmc_4xUVua6I8ZQZl8RjGcor0_LKKnk1ofUx4tnScPzRt6Gn_1M_O_NNOKwlnFWK-ppp8h90yGMW7P8y7ARW-ulwkA2uR7cHsKq_ZBMAD2GpfJnLIwQlZXFsluI74CXY3g
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributed+%28ATC%29+Gradient+Descent+for+High+Dimension+Sparse+Regression&rft.jtitle=IEEE+transactions+on+information+theory&rft.au=Yao%2C+Ji&rft.au=Scutari%2C+Gesualdo&rft.au=Sun%2C+Ying&rft.au=Honnappa%2C+Harsha&rft.date=2023-08-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=0018-9448&rft.eissn=1557-9654&rft.volume=69&rft.issue=8&rft.spage=5253&rft_id=info:doi/10.1109%2FTIT.2023.3267742&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9448&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9448&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9448&client=summon