Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching

We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, t...

Full description

Saved in:

Bibliographic Details
Main Authors	Attia, Amit, Gaash, Ofir, Koren, Tomer
Format	Journal Article
Language	English
Published	19.06.2025
Subjects	Computer Science - Learning Mathematics - Optimization and Control
Online Access	Get full text
DOI	10.48550/arxiv.2408.07503

Cover

Abstract	We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, transforms any standard stochastic first-order method to an asynchronous method with convergence guarantee depending on the $q$-quantile delay of the sequence. This approach leads to convergence rates of the form $O(τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems, where $τ_q$ is the $q$-quantile delay, generalizing and improving on existing results that depend on the average delay. We further show a method that automatically adapts to all quantiles simultaneously, without any prior knowledge of the delays, achieving convergence rates of the form $O(\inf_{q} τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(\inf_{q} τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems. Our technique is based on asynchronous mini-batching with a careful batch-size selection and filtering of stale gradients.
AbstractList	We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, transforms any standard stochastic first-order method to an asynchronous method with convergence guarantee depending on the $q$-quantile delay of the sequence. This approach leads to convergence rates of the form $O(τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems, where $τ_q$ is the $q$-quantile delay, generalizing and improving on existing results that depend on the average delay. We further show a method that automatically adapts to all quantiles simultaneously, without any prior knowledge of the delays, achieving convergence rates of the form $O(\inf_{q} τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(\inf_{q} τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems. Our technique is based on asynchronous mini-batching with a careful batch-size selection and filtering of stale gradients.
Author	Gaash, Ofir Attia, Amit Koren, Tomer
Author_xml	– sequence: 1 givenname: Amit surname: Attia fullname: Attia, Amit – sequence: 2 givenname: Ofir surname: Gaash fullname: Gaash, Ofir – sequence: 3 givenname: Tomer surname: Koren fullname: Koren, Tomer
BackLink	https://doi.org/10.48550/arXiv.2408.07503$$DView paper in arXiv
BookMark	eNqFjr0OgjAUhTvo4N8DOHlfAKwCkRV_iItxwJ1cSbU3gZa0FcWnF4m70znDyXe-MRsorQRj8xX3wziK-BLNixp_HfLY55uIByOWpWidMJA5XciuUgHn2lFFb3SkFTzJSUjMlZxB08JelNhaaAghsa0qpNFKPyycSJG3RVdIUvcpG96wtGL2ywlbpIfL7uj193ltqOpY-Vcj7zWC_4sPexRAEQ
ContentType	Journal Article
Copyright	http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml	– notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID	GOX
DOI	10.48550/arxiv.2408.07503
DatabaseName	arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2408_07503
GroupedDBID	GOX
ID	FETCH-arxiv_primary_2408_075033
IEDL.DBID	GOX
IngestDate	Tue Jul 22 21:52:54 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-arxiv_primary_2408_075033
OpenAccessLink	https://arxiv.org/abs/2408.07503
ParticipantIDs	arxiv_primary_2408_07503
PublicationCentury	2000
PublicationDate	2025-06-19
PublicationDateYYYYMMDD	2025-06-19
PublicationDate_xml	– month: 06 year: 2025 text: 2025-06-19 day: 19
PublicationDecade	2020
PublicationYear	2025
Score	3.8297036
SecondaryResourceType	preprint
Snippet	We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Learning Mathematics - Optimization and Control
Title	Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching
URI	https://arxiv.org/abs/2408.07503
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07T8MwED61nVgQCFB538BqiFM3hLE8QoUEHQpStsh2bBEJAmpCRf89PjsIlq722TrZOt_L9x3A2agUieVWsCjSMRPGcpYm3MlVmoyVLCl35T_IPiXTF_GQj_Me4G8tjFx8V8uAD6yaC8LfOveptj7045icq_tZHpKTHoqro_-jczamH_qnJLIt2OysO5yE69iGnql3YJ5JQiPAefuhXyXhIuPMCep7VwGJFAp1S1TlK-Dx1rzJVYPLSuKkWdWa0Gude46PVV2xa_dyUsxoF06zu-ebKfNsFJ8BM6IgDouQNNyDgfPszRBQXCpdksJ1-kOMrE15ZLVW1lIDXRnxfRiu2-Vg_dQhbMTUpJYa7FwdwaBdfJljpzlbdeKP7weVXHNE
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Faster+Stochastic+Optimization+with+Arbitrary+Delays+via+Asynchronous+Mini-Batching&rft.au=Attia%2C+Amit&rft.au=Gaash%2C+Ofir&rft.au=Koren%2C+Tomer&rft.date=2025-06-19&rft_id=info:doi/10.48550%2Farxiv.2408.07503&rft.externalDocID=2408_07503