Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching

We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, t...

Full description

Saved in:
Bibliographic Details
Main Authors Attia, Amit, Gaash, Ofir, Koren, Tomer
Format Journal Article
LanguageEnglish
Published 19.06.2025
Subjects
Online AccessGet full text
DOI10.48550/arxiv.2408.07503

Cover

Abstract We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, transforms any standard stochastic first-order method to an asynchronous method with convergence guarantee depending on the $q$-quantile delay of the sequence. This approach leads to convergence rates of the form $O(τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems, where $τ_q$ is the $q$-quantile delay, generalizing and improving on existing results that depend on the average delay. We further show a method that automatically adapts to all quantiles simultaneously, without any prior knowledge of the delays, achieving convergence rates of the form $O(\inf_{q} τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(\inf_{q} τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems. Our technique is based on asynchronous mini-batching with a careful batch-size selection and filtering of stale gradients.
AbstractList We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, transforms any standard stochastic first-order method to an asynchronous method with convergence guarantee depending on the $q$-quantile delay of the sequence. This approach leads to convergence rates of the form $O(τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems, where $τ_q$ is the $q$-quantile delay, generalizing and improving on existing results that depend on the average delay. We further show a method that automatically adapts to all quantiles simultaneously, without any prior knowledge of the delays, achieving convergence rates of the form $O(\inf_{q} τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(\inf_{q} τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems. Our technique is based on asynchronous mini-batching with a careful batch-size selection and filtering of stale gradients.
Author Gaash, Ofir
Attia, Amit
Koren, Tomer
Author_xml – sequence: 1
  givenname: Amit
  surname: Attia
  fullname: Attia, Amit
– sequence: 2
  givenname: Ofir
  surname: Gaash
  fullname: Gaash, Ofir
– sequence: 3
  givenname: Tomer
  surname: Koren
  fullname: Koren, Tomer
BackLink https://doi.org/10.48550/arXiv.2408.07503$$DView paper in arXiv
BookMark eNqFjr0OgjAUhTvo4N8DOHlfAKwCkRV_iItxwJ1cSbU3gZa0FcWnF4m70znDyXe-MRsorQRj8xX3wziK-BLNixp_HfLY55uIByOWpWidMJA5XciuUgHn2lFFb3SkFTzJSUjMlZxB08JelNhaaAghsa0qpNFKPyycSJG3RVdIUvcpG96wtGL2ywlbpIfL7uj193ltqOpY-Vcj7zWC_4sPexRAEQ
ContentType Journal Article
Copyright http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID GOX
DOI 10.48550/arxiv.2408.07503
DatabaseName arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2408_07503
GroupedDBID GOX
ID FETCH-arxiv_primary_2408_075033
IEDL.DBID GOX
IngestDate Tue Jul 22 21:52:54 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-arxiv_primary_2408_075033
OpenAccessLink https://arxiv.org/abs/2408.07503
ParticipantIDs arxiv_primary_2408_07503
PublicationCentury 2000
PublicationDate 2025-06-19
PublicationDateYYYYMMDD 2025-06-19
PublicationDate_xml – month: 06
  year: 2025
  text: 2025-06-19
  day: 19
PublicationDecade 2020
PublicationYear 2025
Score 3.8297036
SecondaryResourceType preprint
Snippet We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Learning
Mathematics - Optimization and Control
Title Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching
URI https://arxiv.org/abs/2408.07503
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07T8MwED61nVgQCFB538BqiFM3hLE8QoUEHQpStsh2bBEJAmpCRf89PjsIlq722TrZOt_L9x3A2agUieVWsCjSMRPGcpYm3MlVmoyVLCl35T_IPiXTF_GQj_Me4G8tjFx8V8uAD6yaC8LfOveptj7045icq_tZHpKTHoqro_-jczamH_qnJLIt2OysO5yE69iGnql3YJ5JQiPAefuhXyXhIuPMCep7VwGJFAp1S1TlK-Dx1rzJVYPLSuKkWdWa0Gude46PVV2xa_dyUsxoF06zu-ebKfNsFJ8BM6IgDouQNNyDgfPszRBQXCpdksJ1-kOMrE15ZLVW1lIDXRnxfRiu2-Vg_dQhbMTUpJYa7FwdwaBdfJljpzlbdeKP7weVXHNE
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Faster+Stochastic+Optimization+with+Arbitrary+Delays+via+Asynchronous+Mini-Batching&rft.au=Attia%2C+Amit&rft.au=Gaash%2C+Ofir&rft.au=Koren%2C+Tomer&rft.date=2025-06-19&rft_id=info:doi/10.48550%2Farxiv.2408.07503&rft.externalDocID=2408_07503