Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching
We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, t...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
19.06.2025
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2408.07503 |
Cover
Abstract | We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, transforms any standard stochastic first-order method to an asynchronous method with convergence guarantee depending on the $q$-quantile delay of the sequence. This approach leads to convergence rates of the form $O(τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems, where $τ_q$ is the $q$-quantile delay, generalizing and improving on existing results that depend on the average delay. We further show a method that automatically adapts to all quantiles simultaneously, without any prior knowledge of the delays, achieving convergence rates of the form $O(\inf_{q} τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(\inf_{q} τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems. Our technique is based on asynchronous mini-batching with a careful batch-size selection and filtering of stale gradients. |
---|---|
AbstractList | We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the objective that are subject to an arbitrary (possibly adversarial) sequence of delays. We present a procedure which, for any given $q \in (0,1]$, transforms any standard stochastic first-order method to an asynchronous method with convergence guarantee depending on the $q$-quantile delay of the sequence. This approach leads to convergence rates of the form $O(τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems, where $τ_q$ is the $q$-quantile delay, generalizing and improving on existing results that depend on the average delay. We further show a method that automatically adapts to all quantiles simultaneously, without any prior knowledge of the delays, achieving convergence rates of the form $O(\inf_{q} τ_q/qT+σ/\sqrt{qT})$ for non-convex and $O(\inf_{q} τ_q^2/(q T)^2+σ/\sqrt{qT})$ for convex smooth problems. Our technique is based on asynchronous mini-batching with a careful batch-size selection and filtering of stale gradients. |
Author | Gaash, Ofir Attia, Amit Koren, Tomer |
Author_xml | – sequence: 1 givenname: Amit surname: Attia fullname: Attia, Amit – sequence: 2 givenname: Ofir surname: Gaash fullname: Gaash, Ofir – sequence: 3 givenname: Tomer surname: Koren fullname: Koren, Tomer |
BackLink | https://doi.org/10.48550/arXiv.2408.07503$$DView paper in arXiv |
BookMark | eNqFjr0OgjAUhTvo4N8DOHlfAKwCkRV_iItxwJ1cSbU3gZa0FcWnF4m70znDyXe-MRsorQRj8xX3wziK-BLNixp_HfLY55uIByOWpWidMJA5XciuUgHn2lFFb3SkFTzJSUjMlZxB08JelNhaaAghsa0qpNFKPyycSJG3RVdIUvcpG96wtGL2ywlbpIfL7uj193ltqOpY-Vcj7zWC_4sPexRAEQ |
ContentType | Journal Article |
Copyright | http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
Copyright_xml | – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
DBID | GOX |
DOI | 10.48550/arxiv.2408.07503 |
DatabaseName | arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2408_07503 |
GroupedDBID | GOX |
ID | FETCH-arxiv_primary_2408_075033 |
IEDL.DBID | GOX |
IngestDate | Tue Jul 22 21:52:54 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2408_075033 |
OpenAccessLink | https://arxiv.org/abs/2408.07503 |
ParticipantIDs | arxiv_primary_2408_07503 |
PublicationCentury | 2000 |
PublicationDate | 2025-06-19 |
PublicationDateYYYYMMDD | 2025-06-19 |
PublicationDate_xml | – month: 06 year: 2025 text: 2025-06-19 day: 19 |
PublicationDecade | 2020 |
PublicationYear | 2025 |
Score | 3.8297036 |
SecondaryResourceType | preprint |
Snippet | We consider the problem of asynchronous stochastic optimization, where an optimization algorithm makes updates based on stale stochastic gradients of the... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Learning Mathematics - Optimization and Control |
Title | Faster Stochastic Optimization with Arbitrary Delays via Asynchronous Mini-Batching |
URI | https://arxiv.org/abs/2408.07503 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07T8MwED61nVgQCFB538BqiFM3hLE8QoUEHQpStsh2bBEJAmpCRf89PjsIlq722TrZOt_L9x3A2agUieVWsCjSMRPGcpYm3MlVmoyVLCl35T_IPiXTF_GQj_Me4G8tjFx8V8uAD6yaC8LfOveptj7045icq_tZHpKTHoqro_-jczamH_qnJLIt2OysO5yE69iGnql3YJ5JQiPAefuhXyXhIuPMCep7VwGJFAp1S1TlK-Dx1rzJVYPLSuKkWdWa0Gude46PVV2xa_dyUsxoF06zu-ebKfNsFJ8BM6IgDouQNNyDgfPszRBQXCpdksJ1-kOMrE15ZLVW1lIDXRnxfRiu2-Vg_dQhbMTUpJYa7FwdwaBdfJljpzlbdeKP7weVXHNE |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Faster+Stochastic+Optimization+with+Arbitrary+Delays+via+Asynchronous+Mini-Batching&rft.au=Attia%2C+Amit&rft.au=Gaash%2C+Ofir&rft.au=Koren%2C+Tomer&rft.date=2025-06-19&rft_id=info:doi/10.48550%2Farxiv.2408.07503&rft.externalDocID=2408_07503 |