Convergence analysis of AdaBound with relaxed bound functions for non-convex optimization

Clipping on learning rates in Adam leads to an effective stochastic algorithm—AdaBound. In spite of its effectiveness in practice, convergence analysis of AdaBound has not been fully explored, especially for non-convex optimization. To this end, we address the convergence of the last individual outp...

Full description

Saved in:
Bibliographic Details
Published inNeural networks Vol. 145; pp. 300 - 307
Main Authors Liu, Jinlan, Kong, Jun, Xu, Dongpo, Qi, Miao, Lu, Yinghua
Format Journal Article
LanguageEnglish
Published United States Elsevier Ltd 01.01.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Clipping on learning rates in Adam leads to an effective stochastic algorithm—AdaBound. In spite of its effectiveness in practice, convergence analysis of AdaBound has not been fully explored, especially for non-convex optimization. To this end, we address the convergence of the last individual output of AdaBound for non-convex stochastic optimization problems, which is called individual convergence. We prove that, with the iteration of the AdaBound, the cost function converges to a finite value and the corresponding gradient converges to zero. The novelty of this proof is that the convergence conditions on the bound functions and momentum factors are much more relaxed than the existing results, especially when we remove the monotonicity and convergence of the bound functions, and only keep their boundedness. The momentum factors can be fixed to be constant, without the restriction of monotonically decreasing. This provides a new perspective on understanding the bound functions and momentum factors of AdaBound. At last, numerical experiments are provided to corroborate our theory and show that the convergence of AdaBound extends to more general bound functions.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0893-6080
1879-2782
DOI:10.1016/j.neunet.2021.10.026