A Scaling Transition Method from SGDM to SGD with 2ExpLR Strategy

In deep learning, the vanilla stochastic gradient descent (SGD) and SGD with heavy-ball momentum (SGDM) methods have a wide range of applications due to their simplicity and great generalization. This paper uses an exponential scaling method to realize a smooth and stable transition from SGDM to SGD...

Full description

Saved in:

Bibliographic Details
Published in	Applied sciences Vol. 12; no. 23; p. 12023
Main Authors	Zeng, Kun, Liu, Jinlan, Jiang, Zhixia, Xu, Dongpo
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.12.2022
Subjects	Accuracy Algorithms artificial neural network Convergence Convex analysis Decay Decay rate Deep learning Exponential functions gradient descent image classification Methods Neural networks Optimization scaling transition Stochasticity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In deep learning, the vanilla stochastic gradient descent (SGD) and SGD with heavy-ball momentum (SGDM) methods have a wide range of applications due to their simplicity and great generalization. This paper uses an exponential scaling method to realize a smooth and stable transition from SGDM to SGD, which combines the advantages of the fast training speed of SGDM and the accurate convergence of SGD (named TSGD). We also provide some theoretical results on the convergence of this algorithm. At the same time, we take advantage of the learning rate warmup strategy’s stability and the learning rate decay strategy’s high accuracy. A warmup–decay learning rate strategy with double exponential functions is proposed (named 2ExpLR). The experimental results on different datasets for the proposed algorithms indicate that the accuracy is improved significantly and that the training is faster and more stable.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app122312023