The WuC-Adam algorithm based on joint improvement of Warmup and cosine annealing algorithms

The Adam algorithm is a common choice for optimizing neural network models. However, its application often brings challenges, such as susceptibility to local optima, overfitting and convergence problems caused by unstable learning rate behavior. In this article, we introduce an enhanced Adam optimiz...

Full description

Saved in:

Bibliographic Details
Published in	Mathematical biosciences and engineering : MBE Vol. 21; no. 1; pp. 1270 - 1285
Main Authors	Zhang, Can, Shao, Yichuan, Sun, Haijing, Xing, Lei, Zhao, Qian, Zhang, Le
Format	Journal Article
Language	English
Published	United States AIMS Press 01.01.2024
Subjects	adam algorithm cosine annealing strategy deep learning local optimum warmup deep learning Warmup Adam algorithm cosine annealing strategy local optimum
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The Adam algorithm is a common choice for optimizing neural network models. However, its application often brings challenges, such as susceptibility to local optima, overfitting and convergence problems caused by unstable learning rate behavior. In this article, we introduce an enhanced Adam optimization algorithm that integrates Warmup and cosine annealing techniques to alleviate these challenges. By integrating preheating technology into traditional Adam algorithms, we systematically improved the learning rate during the initial training phase, effectively avoiding instability issues. In addition, we adopt a dynamic cosine annealing strategy to adaptively adjust the learning rate, improve local optimization problems and enhance the model's generalization ability. To validate the effectiveness of our proposed method, extensive experiments were conducted on various standard datasets and compared with traditional Adam and other optimization methods. Multiple comparative experiments were conducted using multiple optimization algorithms and the improved algorithm proposed in this paper on multiple datasets. On the MNIST, CIFAR10 and CIFAR100 datasets, the improved algorithm proposed in this paper achieved accuracies of 98.87%, 87.67% and 58.88%, respectively, with significant improvements compared to other algorithms. The experimental results clearly indicate that our joint enhancement of the Adam algorithm has resulted in significant improvements in model convergence speed and generalization performance. These promising results emphasize the potential of our enhanced Adam algorithm in a wide range of deep learning tasks.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1551-0018 1551-0018
DOI:	10.3934/mbe.2024054