A survey of regularization strategies for deep models

The most critical concern in machine learning is how to make an algorithm that performs well both on training data and new data. No free lunch theorem implies that each specific task needs its own tailored machine learning algorithm to be designed. A set of strategies and preferences are built into...

Full description

Saved in:

Bibliographic Details
Published in	The Artificial intelligence review Vol. 53; no. 6; pp. 3947 - 3986
Main Authors	Moradi, Reza, Berangi, Reza, Minaei, Behrouz
Format	Journal Article
Language	English
Published	Dordrecht Springer Netherlands 01.08.2020 Springer Springer Nature B.V
Subjects	Algorithms Artificial Intelligence Artificial neural networks Comparative studies Computational linguistics Computer Science Computing costs Data augmentation Data mining Deep learning Ensemble learning Language processing Machine learning Natural language interfaces Neural networks Regularization Regularization methods Side effects Surveys Deep learning Generalization Regularization Convolutional neural network Overfitting
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The most critical concern in machine learning is how to make an algorithm that performs well both on training data and new data. No free lunch theorem implies that each specific task needs its own tailored machine learning algorithm to be designed. A set of strategies and preferences are built into learning machines to tune them for the problem at hand. These strategies and preferences, with the core concern of generalization improvement, are collectively known as regularization. In deep learning, because of a considerable number of parameters, a great many forms of regularization methods are available to the deep learning community. Developing more effective regularization strategies has been the subject of significant research efforts in recent years. However, it is difficult for developers to choose the most suitable strategy for their problem at hand, because there is no comparative study regarding the performance of different strategies. In this paper, at the first step, the most effective regularization methods and their variants are presented and analyzed in a systematic approach. At the second step, comparative research on regularization techniques is presented in which the testing errors and computational costs are evaluated in a convolutional neural network, using CIFAR-10 ( https://www.cs.toronto.edu/~kriz/cifar.html ) dataset. In the end, different regularization methods are compared in terms of accuracy of the network, the number of epochs for the network to be trained and the number of operations per input sample. Also, the results are discussed and interpreted based on the employed strategy. The experiment results showed that weight decay and data augmentation regularizations have little computational side effects so can be used in most applications. In the case of enough computational resources, Dropout family methods are rational to be used. Moreover, in the case of abundant computational resources, batch normalization family and ensemble methods are reasonable strategies to be employed.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0269-2821 1573-7462
DOI:	10.1007/s10462-019-09784-7