Training Behavior of Deep Neural Network in Frequency Domain

Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery [24]. To find a potential mechanism, we focus on the study of implicit biases underlying the training process of DNNs. In this work, for both real and synthetic datasets, we empirically find that a...

Full description

Saved in:

Bibliographic Details
Published in	Neural Information Processing Vol. 11953; pp. 264 - 274
Main Authors	Xu, Zhi-Qin John, Zhang, Yaoyu, Xiao, Yanyang
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2019 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Deep learning Deep Neural Network Fourier analysis Generalization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery [24]. To find a potential mechanism, we focus on the study of implicit biases underlying the training process of DNNs. In this work, for both real and synthetic datasets, we empirically find that a DNN with common settings first quickly captures the dominant low-frequency components, and then relatively slowly captures the high-frequency ones. We call this phenomenon Frequency Principle (F-Principle). The F-Principle can be observed over DNNs of various structures, activation functions, and training algorithms in our experiments. We also illustrate how the F-Principle helps understand the effect of early-stopping as well as the generalization of DNNs. This F-Principle potentially provides insight into a general principle underlying DNN optimization and generalization.
ISBN:	303036707X 9783030367077
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-36708-4_22