Exploring deep neural networks via layer-peeled model Minority collapse in imbalanced training

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the r...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the National Academy of Sciences - PNAS Vol. 118; no. 43; pp. 1 - 12
Main Authors	Fang, Cong, He, Hangfeng, Long, Qi, Su, Weijie J.
Format	Journal Article
Language	English
Published	United States National Academy of Sciences 26.10.2021
Subjects	Artificial neural networks Collapse Computer applications Computer Heuristics Databases, Factual - statistics & numerical data Deep Learning Empirical analysis Humans Machine learning Model forms Neural networks Neural Networks, Computer Nonlinear Dynamics Optimization Physical Sciences Stochastic Processes Training neural collapse deep learning class imbalance
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 1Present address: Department of Key Laboratory of Machine Perception, Peking University, Beijing 100871, China. Edited by David L. Donoho, Stanford University, Stanford, CA, and approved August 30, 2021 (received for review February 15, 2021) Author contributions: C.F., H.H., Q.L., and W.J.S. designed research; C.F., H.H., and W.J.S. performed research; C.F., H.H., and W.J.S. contributed new reagents/analytic tools; C.F., H.H., and W.J.S. analyzed data; and C.F., H.H., Q.L., and W.J.S. wrote the paper.
ISSN:	0027-8424 1091-6490
DOI:	10.1073/pnas.2103091118