A law of data separation in deep learning

While deep learning has enabled significant advances in many areas of science, its black-box nature hinders architecture design for future artificial intelligence applications and interpretation for high-stakes decision-makings. We addressed this issue by studying the fundamental question of how dee...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 120; no. 36; p. e2221704120
Main Authors He, Hangfeng, Su, Weijie J.
Format Journal Article
LanguageEnglish
Published Washington National Academy of Sciences 05.09.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:While deep learning has enabled significant advances in many areas of science, its black-box nature hinders architecture design for future artificial intelligence applications and interpretation for high-stakes decision-makings. We addressed this issue by studying the fundamental question of how deep neural networks process data in the intermediate layers. Our finding is a simple and quantitative law that governs how deep neural networks separate data according to class membership throughout all layers for classification. This law shows that each layer improves data separation at a constant geometric rate, and its emergence is observed in a collection of network architectures and datasets during training. This law offers practical guidelines for designing architectures, improving model robustness and out-of-sample performance, as well as interpreting the predictions.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by David Donoho, Stanford University, Stanford, CA; received December 21, 2022; accepted June 26, 2023
ISSN:0027-8424
1091-6490
DOI:10.1073/pnas.2221704120