Different scaling of linear models and deep learning in UKBiobank brain images versus machine-learning datasets

Recently, deep learning has unlocked unprecedented success in various domains, especially using images, text, and speech. However, deep learning is only beneficial if the data have nonlinear relationships and if they are exploitable at available sample sizes. We systematically profiled the performan...

Full description

Saved in:

Bibliographic Details
Published in	Nature communications Vol. 11; no. 1; pp. 4238 - 15
Main Authors	Schulz, Marc-Andre, Yeo, B. T. Thomas, Vogelstein, Joshua T., Mourao-Miranada, Janaina, Kather, Jakob N., Kording, Konrad, Richards, Blake, Bzdok, Danilo
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 25.08.2020 Nature Publishing Group Nature Portfolio
Subjects	49 59/36 59/57 631/378/116/2394 706/648/697/129/2043 Algorithms Biological Specimen Banks Brain Brain - diagnostic imaging Datasets Deep Learning Depth profiling Humanities and Social Sciences Humans Kernels Learning algorithms Linear Models Machine Learning multidisciplinary Neuroimaging - methods Phenotype Phenotypes Predictions Sample Size Science Science (multidisciplinary) Structure-function relationships United Kingdom United Kingdom
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recently, deep learning has unlocked unprecedented success in various domains, especially using images, text, and speech. However, deep learning is only beneficial if the data have nonlinear relationships and if they are exploitable at available sample sizes. We systematically profiled the performance of deep, kernel, and linear models as a function of sample size on UKBiobank brain images against established machine learning references. On MNIST and Zalando Fashion, prediction accuracy consistently improves when escalating from linear models to shallow-nonlinear models, and further improves with deep-nonlinear models. In contrast, using structural or functional brain scans, simple linear models perform on par with more complex, highly parameterized models in age/sex prediction across increasing sample sizes. In sum, linear models keep improving as the sample size approaches ~10,000 subjects. Yet, nonlinearities for predicting common phenotypes from typical brain scans remain largely inaccessible to the examined kernel and deep learning methods. Schulz et al . systematically benchmark performance scaling with increasingly sophisticated prediction algorithms and with increasing sample size in reference machine-learning and biomedical datasets. Complicated nonlinear intervariable relationships remain largely inaccessible for predicting key phenotypes from typical brain scans.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 ObjectType-Undefined-3
ISSN:	2041-1723 2041-1723
DOI:	10.1038/s41467-020-18037-z