Mind the gap: Performance metric evaluation in brain‐age prediction

Estimating age based on neuroimaging‐derived data has become a popular approach to developing markers for brain integrity and health. While a variety of machine‐learning algorithms can provide accurate predictions of age based on brain characteristics, there is significant variation in model accurac...

Full description

Saved in:
Bibliographic Details
Published inHuman brain mapping Vol. 43; no. 10; pp. 3113 - 3129
Main Authors Lange, Ann‐Marie G., Anatürk, Melis, Rokicki, Jaroslav, Han, Laura K. M., Franke, Katja, Alnæs, Dag, Ebmeier, Klaus P., Draganski, Bogdan, Kaufmann, Tobias, Westlye, Lars T., Hahn, Tim, Cole, James H.
Format Journal Article
LanguageEnglish
Published Hoboken, USA John Wiley & Sons, Inc 01.07.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Estimating age based on neuroimaging‐derived data has become a popular approach to developing markers for brain integrity and health. While a variety of machine‐learning algorithms can provide accurate predictions of age based on brain characteristics, there is significant variation in model accuracy reported across studies. We predicted age in two population‐based datasets, and assessed the effects of age range, sample size and age‐bias correction on the model performance metrics Pearson's correlation coefficient (r), the coefficient of determination (R2), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). The results showed that these metrics vary considerably depending on cohort age range; r and R2 values are lower when measured in samples with a narrower age range. RMSE and MAE are also lower in samples with a narrower age range due to smaller errors/brain age delta values when predictions are closer to the mean age of the group. Across subsets with different age ranges, performance metrics improve with increasing sample size. Performance metrics further vary depending on prediction variance as well as mean age difference between training and test sets, and age‐bias corrected metrics indicate high accuracy—also for models showing poor initial performance. In conclusion, performance metrics used for evaluating age prediction models depend on cohort and study‐specific data characteristics, and cannot be directly compared across different studies. Since age‐bias corrected metrics generally indicate high accuracy, even for poorly performing models, inspection of uncorrected model results provides important information about underlying model attributes such as prediction variance. While a variety of machine‐learning algorithms can provide accurate predictions of age based on brain characteristics, there is significant variation in model accuracy reported across studies. We predicted age based on neuroimaging data in two population‐based datasets, and assessed the effects of age range, sample size, and age‐bias correction on the model performance metrics r, R2, Root Mean Squared Error, and Mean Absolute Error. The results showed that these metrics depend on cohort and study‐specific data characteristics including age range and sample size, and cannot be directly compared across different studies. Age‐bias corrected metrics indicate high accuracy, even for poorly performing models, and inspection of uncorrected model results thus provides important information about underlying model attributes such as prediction variance.
Bibliography:Funding information
Collaboratory on Research Definitions for Reserve and Resilience in Cognitive Aging and Dementia, Grant/Award Number: 5R24AG061421‐03; Deutsche Forschungsgemeinschaft, Grant/Award Numbers: FR 3709/1‐2, HA7070/2‐2, HA7070/3, HA7070/4; ERA‐net Cofound, Grant/Award Number: ERA PerMed project ”IMPLEMENT”; Fondation Leenaards; H2020 European Research Council, Grant/Award Number: 802998; HDH Wills 1965 Charitable Trust, Grant/Award Number: 1117747; Helse Sør‐Øst RHF, Grant/Award Numbers: 2015073, 2019107; Interdisciplinary Center for Clinical Research of the Jena University hospital, Grant/Award Number: AMSP 07; Interdisciplinary Center for Clinical Research of the Medical Faculty of Münster, Grant/Award Number: MzH 3/020/20; Medical Research Council, Grant/Award Numbers: G1001354, MR/R024790/2; Norges Forskningsråd, Grant/Award Numbers: 223273, 249795, 273345, 276082; Swiss National Science Foundation, Grant/Award Numbers: 32003B_135679, 32003B_159780, 324730_192755, CRSK‐3_190185, PZ00P3_193658
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
SIGMA2/NS9666S
Funding information Collaboratory on Research Definitions for Reserve and Resilience in Cognitive Aging and Dementia, Grant/Award Number: 5R24AG061421‐03; Deutsche Forschungsgemeinschaft, Grant/Award Numbers: FR 3709/1‐2, HA7070/2‐2, HA7070/3, HA7070/4; ERA‐net Cofound, Grant/Award Number: ERA PerMed project ”IMPLEMENT”; Fondation Leenaards; H2020 European Research Council, Grant/Award Number: 802998; HDH Wills 1965 Charitable Trust, Grant/Award Number: 1117747; Helse Sør‐Øst RHF, Grant/Award Numbers: 2015073, 2019107; Interdisciplinary Center for Clinical Research of the Jena University hospital, Grant/Award Number: AMSP 07; Interdisciplinary Center for Clinical Research of the Medical Faculty of Münster, Grant/Award Number: MzH 3/020/20; Medical Research Council, Grant/Award Numbers: G1001354, MR/R024790/2; Norges Forskningsråd, Grant/Award Numbers: 223273, 249795, 273345, 276082; Swiss National Science Foundation, Grant/Award Numbers: 32003B_135679, 32003B_159780, 324730_192755, CRSK‐3_190185, PZ00P3_193658
ISSN:1065-9471
1097-0193
1097-0193
DOI:10.1002/hbm.25837