Harmonization for Parkinson’s Disease Multi-Dataset T1 MRI Morphometry Classification
Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner data...
Saved in:
Published in | NeuroSci Vol. 5; no. 4; pp. 600 - 613 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Switzerland
MDPI AG
29.11.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner datasets, introducing confounding batch effects. We test ComBat, a common harmonization model, in an example application to classify subjects with Parkinson’s disease from healthy volunteers and identify common pitfalls, including data leakage. We used a multi-dataset cohort of 372 subjects (216 with Parkinson’s disease, 156 healthy volunteers) from 11 identified scanners. We extracted both FreeSurfer and the determinant of Jacobian morphometry to compare single-scanner and multi-scanner classification pipelines. We confirm the presence of batch effects by running single scanner classifiers which could achieve wildly divergent AUCs on scanner-specific datasets (mean:0.651 ± 0.144). Multi-scanner classifiers that considered neurobiological batch effects between sites could easily achieve a test AUC of 0.902, though pipelines that prevented data leakage could only achieve a test AUC of 0.550. We conclude that batch effects remain a major issue for classification problems, such that even impressive single-scanner classifiers are unlikely to generalize to multiple scanners, and that solving for batch effects in a classifier problem must avoid circularity and reporting overly optimistic results. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 2673-4087 2673-4087 |
DOI: | 10.3390/neurosci5040042 |