Harmonization for Parkinson’s Disease Multi-Dataset T1 MRI Morphometry Classification

Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner data...

Full description

Saved in:

Bibliographic Details
Published in	NeuroSci Vol. 5; no. 4; pp. 600 - 613
Main Authors	Saqib, Mohammed, Horovitz, Silvina G.
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 29.11.2024
Subjects	batch effect brain morphometry classifier data harmonization magnetic resonance imaging Parkinson’s disease batch effect classifier magnetic resonance imaging brain morphometry data harmonization Parkinson’s disease
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner datasets, introducing confounding batch effects. We test ComBat, a common harmonization model, in an example application to classify subjects with Parkinson’s disease from healthy volunteers and identify common pitfalls, including data leakage. We used a multi-dataset cohort of 372 subjects (216 with Parkinson’s disease, 156 healthy volunteers) from 11 identified scanners. We extracted both FreeSurfer and the determinant of Jacobian morphometry to compare single-scanner and multi-scanner classification pipelines. We confirm the presence of batch effects by running single scanner classifiers which could achieve wildly divergent AUCs on scanner-specific datasets (mean:0.651 ± 0.144). Multi-scanner classifiers that considered neurobiological batch effects between sites could easily achieve a test AUC of 0.902, though pipelines that prevented data leakage could only achieve a test AUC of 0.550. We conclude that batch effects remain a major issue for classification problems, such that even impressive single-scanner classifiers are unlikely to generalize to multiple scanners, and that solving for batch effects in a classifier problem must avoid circularity and reporting overly optimistic results.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2673-4087 2673-4087
DOI:	10.3390/neurosci5040042