Comparative Analysis of Cross-Validation Methods on PPMI Dataset
Artificial Intelligence (AI) is significantly impacting the management of neurodegenerative diseases in the realm of radiology and medical imaging. AI plays a pivotal role in identifying biomarkers in rare disorders for prediction and classification. However, a crucial issue remains due to the chall...
Saved in:
Published in | Medical Measurement and Applications (MEMEA), IEEE International Workshop on pp. 1 - 5 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
26.06.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 2837-5882 |
DOI | 10.1109/MeMeA60663.2024.10596885 |
Cover
Loading…
Summary: | Artificial Intelligence (AI) is significantly impacting the management of neurodegenerative diseases in the realm of radiology and medical imaging. AI plays a pivotal role in identifying biomarkers in rare disorders for prediction and classification. However, a crucial issue remains due to the challenges posed by limited dataset size. The high dimensionality of the feature space and the restricted cohort exacerbate these challenges. Here, we use a dataset sourced from the Parkinson's Progression Markers Initiative (PPMI). In this study, 100 Parkinson's disease (PD) patients and 73 healthy controls (HC) were included, encompassing 160 features of both imaging and cognitive data. The dataset is partitioned into training (80%) and test sets (20%). In this work, we compare two cross-validation (CV) methods: nested CV, employed for unbiased performance estimation on the training data, and shuffle CV that is a drawback of nested. Moreover, a novel hybrid approach to feature selection was applied on MRI data, aiming to enhance the selection of relevant features. The methodology combines correlation analysis with SHAP (SHapley Additive exPlanations). Subsequently, XGBoost models is deployed for the classification of patients from healthy subjects. Our findings reveal the superiority of nested CV over shuffle CV when validating an independent test set, indicating its robustness. Furthermore, our study underscores the significance of identifying informative features, such as differences in brain regions, to enhance the robustness and accuracy of the model. Overall, our research contributes to the advancement of AI applications in neurodegenerative disease management by addressing challenges associated with small dataset sizes and emphasizing the importance of effective feature selection. |
---|---|
ISSN: | 2837-5882 |
DOI: | 10.1109/MeMeA60663.2024.10596885 |