Identifying network state-based Parkinson's disease subtypes using clustering and support vector machine models

Parkinson's disease ( ) heterogeneity poses challenges to the current development of discovering the best therapeutic targets. Here, we employ K-means and hierarchical clustering algorithms on data from the Parkinson's Progression Markers Initiative ( ) to identify network-specific pattern...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in psychiatry Vol. 16; p. 1453852
Main Authors Nguchu, Benedictor Alexander, Han, Yifei, Wang, Yanming, Shaw, Peter
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 13.02.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Parkinson's disease ( ) heterogeneity poses challenges to the current development of discovering the best therapeutic targets. Here, we employ K-means and hierarchical clustering algorithms on data from the Parkinson's Progression Markers Initiative ( ) to identify network-specific patterns that describe PD subtypes using the optimal number of brain features. The features were specifically the gray matter volume and dopaminergic features of the neostriatum, i.e., the caudate, putamen, and anterior putamen. We use machine learning ( ) algorithms, including Random Forest, Logistic Regression, and Support Vector Machine, to evaluate the diagnostic power of the brain features and network patterns in differentiating the subtypes and distinguishing from . Finally, we assessed whether subtypes described through network-specific patterns are dependent on the genotype. Using data from 2396 subjects, we show that is highly associated with APOE ϵ2/ϵ4. Our findings reveal a significant DAT deficit in the left and right structures of the caudate, putamen, and anterior putamen in subjects with compared to subjects with , and that APOE ϵ2/ϵ4 may accelerate DAT deficits and brain alterations in both PD and SWEDD. Furthermore, clinical symptoms of PD in subjects (SWEDD), which hardly validated by DAT scan data, can be explained by variations in APOE genotypes and other brain features beyond DAT. We show the existence of three networks states for the whole data, with the first network state describing the subjects in HC, while the remaining two network states describing the two PD subtypes-one network state typified by a mildly sparsely connected network (patterns) and the other network state characterized by a more intensified sparsity in their network. We also show that the two subtypes of PD are characterized by distinctly different levels of total gray matter volume and DAT deficit. ML models show that features extracted from brain structure and network patterns can serve as reliable biomarkers for PD and its subtypes, with the highest performance (100% AUC, 99.3% accuracy, 0.993 F1) demonstrated by the fine-tuned SVM model. Our findings suggest that, while PD is generally associated with a larger DAT deficit in specific brain structures of the neostriatum, it exhibits intrinsic heterogeneity across individuals, which may stem from genetic factors. Such heterogeneity can be characterized by ML models and optimally mapped into network states, providing new insights to consider when developing personalized drugs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Elena Monai, University of Wisconsin-Madison, United States
Reviewed by: Antonio Daniele, Catholic University of the Sacred Heart, Rome, Italy
These authors have contributed equally to this work
Ji-An Li, University of California, San Diego, United States
ISSN:1664-0640
1664-0640
DOI:10.3389/fpsyt.2025.1453852