Ischemic stroke prediction using machine learning in elderly Chinese population: The Rugao Longitudinal Ageing Study
Objective Compared logistic regression (LR) with machine learning (ML) models, to predict the risk of ischemic stroke in an elderly population in China. Methods We applied 2208 records from the Rugao Longitudinal Ageing Study (RLAS) for ischemic stroke risk prediction assessment. Input variables inc...
Saved in:
Published in | Brain and behavior Vol. 13; no. 12; pp. e3307 - n/a |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
John Wiley and Sons Inc
01.12.2023
Wiley |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Objective
Compared logistic regression (LR) with machine learning (ML) models, to predict the risk of ischemic stroke in an elderly population in China.
Methods
We applied 2208 records from the Rugao Longitudinal Ageing Study (RLAS) for ischemic stroke risk prediction assessment. Input variables included 103 phenotypes. For 3‐year ischemic stroke risk prediction, we compared the discrimination and calibration of LR model and ML methods, where ML methods include Random Forest (RF), Gaussian kernel Support Vector Machines (SVM), Multilayer perceptron (MLP), K‐Nearest Neighbors Algorithm (KNN), and Gradient Boosting Decision Tree (GBDT) to develop an ischemic stroke risk prediction model.
Results
Age, pulse, waist circumference, education level, β2‐microglobulin, homocysteine, cystatin C, folate, free triiodothyronine, platelet distribution width, QT interval, and QTc interval were significant induced predictors of ischemic stroke. For ischemic stroke prediction, the ML approach was able to tap more biochemical and ECG‐related multidimensional phenotypic indicators compared to the LR model, which placed more importance on general demographic indicators. Compared to the LR model, SVM provided the best discrimination and calibration (C‐index: 0.79 vs. 0.71, 11.27% improvement in model utility), with the best performance in both validation and test data.
Conclusion
In a comparison of LR with five ML models, the accuracy of ischemic stroke prediction was higher by combining ML with multiple phenotypes. Combined with other studies based on elderly populations in China, ML techniques, especially SVM, have shown good long‐term predictive performance, inspiring the potential value of ML use in clinical practice.
Gaussian kernel Support Vector Machines (SVM) is an effective ML strategy for ischemic stroke risk prediction in a large population with a multidimensional phenotypic dataset. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 2162-3279 2162-3279 |
DOI: | 10.1002/brb3.3307 |