A stacking ensemble model for predicting the occurrence of carotid atherosclerosis

Carotid atherosclerosis (CAS) is a significant risk factor for cardio-cerebrovascular events. The objective of this study is to employ stacking ensemble machine learning techniques to enhance the prediction of CAS occurrence, incorporating a wide range of predictors, including endocrine-related mark...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in endocrinology (Lausanne) Vol. 15; p. 1390352
Main Authors Zhang, Xiaoshuai, Tang, Chuanping, Wang, Shuohuan, Liu, Wei, Yang, Wangxuan, Wang, Di, Wang, Qinghuan, Tang, Fang
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 23.07.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Carotid atherosclerosis (CAS) is a significant risk factor for cardio-cerebrovascular events. The objective of this study is to employ stacking ensemble machine learning techniques to enhance the prediction of CAS occurrence, incorporating a wide range of predictors, including endocrine-related markers. Based on data from a routine health check-up cohort, five individual prediction models for CAS were established based on logistic regression (LR), random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost) and gradient boosting decision tree (GBDT) methods. Then, a stacking ensemble algorithm was used to integrate the base models to improve the prediction ability and address overfitting problems. Finally, the SHAP value method was applied for an in-depth analysis of variable importance at both the overall and individual levels, with a focus on elucidating the impact of endocrine-related variables. A total of 441 of the 1669 subjects in the cohort were finally diagnosed with CAS. Seventeen variables were selected as predictors. The ensemble model outperformed the individual models, with AUCs of 0.893 in the testing set and 0.861 in the validation set. The ensemble model has the optimal accuracy, precision, recall and F1 score in the validation set, with considerable performance in the testing set. Carotid stenosis and age emerged as the most significant predictors, alongside notable contributions from endocrine-related factors. The ensemble model shows enhanced accuracy and generalizability in predicting CAS risk, underscoring its utility in identifying individuals at high risk. This approach integrates a comprehensive analysis of predictors, including endocrine markers, affirming the critical role of endocrine dysfunctions in CAS development. It represents a promising tool in identifying high-risk individuals for the prevention of CAS and cardio-cerebrovascular diseases.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Eliza Russu, George Emil Palade University of Medicine, Pharmacy, Sciences and Technology of Târgu Mureş, Romania
Reviewed by: Niranjana Sampathila, Manipal Academy of Higher Education, India
Stoian Adina, George Emil Palade University of Medicine, Pharmacy, Sciences and Technology of Târgu Mureş, Romania
ISSN:1664-2392
1664-2392
DOI:10.3389/fendo.2024.1390352