Machine learning approaches for predicting frailty base on multimorbidities in US adults using NHANES data (1999–2018)
•Based on NHANES 46,187 patient data, we demonstrated that the age of 49 is a critical age for the onset of frailty, and frailty begins to occur gradually after the age of 49.•The tuned XGBoost model we built has high accuracy, consistency and clinical practicality, and can quickly complete frailty...
Saved in:
Published in | Computer methods and programs in biomedicine update Vol. 6; p. 100164 |
---|---|
Main Authors | , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
2024
Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Based on NHANES 46,187 patient data, we demonstrated that the age of 49 is a critical age for the onset of frailty, and frailty begins to occur gradually after the age of 49.•The tuned XGBoost model we built has high accuracy, consistency and clinical practicality, and can quickly complete frailty risk assessment through the patient's general information and medical history.•This study is the first to construct a frailty prediction model for comorbid populations based on learning algorithms.
The global increase in an aging population has led to more common age-related health challenges, particularly multimorbidity and frailty, but there is a significant gap.
This cross-sectional study utilized data from the National Health and Nutrition Examination Survey (1999–2018). The association between age and frailty was assessed using a restricted cubic spline (RCS) model, while weighted adjusted multivariable logistic regression evaluated the effect of diseases to frailty. And in machine learning process, feature selection for the frailty prediction model involved three algorithms. The model's performance was optimized using nested cross-validation and tested with various algorithms including decision tree, Logistic Regression, k-Nearest Neighbor, Random Forest, Recursive Partitioning and Regression Trees, and eXtreme Gradient Boosting (XGBoost). We used areas under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AU-PRC) to evaluate six algorithms, select the optimal model, and test the discrimination and consistency of the optimal model.
The study included 46,187 participants, with 6,009 cases of frailty. RCS analysis showed a non-linear association between age and frailty, with a turning point at 49 years. Key impacting variables identified are Anemia, Arthritis, Diabetes Mellitus, Coronary Heart Disease, and Hypertension. In the machine learning process, we selected the optimal data set by feature selection, including 13 variables. Through nested cross-validation, a total of 31,900 models were built using 6 algorithms. And the XGBoost model showed the highest performance (AUC = 0.8828 and AU-PRC = 0.624), and clear proficiency in both discrimination and calibration.
We found 49 years maintain the balance of physiological reserve and external aggression. In addition, chronic diseases are trigger factor of frailty, while acute diseases are contributing factor that exacerbates the body's rapid decline. Last, the XGBoost frailty prediction model, with its simplicity, high performance and high clinical value holds potential for clinical application. |
---|---|
ISSN: | 2666-9900 2666-9900 |
DOI: | 10.1016/j.cmpbup.2024.100164 |