Machine learning-based prediction of LDL cholesterol: performance evaluation and validation

This study aimed to validate and optimize a machine learning algorithm for accurately predicting low-density lipoprotein cholesterol (LDL-C) levels, addressing limitations of traditional formulas, particularly in hypertriglyceridemia. Various machine learning models-linear regression, K-nearest neig...

Full description

Saved in:
Bibliographic Details
Published inPeerJ (San Francisco, CA) Vol. 13; p. e19248
Main Authors Meng, Jing-Bi, An, Zai-Jian, Jiang, Chun-Shan
Format Journal Article
LanguageEnglish
Published United States PeerJ. Ltd 09.04.2025
PeerJ, Inc
PeerJ Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This study aimed to validate and optimize a machine learning algorithm for accurately predicting low-density lipoprotein cholesterol (LDL-C) levels, addressing limitations of traditional formulas, particularly in hypertriglyceridemia. Various machine learning models-linear regression, K-nearest neighbors (KNN), decision tree, random forest, eXtreme Gradient Boosting (XGB), and multilayer perceptron (MLP) regressor-were compared to conventional formulas (Friedewald, Martin, and Sampson) using lipid profiles from 120,174 subjects (2020-2023). Predictive performance was evaluated using R-squared ( ), mean squared error (MSE), and Pearson correlation coefficient (PCC) against measured LDL-C values. Machine learning models outperformed traditional methods, with Random Forest and XGB achieving the highest accuracy (  = 0.94, MSE = 89.25) on the internal dataset. Among the traditional formulas, the Sampson method performed best but showed reduced accuracy in high triglyceride (TG) groups (TG > 300 mg/dL). Machine learning models maintained high predictive power across all TG levels. Machine learning models offer more accurate LDL-C estimates, especially in high TG contexts where traditional formulas are less reliable. These models could enhance cardiovascular risk assessment by providing more precise LDL-C estimates, potentially leading to more informed treatment decisions and improved patient outcomes.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Undefined-3
ISSN:2167-8359
2167-8359
2376-5992
DOI:10.7717/peerj.19248