Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
Chronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinin...
Saved in:
Published in | IEEE access Vol. 13; pp. 78057 - 78072 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Chronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinine (SCr) and cystatin C (SCysC). However, eGFR calculations may be inaccurate when applied to diverse patient populations. This study proposes a machine learning (ML) system that integrates regression-based eGFR estimation, metaheuristic optimization using the Grey Wolf Optimizer (GWO), and multi-class classification with various ML models to enhance CKD staging and classification. The model estimates eGFR using three established CKD Epidemiology Collaboration (CKD-EPI) equations incorporating SCr, SCysC, and their combined values. Regression models assess predictive performance, specifically Linear Regression (LR) and Support Vector Regression (SVR). SVR demonstrates superior performance compared to LR for <inline-formula> <tex-math notation="LaTeX">\text {CKD-EPI}_{\text {SCr-SCysC}} </tex-math></inline-formula> achieved a root mean squared error (RMSE) of 3.03, a mean absolute percentage error (MAPE) of 2.97%, and a coefficient of determination (<inline-formula> <tex-math notation="LaTeX">\text {R}^{2} </tex-math></inline-formula>) score of 0.97. The application of GWO for hyperparameter tuning has resulted in a 37.3% reduction in root mean square error (RMSE), a 37.4% drop in mean absolute percentage error (MAPE), and a 2.06% improvement in <inline-formula> <tex-math notation="LaTeX">\text {R}^{2} </tex-math></inline-formula> to improve the precision of prediction. Once the model fine-tunes the eGFR estimations, it feeds them into various algorithms for CKD stage classification, including Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). Among these, XGBoost achieves the highest classification accuracy of 97.76%, along with an F1-score of 97.45%, demonstrating its effectiveness in CKD staging. Shapley Additive Explanations (SHAP) provide global and local feature importance insights, enhancing clinical decision-making and model transparency. Future research will validate the model using more extensive and more diverse datasets. Additionally, it will incorporate extra clinical parameters, including biomarkers and genetic data, to enhance the precision of CKD risk prediction. This research enhances AI-driven nephrology by providing a scalable, interpretable, and highly accurate solution for diagnosing and managing CKD. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2025.3565549 |