Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique

Diabetes is one of the most common and serious diseases affecting human health. Early diagnosis and treatment are vital to prevent or delay complications related to diabetes. An automated diabetes detection system assists physicians in the early diagnosis of the disease and reduces complications by...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 82; no. 22; pp. 34163 - 34181
Main Author Gündoğdu, Serdar
Format Journal Article
LanguageEnglish
Published New York Springer US 01.09.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Diabetes is one of the most common and serious diseases affecting human health. Early diagnosis and treatment are vital to prevent or delay complications related to diabetes. An automated diabetes detection system assists physicians in the early diagnosis of the disease and reduces complications by providing fast and precise results. This study aims to introduce a technique based on a combination of multiple linear regression (MLR), random forest (RF), and XGBoost (XG) to diagnose diabetes from questionnaire data. MLR-RF algorithm is used for feature selection, and XG is used for classification in the proposed system. The dataset is the diabetic hospital data in Sylhet, Bangladesh. It contains 520 instances, including 320 diabetics and 200 control instances. The performance of the classifiers is measured concerning accuracy (ACC), precision (PPV), recall (SEN, sensitivity), F1 score (F1), and the area under the receiver-operating-characteristic curve (AUC). The results show that the proposed system achieves an accuracy of 99.2%, an AUC of 99.3%, and a prediction time of 0.04825 seconds. The feature selection method improves the prediction time, although it does not affect the accuracy of the four compared classifiers. The results of this study are quite reasonable and successful when compared with other studies. The proposed method can be used as an auxiliary tool in diagnosing diabetes.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-023-15165-8