A rule extraction approach from support vector machines for diagnosing hypertension among diabetics
•Classification of datasets on diabetes and its complications are considered.•Five feature selection algorithms are utilized for choosing significant features.•A hybrid rule-extraction method generating comprehensible rule sets is developed.•Experiments were performed on six datasets: one new and fi...
Saved in:
Published in | Expert systems with applications Vol. 130; pp. 188 - 205 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Elsevier Ltd
15.09.2019
Elsevier BV |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Classification of datasets on diabetes and its complications are considered.•Five feature selection algorithms are utilized for choosing significant features.•A hybrid rule-extraction method generating comprehensible rule sets is developed.•Experiments were performed on six datasets: one new and five public.•The proposed approach outperforms ten state-of-the-art classifiers.
Diabetes mellitus is a major non-communicable disease ever rising as an epidemic and a public health crisis worldwide. One of the several life-threatening complications of diabetes is hypertension or high blood pressure which mostly remains undiagnosed and untreated until symptoms become severe. Diabetic complications can be greatly reduced or prevented by early detection of individuals at risk. In recent past, several machine learning classification algorithms have been widely applied for diagnosing diabetes but very few studies have been conducted for detecting hypertension among diabetic subjects. Specifically, existing rule-based models fail to produce comprehensible rule sets. To resolve this limitation, this paper endeavours to develop a hybrid approach for extracting rules from support vector machines. A feature selection mechanism is introduced for selecting significantly associated features from the dataset. XGBoost has been utilized to convert SVM black box model into an apprehensible decision-making tool. A new dataset has been obtained from Pt. JNM, Medical College, Raipur, India comprising of 300 diabetic subjects with 108 hypertensives and 192 normotensives. In addition, five public diabetes-related datasets have been taken for generalization of the results. Experiments reveal that the proposed model outperforms ten other benchmark classifiers. Friedman rank and post hoc Bonferroni-Dunn tests demonstrate the significance of the proposed method over others. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.04.029 |