Comparison of Various Machine Learning Methods in Diagnosis of Hypertension in Diabetics with/without Consideration of Costs

Background and Objectives: Diabetic patients are always at risk of hypertension. In this paper, the main goal was to design a native cost sensitive model for the diagnosis of hypertension among diabetics considering the prior probabilities. Methods: In this paper, we tried to design a cost sensitive...

Full description

Saved in:

Bibliographic Details
Published in	Iranian journal of epidemiology Vol. 11; no. 4; pp. 46 - 54
Main Authors	M Teimouri, E Ebrahimi, SM Alavinia
Format	Journal Article
Language	Persian
Published	Tehran University of Medical Sciences 01.12.2016
Subjects	classification cost sensitive models diabetes hypertension machine learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Background and Objectives: Diabetic patients are always at risk of hypertension. In this paper, the main goal was to design a native cost sensitive model for the diagnosis of hypertension among diabetics considering the prior probabilities. Methods: In this paper, we tried to design a cost sensitive model for the diagnosis of hypertension in diabetic patients, considering the distribution of the disease in the general population. Among the data mining algorithms, Decision Tree, Artificial Neural Network, K-Nearest Neighbors, Support Vector Machine, and Logistic Regression were used. The data set belonged to Azarbayjan-e-Sharqi, Iran. Results: For people with diabetes, a systolic blood pressure more than 130 mm Hg increased the risk of hypertension. In the non-cost-sensitive scenario, Youdenchr('39')s index was around 68%. On the other hand, in the cost-sensitive scenario, the highest Youdenchr('39')s index (47.11%) was for Neural Network. However, in the cost-sensitive scenario, the value of the imposed cost was important, and Decision Tree and Logistic Regression show better performances. Conclusion: When diagnosing a disease, the cost of miss-classifications and also prior probabilities are the most important factors rather than only minimizing the error of classification on the data set.
ISSN:	1735-7489 1735-7489