Effective Heart Disease Prediction Using Machine Learning Techniques

The diagnosis and prognosis of cardiovascular disease are crucial medical tasks to ensure correct classification, which helps cardiologists provide proper treatment to the patient. Machine learning applications in the medical niche have increased as they can recognize patterns from data. Using machi...

Full description

Saved in:

Bibliographic Details
Published in	Algorithms Vol. 16; no. 2; p. 88
Main Authors	Bhatt, Chintan M., Patel, Parth, Ghetia, Tarang, Mazzeo, Pier Luigi
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.02.2023
Subjects	Accuracy Algorithms Analysis Body fat Cardiovascular disease Care and treatment Chronic illnesses Classification Clustering Data mining Datasets Decision trees Diabetes Electrocardiography Fatalities Feature selection heart disease Heart diseases Heart failure k-modes Low income groups Machine learning Medical imaging Medical research Medicine, Experimental Methods Model accuracy model evaluation Mortality multilayer perceptron Multilayer perceptrons Pattern recognition Prognosis Regression analysis Risk factors Support vector machines Technology application
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The diagnosis and prognosis of cardiovascular disease are crucial medical tasks to ensure correct classification, which helps cardiologists provide proper treatment to the patient. Machine learning applications in the medical niche have increased as they can recognize patterns from data. Using machine learning to classify cardiovascular disease occurrence can help diagnosticians reduce misdiagnosis. This research develops a model that can correctly predict cardiovascular diseases to reduce the fatality caused by cardiovascular diseases. This paper proposes a method of k-modes clustering with Huang starting that can improve classification accuracy. Models such as random forest (RF), decision tree classifier (DT), multilayer perceptron (MP), and XGBoost (XGB) are used. GridSearchCV was used to hypertune the parameters of the applied model to optimize the result. The proposed model is applied to a real-world dataset of 70,000 instances from Kaggle. Models were trained on data that were split in 80:20 and achieved accuracy as follows: decision tree: 86.37% (with cross-validation) and 86.53% (without cross-validation), XGBoost: 86.87% (with cross-validation) and 87.02% (without cross-validation), random forest: 87.05% (with cross-validation) and 86.92% (without cross-validation), multilayer perceptron: 87.28% (with cross-validation) and 86.94% (without cross-validation). The proposed models have AUC (area under the curve) values: decision tree: 0.94, XGBoost: 0.95, random forest: 0.95, multilayer perceptron: 0.95. The conclusion drawn from this underlying research is that multilayer perceptron with cross-validation has outperformed all other algorithms in terms of accuracy. It achieved the highest accuracy of 87.28%.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1999-4893 1999-4893
DOI:	10.3390/a16020088