An Imbalanced-Data Processing Algorithm for the Prediction of Heart Attack in Stroke Patients
Early predicting heart attack out of stroke patients in a view of data analysis is an approach to reduce a high mortality rate. Stroke-patient data in Intensive Care Unit are imbalanced due to that stroke patients with heart attack are in the minority of stroke patients. How to predict heart attack...
Saved in:
Published in | IEEE access Vol. 9; pp. 25394 - 25404 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Early predicting heart attack out of stroke patients in a view of data analysis is an approach to reduce a high mortality rate. Stroke-patient data in Intensive Care Unit are imbalanced due to that stroke patients with heart attack are in the minority of stroke patients. How to predict heart attack in the stroke-patient data becomes a challenge. For processing the imbalanced data, this paper designs an algorithm by leveraging random undersampling, clustering and oversampling techniques, which is called undersampling-clustering-oversampling algorithm (shortly, UCO algorithm). The UCO algorithm generates nearly balanced data which are utilized to train machine-learning models for predicting heart attack. Over the database of Medical Information Mart for Intensive Care III, extensive experiments are conducted to evaluate the UCO algorithm. A setting of undersampling number of 120 in the algorithm UCO, denoted UCO(120), shows good performance in helping machine-learning classifiers extract features. Five classifiers are separately deployed to predict heart attack based on outputs of the UCO(120). Our results show that random forest classifier achieves the best predicting performance with an <inline-formula> <tex-math notation="LaTeX">accuracy </tex-math></inline-formula> of 70.29%, and <inline-formula> <tex-math notation="LaTeX">precision </tex-math></inline-formula> of 70.05%. It could be well-predicted using UCO(120) and random forest that whether a stroke patient will have heart attack or not. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2021.3057693 |