Comparative analysis of machine learning and ensemble approaches for hepatitis B prediction using data mining with synthetic minority oversampling technique
Purpose Hepatitis B, caused by the Hepatitis B virus (HBV), can harm the liver without noticeable symptoms. Early detection is crucial to prevent transmission and enhance recovery. The main goal is to predict Hepatitis B through cost-effective lab test data, by utilizing machine learning. The primar...
Saved in:
Published in | Health and technology Vol. 14; no. 1; pp. 109 - 118 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Purpose
Hepatitis B, caused by the Hepatitis B virus (HBV), can harm the liver without noticeable symptoms. Early detection is crucial to prevent transmission and enhance recovery. The main goal is to predict Hepatitis B through cost-effective lab test data, by utilizing machine learning. The primary focus is on evaluating the effectiveness of various algorithms in predicting the disease and their potential to enhance early diagnosis capabilities.
Methods
Six distinct algorithms (Support Vector Machine, K-nearest Neighbors, Logistic Regression, decision tree, extreme gradient boosting, random forest) were employed alongside an ensemble model. Analysis involved two rounds: considering all features and key attributes. The Synthetic Minority Oversampling Technique (SMOTE) was employed for data imbalance. Various metrics, including the confusion matrix, precision, recall, F1 score, accuracy, receiver operating characteristics (ROC) curve, area under the curve (AUC), and mean absolute error (MAE), were utilized to assess the efficacy of each predictive technique. The National Health and Nutrition Examination Survey (NHANES) dataset was employed.
Results
The experimental results demonstrate that the ensemble model attained the highest accuracy (97%) and AUC (0.997) in comparison to existing models. The analysis revealed that specific crucial features possess substantial predictive significance within this model.
Conclusion
The study underscores the potential of the ensemble model as a valuable tool for medical practitioners, leveraging cost-effective and readily obtainable laboratory test data to predict Hepatitis B with remarkable accuracy. By facilitating early diagnosis and intervention, this research presents a promising avenue to enhance patient outcomes in the context of Hepatitis B. |
---|---|
ISSN: | 2190-7188 2190-7196 |
DOI: | 10.1007/s12553-023-00802-x |