Prediction models for postoperative recurrence of non-lactating mastitis based on machine learning

This study aims to build a machine learning (ML) model to predict the recurrence probability for postoperative non-lactating mastitis (NLM) by Random Forest (RF) and XGBoost algorithms. It can provide the ability to identify the risk of NLM recurrence and guidance in clinical treatment plan. This st...

Full description

Saved in:

Bibliographic Details
Published in	BMC medical informatics and decision making Vol. 24; no. 1; p. 106
Main Authors	Sun, Jiaye, Shao, Shijun, Wan, Hua, Wu, Xueqing, Feng, Jiamei, Gao, Qingqian, Qu, Wenchao, Xie, Lu
Format	Journal Article
Language	English
Published	England BioMed Central Ltd 22.04.2024 BioMed Central BMC
Subjects	Accuracy Adult Albumins Algorithms Blood cell count Blood cells Body mass index Body size Breast diseases Care and treatment China Datasets Decision trees Diagnosis Female Globulins Health aspects Herbal medicine Hospital patients Hospitals Humans Learning algorithms Leukocytes (neutrophilic) Libraries Lymphocytes Machine Learning Mastitis Methods Middle Aged Nipples Nomograms Non-lactating mastitis Patients Performance evaluation Performance prediction Postoperative Complications Prediction models Prognosis Recurrence Risk factors Shapley additive explanations Statistical methods Traditional Chinese medicine Trends Triglycerides Women China Recurrence Shapley additive explanations Machine learning Non-lactating mastitis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This study aims to build a machine learning (ML) model to predict the recurrence probability for postoperative non-lactating mastitis (NLM) by Random Forest (RF) and XGBoost algorithms. It can provide the ability to identify the risk of NLM recurrence and guidance in clinical treatment plan. This study was conducted on inpatients who were admitted to the Mammary Department of Shuguang Hospital affiliated to Shanghai University of Traditional Chinese Medicine between July 2019 to December 2021. Inpatient data follow-up has been completed until December 2022. Ten features were selected in this study to build the ML model: age, body mass index (BMI), number of abortions, presence of inverted nipples, extent of breast mass, white blood cell count (WBC), neutrophil to lymphocyte ratio (NLR), albumin-globulin ratio (AGR) and triglyceride (TG) and presence of intraoperative discharge. We used two ML approaches (RF and XGBoost) to build models and predict the NLM recurrence risk of female patients. Totally 258 patients were randomly divided into a training set and a test set according to a 75%-25% proportion. The model performance was evaluated based on Accuracy, Precision, Recall, F1-score and AUC. The Shapley Additive Explanations (SHAP) method was used to interpret the model. There were 48 (18.6%) NLM patients who experienced recurrence during the follow-up period. Ten features were selected in this study to build the ML model. For the RF model, BMI is the most important influence factor and for the XGBoost model is intraoperative discharge. The results of tenfold cross-validation suggest that both the RF model and the XGBoost model have good predictive performance, but the XGBoost model has a better performance than the RF model in our study. The trends of SHAP values of all features in our models are consistent with the trends of these features' clinical presentation. The inclusion of these ten features in the model is necessary to build practical prediction models for recurrence. The results of tenfold cross-validation and SHAP values suggest that the models have predictive ability. The trend of SHAP value provides auxiliary validation in our models and makes it have more clinical significance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1472-6947 1472-6947
DOI:	10.1186/s12911-024-02499-y