Optimizing photocatalytic dye degradation: A machine learning and metaheuristic approach for predicting methylene blue in contaminated water

•Ten advanced machine-learning models were selected to predict the degradation of methylene blue dye from contaminated water.•HistGradientBoosting (HGB) model outperformed as compared to other studied models.•Bayesian optimization was used to tune hyperparameters and achieve the best results.•The fi...

Full description

Saved in:
Bibliographic Details
Published inResults in engineering Vol. 25; p. 103538
Main Authors Ahmed, Yunus, Dutta, Keya Rani, Nepu, Sharmin Nahar Chowdhury, Prima, Meherunnesa, AlMohamadi, Hamad, Akhtar, Parul
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.03.2025
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•Ten advanced machine-learning models were selected to predict the degradation of methylene blue dye from contaminated water.•HistGradientBoosting (HGB) model outperformed as compared to other studied models.•Bayesian optimization was used to tune hyperparameters and achieve the best results.•The final HGB model metrics included an R² score of 0.9915, MedAE of 1.171, MSE of 5.634, MAE of 1.735, and RMSE of 2.374.•The modelling projected the highest MB dye degradation (98.99 %) under optimized conditions. Dye contamination in water sources has severe environmental and public health issues; therefore, it needs effective monitoring and remediation strategies. The aim of the study is to use machine learning techniques to develop predictive models that may be used to evaluate methylene blue dye degradation capacity in contaminated water. Ten different machine learning models, including AdaBoost, Bagging, CatBoost, Decision Tree, Extra Trees, Gradient Boosting, HistGradientBoosting, LightGBM, Random Forest, and XGBoost, were evaluated using CuWO₄@TiO₂ as a photocatalyst. The evaluation metrics such as R², MSE, RMSE, MAE, and MedAE were used to assess the performance of the models. Among all models, HistGradientBoosting had a very well-balanced performance. It reached a very high R² of 0.9998 on the training set and 0.9915 on the test set, coupled with low error metrics, showcasing its strong generalization capability. However, Gradient Boosting and CatBoost exhibited impressive predictive performance, while AdaBoost and Decision Tree models suffered from overfitting. The maximum prediction obtained in the case of the integral approach for MB dye degradation is 98.99 %. Experimental validation indicated that the effectiveness under optimized conditions reached 98.5 %. In the case of initial MB concentration at 10 mg/L, a dosage of CuWO4@TiO2 photocatalyst at 200.33 mg/L, light intensity at 150 mW/cm², contact time at 88.6 min at room temperature, and near-neutral pH 7.0. The final model metrics included an R² score of 0.9915, MedAE of 1.171, MSE of 5.634, MAE of 1.735, and RMSE of 2.374. This work points out the possibility of taking complete advantage of advanced machine learning algorithms along with metaheuristics optimization in improving photocatalytic processes, hence opening a bright avenue for real applications in water treatment. [Display omitted]
ISSN:2590-1230
2590-1230
DOI:10.1016/j.rineng.2024.103538