An innovative approach for predicting groundwater TDS using optimized ensemble machine learning algorithms at two levels of modeling strategy

Groundwater salinization in coastal aquifers is a major socioeconomic challenge in Oman and many other regions worldwide due to several anthropogenic activities and natural drivers. Therefore, assessing the salinization of groundwater resources is crucial to ensure the protection of water resources...

Full description

Saved in:
Bibliographic Details
Published inJournal of environmental management Vol. 351; p. 119896
Main Authors Elzain, Hussam Eldin, Abdalla, Osman, A. Ahmed, Hamdi, Kacimov, Anvar, Al-Maktoumi, Ali, Al-Higgi, Khalifa, Abdallah, Mohammed, Yassin, Mohamed A., Senapathi, Venkatramanan
Format Journal Article
LanguageEnglish
Published England Elsevier Ltd 01.02.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Groundwater salinization in coastal aquifers is a major socioeconomic challenge in Oman and many other regions worldwide due to several anthropogenic activities and natural drivers. Therefore, assessing the salinization of groundwater resources is crucial to ensure the protection of water resources and sustainable management. The aim of this study is to apply a novel approach using predictive optimized ensemble trees-based (ETB) machine learning models, namely Catboost regression (CBR), Extra trees regression (ETR), and Bagging regression (BA), at two levels of modeling strategy for predicting groundwater TDS as an indicator for seawater intrusion in a coastal aquifer, Oman. At level 1, ETR and CBR models were used as base models or inputs for BA in level 2. The results show that the models at level 1 (i.e., ETR and CBR) yielded satisfactory results using a limited number of inputs (Cl, K, and Sr) from a few sets of 40 groundwater wells. The BA model at level 2 improved the overall performance of the modeling by extracting more information from ETR and CBR models at level 1 models. At level 2, the BA model achieved a significant improvement in accuracy (MSE = 0.0002, RSR = 0.062, R2 = 0.995 and NSE = 0.996) compared to each individual model of ETR (MSE = 0.0007, RSR = 0.245, R2 = 0.98 and NSE = 0.94), and CBR (MSE = 0.0035, RSR = 0.258, R2 = 0.933 and NSE = 0.934) at level 1 models in the testing dataset. BA model at level 2 outperformed all models regarding predictive accuracy, best generalization of new data, and matching the locations of the polluted and unpolluted wells. Our approach predicts groundwater TDS with high accuracy and thus provides early warnings of water quality deterioration along coastal aquifers which will improve water resources sustainability. •A two levels of ML modeling strategy for predicting groundwater TDS.•GS optimizer successfully tuned hyper-parameters of the ML models.•Random forest model was used as a feature selector.•BA-level 2 model outperformed CBR and ETR models in accuracy and various visual representations.•BA-level 2 dealt with small datasets and showed high-performance predictability.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0301-4797
1095-8630
DOI:10.1016/j.jenvman.2023.119896