Differentiating estuarine dissolved organic matter composition by unsupervised and supervised machine learning
•ML captures dominant DOM optical parameters in different zones and scenarios.•Biogeochemical insights from explainable artificial intelligence.•Identification of zones that require attention to guide watershed management.•Establishment of a workflow to differentiate DOM composition in estuaries. Di...
Saved in:
Published in | Water research (Oxford) Vol. 284; p. 123900 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
England
Elsevier Ltd
15.09.2025
IWA Publishing/Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •ML captures dominant DOM optical parameters in different zones and scenarios.•Biogeochemical insights from explainable artificial intelligence.•Identification of zones that require attention to guide watershed management.•Establishment of a workflow to differentiate DOM composition in estuaries.
Differentiating the composition of Dissolved Organic Matter (DOM) in estuaries is a major environmental concern, as the DOM characteristics are closely linked to biogeochemical and ecological considerations (e.g. water properties and trophic cycling). However, tracing the spatiotemporal variations of estuarine DOM is challenging due to multiple sources and complex transformation processes. Here, we investigate the dynamics of estuarine DOM by analyzing the optical properties of DOM through UV–Visible absorbance and fluorescence spectroscopy, while also capturing the variability of DOM using machine learning algorithms and explainable artificial intelligence. To this aim, we collected sub-surface water samples (n = 249) from a human-impacted estuary with intense industrialization and urbanization in France (Seine Estuary) across distinct land use characteristics in contrasting hydrological conditions. We then applied unsupervised and supervised machine learning techniques to analyze the optical properties of DOM, which were determined by UV–Visible absorbance and Excitation-Emission Matrix (EEM) fluorescence spectroscopy combined with parallel factor analysis (PARAFAC). Our results show that unsupervised machine learning (K-means clustering) captures the spatial variabilities of DOM, identifying three distinct estuarine zones based on pronounced spatial variations of several DOM optical parameters. Supervised machine learning (Light Gradient Boosted Machine, LightGBM) further validates the rationality of the defined zonation. Subsequently, explainable artificial intelligence based on SHapley Additive exPlanations (SHAP) analysis shows that DOM in each zone has specific characteristics. Our model indicates that DOM in the Seine Estuary is primarily influenced by high molecular weight materials and autochthonous contributions in the upper estuary (Zone I). The dominant contribution to DOM in the mid-estuary (Zone II) comes from autochthonous and aromatic material as well as transformation and (photo)degradation products. Lower estuary (Zone III) is mainly characterized by aromatic DOM (subject to photodegradation), low molecular weight compounds, autochthonous DOM, as well as transformation and (photo)degradation products. Overall, this study presents a workflow for differentiating the composition of DOM, tracing the variability and dynamics of DOM along the land-to-sea continuum, and elucidating the involved processes. The approach developed in the Seine Estuary has significant implications for environmental management and can be adapted to other land-sea continuums.
[Display omitted] |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0043-1354 1879-2448 1879-2448 |
DOI: | 10.1016/j.watres.2025.123900 |