Quantitative Provenance Analysis of the Yangtze and Yellow River Sediments Through Detrital Zircon U‐Pb Geochronology Using an XGBoost Machine Learning Algorithm
Over the past two decades, a large number of zircon U‐Pb ages from the Yangtze and Yellow River Basins have been published, yet distinguishing the sources of sediment between these regions remains challenging. Issues related to sampling, analytical methods, and biases complicate the interpretation o...
Saved in:
Published in | Journal of geophysical research. Machine learning and computation Vol. 2; no. 3 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
01.09.2025
|
Online Access | Get full text |
Cover
Loading…
Summary: | Over the past two decades, a large number of zircon U‐Pb ages from the Yangtze and Yellow River Basins have been published, yet distinguishing the sources of sediment between these regions remains challenging. Issues related to sampling, analytical methods, and biases complicate the interpretation of detrital zircon geochronology. In this study, we leveraged machine learning techniques to analyze a data set of over 33,000 zircon U‐Pb ages, refining the data to 28,082 ages for our analysis. We employed two characterization strategies: tectonic classification and kernel density estimation, and optimized our models through hyperparameter tuning. Our results demonstrated that the machine learning algorithm, eXtreme Gradient Boosting (XGBoost), significantly improved the accuracy of predicting sediment sources when compared to conventional methods (e.g., multidimensional scaling diagram). Additionally, we found that the most informative age populations were associated with the orogenic events (e.g., Jinning, 800–1,000 Ma, Tianshan, 260–394 Ma, and Nanhua, 680–800 Ma) rather than the movements of Lvliang (1,800–2,500 Ma) and Wutai (2,500–2,800 Ma), as suggested in previous studies. Finally, we tested the optimized models on several case studies, illustrating the effectiveness in identifying provenance signals for modern and quaternary sediments in East China Seas and the Yangtze Delta. While this machine learning approach shows great potential for improving sediment provenance analysis in these case studies, it is still limited by the availability and quality of detrital zircon age data for more detailed provenance analysis on sub‐basin scales.
In the last 20 years, many studies have looked at the ages of zircon minerals from the Yangtze and Yellow River Basins and nearby seas, but figuring out where these zircons come from has been tricky. Scientists often debate the best methods due to problems with sampling and analysis. To tackle this issue, we used recent advancements in machine learning to analyze a large data set of zircon ages—over 33,000 from past research. After cleaning the data, we focused on 28,000 ages and developed two models to help classify the sources of these zircons: one based on tectonic history (T model) and another using statistical methods (K model). We trained these models and found they performed well in predicting the provenance of the zircons, showing better accuracy than traditional methods. Notably, the most telling age groups for distinguishing between the two river basins are linked to specific geological events, rather than older movements previously thought to be significant. We further applied these models to real‐world examples, demonstrating that they can effectively differentiate sediment sources in various locations. This study highlights the promise of machine learning in geoscience for analyzing zircon provenance, though we still need more data to improve these methods.
We proposed two characterization strategies to assess the ability of machine learning to distinguish the sources of detrital zircon The XGBoost models exhibited improved predictions of provenance compared to conventional U‐Pb comparisons in our case studies While machine learning shows great promise for provenance analysis, it requires further attention and development |
---|---|
ISSN: | 2993-5210 2993-5210 |
DOI: | 10.1029/2025JH000763 |