Soil Science-Informed Machine Learning

[Display omitted] •Soil Science-Informed Machine Learning (SoilML) models enhance prediction reliability and generalisability.•Integrates soil science knowledge into ML models through observational priors, model structure design, and loss functions.•Physics-informed neural networks improve ML predic...

Full description

Saved in:

Bibliographic Details
Published in	Geoderma Vol. 452; p. 117094
Main Authors	Minasny, Budiman, Bandai, Toshiyuki, Ghezzehei, Teamrat A., Huang, Yin-Chung, Ma, Yuxin, McBratney, Alex B., Ng, Wartini, Norouzi, Sarem, Padarian, Jose, Rudiyanto, Sharififar, Amin, Styc, Quentin, Widyastuti, Marliana
Format	Journal Article
Language	English
Published	Elsevier B.V 01.12.2024 Elsevier
Subjects	Artificial Intelligence data collection Informed Machine Learning Mechanistic models Pedology Physics Informed Neural Networks prediction Process-based models soil organic carbon soil properties spectroscopy Mechanistic models Pedology Artificial Intelligence Physics Informed Neural Networks Informed Machine Learning Process-based models
Online Access	Get full text

Cover

Loading…

More Information
Summary:	[Display omitted] •Soil Science-Informed Machine Learning (SoilML) models enhance prediction reliability and generalisability.•Integrates soil science knowledge into ML models through observational priors, model structure design, and loss functions.•Physics-informed neural networks improve ML predictions.•Align ML predictions with soil science principles. Machine learning (ML) applications in soil science have significantly increased over the past two decades, reflecting a growing trend towards data-driven research addressing soil security. This extensive application has mainly focused on enhancing predictions of soil properties, particularly soil organic carbon, and improving the accuracy of digital soil mapping (DSM). Despite these advancements, the application of ML in soil science faces challenges related to data scarcity and the interpretability of ML models. There is a need for a shift towards Soil Science-Informed ML (SoilML) models that use the power of ML but also incorporate soil science knowledge in the training process to make predictions more reliable and generalisable. This paper proposes methodologies for embedding ML models with soil science knowledge to overcome current limitations. Incorporating soil science knowledge into ML models involves using observational priors to enhance training datasets, designing model structures which reflect soil science principles, and supervising model training with soil science-informed loss functions. The informed loss functions include observational constraints, coherency rules such as regularisation to avoid overfitting, and prior or soil-knowledge constraints that incorporate existing information about the parameters or outputs. By way of illustration, we present examples from four fields: digital soil mapping, soil spectroscopy, pedotransfer functions, and dynamic soil property models. We discuss the potential to integrate process-based models for improved prediction, the use of physics-informed neural networks, limitations, and the issue of overparametrisation. These approaches improve the relevance of ML predictions in soil science and enhance the models’ ability to generalise across different scenarios while maintaining soil science principles, transparency and reliability.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0016-7061 1872-6259
DOI:	10.1016/j.geoderma.2024.117094