Using deep learning to predict soil properties from regional spectral data

Diffuse reflectance infrared spectroscopy allows the rapid acquisition of soil information in the field or the laboratory. The vis-NIR spectroscopy research enthusiasm around the world has created regional to global soil spectral libraries. While machine learning methods have been utilised in proces...

Full description

Saved in:
Bibliographic Details
Published inGeoderma Regional Vol. 16; p. e00198
Main Authors Padarian, J., Minasny, B., McBratney, A.B.
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.03.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Diffuse reflectance infrared spectroscopy allows the rapid acquisition of soil information in the field or the laboratory. The vis-NIR spectroscopy research enthusiasm around the world has created regional to global soil spectral libraries. While machine learning methods have been utilised in processing spectral data, such large regional datasets are better dealt with big data analytics. Deep learning is an exciting discipline that has already transformed the way data are analysed in many fields and could also change the way we model soil spectral data. This study developed and evaluated convolutional neural networks (CNNs), a type of deep learning algorithm, as a new way to predict soil properties from raw soil spectra. We demonstrated the effectiveness of CNNs on the LUCAS soil database, which consists of around 20,000 topsoil observations with physicochemical and biological properties from Europe. To fully utilise the capacity of the CNN model, we represented the soil spectral data as a 2-dimensional spectrogram, showing the reflectance as a function of wavelength and frequency. We showed the capacity of a CNN to be trained in a multi-task setting to simultaneously predict six soil properties in one model (OC, CEC, clay, sand, pH, total N). Compared with conventional methods such as PLS regression and Cubist regression tree, the CNN performed significantly better, especially the multi-tasking model. In the case of soil organic carbon prediction, the multi-task CNN decreased the error by 87% compared to PLS and 62% compared with Cubist. This approach proved to be effective when trained on a relatively large dataset. The high accuracy of CNN makes it an ideal tool for modelling soil spectral data. •CNN model is able to predict soil properties from raw spectral data.•CNN is able to predict multi properties simultaneously and synergically.•The multi-task CNN reduced the error compared to the Cubist model by 62%.•This approach works better with large spectral datasets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2352-0094
2352-0094
DOI:10.1016/j.geodrs.2018.e00198