Modelling Soil Temperature by Tree-Based Machine Learning Methods in Different Climatic Regions of China

Accurate estimation of soil temperature (Ts) at a national scale under different climatic conditions is important for soil–plant–atmosphere interactions. This study estimated daily Ts at the 0 cm depth for 689 meteorological stations in seven different climate zones of China for the period 1966–2015...

Full description

Saved in:
Bibliographic Details
Published inApplied sciences Vol. 12; no. 10; p. 5088
Main Authors Dong, Jianhua, Huang, Guomin, Wu, Lifeng, Liu, Fa, Li, Sien, Cui, Yaokui, Wang, Yicheng, Leng, Menghui, Wu, Jie, Wu, Shaofei
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.05.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Accurate estimation of soil temperature (Ts) at a national scale under different climatic conditions is important for soil–plant–atmosphere interactions. This study estimated daily Ts at the 0 cm depth for 689 meteorological stations in seven different climate zones of China for the period 1966–2015 with the M5P model tree (M5P), random forests (RF), and the extreme gradient boosting (XGBoost). The results showed that the XGBoost model (averaged coefficient of determination (R2) = 0.964 and root mean square error (RMSE) = 2.066 °C) overall performed better than the RF (averaged R2 = 0.959 and RMSE = 2.130 °C) and M5P (averaged R2 = 0.954 and RMSE = 2.280 °C) models for estimating Ts with higher computational efficiency. With the combination of mean air temperature (Tmean) and global solar radiation (Rs) as inputs, the estimating accuracy of the models was considerably high (averaged R2 = 0.96–0.97 and RMSE = 1.73–1.99 °C). On the basis of Tmean, adding Rs to the model input had a greater degree of influence on model estimating accuracy than adding other climatic factors to the input. Principal component analysis indicated that soil organic matter, soil water content, Tmean, relative humidity (RH), Rs, and wind speed (U2) are the main factors that cause errors in estimating Ts, and the total error interpretation rate was 97.9%. Overall, XGBoost would be a suitable algorithm for estimating Ts in different climate zones of China, and the combination of Tmean and Rs as model inputs would be more practical than other input combinations.
ISSN:2076-3417
2076-3417
DOI:10.3390/app12105088