Predicting Cd accumulation in crops and identifying nonlinear effects of multiple environmental factors based on machine learning models

The traditional prediction of the Cd content in grains (Cdg) of crops primarily relies on the multiple linear regression models based on soil Cd content (Cds) and pH, neglecting inter-factorial interactions and nonlinear causal links between external environmental factors and Cdg. In this study, a c...

Full description

Saved in:
Bibliographic Details
Published inThe Science of the total environment Vol. 951; p. 175787
Main Authors Lu, Xiaosong, Sun, Li, Zhang, Ya, Du, Junyang, Wang, Guoqing, Huang, Xinghua, Li, Xuzhi, Wang, Xiaozhi
Format Journal Article
LanguageEnglish
Published Elsevier B.V 15.11.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The traditional prediction of the Cd content in grains (Cdg) of crops primarily relies on the multiple linear regression models based on soil Cd content (Cds) and pH, neglecting inter-factorial interactions and nonlinear causal links between external environmental factors and Cdg. In this study, a comprehensive index system of multi-type environmental factors including soil properties, geology, climate, and anthropogenic activity was constructed. The machine learning models of the tree-based ensemble, support vector regression, artificial neural network for predicting Cdg of rice and wheat based on the environmental factor indexes significantly improved the accuracy than the traditional models of linear regression based on soil properties. Among them, the tree-based ensemble models of XGboost and random forest exhibited highest accuracies for predicting Cdg of rice and wheat, with R2 in the test dataset of 0.349 and 0.546, respectively. This study found that soil properties, including Cds, pH, and clay, have greater impacts on Cdg of rice and wheat, with combined contribution rates accounting for 65.2 % and 29.7 % respectively. Since wheat sampling areas are located in central and northern China, they are more constrained by precipitation and temperature than rice sampling areas in the south. Geologic and climate factors have a greater impact on Cdg of wheat, with a combined contribution rate of 49.9 %, which is higher than the corresponding rate of 20.9 % in rice. Furthermore, the Cdg of rice and wheat did not exhibit an absolute linear relationship with Cds, and excessively high Cds can reduce the bioconcentration factor of Cd accumulation in crops. Meanwhile, other environmental factors such as temperature, precipitation, elevation have marginal effects on the increase of Cdg of crops. This study provides a novel framework to optimize traditional soil plant transfer models, as well as offer a step towards realizing high precision prediction of Cd content in crops. [Display omitted] •Multiple environmental factors were constructed to predict crop Cd accumulation.•Machine learning improved the accuracy of crop Cd accumulation prediction.•Nonlinear effects of environmental factors were revealed by Shapley value.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0048-9697
1879-1026
1879-1026
DOI:10.1016/j.scitotenv.2024.175787