Comparison of influential input variables in the deep learning modeling of sunflower grain yields under normal and drought stress conditions

Crop yield prediction is a complex task with nonlinear relationships due to its dependence on multiple factors such as polygenic traits, environmental effects, genetics and environment interactions, etc. These cases make conventional statistical techniques unable to explain the nonlinear and complex...

Full description

Saved in:

Bibliographic Details
Published in	Field crops research Vol. 303; p. 109145
Main Authors	Khalifani, Sanaz, Darvishzadeh, Reza, Azad, Nasrin, Shayesteh, Mahrokh G., Kalbkhani, Hashem, Akbari, Nasrin
Format	Journal Article
Language	English
Published	Elsevier B.V 01.11.2023
Subjects	Convolutional neural networks Stepwise regression Sunflower Yield prediction Stepwise regression Convolutional neural networks Sunflower Yield prediction
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Crop yield prediction is a complex task with nonlinear relationships due to its dependence on multiple factors such as polygenic traits, environmental effects, genetics and environment interactions, etc. These cases make conventional statistical techniques unable to explain the nonlinear and complex relationship between performance and its components. This research was conducted to estimate sunflower seed yield using multiple linear regression (MLR) and convolutional neural network (CNN). It also investigated the effect of different input variables on deep learning modeling of sunflower seed yield under normal and drought stress (DS) conditions. The 100 pure lines of oil seed sunflower were investigated during two crop years in the field under normal and DS conditions in terms of seed yield and morphological traits. The CNN model was implemented using different combinations of input variables. In this regard, all studied parameters were first used as input variables, and then stepwise regression was performed for both conditions. In this step, the input variables for yield modeling with the CNN model consisted of parameters included in the regression model and those common in normal and DS conditions. The CNN model with two input variables (head diameter [HD] and number of leaves [NL]), which were common in the regression model for both conditions, achieved higher accuracy and performance in predicting sunflower yield under normal conditions (R2 =0.921, MAE=5.425, and RMSE=6.462). Nonetheless, in DS conditions, the CNN model with seven input variables (i.e., leaf width [LW], NL, plant height [PH], days to flowering [DF], stem diameter [SD], petiole length [PL], and HD) demonstrated higher accuracy and performance in predicting sunflower yield (R2 =0.915, MAE=3.632, and RMSE=4.330). The CNN model outperformed the MLR model in both conditions in terms of accuracy and performance. Sensitivity analysis identified LW, NL, and length of leaf [LL] traits as important and influential traits for yield prediction under normal and DS conditions. The CNN model was successful in reducing the number of variables needed to model sunflower seed yield. With the important parameters identified, sunflower yield can be predicted with higher accuracy, lower cost, and in a shorter time, even if other parameters are not available, in both normal and drought-prone conditions. The CNN model can potentially be used as a promising tool for predicting sunflower yield in yield increase programs under different growing conditions. •CNNs were more accurate than MLR in predicting sunflower yield.•Relative influence of input variables in CNN varies depending on environ status.•CNN with HD and NL had high accuracy in predicting yield under normal conditions.•CNN with LW, NL, PH, DF, SD, PL and HD had high accuracy in predicting yield in DS.•LW and NL were the most effective traits on yield in normal and DS conditions.
ISSN:	0378-4290 1872-6852
DOI:	10.1016/j.fcr.2023.109145