Ensemble and single algorithm models to handle multicollinearity of UAV vegetation indices for predicting rice biomass

•Base and ensemble algorithms were compared to handle VIs multicollinearity.•The comparison includes model performance, variance, stability, and confidence.•The MLs model performance and under/overfitting was better in MCC than in NMCC.•The multicollinearity doesn’t affect the algorithms model varia...

Full description

Saved in:

Bibliographic Details
Published in	Computers and electronics in agriculture Vol. 205; p. 107621
Main Authors	Derraz, Radhwane, Melissa Muharam, Farrah, Nurulhuda, Khairudin, Ahmad Jaafar, Noraini, Keng Yap, Ng
Format	Journal Article
Language	English
Published	Elsevier B.V 01.02.2023
Subjects	Algorithm Multicollinearity Rice Unmanned aerial vehicle Vegetation index Vegetation index Unmanned aerial vehicle Algorithm Multicollinearity Rice
Online Access	Get full text
ISSN	0168-1699 1872-7107
DOI	10.1016/j.compag.2023.107621

Cover

Loading…

More Information
Summary:	•Base and ensemble algorithms were compared to handle VIs multicollinearity.•The comparison includes model performance, variance, stability, and confidence.•The MLs model performance and under/overfitting was better in MCC than in NMCC.•The multicollinearity doesn’t affect the algorithms model variance.•The multicollinearity doesn’t affect the algorithms model confidence. Rice biomass is a biofuel’s source and yield indicator. Conventional sampling methods predict rice biomass accurately. However, these methods are destructive, time-consuming, expensive, and labour-intensive. Instead, unmanned aerial vehicles (UAVs) cover such shortcomings by providing rice-attribute-sensitive vegetation indices (VIs). Nevertheless, VIs are collinear, and their analyses require machine learning algorithms (MLs). The analysis of collinear VIs using base (single) and ensemble MLs is yet to be investigated. Therefore, this study aims to compare the base and ensemble MLs’ model performance, variance, stability (under/overfitting), and confidence for rice biomass prediction in multicollinearity context (MCC) and non-multicollinearity context (NMCC). To that end, a randomised complete block design experiment was held in the IADA KETARA rice granary in Terengganu, Malaysia. The experiment resulted in 360 samples of five biomass traits, five spectral bands, and ninety VIs. The MLs model performance and under/overfitting were better in MCC than in NMCC for predicting all rice biomass traits. The ensemble MLs outperformed the base MLs for predicting all rice biomass traits in MCC and NMCC. All base and ensemble MLs achieved inconsistent patterns of R2 and RMSE variances in MCC and NMCC. Finally, multicollinearity and the base-ensemble MLs concept did not affect the model confidence; rather, the latter was subject to the cross-effects of the ML and dataset characteristics. The present study significantly reveals the level of different base and ensemble MLs' sensitivity to multicollinearity regarding model performance, stability, variance, and confidence.
ISSN:	0168-1699 1872-7107
DOI:	10.1016/j.compag.2023.107621