Ensemble and single algorithm models to handle multicollinearity of UAV vegetation indices for predicting rice biomass

•Base and ensemble algorithms were compared to handle VIs multicollinearity.•The comparison includes model performance, variance, stability, and confidence.•The MLs model performance and under/overfitting was better in MCC than in NMCC.•The multicollinearity doesn’t affect the algorithms model varia...

Full description

Saved in:
Bibliographic Details
Published inComputers and electronics in agriculture Vol. 205; p. 107621
Main Authors Derraz, Radhwane, Melissa Muharam, Farrah, Nurulhuda, Khairudin, Ahmad Jaafar, Noraini, Keng Yap, Ng
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.02.2023
Subjects
Online AccessGet full text
ISSN0168-1699
1872-7107
DOI10.1016/j.compag.2023.107621

Cover

Loading…
More Information
Summary:•Base and ensemble algorithms were compared to handle VIs multicollinearity.•The comparison includes model performance, variance, stability, and confidence.•The MLs model performance and under/overfitting was better in MCC than in NMCC.•The multicollinearity doesn’t affect the algorithms model variance.•The multicollinearity doesn’t affect the algorithms model confidence. Rice biomass is a biofuel’s source and yield indicator. Conventional sampling methods predict rice biomass accurately. However, these methods are destructive, time-consuming, expensive, and labour-intensive. Instead, unmanned aerial vehicles (UAVs) cover such shortcomings by providing rice-attribute-sensitive vegetation indices (VIs). Nevertheless, VIs are collinear, and their analyses require machine learning algorithms (MLs). The analysis of collinear VIs using base (single) and ensemble MLs is yet to be investigated. Therefore, this study aims to compare the base and ensemble MLs’ model performance, variance, stability (under/overfitting), and confidence for rice biomass prediction in multicollinearity context (MCC) and non-multicollinearity context (NMCC). To that end, a randomised complete block design experiment was held in the IADA KETARA rice granary in Terengganu, Malaysia. The experiment resulted in 360 samples of five biomass traits, five spectral bands, and ninety VIs. The MLs model performance and under/overfitting were better in MCC than in NMCC for predicting all rice biomass traits. The ensemble MLs outperformed the base MLs for predicting all rice biomass traits in MCC and NMCC. All base and ensemble MLs achieved inconsistent patterns of R2 and RMSE variances in MCC and NMCC. Finally, multicollinearity and the base-ensemble MLs concept did not affect the model confidence; rather, the latter was subject to the cross-effects of the ML and dataset characteristics. The present study significantly reveals the level of different base and ensemble MLs' sensitivity to multicollinearity regarding model performance, stability, variance, and confidence.
ISSN:0168-1699
1872-7107
DOI:10.1016/j.compag.2023.107621