Multi-Temporal Unmanned Aerial Vehicle Remote Sensing for Vegetable Mapping Using an Attention-Based Recurrent Convolutional Neural Network

Vegetable mapping from remote sensing imagery is important for precision agricultural activities such as automated pesticide spraying. Multi-temporal unmanned aerial vehicle (UAV) data has the merits of both very high spatial resolution and useful phenological information, which shows great potentia...

Full description

Saved in:
Bibliographic Details
Published inRemote sensing (Basel, Switzerland) Vol. 12; no. 10; p. 1668
Main Authors Feng, Quanlong, Yang, Jianyu, Liu, Yiming, Ou, Cong, Zhu, Dehai, Niu, Bowen, Liu, Jiantao, Li, Baoguo
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.05.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Vegetable mapping from remote sensing imagery is important for precision agricultural activities such as automated pesticide spraying. Multi-temporal unmanned aerial vehicle (UAV) data has the merits of both very high spatial resolution and useful phenological information, which shows great potential for accurate vegetable classification, especially under complex and fragmented agricultural landscapes. In this study, an attention-based recurrent convolutional neural network (ARCNN) has been proposed for accurate vegetable mapping from multi-temporal UAV red-green-blue (RGB) imagery. The proposed model firstly utilizes a multi-scale deformable CNN to learn and extract rich spatial features from UAV data. Afterwards, the extracted features are fed into an attention-based recurrent neural network (RNN), from which the sequential dependency between multi-temporal features could be established. Finally, the aggregated spatial-temporal features are used to predict the vegetable category. Experimental results show that the proposed ARCNN yields a high performance with an overall accuracy of 92.80%. When compared with mono-temporal classification, the incorporation of multi-temporal UAV imagery could significantly boost the accuracy by 24.49% on average, which justifies the hypothesis that the low spectral resolution of RGB imagery could be compensated by the inclusion of multi-temporal observations. In addition, the attention-based RNN in this study outperforms other feature fusion methods such as feature-stacking. The deformable convolution operation also yields higher classification accuracy than that of a standard convolution unit. Results demonstrate that the ARCNN could provide an effective way for extracting and aggregating discriminative spatial-temporal features for vegetable mapping from multi-temporal UAV RGB imagery.
ISSN:2072-4292
2072-4292
DOI:10.3390/rs12101668