A CNN-Transformer Network With Multiscale Context Aggregation for Fine-Grained Cropland Change Detection

Nonagriculturalization incidents are serious threats to local agricultural ecosystem and global food security. Remote sensing change detection (CD) can provide an effective approach for in-time detection and prevention of such incidents. However, existing CD methods are difficult to deal with the la...

Full description

Saved in:

Bibliographic Details
Published in	IEEE journal of selected topics in applied earth observations and remote sensing Vol. 15; pp. 4297 - 4306
Main Authors	Liu, Mengxi, Chai, Zhuoqun, Deng, Haojun, Liu, Rong
Format	Journal Article
Language	English
Published	Piscataway IEEE 2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Agglomeration Aggregation Agricultural ecosystems Agricultural land Biological system modeling Change detection Change detection (CD) Computer applications Context cropland Data mining Datasets Decoding Deep layer deep learning (DL) Detection Feature extraction Food security Head High resolution Image resolution Remote sensing Resolution Spatial discrimination Spatial resolution Task analysis transformer Transformers
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Nonagriculturalization incidents are serious threats to local agricultural ecosystem and global food security. Remote sensing change detection (CD) can provide an effective approach for in-time detection and prevention of such incidents. However, existing CD methods are difficult to deal with the large intraclass differences of cropland changes in high-resolution images. In addition, traditional CNN based models are plagued by the loss of long-range context information, and the high computational complexity brought by deep layers. Therefore, in this article, we propose a CNN-transformer network with multiscale context aggregation (MSCANet), which combines the merits of CNN and transformer to fulfill efficient and effective cropland CD. In the MSCANet, a CNN-based feature extractor is first utilized to capture hierarchical features, then a transformer-based MSCA is designed to encode and aggregate context information. Finally, a multibranch prediction head with three CNN classifiers is applied to obtain change maps, to enhance the supervision for deep layers. Besides, for the lack of CD dataset with fine-grained cropland change of interest, we also provide a new cropland change detection dataset, which contains 600 pairs of 512 × 512 bi-temporal images with the spatial resolution of 0.5-2m. Comparative experiments with several CD models prove the effectiveness of the MSCANet, with the highest F1 of 64.67% on the high-resolution semantic CD dataset, and of 71.29% on CLCD.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1939-1404 2151-1535
DOI:	10.1109/JSTARS.2022.3177235