Dual-range Context Aggregation for Efficient Semantic Segmentation in Remote Sensing Images

Although introducing self-attention mechanisms is beneficial to establish long-range dependencies and explore global context information in the task of remote sensing image semantic segmentation, it results in expensive computation and large memory cost. In this paper, we address this dilemma by pro...

Full description

Saved in:
Bibliographic Details
Published inIEEE geoscience and remote sensing letters Vol. 20; p. 1
Main Authors He, Guangjun, Dong, Zhe, Feng, Pengming, Muhtar, Dilxat, Zhang, Xueliang
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Although introducing self-attention mechanisms is beneficial to establish long-range dependencies and explore global context information in the task of remote sensing image semantic segmentation, it results in expensive computation and large memory cost. In this paper, we address this dilemma by proposing a lightweight dual-range context aggregation network (LDCANet) for efficient remote sensing image semantic segmentation. Firstly, a dual-range context aggregation module (DCAM) is designed to aggregate the local features and global semantic context acquired by convolutions and self-attention respectively, where self-attention is implemented easily by applying two cascaded linear layers to reduce the computational complexity. Furthermore, a simple and lightweight decoder is employed to combine information from different levels, in which a multilayer perceptron (MLP) based efficient linear block (ELB) is proposed to yield a strong and efficient representation. Experiments conducted on ISPRS Vaihingen and GID datasets prove that our LDCANet achieves an excellent trade-off between segmentation accuracy and computational efficiency. In particular, our method achieves 74.12% mIoU on the ISPRS Vaihingen dataset and 61.42% mIoU on the GID dataset with only 4.98M parameter size.
ISSN:1545-598X
1558-0571
DOI:10.1109/LGRS.2023.3233979