A Hybrid Siamese Network With Spatiotemporal Enhancement and Two-Level Feature Fusion for Remote Sensing Image Change Detection
With the popularization and development of deep learning (DL) technology, remote sensing (RS) image change detection (CD) has achieved remarkable success. However, an accurate CD has still been challenging due to the difficulties in achieving efficient feature extraction and effective difference fea...
Saved in:
Published in | IEEE transactions on geoscience and remote sensing Vol. 61; pp. 1 - 17 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | With the popularization and development of deep learning (DL) technology, remote sensing (RS) image change detection (CD) has achieved remarkable success. However, an accurate CD has still been challenging due to the difficulties in achieving efficient feature extraction and effective difference feature enhancement and refinement. To address these limitations, this article proposes a hybrid Siamese network with spatiotemporal enhancement and two-level feature fusion (named the HSSENet) for CD. First, an efficient hybrid Siamese backbone is designed by combining a transformer's advantage to capture dense dependencies between features and convolutional neural network (CNN)'s advantage to provide local prior knowledge. In addition, to reduce irrelevant pseudo-changes and high-frequency noise while maintaining the high compactness of changed targets, a spatiotemporal enhancement module (STEM) that adopts the self-attention mechanism for context modeling in spatiotemporal dimensions and can separately process low and high frequencies is proposed for effective difference feature enhancement. Finally, three two-level feature fusion modules (TL-FFMs) are designed instead of standard decoders to aggregate low-level details and high-level semantics for refining the boundary information. The proposed HSSENet is verified by experiments, and the experimental results demonstrate that it can obtain a better tradeoff between accuracy and efficiency than the state-of-the-art methods and significantly outperforms them with the F1-score of 91.48/91.55/91.17 points on the learning, vision, and RS (LEVIR)/Wuhan University (WHU)/deeply supervised image fusion network (DSIFN) test sets, respectively. |
---|---|
ISSN: | 0196-2892 1558-0644 |
DOI: | 10.1109/TGRS.2023.3268294 |