SCDNET: A novel convolutional network for semantic change detection in high resolution optical remote sensing imagery

•A novel end-to-end convolution network for large-scale semantic change detection (SCDNet) is proposed.•A multi-scale atrous convolution unit is proposed to capture multi-scale change information.•A novel attention unit is employed for effective feature fusion.•Deep supervision strategy is introduce...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of applied earth observation and geoinformation Vol. 103; p. 102465
Main Authors Peng, Daifeng, Bruzzone, Lorenzo, Zhang, Yongjun, Guan, Haiyan, He, Pengfei
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.12.2021
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•A novel end-to-end convolution network for large-scale semantic change detection (SCDNet) is proposed.•A multi-scale atrous convolution unit is proposed to capture multi-scale change information.•A novel attention unit is employed for effective feature fusion.•Deep supervision strategy is introduced to improve network performance. With the continuing improvement of remote-sensing (RS) sensors, it is crucial to monitor Earth surface changes at fine scale and in great detail. Thus, semantic change detection (SCD), which is capable of locating and identifying “from-to” change information simultaneously, is gaining growing attention in RS community. However, due to the limitation of large-scale SCD datasets, most existing SCD methods are focused on scene-level changes, where semantic change maps are generated with only coarse boundary or scarce category information. To address this issue, we propose a novel convolutional network for large-scale SCD (SCDNet). It is based on a Siamese UNet architecture, which consists of two encoders and two decoders with shared weights. First, multi-temporal images are given as input to the encoders to extract multi-scale deep representations. A multi-scale atrous convolution (MAC) unit is inserted at the end of the encoders to enlarge the receptive field as well as capturing multi-scale information. Then, difference feature maps are generated for each scale, which are combined with feature maps from the encoders to serve as inputs for the decoders. Attention mechanism and deep supervision strategy are further introduced to improve network performance. Finally, we utilize softmax layer to produce a semantic change map for each time image. Extensive experiments are carried out on two large-scale high-resolution SCD datasets, which demonstrates the effectiveness and superiority of the proposed method.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1569-8432
1872-826X
DOI:10.1016/j.jag.2021.102465