A Novel Building Extraction Network via Multi-Scale Foreground Modeling and Gated Boundary Refinement

Deep learning-based methods for building extraction from remote sensing images have been widely applied in fields such as land management and urban planning. However, extracting buildings from remote sensing images commonly faces challenges due to specific shooting angles. First, there exists a fore...

Full description

Saved in:
Bibliographic Details
Published inRemote sensing (Basel, Switzerland) Vol. 15; no. 24; p. 5638
Main Authors Liu, Junlin, Xia, Ying, Feng, Jiangfan, Bai, Peng
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.12.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Deep learning-based methods for building extraction from remote sensing images have been widely applied in fields such as land management and urban planning. However, extracting buildings from remote sensing images commonly faces challenges due to specific shooting angles. First, there exists a foreground–background imbalance issue, and the model excessively learns features unrelated to buildings, resulting in performance degradation and propagative interference. Second, buildings have complex boundary information, while conventional network architectures fail to capture fine boundaries. In this paper, we designed a multi-task U-shaped network (BFL-Net) to solve these problems. This network enhances the expression of the foreground and boundary features in the prediction results through foreground learning and boundary refinement, respectively. Specifically, the Foreground Mining Module (FMM) utilizes the relationship between buildings and multi-scale scene spaces to explicitly model, extract, and learn foreground features, which can enhance foreground and related contextual features. The Dense Dilated Convolutional Residual Block (DDCResBlock) and the Dual Gate Boundary Refinement Module (DGBRM) individually process the diverted regular stream and boundary stream. The former can effectively expand the receptive field, and the latter utilizes spatial and channel gates to activate boundary features in low-level feature maps, helping the network refine boundaries. The predictions of the network for the building, foreground, and boundary are respectively supervised by ground truth. The experimental results on the WHU Building Aerial Imagery and Massachusetts Buildings Datasets show that the IoU scores of BFL-Net are 91.37% and 74.50%, respectively, surpassing state-of-the-art models.
ISSN:2072-4292
2072-4292
DOI:10.3390/rs15245638