Semantic segmentation of water bodies in very high-resolution satellite and aerial images

This study evaluates the performance of convolutional neural networks for semantic segmentation of water bodies in very high-resolution satellite and aerial images from multiple sensors with particular focus on flood emergency response applications. Different model architectures (U-Net and DeepLab-V...

Full description

Saved in:
Bibliographic Details
Published inRemote sensing of environment Vol. 287; p. 113452
Main Authors Wieland, Marc, Martinis, Sandro, Kiefl, Ralph, Gstaiger, Veronika
Format Journal Article
LanguageEnglish
Published Elsevier Inc 15.03.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This study evaluates the performance of convolutional neural networks for semantic segmentation of water bodies in very high-resolution satellite and aerial images from multiple sensors with particular focus on flood emergency response applications. Different model architectures (U-Net and DeepLab-V3+) are combined with encoder backbones (MobileNet-V3, ResNet-50 and EfficientNet-B4) and tested for their ability to delineate inundated areas under varying environmental conditions and data availability scenarios. An unprecedented reference dataset of 1120 globally sampled images with quality checked binary water masks is introduced and used to train, validate and test the models for water body segmentation. Furthermore, independent test datasets are developed to test the generalization ability of the trained models across regions, sensors (IKONOS, GeoEye-1, WorldView-2, WorldView-3 and four different airborne camera systems) and tasks (normal water and flood water segmentation). Results indicate that across all tested scenarios a U-Net model with Mobilenet-V3 backbone pre-trained on ImageNet performs best. While using R-G-B image bands performs well, adding the near infrared band (if available) slightly improves prediction results. Similarly, adding slope information from an independent digital elevation model increases accuracies. Train-time augmentation and contrast enhancement could improve transferability across sensors and in particular between satellite and aerial images. Moreover, adding noisy training data from freely available online resources could further improve performance with minimal annotation effort. •Extensive global reference dataset for water segmentation in aerial and satellite images.•Generalization ability on heterogeneous test datasets across 8 sensors, 3 platforms and 2 tasks.•Performance evaluation of various convolutional neural network architectures.•Best performance for U-Net with Mobilenet-V3 encoder pre-trained on Imagenet.•Application examples include amongst others recent floods in Germany.
ISSN:0034-4257
1879-0704
DOI:10.1016/j.rse.2023.113452