A New Multi-Channel Deep Convolutional Neural Network for Semantic Segmentation of Remote Sensing Image

The semantic segmentation of remote sensing (RS) image is a hot research field. With the development of deep learning, the semantic segmentation based on a full convolution neural network greatly improves the segmentation accuracy. The amount of information on the RS image is very large, but the sam...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 8; p. 1
Main Authors	Liu, Wenjie, Zhang, Yongjun, Fan, Haisheng, Zou, Yongjie, Cui, Zhongwei
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.01.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accuracy Artificial neural networks Computer networks Convolution Datasets Feature extraction feature fusion Image classification Image segmentation Labelling Machine learning neural network Neural networks Remote sensing Semantic segmentation Semantics Two dimensional models
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The semantic segmentation of remote sensing (RS) image is a hot research field. With the development of deep learning, the semantic segmentation based on a full convolution neural network greatly improves the segmentation accuracy. The amount of information on the RS image is very large, but the sample size is extremely uneven. Therefore, even the common network can segment RS images to a certain extent, but the segmentation accuracy can still be greatly improved. The common neural network deepens the network to improve the classification accuracy, but it has a lot of loss to the target spatial features and scale features, and the existing common feature fusion methods can only solve some problems. A segmentation network is built to solve the above problems very well. The network employs the InceptionV-4 network as the backbone and improves it. We modify the network structure and introduce the changed Atrous Spatial Pyramid Pooling module to extract the multi-scale features of the target from different training stages. Without losing the depth of the network, using Inception blocks to strengthen the width of the network can obtain more abstract features. At the same time, the backbone network is used for semantic fusion of the context, it can retain more spatial features, then an effective decoder network is designed. Finally, evaluate our model on the ISPRS 2D Semantic Labeling Contest Potsdam and Inria Aerial Image Labeling Dataset. The results show that the network has very superior performance, reaching 89.62% IOU score and 94.49% F1 score on the Potsdam dataset, and the IOU score on the Inria dataset has been greatly improved.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2020.3009976