Network adaptation for color image semantic segmentation

Image analysis using deep learning has made significant progress in the last few decades, and the importance of pre‐processing input images has become evident. However, adapting a network structure suitable for input images has not been considered. In this study, a simple network adaptation method f...

Full description

Saved in:
Bibliographic Details
Published inIET image processing Vol. 17; no. 10; pp. 2972 - 2983
Main Authors An, Taeg‐Hyun, Kang, Jungyu, Min, Kyoung‐Wook
Format Journal Article
LanguageEnglish
Published Wiley 01.08.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Image analysis using deep learning has made significant progress in the last few decades, and the importance of pre‐processing input images has become evident. However, adapting a network structure suitable for input images has not been considered. In this study, a simple network adaptation method for color image analysis is described. The method is illustrated using semantic segmentation, which mainly takes a color image as input. The method is inspired by chrominance subsampling, which is a practical method for image and video analysis. The human visual system is less sensitive to color differences than it is to brightness, and based on this phenomenon, it is possible to improve existing networks by providing less resolution to chroma information than luminance information in the network encoder design by applying the group convolution concept. The proposed method helps to achieve improved results without changing the complexity of the baseline network model, and is especially helpful in applications with limited resources, such as autonomous driving, augmented reality. Experiments were performed on a combination of datasets (i.e. CamVid, Cityscapes and KITTI‐360) and networks (i.e. ENet, ERFNet, Deeplabv3plus with mobilenetv2). The results show that the method improves the performance of existing network structures without increasing the number of parameters. The human visual system is less sensitive to color differences than it is to brightness, and based on this phenomenon, it is possible to improve existing networks by providing less resolution to chrominance information than luminance information in the network encoder design by applying the group convolution concept.
ISSN:1751-9659
1751-9667
DOI:10.1049/ipr2.12846