MIFNet: A lightweight multiscale information fusion network

Semantic segmentation technique plays a crucial role in Internet of Things applications, such as industrial robotics and self‐driving. Recently deep learning approaches have boosted semantic segmentation accuracy greatly. However, their comprehensive performance in terms of accuracy and efficiency i...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of intelligent systems Vol. 37; no. 9; pp. 5617 - 5642
Main Authors Cheng, Jieren, Peng, Xin, Tang, Xiangyan, Tu, Wenxuan, Xu, Wenhang
Format Journal Article
LanguageEnglish
Published New York Hindawi Limited 01.09.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Semantic segmentation technique plays a crucial role in Internet of Things applications, such as industrial robotics and self‐driving. Recently deep learning approaches have boosted semantic segmentation accuracy greatly. However, their comprehensive performance in terms of accuracy and efficiency is still far from satisfactory. We observe that (1) accuracy‐oriented methods rely on numerous convolution layers and sophisticated architectures, which result in heavy computational complexity and usually take a long time for inference; (2) efficiency‐oriented methods fail to capture the multiscale context information for discriminative representations during the feature fusion process, thus leading to suboptimal performance. Previous semantic segmentation approaches fail to address these two challenges simultaneously. To tackle the dilemma of precise segmentation and efficient inference, we propose a novel lightweight Multiscale Information Fusion Network (MIFNet). Specifically, the proposed MIFNet mainly consists of two core components, that is, Pyramid Refinement Connection Module (PRCM) and Lightweight Information Fusion Module (LIFM). The PRCM exploits skip learning to establish dependency between different stages. Meanwhile, the pyramid attention mechanism (PAM) in PRCM, which adjusts the weight of hybrid pyramid attention vector to refine spatial features of low‐level, is developed to alleviate the semantic gap. Moreover, the LIFM is designed to detect objects at multiple scales from the global‐local perspective. In LIFM, the proposed multiscale dense concatenation (MDC) adopts various dilated convolution to extract multiscale local context information. Extensive experimental results on benchmarks data sets demonstrate the significantly better performance of the proposed MIFNet compared with most existing state‐of‐the‐art methods.
ISSN:0884-8173
1098-111X
DOI:10.1002/int.22804