Mask-Pyramid Network: A Novel Panoptic Segmentation Method

In this paper, we introduce a novel panoptic segmentation method called the Mask-Pyramid Network. Existing Mask RCNN-based methods first generate a large number of box proposals and then filter them at each feature level, which requires a lot of computational resources, while most of the box proposa...

Full description

Saved in:

Bibliographic Details
Published in	Sensors (Basel, Switzerland) Vol. 24; no. 5; p. 1411
Main Authors	Xian, Peng-Fei, Po, Lai-Man, Xiong, Jing-Jing, Zhao, Yu-Zhi, Yu, Wing-Yin, Cheung, Kwok-Wai
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 22.02.2024 MDPI
Subjects	Boxes convolutional neural network image processing Liu, Timothy Neural networks panoptic segmentation Proposals Semantics image processing convolutional neural network panoptic segmentation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we introduce a novel panoptic segmentation method called the Mask-Pyramid Network. Existing Mask RCNN-based methods first generate a large number of box proposals and then filter them at each feature level, which requires a lot of computational resources, while most of the box proposals are suppressed and discarded in the Non-Maximum Suppression process. Additionally, for panoptic segmentation, it is a problem to properly fuse the semantic segmentation results with the Mask RCNN-produced instance segmentation results. To address these issues, we propose a new mask pyramid mechanism to distinguish objects and generate much fewer proposals by referring to existing segmented masks, so as to reduce computing resource consumption. The Mask-Pyramid Network generates object proposals and predicts masks from larger to smaller sizes. It records the pixel area occupied by the larger object masks, and then only generates proposals on the unoccupied areas. Each object mask is represented as a H × W × 1 logit, which fits well in format with the semantic segmentation logits. By applying SoftMax to the concatenated semantic and instance segmentation logits, it is easy and natural to fuse both segmentation results. We empirically demonstrate that the proposed Mask-Pyramid Network achieves comparable accuracy performance on the Cityscapes and COCO datasets. Furthermore, we demonstrate the computational efficiency of the proposed method and obtain competitive results.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1424-8220 1424-8220
DOI:	10.3390/s24051411