IEMask R-CNN: Information-Enhanced Mask R-CNN
The instance segmentation task is relatively difficult in computer vision, which requires not only high-quality masks but also high-accuracy instance category classification. Mask R-CNN has been proven to be a feasible method. However, due to the Feature Pyramid Network (FPN) structure lack useful c...
Saved in:
Published in | IEEE transactions on big data Vol. 9; no. 2; pp. 688 - 700 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.04.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The instance segmentation task is relatively difficult in computer vision, which requires not only high-quality masks but also high-accuracy instance category classification. Mask R-CNN has been proven to be a feasible method. However, due to the Feature Pyramid Network (FPN) structure lack useful channel information, global information and low-level texture information, and mask branch cannot obtain useful local-global information, Mask R-CNN is prevented from obtaining high-quality masks and high-accuracy instance category classification. Therefore, we proposed the Information-enhanced Mask R-CNN, called IEMask R-CNN. In the FPN structure of IEMask R-CNN, the information-enhanced FPN will enhance the useful channel information and the global information of the feature maps to solve the issues that the high-level feature map loses useful channel information and inaccurate of instance category classification, meanwhile the bottom-up path enhancement with adaptive feature fusion will ultilize the precise positioning signal in the lower layer to enhance the feature pyramid. In the mask branch of IEMask R-CNN, an encoding-decoding mask head will strength local-global information to gain a high-quality mask. Without bells and whistles, IEMask R-CNN gains significant gains of about 2.60%, 4.00%, 3.17% over Mask R-CNN on MS COCO2017, Cityscapes and LVIS1.0 benchmarks respectively. |
---|---|
ISSN: | 2332-7790 2332-7790 2372-2096 |
DOI: | 10.1109/TBDATA.2022.3187413 |