Bi-directional attention based RGB-D fusion for category-level object pose and shape estimation
RGB-D images contain color and geometric information which are complementary for object pose and shape estimation. Normally, dense-fusion scheme is used to fuse the features extracted from the RGB-D channels for pose estimation of instance-level objects. However, for category-level objects, the effe...
Saved in:
Published in | Multimedia tools and applications Vol. 83; no. 17; pp. 53043 - 53063 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.05.2024
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | RGB-D images contain color and geometric information which are complementary for object pose and shape estimation. Normally, dense-fusion scheme is used to fuse the features extracted from the RGB-D channels for pose estimation of instance-level objects. However, for category-level objects, the effectiveness of dense-fusion feature is unfortunately affected by the significant intra-class variations between color and geometry. To address this problem, we propose AttentionFusion, a bi-directional attention-based RGB-D fusion framework for category-level object pose and shape estimation. In this framework, the complex contextual relationship between the color and geometric features is effectively explored by bi-directional cross-attention mechanism on a global scale for feature fusion. Based on the fused feature, 6D pose of the category-level object instance is refined iteratively, and object shape is also estimated precisely. Experimental results show that, the proposed method can achieve state-of-the-art performance for object pose and shape estimation on REAL275 datasets. |
---|---|
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-17626-6 |