Towards Global Localization using Multi-Modal Object-Instance Re-Identification
Re-identification (ReID) is a critical challenge in computer vision, predominantly studied in the context of pedestrians and vehicles. However, robust object-instance ReID, which has significant implications for tasks such as autonomous exploration, long-term perception, and scene understanding, rem...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
18.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Re-identification (ReID) is a critical challenge in computer vision,
predominantly studied in the context of pedestrians and vehicles. However,
robust object-instance ReID, which has significant implications for tasks such
as autonomous exploration, long-term perception, and scene understanding,
remains underexplored. In this work, we address this gap by proposing a novel
dual-path object-instance re-identification transformer architecture that
integrates multimodal RGB and depth information. By leveraging depth data, we
demonstrate improvements in ReID across scenes that are cluttered or have
varying illumination conditions. Additionally, we develop a ReID-based
localization framework that enables accurate camera localization and pose
identification across different viewpoints. We validate our methods using two
custom-built RGB-D datasets, as well as multiple sequences from the open-source
TUM RGB-D datasets. Our approach demonstrates significant improvements in both
object instance ReID (mAP of 75.18) and localization accuracy (success rate of
83% on TUM-RGBD), highlighting the essential role of object ReID in advancing
robotic perception. Our models, frameworks, and datasets have been made
publicly available. |
---|---|
DOI: | 10.48550/arxiv.2409.12002 |