Distilled representation using patch-based local-to-global similarity strategy for visual place recognition
Visual Place Recognition (VPR) is important for ensuring the accuracy and reliability of re-localization in a Visual Simultaneous Localization and Mapping (VSLAM) system, effectively reducing potential errors in mapping and navigation tasks. In VPR tasks, CNN-based VPR techniques encounter challenge...
Saved in:
Published in | Knowledge-based systems Vol. 280; p. 111015 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
25.11.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Visual Place Recognition (VPR) is important for ensuring the accuracy and reliability of re-localization in a Visual Simultaneous Localization and Mapping (VSLAM) system, effectively reducing potential errors in mapping and navigation tasks. In VPR tasks, CNN-based VPR techniques encounter challenges in mitigating the impact of severe appearance changes caused by seasons and weather, as well as, viewpoint changes arising from robot motion deviations. To cope with this problem, a local-to-global similarity strategy is proposed in this paper. Specifically, an Auto-Encoder (AE) block is designed to distill appearance-invariant local features from AlexNet, where each local feature represents a specific image patch. Then, three local similarity measures, namely paired similarity, additional similarity, and adjacent similarity, are used to measure the similarity between paired images. Finally, weight encoders are introduced to combine the three local measures into a global one that achieves viewpoint-invariance. Extensive experiments show that our proposed method is robust to severe appearance and viewpoint changes while outperforming the current state-of-the-art methods on public visual place recognition datasets. Moreover, the proposed similarity strategy distinguishes the relationships between internal and external patches within images, effectively enhancing its recognition capability in real-world scenarios.
•Auto-Encoder enhances AlexNet’s local features, improving appearance invariance.•Innovative local-to-global similarity for robust viewpoint invariance.•Weight encoders fine-tune local measures for dynamic similarity. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2023.111015 |