Distilled representation using patch-based local-to-global similarity strategy for visual place recognition

Visual Place Recognition (VPR) is important for ensuring the accuracy and reliability of re-localization in a Visual Simultaneous Localization and Mapping (VSLAM) system, effectively reducing potential errors in mapping and navigation tasks. In VPR tasks, CNN-based VPR techniques encounter challenge...

Full description

Saved in:
Bibliographic Details
Published inKnowledge-based systems Vol. 280; p. 111015
Main Authors Zhang, Qieshi, Xu, Zhenyu, Kang, Yuhang, Hao, Fusheng, Ren, Ziliang, Cheng, Jun
Format Journal Article
LanguageEnglish
Published Elsevier B.V 25.11.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Visual Place Recognition (VPR) is important for ensuring the accuracy and reliability of re-localization in a Visual Simultaneous Localization and Mapping (VSLAM) system, effectively reducing potential errors in mapping and navigation tasks. In VPR tasks, CNN-based VPR techniques encounter challenges in mitigating the impact of severe appearance changes caused by seasons and weather, as well as, viewpoint changes arising from robot motion deviations. To cope with this problem, a local-to-global similarity strategy is proposed in this paper. Specifically, an Auto-Encoder (AE) block is designed to distill appearance-invariant local features from AlexNet, where each local feature represents a specific image patch. Then, three local similarity measures, namely paired similarity, additional similarity, and adjacent similarity, are used to measure the similarity between paired images. Finally, weight encoders are introduced to combine the three local measures into a global one that achieves viewpoint-invariance. Extensive experiments show that our proposed method is robust to severe appearance and viewpoint changes while outperforming the current state-of-the-art methods on public visual place recognition datasets. Moreover, the proposed similarity strategy distinguishes the relationships between internal and external patches within images, effectively enhancing its recognition capability in real-world scenarios. •Auto-Encoder enhances AlexNet’s local features, improving appearance invariance.•Innovative local-to-global similarity for robust viewpoint invariance.•Weight encoders fine-tune local measures for dynamic similarity.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2023.111015