Similarity Graph-correlation Reconstruction Network for unsupervised cross-modal hashing

Existing cross-modal hash retrieval methods can simultaneously enhance retrieval speed and reduce storage space. However, these methods face a major challenge in determining the similarity metric between two modalities. Specifically, the accuracy of intra-modal and inter-modal similarity measurement...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 237; p. 121516
Main Authors Yao, Dan, Li, Zhixin, Li, Bo, Zhang, Canlong, Ma, Huifang
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.03.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Existing cross-modal hash retrieval methods can simultaneously enhance retrieval speed and reduce storage space. However, these methods face a major challenge in determining the similarity metric between two modalities. Specifically, the accuracy of intra-modal and inter-modal similarity measurements is inadequate, and the large gap between modalities leads to semantic bias. In this paper, we propose a Similarity Graph-correlation Reconstruction Network (SGRN) for unsupervised cross-modal hashing. Particularly, the local relation graph rebasing module is used to filter out graph nodes with weak similarity and associate graph nodes with strong similarity, resulting in fine-grained intra-modal similarity relation graphs. The global relation graph reconstruction module is further strengthens cross-modal correlation and implements fine-grained similarity alignment between modalities. In addition, in order to bridge the modal gap, we combine the similarity representation of real-valued and hash features to design the intra-modal and inter-modal training strategies. SGRN conducted extensive experiments on two cross-modal retrieval datasets, and the experimental results effectively validated the superiority of the proposed method and significantly improved the retrieval performance. •We construct the relation graphs for image modality and text modality separately.•We rebase intra-modal relation graphs through the similarity correlation.•We combine the rebase graphs of two modalities to obtain the joint relation graphs.•We reconstruct the joint relation graphs to obtain fine-grained similarity alignment.•We design a combined intra-modal and inter-modal training strategy.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2023.121516