Average Approximate Hashing-Based Double Projections Learning for Cross-Modal Retrieval

Cross-modal retrieval has attracted considerable attention for searching in large-scale multimedia databases because of its efficiency and effectiveness. As a powerful tool of data analysis, matrix factorization is commonly used to learn hash codes for cross-modal retrieval, but there are still many...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on cybernetics Vol. 52; no. 11; pp. 11780 - 11793
Main Authors Fang, Xiaozhao, Jiang, Kaihang, Han, Na, Teng, Shaohua, Zhou, Guoxu, Xie, Shengli
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.11.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cross-modal retrieval has attracted considerable attention for searching in large-scale multimedia databases because of its efficiency and effectiveness. As a powerful tool of data analysis, matrix factorization is commonly used to learn hash codes for cross-modal retrieval, but there are still many shortcomings. First, most of these methods only focus on preserving locality of data but they ignore other factors such as preserving reconstruction residual of data during matrix factorization. Second, the energy loss of data is not considered when the data of cross-modal are projected into a common semantic space. Third, the data of cross-modal are directly projected into a unified semantic space which is not reasonable since the data from different modalities have different properties. This article proposes a novel method called average approximate hashing (AAH) to address these problems by: 1) integrating the locality and residual preservation into a graph embedding framework by using the label information; 2) projecting data from different modalities into different semantic spaces and then making the two spaces approximate to each other so that a unified hash code can be obtained; and 3) introducing a principal component analysis (PCA)-like projection matrix into the graph embedding framework to guarantee that the projected data can preserve the main energy of data. AAH obtains the final hash codes by using an average approximate strategy, that is, using the mean of projected data of different modalities as the hash codes. Experiments on standard databases show that the proposed AAH outperforms several state-of-the-art cross-modal hashing methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2168-2267
2168-2275
DOI:10.1109/TCYB.2021.3081615