Image retrieval with geometry-preserving visual phrases

The most popular approach to large scale image retrieval is based on the bag-of-visual-word (BoV) representation of images. The spatial information is usually re-introduced as a post-processing step to re-rank the retrieved images, through a spatial verification like RANSAC. Since the spatial verifi...

Full description

Saved in:
Bibliographic Details
Published inCVPR 2011 pp. 809 - 816
Main Authors Yimeng Zhang, Zhaoyin Jia, Tsuhan Chen
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2011
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The most popular approach to large scale image retrieval is based on the bag-of-visual-word (BoV) representation of images. The spatial information is usually re-introduced as a post-processing step to re-rank the retrieved images, through a spatial verification like RANSAC. Since the spatial verification techniques are computationally expensive, they can be applied only to the top images in the initial ranking. In this paper, we propose an approach that can encode more spatial information into BoV representation and that is efficient enough to be applied to large-scale databases. Other works pursuing the same purpose have proposed exploring the word co-occurrences in the neighborhood areas. Our approach encodes more spatial information through the geometry-preserving visual phrases (GVP). In addition to co-occurrences, the GVP method also captures the local and long-range spatial layouts of the words. Our GVP based searching algorithm increases little memory usage or computational time compared to the BoV method. Moreover, we show that our approach can also be integrated to the min-hash method to improve its retrieval accuracy. The experiment results on Oxford 5K and Flicker 1M dataset show that our approach outperforms the BoV method even following a RANSAC verification.
ISBN:1457703947
9781457703942
ISSN:1063-6919
DOI:10.1109/CVPR.2011.5995528