Attribute-Aware Deep Hashing With Self-Consistency for Large-Scale Fine-Grained Image Retrieval

Our work focuses on tackling large-scale fine-grained image retrieval as ranking the images depicting the concept of interests (i.e., the same sub-category labels) highest based on the fine-grained details in the query. It is desirable to alleviate the challenges of both fine-grained nature of small...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on pattern analysis and machine intelligence Vol. PP; no. 11; pp. 1 - 16
Main Authors	Wei, Xiu-Shen, Shen, Yang, Sun, Xuhao, Wang, Peng, Peng, Yuxin
Format	Journal Article
Language	English
Published	United States IEEE 01.11.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Annotations Atmospheric modeling Attribute-Aware Bias Coders Codes Consistency Datasets Decorrelation Encoders-Decoders Image enhancement Image reconstruction Image retrieval large-scale fine-grained image retrieval learning-to-hash Representations self-consistency simplicity bias Task analysis Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Our work focuses on tackling large-scale fine-grained image retrieval as ranking the images depicting the concept of interests (i.e., the same sub-category labels) highest based on the fine-grained details in the query. It is desirable to alleviate the challenges of both fine-grained nature of small inter-class variations with large intra-class variations and explosive growth of fine-grained data for such a practical task. In this paper, we propose attribute-aware hashing networks with self-consistency for generating attribute-aware hash codes to not only make the retrieval process efficient, but also establish explicit correspondences between hash codes and visual attributes. Specifically, based on the captured visual representations by attention, we develop an encoder-decoder structure network of a reconstruction task to unsupervisedly distill high-level attribute-specific vectors from the appearance-specific visual representations without attribute annotations. Our models are also equipped with a feature decorrelation constraint upon these attribute vectors to strengthen their representative abilities. Then, driven by preserving original entities' similarity, the required hash codes can be generated from these attribute-specific vectors and thus become attribute-aware. Furthermore, to combat simplicity bias in deep hashing, we consider the model design from the perspective of the self-consistency principle and propose to further enhance models' self-consistency by equipping an additional image reconstruction path. Comprehensive quantitative experiments under diverse empirical settings on six fine-grained retrieval datasets and two generic retrieval datasets show the superiority of our models over competing methods. Moreover, qualitative results demonstrate that not only the obtained hash codes can strongly correspond to certain kinds of crucial properties of fine-grained objects, but also our self-consistency designs can effectively overcome simplicity bias in fine-grained hashing.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2023.3299563