Image Retrieval Using Object Semantic Aggregation Histogram
Simulating primates’ ability to make fine visual discriminations for extracting visual features remains a challenge. To address this issue, a novel method was proposed, which was named the object semantic aggregation histogram . By using the developed method, the mid-level object features and high-l...
Saved in:
Published in | Cognitive computation Vol. 15; no. 5; pp. 1736 - 1747 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.09.2023
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Simulating primates’ ability to make fine visual discriminations for extracting visual features remains a challenge. To address this issue, a novel method was proposed, which was named the
object semantic aggregation histogram
. By using the developed method, the mid-level object features and high-level semantic features can be aggregated into a discriminative compact representation for image retrieval. The proposed method includes three main highlights: (1) two adaptive semantic kernels were proposed to bridge the object features and semantic features. It can serve as a connecting link for detailed object representation. (2) Both object and semantic features were aggregated to provide a discriminative representation by depicting the target objects, semantic cognition, and spatial layouts. This approach is consistent with the coarse-to-fine nature of perception in the visual hierarchy. (3) A simple yet generic method was introduced to implement dimensionality reduction and whitening. It can provide a good choice of using various regularizations to decide optimal compact representation. Experiments on benchmark datasets confirmed that the proposed method can effectively improve the retrieval performance in terms of mAP. The mAPs of the proposed method using 128-dimensionality representation were significantly greater than that of the CroW, SBA, DSFH, and DTFH methods by 0.042, 0.035, 0.061, and 0.029 on the Oxford5k dataset and by 0.019, 0.017, 0.083, and 0.034 on the Holidays dataset. The proposed method is regarded as an effective and competitive method of aggregating multiple visual hierarchy features, while no complex handcrafting or training is required. |
---|---|
ISSN: | 1866-9956 1866-9964 |
DOI: | 10.1007/s12559-023-10143-6 |