Affective image recognition with multi-attribute knowledge in deep neural networks

Incorporating visual attributes such as objects and scene features into deep models has been proved valuable for affective image recognition. In general, the existing works achieve it by either fine-tuning popular CNNs for emotion recognition, or connecting external attributes through additional wel...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 83; no. 6; pp. 18353 - 18379
Main Authors Zhang, Hao, Luo, Gaifang, Yue, Yingying, He, Kangjian, Xu, Dan
Format Journal Article
LanguageEnglish
Published New York Springer US 01.02.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Incorporating visual attributes such as objects and scene features into deep models has been proved valuable for affective image recognition. In general, the existing works achieve it by either fine-tuning popular CNNs for emotion recognition, or connecting external attributes through additional well-designed modules. However, they do not realize the diversity of emotional representations for different styles of affective images, or utilize the inter-hierarchical correlations in deep models. In this paper, we propose a multi-attribute model which incorporates different visual concepts to solve this problem. The model consists of 2 branch modules from local to global view: one trains a gram encoder to capture local visual details, and the other trains a semantic tokenizer to extract global semantics simultaneously. Through a fusion layer, we represent image sentiments with aggregated attributes. Different from the existing methods, our model is composed of stacked CNNs without additional backbones, and it shows the great ability to learn hierarchical attributes from internal intermediate features. Furthermore, inspired by deep metric learning, we design an emotional contrast loss to consider dynamic polarity embedded in affective images, and optimize the model within cross-entropy loss as well. A comprehensive evaluation on 5 datasets supports that our model outperforms the others.
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-16081-7