A Multimodal Sentiment Analysis Model for Graphic Texts Based on Deep Feature Interaction Networks

Due to the widespread adoption of social networks, image-text comments have become a prevalent mode of emotional expression compared to traditional text descriptions. However, there are currently two major challenges. The first is the question of how to extract rich representations effectively from...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of ambient computing and intelligence Vol. 15; no. 1; pp. 1 - 19
Main Authors Chang, Wanjun, Zhang, Dongfang
Format Journal Article
LanguageEnglish
Published Hershey IGI Global 2024
Subjects
Online AccessGet full text
ISSN1941-6237
1941-6245
DOI10.4018/IJACI.355192

Cover

Loading…
More Information
Summary:Due to the widespread adoption of social networks, image-text comments have become a prevalent mode of emotional expression compared to traditional text descriptions. However, there are currently two major challenges. The first is the question of how to extract rich representations effectively from both text and images, and the second is the question of how to extract cross-modal shared emotion features. This study proposes a multimodal sentiment analysis method based on a deep feature interaction network (DFINet). It leverages word-to-word graphs and deep attention interaction networks (DAIN) to learn text representations effectively from multiple subspaces. Additionally, it introduces a cross-modal attention interaction network to extract cross-modal shared emotion features efficiently. This approach helps alleviate the difficulties associated with acquiring image-text features and representing cross-modal shared emotion features. Experimental results on the Yelp dataset demonstrate the effectiveness of the DFINet method.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1941-6237
1941-6245
DOI:10.4018/IJACI.355192