BAFN: Bi-direction Attention based Fusion Network for Multimodal Sentiment Analysis

Attention-based networks currently identify their effectiveness in multimodal sentiment analysis. However, existing methods ignore the redundancy of auxiliary modalities. More importantly, existing methods only attend to top-down attention (static process) or down-top attention (implicit process), l...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 33; no. 4; p. 1
Main Authors Tang, Jiajia, Liu, Dongjun, Jin, Xuanyu, Peng, Yong, Zhao, Qibin, Ding, Yu, Kong, Wanzeng
Format Journal Article
LanguageEnglish
Published New York IEEE 01.04.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Attention-based networks currently identify their effectiveness in multimodal sentiment analysis. However, existing methods ignore the redundancy of auxiliary modalities. More importantly, existing methods only attend to top-down attention (static process) or down-top attention (implicit process), leading to the coarse-grained multimodal sentiment context. In this paper, during the preprocessing period, we first propose the multimodal dynamic enhanced block to capture the intra-modality sentiment context. This can effectively decrease the intra-modality redundancy of auxiliary modalities. Furthermore, the bi-direction attention block is proposed to capture fine-grained multimodal sentiment context via the novel bi-direction multimodal dynamic routing mechanism. Specifically, the bi-direction attention block first highlights the explicit and low-level multimodal sentiment context. Then, the low-level multimodal context is transmitted to a carefully designed bi-direction multimodal dynamic routing procedure. This allows us to dynamically update and investigate high-level and much more fine-grained multimodal sentiment contexts. The experiments demonstrate that our fusion network can achieve state-of-the-art performance. Notably, our model outperforms the best baseline on the metric 'Acc-7' with an improvement of 6.9%.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2022.3218018