MF-BERT: Multimodal Fusion in Pre-Trained BERT for Sentiment Analysis

Multimodal sentiment analysis mainly concentrates on language, acoustic and visual information. Previous work based on BERT utilizes only text (language) representation to fine-tune BERT, while ignoring the importance of nonverbal information. Due to the fact that features extracted from a single mo...

Full description

Saved in:

Bibliographic Details
Published in	IEEE signal processing letters Vol. 29; pp. 454 - 458
Main Authors	He, Jiaxuan, Hu, Haifeng
Format	Journal Article
Language	English
Published	New York IEEE 2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Acoustics Analytical models Bit error rate Convolution Data mining Feature extraction Fuses Internal updating Mathematical models multimodal fusion BERT multimodal sentiment analysis Parameters Sentiment analysis Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Multimodal sentiment analysis mainly concentrates on language, acoustic and visual information. Previous work based on BERT utilizes only text (language) representation to fine-tune BERT, while ignoring the importance of nonverbal information. Due to the fact that features extracted from a single modality may contain uncertainty, it is challenging for BERT to perform well in real-world applications. In this paper, we propose a multimodal fusion BERT that can explore the time-dependent interactions among different modalities. Additionally, prior BERT-based methods tend to train the models with only one optimizer to update the parameters. However, we argue that BERT has been pre-trained with a lot of corpora so it needs to be fine-tuned slightly. Therefore, an internal updating mechanism is introduced to avoid the overfitting of the model in the training process. We set two optimizers for multimodal fusion BERT and other components of the model with different learning rates, which enables the model to attain optimal parameters. The results of experiments on public datasets demonstrate that our model is superior to the baselines and achieves the state-of-the-art.
ISSN:	1070-9908 1558-2361
DOI:	10.1109/LSP.2021.3139856