Multi-Modal Sarcasm Detection with Prompt-Tuning

Sarcasm is a meaningful and effective form of expression which people often use to express sentiments that are contrary to their literal meaning. It is fairly common to encounter such expressions on social media platforms. Comparing with the traditional approach of text sarcasm detection, multi-moda...

Full description

Saved in:
Bibliographic Details
Published in2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT) pp. 1 - 8
Main Authors Ding, Daijun, Huang, Hu, Zhang, Bowen, Peng, Cheng, Li, Yangyang, Fu, Xianghua, Jing, Liwen
Format Conference Proceeding
LanguageEnglish
Published IEEE 09.12.2022
Subjects
Online AccessGet full text
DOI10.1109/ACAIT56212.2022.10137937

Cover

More Information
Summary:Sarcasm is a meaningful and effective form of expression which people often use to express sentiments that are contrary to their literal meaning. It is fairly common to encounter such expressions on social media platforms. Comparing with the traditional approach of text sarcasm detection, multi-modal sarcasm detection is proved to be more effective when dealing with information on social networks with various forms of communication. In this work, a prompt-tuning method is proposed for multi-modal sarcasm detection (Pmt-MmSD). Specifically, to model the incongruity of text modalities, we first build a prompt-PLM network. Second, to model the text-image incongruity, an inter-modality attention network (ImAN) is designed based on self-attention mechanism. In addition, we utilize the pre-trained Vision Transformer (ViT) network to process the image modality. Extensive experiments demonstrated the effectiveness of the proposed Pmt-MmSD model for multi-modal sarcasm detection, which significantly outperforms the state-of-the-art results.
DOI:10.1109/ACAIT56212.2022.10137937