Text Augmentation Based on Integrated Gradients Attribute Score for Aspect-based Sentiment Analysis

One of the factors that determine the effectiveness of deep learning models for sentiment analysis is the availability of high-quality training data. Data augmentation is a strategy to increase the amount of training data by applying semantically specified adjustments to training data. Such techniqu...

Full description

Saved in:
Bibliographic Details
Published in2023 IEEE International Conference on Big Data and Smart Computing (BigComp) pp. 227 - 234
Main Authors Santoso, Noviyanti, Mendonca, Israel, Aritsugi, Masayoshi
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.02.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:One of the factors that determine the effectiveness of deep learning models for sentiment analysis is the availability of high-quality training data. Data augmentation is a strategy to increase the amount of training data by applying semantically specified adjustments to training data. Such technique has been widely adopted in computer vision tasks, however, it has not been adequately addressed for aspect-based sentiment analysis (ABSA) tasks. ABSA is a text analysis method that discovers aspects of sentences as well as their polarity. In this paper, we investigate the effect of data augmentation on the hybrid approach for aspect-based sentiment analysis (HAABSA) model. We propose an extension of easy data augmentation (EDA) by combining the effectiveness of part-of-speech tagging, word sense disambiguation, and feature importance selection. We apply our proposed technique to the SemEval 2015 and SemEval 2016 datasets and compare it to existing approaches. Experimental results demonstrate that when compared to a model trained without data augmentation, our method is able to improve the accuracy between 0.6 and 3.4 percentage points. Furthermore, we show that an augmentation method that does an informed selection is more effective than the randomized ones. Moreover, we show that the combination of our techniques improves the accuracy and quality of generated sentences.
ISSN:2375-9356
DOI:10.1109/BigComp57234.2023.00044