Self-augmented sequentiality-aware encoding for aspect term extraction

Aspect Term Extraction (ATE) is a natural language processing task which recognizes and extracts aspect terms from sentences. The recent study in this field successfully leverages Pretrained Language Model (PLM) and data augmentation, which contributes to the construction of knowledgeable encoder fo...

Full description

Saved in:
Bibliographic Details
Published inInformation processing & management Vol. 61; no. 3; p. 103656
Main Authors Xu, Qingting, Hong, Yu, Chen, Jiaxiang, Yao, Jianming, Zhou, Guodong
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Aspect Term Extraction (ATE) is a natural language processing task which recognizes and extracts aspect terms from sentences. The recent study in this field successfully leverages Pretrained Language Model (PLM) and data augmentation, which contributes to the construction of knowledgeable encoder for ATE. In this paper, we propose a novel method to strengthen ATE encoder, using self-augmented mechanism and sequentiality-aware network. Specifically, self augmentation expands training data with paraphrases and fuses correlated information using PLM during encoding. Back translation is used for obtaining pragmatically-diverse paraphrases. On the basis, BiGRU is utilized as a supplementary neural layer for extra-encoding, so as to involve sequence features into the output hidden states of PLM. We refer to our method as Self-augmented Sequentiality-sensitive Encoding (SSE for short). We carry out experiments on the benchmark SemEval datasets, including L-14, R-14, R-15 and R-16. Experimental results show that SSE yields substantial improvements, compared to BERT-based baselines. In particular, it is demonstrated that SSE is able to collaborate with other data augmentation approaches to produce more significant improvements, where the resultant ATE performance is up to 86.74%, 88.91%, 77.43% and 82.42% F1-scores (for L-14, R-14, R-15 and R-16 respectively).
ISSN:0306-4573
1873-5371
DOI:10.1016/j.ipm.2024.103656