Self-augmented sequentiality-aware encoding for aspect term extraction

Aspect Term Extraction (ATE) is a natural language processing task which recognizes and extracts aspect terms from sentences. The recent study in this field successfully leverages Pretrained Language Model (PLM) and data augmentation, which contributes to the construction of knowledgeable encoder fo...

Full description

Saved in:

Bibliographic Details
Published in	Information processing & management Vol. 61; no. 3; p. 103656
Main Authors	Xu, Qingting, Hong, Yu, Chen, Jiaxiang, Yao, Jianming, Zhou, Guodong
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.05.2024
Subjects	Aspect term extraction Back translation Data augmentation Sequentiality-aware Aspect term extraction Data augmentation Sequentiality-aware Back translation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Aspect Term Extraction (ATE) is a natural language processing task which recognizes and extracts aspect terms from sentences. The recent study in this field successfully leverages Pretrained Language Model (PLM) and data augmentation, which contributes to the construction of knowledgeable encoder for ATE. In this paper, we propose a novel method to strengthen ATE encoder, using self-augmented mechanism and sequentiality-aware network. Specifically, self augmentation expands training data with paraphrases and fuses correlated information using PLM during encoding. Back translation is used for obtaining pragmatically-diverse paraphrases. On the basis, BiGRU is utilized as a supplementary neural layer for extra-encoding, so as to involve sequence features into the output hidden states of PLM. We refer to our method as Self-augmented Sequentiality-sensitive Encoding (SSE for short). We carry out experiments on the benchmark SemEval datasets, including L-14, R-14, R-15 and R-16. Experimental results show that SSE yields substantial improvements, compared to BERT-based baselines. In particular, it is demonstrated that SSE is able to collaborate with other data augmentation approaches to produce more significant improvements, where the resultant ATE performance is up to 86.74%, 88.91%, 77.43% and 82.42% F1-scores (for L-14, R-14, R-15 and R-16 respectively).
ISSN:	0306-4573 1873-5371
DOI:	10.1016/j.ipm.2024.103656