An R-Transformer_BiLSTM Model Based on Attention for Multi-label Text Classification

Multi-label text classification task is one of the research hotspots in the field of natural language processing. However, most of the existing multi-label text classification models are only suitable for scenarios with a small number of labels and coarser granularity. Aiming at the problem of diffi...

Full description

Saved in:

Bibliographic Details
Published in	Neural processing letters Vol. 55; no. 2; pp. 1293 - 1316
Main Authors	Yan, Yaoyao, Liu, Fang’ai, Zhuang, Xuqiang, Ju, Jie
Format	Journal Article
Language	English
Published	New York Springer US 01.04.2023 Springer Nature B.V
Subjects	Accuracy Algorithms Artificial Intelligence Classification Complex Systems Computational Intelligence Computer Science Datasets Deep learning Embedding Labels Methods Natural language processing Neural networks Representations Semantics Speech Text categorization Transformers Multi-label text classification Label embedding R-Transformer BiLSTM Self-attention
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Multi-label text classification task is one of the research hotspots in the field of natural language processing. However, most of the existing multi-label text classification models are only suitable for scenarios with a small number of labels and coarser granularity. Aiming at the problem of difficulty in obtaining sequence information and obvious lack of semantic information when the text sequence grows, this paper proposes an R-Transformer_BiLSTM model based on label embedding and attention mechanism for multi-label text classification. First, we use the R-Transformer model to obtain the global and local information of the text sequence in combination with part-of-speech embedding. At the same time, we use BiLSTM+CRF to obtain the entity information of the text, and use the self-attention mechanism to obtain the keywords of the entity information, and then use bidirectional attention and label embedding to further generate text representation and label representation. Finally, the classifier performs text classification according to the label representation and text representation. In order to evaluate the performance of the model, we conducted a lot of experiments on the RCV1-V2 and AAPD datasets. Experimental results show that the model can effectively improve the efficiency and accuracy of multi-label text classification task.
ISSN:	1370-4621 1573-773X
DOI:	10.1007/s11063-022-10938-y