A Word-Concept Heterogeneous Graph Convolutional Network for Short Text Classification

Text classification is an important task in natural language processing. However, most of the existing models focus on long texts, and their performance in short texts is not satisfied due to the problem of data sparsity. To solve this problem, recent studies have introduced the concepts of words to...

Full description

Saved in:

Bibliographic Details
Published in	Neural processing letters Vol. 55; no. 1; pp. 735 - 750
Main Authors	Yang, Shigang, Liu, Yongguo, Zhang, Yun, Zhu, Jiajing
Format	Journal Article
Language	English
Published	New York Springer US 01.02.2023 Springer Nature B.V
Subjects	Artificial Intelligence Artificial neural networks Classification Complex Systems Computational Intelligence Computer Science Convolution Deep learning Knowledge Natural language processing Neural networks Noise Representations Semantics Sparsity Support vector machines Text categorization Texts Words (language) Concepts Short text classification Graph convolution network Words
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Text classification is an important task in natural language processing. However, most of the existing models focus on long texts, and their performance in short texts is not satisfied due to the problem of data sparsity. To solve this problem, recent studies have introduced the concepts of words to enrich the representation of short texts. However, these methods ignore the interactive information between words and concepts and lead introduced concepts to be noises unsuitable for semantic understanding. In this paper, we propose a new model called word-concept heterogeneous graph convolution network (WC-HGCN) to introduce interactive information between words and concepts for short text classification. WC-HGCN develops words and relevant concepts and adopts graph convolution networks to learn the representation with interactive information. Furthermore, we design an innovative learning strategy, which can make full use of the introduced concept information. Experimental results on seven real short text datasets show that our model outperforms latest baseline methods.
ISSN:	1370-4621 1573-773X
DOI:	10.1007/s11063-022-10906-6