A visual attention-based keyword extraction for document classification

Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perce...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 77; no. 19; pp. 25355 - 25367
Main Authors	Wu, Xing, Du, Zhikang, Guo, Yike
Format	Journal Article
Language	English
Published	New York Springer US 01.10.2018 Springer Nature B.V
Subjects	Algorithms Classification Computer Communication Networks Computer Science Data Structures and Information Theory Information retrieval Keywords Multimedia Information Systems Natural language processing Neural networks Semantics Special Purpose and Application-Based Systems Training Visual task performance Long short-term memory Keyword extraction Semantic context Visual attention Document classification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perceivable information. A new probabilistic keyword extraction algorithm is proposed, which is inspired by the visual attention mechanism. An unsupervised neural network based pre-training method is proposed for training the semantic attention based keyword extraction algorithm, which is helpful in extracting keywords with rich contextual information from the document. A bidirectional Long short-term memory network combined with the proposed semantic keyword extraction algorithm is designed for both topic and sentiment classification tasks. Experiments on four large scale datasets show that the proposed visual attention based keyword extraction algorithm gives a better performance than the baseline methods. The semantic attention based keyword extraction method is significant in summarizing the content of a document, which is very useful for large scale document classification.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-018-5788-9