A visual attention-based keyword extraction for document classification

Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perce...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 77; no. 19; pp. 25355 - 25367
Main Authors Wu, Xing, Du, Zhikang, Guo, Yike
Format Journal Article
LanguageEnglish
Published New York Springer US 01.10.2018
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perceivable information. A new probabilistic keyword extraction algorithm is proposed, which is inspired by the visual attention mechanism. An unsupervised neural network based pre-training method is proposed for training the semantic attention based keyword extraction algorithm, which is helpful in extracting keywords with rich contextual information from the document. A bidirectional Long short-term memory network combined with the proposed semantic keyword extraction algorithm is designed for both topic and sentiment classification tasks. Experiments on four large scale datasets show that the proposed visual attention based keyword extraction algorithm gives a better performance than the baseline methods. The semantic attention based keyword extraction method is significant in summarizing the content of a document, which is very useful for large scale document classification.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-018-5788-9