A visual attention-based keyword extraction for document classification
Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perce...
Saved in:
Published in | Multimedia tools and applications Vol. 77; no. 19; pp. 25355 - 25367 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.10.2018
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perceivable information. A new probabilistic keyword extraction algorithm is proposed, which is inspired by the visual attention mechanism. An unsupervised neural network based pre-training method is proposed for training the semantic attention based keyword extraction algorithm, which is helpful in extracting keywords with rich contextual information from the document. A bidirectional Long short-term memory network combined with the proposed semantic keyword extraction algorithm is designed for both topic and sentiment classification tasks. Experiments on four large scale datasets show that the proposed visual attention based keyword extraction algorithm gives a better performance than the baseline methods. The semantic attention based keyword extraction method is significant in summarizing the content of a document, which is very useful for large scale document classification. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-018-5788-9 |