딥러닝 기반 기사단위 및 문단 단위별 분류

Text classification has been studied for a long time in the Natural Language Processing field. In this paper, we propose an article- and paragraph-level genre classification system using Word2Vec-based LSTM, GRU, and CNN models for large-scale English corpora. Both article- and paragraph-level class...

Full description

Saved in:
Bibliographic Details
Published in한국컴퓨터정보학회논문지 Vol. 23; no. 11; pp. 31 - 41
Main Author 김유희(Euhee Kim)
Format Journal Article
LanguageKorean
Published 한국컴퓨터정보학회 01.11.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Text classification has been studied for a long time in the Natural Language Processing field. In this paper, we propose an article- and paragraph-level genre classification system using Word2Vec-based LSTM, GRU, and CNN models for large-scale English corpora. Both article- and paragraph-level classification performed best in accuracy with LSTM, which was followed by GRU and CNN in accuracy performance. Thus, it is to be confirmed that in evaluating the classification performance of LSTM, GRU, and CNN, the word sequential information for articles is better than the word feature extraction for paragraphs when the pre-trained Word2Vec-based word embeddings are used in both deep learning-based article- and paragraph-level classification tasks. KCI Citation Count: 1
ISSN:1598-849X
2383-9945
DOI:10.9708/jksci.2018.23.11.031