Japanese Short Text Classification Based on CNN-BiLSTM-Attention

Due to the limited context information of the text, the traditional statistical feature-based method is difficult to effectively model the semantic relationship in the Japanese short text classification task, resulting in limited classification effect. To this end, this paper introduces the CNN-BiLS...

Full description

Saved in:

Bibliographic Details
Published in	Procedia computer science Vol. 262; pp. 320 - 329
Main Authors	Chen, Tianyang, Xie, Zexian
Format	Journal Article
Language	English
Published	Elsevier B.V 2025
Subjects	CNN-BiLSTM-Attention Global Context Modeling Japanese Short Text Classification Local Feature Extraction Self-Attention Mechanism CNN-BiLSTM-Attention Self-Attention Mechanism Japanese Short Text Classification Local Feature Extraction Global Context Modeling
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Due to the limited context information of the text, the traditional statistical feature-based method is difficult to effectively model the semantic relationship in the Japanese short text classification task, resulting in limited classification effect. To this end, this paper introduces the CNN-BiLSTM-Attention fusion model, which aims to fully extract the local and global features in the short text and improve the classification accuracy. First, Convolutional Neural Networks (CNNs) are used to extract local n-gram features and identify phrase patterns. Then, the global context information of the text is modeled by Bidirectional Long Short-Term Memory (BiLSTM) to capture the influence of special structures such as auxiliary words and honorifics. Finally, the self-attention mechanism (Self-Attention) assigns weights to different words, so that the model focuses on the key information of classification and reduces the interference of grammatical vocabulary. In addition, Dropout regularization and Softmax classification layer are introduced to enhance the model’s robustness and capacity for adaptation. Experimental results show that the CNN-BiLSTM-Attention model achieves the best performance in all structures, and the overall WOSS (Word Order Sensitivity Score) score is higher than other models. In the SVO (Subject-Verb-Object) structure, the model reaches 0.94, which is 20.5% higher than CNN’s 0.78, indicating that it has a more accurate understanding of standard word order sentences.
ISSN:	1877-0509 1877-0509
DOI:	10.1016/j.procs.2025.05.059