Indian News Headlines Classification using Word Embedding Techniques and LSTM Model
Newspapers introduce us to the latest happenings around the world. Going paperless creates more opportunities for newspapers, like broadcasting news coverage and presenting breaking news conveniently. News headlines are considered under the short text category and are vibrant subjects for researcher...
Saved in:
Published in | Procedia computer science Vol. 218; pp. 899 - 907 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Newspapers introduce us to the latest happenings around the world. Going paperless creates more opportunities for newspapers, like broadcasting news coverage and presenting breaking news conveniently. News headlines are considered under the short text category and are vibrant subjects for researchers. Creating a dense vector from short texts has become a challenging and essential task in many applications such as recommender systems, context analysis, decision making, text classification, etc. This work not only targeted creating a classification model for the short text but also categorized the headlines with the ‘unknown’ category. Our work uses Bidirectional Encoder Representations from Transformers (BERT), cosine similarity index, word embedding, and Long Short-Term Memory (LSTM) network to classify news headlines in multiple categories. Our proposed method outperforms labeling the unlabeled data with the help of a BERT sentence encoder. The system uses LSTM to learn the headlines as input vectors and classify the headline text by the classifier. At the end of this experiment, the designed pipeline achieves remarkable precision at the class level. |
---|---|
ISSN: | 1877-0509 1877-0509 |
DOI: | 10.1016/j.procs.2023.01.070 |