Emotion Detection in Hinglish(Hindi+English) Code-Mixed Social Media Text

Human communication is often embedded with emotion and it can be expressed via different mediums like vocal interaction, texts, non-verbal communication like facial expressions and gestures. Even though textual communication is a more common way of interaction the rapid utilization of social media h...

Full description

Saved in:
Bibliographic Details
Published inProcedia computer science Vol. 171; pp. 1346 - 1352
Main Authors Sasidhar, T Tulasi, B, Premjith, P, Soman K
Format Journal Article
LanguageEnglish
Published Elsevier B.V 2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Human communication is often embedded with emotion and it can be expressed via different mediums like vocal interaction, texts, non-verbal communication like facial expressions and gestures. Even though textual communication is a more common way of interaction the rapid utilization of social media has taken it to another level. Social media open an easy way for people to express their emotions. So people around the world utilize this opportunity and express themselves in social media platforms through texts over various subjects. People make use of these platforms to exhibit their like or dislike towards something, how they felt about a situation, their reaction to a government decision and so on. Hence, understanding the emotion expressed in such social media texts has a significant number of applications emphasizing the need to detect it. The human brain is quite intelligent to sense such kind of emotion associated with a text but for a machine to gain such perception is quite difficult. In Natural Language Processing, emotion recognition and classification is a commonly researched task where a model can detect these type of emotions. It is quite challenging when it comes to Indian Languages due to the lack of data, as well as being a multilingual society people tend to use code-mixed pattern in social media. The lack of annotated corpus in the Hindi-English code-mixed domain and unavailability of the standard model to classify, left this area of research still an exploring region. In this paper, to analyze such data, we created a dataset of 12000 Hindi-English code-mixed texts collected from various sources and annotated them with emotions Happy, Sad and Anger. In our work, a pretrained bilingual model is used to generate feature vectors and deep neural networks are employed as classification models. It is observed that with the selected features, CNN-BiLSTM gave better performance compared to other experimented models with 83.21% classification accuracy.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2020.04.144