Hope speech detection in YouTube comments

Recent work on language technology has tried to recognize abusive language such as those containing hate speech and cyberbullying and enhance offensive language identification to moderate social media platforms. Most of these systems depend on machine learning models using a tagged dataset. Such mod...

Full description

Saved in:

Bibliographic Details
Published in	Social network analysis and mining Vol. 12; no. 1; p. 75
Main Author	Chakravarthi, Bharathi Raja
Format	Journal Article
Language	English
Published	Vienna Springer Vienna 01.12.2022 Springer Nature B.V
Subjects	Applications of Graph Theory and Complex Networks Artificial intelligence Bullying Computer Science COVID-19 Data Mining and Knowledge Discovery Datasets Decision making Decision trees Digital media Disease transmission Economics Game Theory Hate speech Humanities Inclusion Internet Language Law Machine learning Mental health Methodology of the Social Sciences Minority & ethnic groups Multiculturalism & pluralism Original Original Article People with disabilities Regression analysis Social and Behav. Sciences Social exclusion Social integration Social media Social networks Statistics for Social Sciences STEM education Technology Unpleasant Virtual communities Dravidian languages Inclusion Equality Multilingual Hope speech Diversity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recent work on language technology has tried to recognize abusive language such as those containing hate speech and cyberbullying and enhance offensive language identification to moderate social media platforms. Most of these systems depend on machine learning models using a tagged dataset. Such models have been successful in detecting and eradicating negativity. However, an additional study has lately been conducted on the enhancement of free expression through social media. Instead of eliminating ostensibly unpleasant words, we created a multilingual dataset to recognize and encourage positivity in the comments, and we propose a novel custom deep network architecture, which uses a concatenation of embedding from T5-Sentence. We have experimented with multiple machine learning models, including SVM, logistic regression, K-nearest neighbour, decision tree, logistic neighbours, and we propose new CNN based model. Our proposed model outperformed all others with a macro F1-score of 0.75 for English, 0.62 for Tamil, and 0.67 for Malayalam.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1869-5450 1869-5469
DOI:	10.1007/s13278-022-00901-z