Social Media Hate Speech Detection Using Explainable Artificial Intelligence (XAI)

Explainable artificial intelligence (XAI) characteristics have flexible and multifaceted potential in hate speech detection by deep learning models. Interpreting and explaining decisions made by complex artificial intelligence (AI) models to understand the decision-making process of these model were...

Full description

Saved in:

Bibliographic Details
Published in	Algorithms Vol. 15; no. 8; p. 291
Main Authors	Mehta, Harshkumar, Passi, Kalpdrum
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.08.2022
Subjects	Artificial intelligence Artificial neural networks BERT Coders Computational linguistics Data analysis Datasets Decision making Decision trees Explainable artificial intelligence Hate speech hate speech detection Information management Language processing LIME Multilayer perceptrons Natural language interfaces neural networks offensive languages Social media
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Explainable artificial intelligence (XAI) characteristics have flexible and multifaceted potential in hate speech detection by deep learning models. Interpreting and explaining decisions made by complex artificial intelligence (AI) models to understand the decision-making process of these model were the aims of this research. As a part of this research study, two datasets were taken to demonstrate hate speech detection using XAI. Data preprocessing was performed to clean data of any inconsistencies, clean the text of the tweets, tokenize and lemmatize the text, etc. Categorical variables were also simplified in order to generate a clean dataset for training purposes. Exploratory data analysis was performed on the datasets to uncover various patterns and insights. Various pre-existing models were applied to the Google Jigsaw dataset such as decision trees, k-nearest neighbors, multinomial naïve Bayes, random forest, logistic regression, and long short-term memory (LSTM), among which LSTM achieved an accuracy of 97.6%. Explainable methods such as LIME (local interpretable model—agnostic explanations) were applied to the HateXplain dataset. Variants of BERT (bidirectional encoder representations from transformers) model such as BERT + ANN (artificial neural network) with an accuracy of 93.55% and BERT + MLP (multilayer perceptron) with an accuracy of 93.67% were created to achieve a good performance in terms of explainability using the ERASER (evaluating rationales and simple English reasoning) benchmark.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1999-4893 1999-4893
DOI:	10.3390/a15080291