Detecting Offensive Language on Arabic Social Media Using Deep Learning

Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social med...

Full description

Saved in:
Bibliographic Details
Published in2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) pp. 466 - 471
Main Authors Mohaouchane, Hanane, Mourhir, Asmaa, Nikolov, Nikola S.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2019
Subjects
Online AccessGet full text
DOI10.1109/SNAMS.2019.8931839

Cover

Loading…
Abstract Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social media. This paper presents the results of evaluating the performance of four different neural network architectures for this task: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), Bi-LSTM with attention mechanism, and a combined CNN-LSTM architecture. These networks are trained and tested on a labeled dataset of Arabic YouTube comments. We run this dataset through a series of pre-processing steps and use Arabic word embeddings to represent the comments. We also apply Bayesian optimization techniques to tune the hyperparameters of the neural network models. We train and test each network using 5-fold cross validation. The CNN-LSTM achieves the highest recall (83.46%), followed by the CNN (82.24%), the Bi-LSTM with attention (81.51%) and the Bi-LSTM (80.97%).
AbstractList Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social media. This paper presents the results of evaluating the performance of four different neural network architectures for this task: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), Bi-LSTM with attention mechanism, and a combined CNN-LSTM architecture. These networks are trained and tested on a labeled dataset of Arabic YouTube comments. We run this dataset through a series of pre-processing steps and use Arabic word embeddings to represent the comments. We also apply Bayesian optimization techniques to tune the hyperparameters of the neural network models. We train and test each network using 5-fold cross validation. The CNN-LSTM achieves the highest recall (83.46%), followed by the CNN (82.24%), the Bi-LSTM with attention (81.51%) and the Bi-LSTM (80.97%).
Author Nikolov, Nikola S.
Mohaouchane, Hanane
Mourhir, Asmaa
Author_xml – sequence: 1
  givenname: Hanane
  surname: Mohaouchane
  fullname: Mohaouchane, Hanane
  organization: School of Science and Engineering, Al Akhawayn University,Ifrane,Morocco
– sequence: 2
  givenname: Asmaa
  surname: Mourhir
  fullname: Mourhir, Asmaa
  organization: School of Science and Engineering, Al Akhawayn University,Ifrane,Morocco
– sequence: 3
  givenname: Nikola S.
  surname: Nikolov
  fullname: Nikolov, Nikola S.
  organization: University of Limerick,Department of Computer Science and Information Systems,Limerick,Ireland
BookMark eNotj9FKwzAUhiPohZt7Ab3JC7T2JGmaXJZNp9C5i27g3ThLTkpgpqOtgm-v4q5-Pvj44J-x69QnYuweihygsI_tW71pc1GAzY2VYKS9YjOohAFhlX6_ZesVTeSmmDq-DYHSGL-IN5i6T-yI94nXAx6j423vIp74hnxEvh___BXRmTeEQ_qlO3YT8DTS4rJztn9-2i1fsma7fl3WTRYBzJShMQIUSGcDWe2l96UoAwildIWVPoJ2xpsSSBjhvQmFpgKD0j4goJZOztnDfzcS0eE8xA8cvg-Xb_IHasRIGw
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/SNAMS.2019.8931839
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 172812946X
9781728129464
EndPage 471
ExternalDocumentID 8931839
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i118t-a8821413c9fe96d3dd525f124467a76b16c8d851e282dd8f06e0af46dfa1a63c3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:38:00 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i118t-a8821413c9fe96d3dd525f124467a76b16c8d851e282dd8f06e0af46dfa1a63c3
PageCount 6
ParticipantIDs ieee_primary_8931839
PublicationCentury 2000
PublicationDate 2019-Oct.
PublicationDateYYYYMMDD 2019-10-01
PublicationDate_xml – month: 10
  year: 2019
  text: 2019-Oct.
PublicationDecade 2010
PublicationTitle 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS)
PublicationTitleAbbrev SNAMS
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.907613
Snippet Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of...
SourceID ieee
SourceType Publisher
StartPage 466
SubjectTerms Arabic language
attention model
Computer architecture
convolutional neural network
deep learning
Feature extraction
long short-term memory
offensive language detection
Recurrent neural networks
social media
Social networking (online)
Task analysis
Training
Title Detecting Offensive Language on Arabic Social Media Using Deep Learning
URI https://ieeexplore.ieee.org/document/8931839
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0AJ09qwPiBZg8eLbS07LJHIyIxgiZKwo1Md2cNMSmElIu_3tm2YDQevLWbJtt2Nnnzdt6bBbhGmZhYogmsSzFIUEWBppg5j881UmOUjbx3eDKV41nyOO_Pa3Cz98IQUSE-o46_LGr5dmW2fqusy9jqAb0OdV5mpVdr54MJdfd1ytzXi7U4-uWDP05MKQBjdAiT3VSlTuSjs83Tjvn81YXxv-9yBK1va5542YPOMdQoa8LDkHwtgAfEs3OlJF08VTuRYpWJ2w2mSyNKL67wxRkUhVhADInWomqy-t6C2ej-7W4cVCckBEsmBnmAnB9HDENGO9LSxtb2e33nIVsqVDKNpBlYzqmIiZW1AxdKCtEl0jqMUMYmPoFGtsroFIThO4U9pZEBC63WvXCALrSesUkV0hk0_U9YrMsmGIvq-8__Hr6AAx-IUvXWhka-2dIlo3eeXhVh-wI52Jvb
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH5BPOhJDRh_24NHBxvbWno0IqICmggJN_LWvhJiMggZF_96221gNB68rc2SdX2H73193_cKcIM8UiFH5WmToBehCDxJoeU8LtdIlBI6cN7hwZD3xtHzJJ5U4HbrhSGiXHxGDfeY1_L1Qq3dUVnTYqsD9B3YtbgfxYVba-OE8WXzfWjZr5Nr2fgXr_64MyWHjO4BDDYfK5QiH411ljTU568-jP9dzSHUv8157G0LO0dQobQGjx1y1QA7wV6NKUTprF-eRbJFyu5WmMwVK9y4zJVnkOVyAdYhWrKyzeqsDuPuw-i-55V3JHhzSw0yD22GHFggUtKQ5DrUOm7FxoE2Fyh4EnDV1jarIkuttG4bn5OPJuLaYIA8VOExVNNFSifAlB0JbAmJFrJQS9ny22h87TgbFz6dQs1twnRZtMGYlv9_9vf0Nez1RoP-tP80fDmHfReUQgN3AdVstaZLi-VZcpWH8AuVKJ8o
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+Sixth+International+Conference+on+Social+Networks+Analysis%2C+Management+and+Security+%28SNAMS%29&rft.atitle=Detecting+Offensive+Language+on+Arabic+Social+Media+Using+Deep+Learning&rft.au=Mohaouchane%2C+Hanane&rft.au=Mourhir%2C+Asmaa&rft.au=Nikolov%2C+Nikola+S.&rft.date=2019-10-01&rft.pub=IEEE&rft.spage=466&rft.epage=471&rft_id=info:doi/10.1109%2FSNAMS.2019.8931839&rft.externalDocID=8931839