Detecting Offensive Language on Arabic Social Media Using Deep Learning

Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social med...

Full description

Saved in:

Bibliographic Details
Published in	2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) pp. 466 - 471
Main Authors	Mohaouchane, Hanane, Mourhir, Asmaa, Nikolov, Nikola S.
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2019
Subjects	Arabic language attention model Computer architecture convolutional neural network deep learning Feature extraction long short-term memory offensive language detection Recurrent neural networks social media Social networking (online) Task analysis Training
Online Access	Get full text
DOI	10.1109/SNAMS.2019.8931839

Cover

Loading…

Abstract	Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social media. This paper presents the results of evaluating the performance of four different neural network architectures for this task: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), Bi-LSTM with attention mechanism, and a combined CNN-LSTM architecture. These networks are trained and tested on a labeled dataset of Arabic YouTube comments. We run this dataset through a series of pre-processing steps and use Arabic word embeddings to represent the comments. We also apply Bayesian optimization techniques to tune the hyperparameters of the neural network models. We train and test each network using 5-fold cross validation. The CNN-LSTM achieves the highest recall (83.46%), followed by the CNN (82.24%), the Bi-LSTM with attention (81.51%) and the Bi-LSTM (80.97%).
AbstractList	Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social media. This paper presents the results of evaluating the performance of four different neural network architectures for this task: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), Bi-LSTM with attention mechanism, and a combined CNN-LSTM architecture. These networks are trained and tested on a labeled dataset of Arabic YouTube comments. We run this dataset through a series of pre-processing steps and use Arabic word embeddings to represent the comments. We also apply Bayesian optimization techniques to tune the hyperparameters of the neural network models. We train and test each network using 5-fold cross validation. The CNN-LSTM achieves the highest recall (83.46%), followed by the CNN (82.24%), the Bi-LSTM with attention (81.51%) and the Bi-LSTM (80.97%).
Author	Nikolov, Nikola S. Mohaouchane, Hanane Mourhir, Asmaa
Author_xml	– sequence: 1 givenname: Hanane surname: Mohaouchane fullname: Mohaouchane, Hanane organization: School of Science and Engineering, Al Akhawayn University,Ifrane,Morocco – sequence: 2 givenname: Asmaa surname: Mourhir fullname: Mourhir, Asmaa organization: School of Science and Engineering, Al Akhawayn University,Ifrane,Morocco – sequence: 3 givenname: Nikola S. surname: Nikolov fullname: Nikolov, Nikola S. organization: University of Limerick,Department of Computer Science and Information Systems,Limerick,Ireland
BookMark	eNotj9FKwzAUhiPohZt7Ab3JC7T2JGmaXJZNp9C5i27g3ThLTkpgpqOtgm-v4q5-Pvj44J-x69QnYuweihygsI_tW71pc1GAzY2VYKS9YjOohAFhlX6_ZesVTeSmmDq-DYHSGL-IN5i6T-yI94nXAx6j423vIp74hnxEvh___BXRmTeEQ_qlO3YT8DTS4rJztn9-2i1fsma7fl3WTRYBzJShMQIUSGcDWe2l96UoAwildIWVPoJ2xpsSSBjhvQmFpgKD0j4goJZOztnDfzcS0eE8xA8cvg-Xb_IHasRIGw
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/SNAMS.2019.8931839
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	172812946X 9781728129464
EndPage	471
ExternalDocumentID	8931839
Genre	orig-research
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i118t-a8821413c9fe96d3dd525f124467a76b16c8d851e282dd8f06e0af46dfa1a63c3
IEDL.DBID	RIE
IngestDate	Thu Jun 29 18:38:00 EDT 2023
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i118t-a8821413c9fe96d3dd525f124467a76b16c8d851e282dd8f06e0af46dfa1a63c3
PageCount	6
ParticipantIDs	ieee_primary_8931839
PublicationCentury	2000
PublicationDate	2019-Oct.
PublicationDateYYYYMMDD	2019-10-01
PublicationDate_xml	– month: 10 year: 2019 text: 2019-Oct.
PublicationDecade	2010
PublicationTitle	2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS)
PublicationTitleAbbrev	SNAMS
PublicationYear	2019
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.907613
Snippet	Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of...
SourceID	ieee
SourceType	Publisher
StartPage	466
SubjectTerms	Arabic language attention model Computer architecture convolutional neural network deep learning Feature extraction long short-term memory offensive language detection Recurrent neural networks social media Social networking (online) Task analysis Training
Title	Detecting Offensive Language on Arabic Social Media Using Deep Learning
URI	https://ieeexplore.ieee.org/document/8931839
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0AJ09qwPiBZg8eLbS07LJHIyIxgiZKwo1Md2cNMSmElIu_3tm2YDQevLWbJtt2Nnnzdt6bBbhGmZhYogmsSzFIUEWBppg5j881UmOUjbx3eDKV41nyOO_Pa3Cz98IQUSE-o46_LGr5dmW2fqusy9jqAb0OdV5mpVdr54MJdfd1ytzXi7U4-uWDP05MKQBjdAiT3VSlTuSjs83Tjvn81YXxv-9yBK1va5542YPOMdQoa8LDkHwtgAfEs3OlJF08VTuRYpWJ2w2mSyNKL67wxRkUhVhADInWomqy-t6C2ej-7W4cVCckBEsmBnmAnB9HDENGO9LSxtb2e33nIVsqVDKNpBlYzqmIiZW1AxdKCtEl0jqMUMYmPoFGtsroFIThO4U9pZEBC63WvXCALrSesUkV0hk0_U9YrMsmGIvq-8__Hr6AAx-IUvXWhka-2dIlo3eeXhVh-wI52Jvb
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH5BPOhJDRh_24NHBxvbWno0IqICmggJN_LWvhJiMggZF_96221gNB68rc2SdX2H73193_cKcIM8UiFH5WmToBehCDxJoeU8LtdIlBI6cN7hwZD3xtHzJJ5U4HbrhSGiXHxGDfeY1_L1Qq3dUVnTYqsD9B3YtbgfxYVba-OE8WXzfWjZr5Nr2fgXr_64MyWHjO4BDDYfK5QiH411ljTU568-jP9dzSHUv8157G0LO0dQobQGjx1y1QA7wV6NKUTprF-eRbJFyu5WmMwVK9y4zJVnkOVyAdYhWrKyzeqsDuPuw-i-55V3JHhzSw0yD22GHFggUtKQ5DrUOm7FxoE2Fyh4EnDV1jarIkuttG4bn5OPJuLaYIA8VOExVNNFSifAlB0JbAmJFrJQS9ny22h87TgbFz6dQs1twnRZtMGYlv9_9vf0Nez1RoP-tP80fDmHfReUQgN3AdVstaZLi-VZcpWH8AuVKJ8o
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+Sixth+International+Conference+on+Social+Networks+Analysis%2C+Management+and+Security+%28SNAMS%29&rft.atitle=Detecting+Offensive+Language+on+Arabic+Social+Media+Using+Deep+Learning&rft.au=Mohaouchane%2C+Hanane&rft.au=Mourhir%2C+Asmaa&rft.au=Nikolov%2C+Nikola+S.&rft.date=2019-10-01&rft.pub=IEEE&rft.spage=466&rft.epage=471&rft_id=info:doi/10.1109%2FSNAMS.2019.8931839&rft.externalDocID=8931839