Detecting Offensive Language on Arabic Social Media Using Deep Learning
Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social med...
Saved in:
Published in | 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) pp. 466 - 471 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2019
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/SNAMS.2019.8931839 |
Cover
Loading…
Abstract | Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social media. This paper presents the results of evaluating the performance of four different neural network architectures for this task: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), Bi-LSTM with attention mechanism, and a combined CNN-LSTM architecture. These networks are trained and tested on a labeled dataset of Arabic YouTube comments. We run this dataset through a series of pre-processing steps and use Arabic word embeddings to represent the comments. We also apply Bayesian optimization techniques to tune the hyperparameters of the neural network models. We train and test each network using 5-fold cross validation. The CNN-LSTM achieves the highest recall (83.46%), followed by the CNN (82.24%), the Bi-LSTM with attention (81.51%) and the Bi-LSTM (80.97%). |
---|---|
AbstractList | Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social media. This paper presents the results of evaluating the performance of four different neural network architectures for this task: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), Bi-LSTM with attention mechanism, and a combined CNN-LSTM architecture. These networks are trained and tested on a labeled dataset of Arabic YouTube comments. We run this dataset through a series of pre-processing steps and use Arabic word embeddings to represent the comments. We also apply Bayesian optimization techniques to tune the hyperparameters of the neural network models. We train and test each network using 5-fold cross validation. The CNN-LSTM achieves the highest recall (83.46%), followed by the CNN (82.24%), the Bi-LSTM with attention (81.51%) and the Bi-LSTM (80.97%). |
Author | Nikolov, Nikola S. Mohaouchane, Hanane Mourhir, Asmaa |
Author_xml | – sequence: 1 givenname: Hanane surname: Mohaouchane fullname: Mohaouchane, Hanane organization: School of Science and Engineering, Al Akhawayn University,Ifrane,Morocco – sequence: 2 givenname: Asmaa surname: Mourhir fullname: Mourhir, Asmaa organization: School of Science and Engineering, Al Akhawayn University,Ifrane,Morocco – sequence: 3 givenname: Nikola S. surname: Nikolov fullname: Nikolov, Nikola S. organization: University of Limerick,Department of Computer Science and Information Systems,Limerick,Ireland |
BookMark | eNotj9FKwzAUhiPohZt7Ab3JC7T2JGmaXJZNp9C5i27g3ThLTkpgpqOtgm-v4q5-Pvj44J-x69QnYuweihygsI_tW71pc1GAzY2VYKS9YjOohAFhlX6_ZesVTeSmmDq-DYHSGL-IN5i6T-yI94nXAx6j423vIp74hnxEvh___BXRmTeEQ_qlO3YT8DTS4rJztn9-2i1fsma7fl3WTRYBzJShMQIUSGcDWe2l96UoAwildIWVPoJ2xpsSSBjhvQmFpgKD0j4goJZOztnDfzcS0eE8xA8cvg-Xb_IHasRIGw |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/SNAMS.2019.8931839 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 172812946X 9781728129464 |
EndPage | 471 |
ExternalDocumentID | 8931839 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i118t-a8821413c9fe96d3dd525f124467a76b16c8d851e282dd8f06e0af46dfa1a63c3 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:38:00 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i118t-a8821413c9fe96d3dd525f124467a76b16c8d851e282dd8f06e0af46dfa1a63c3 |
PageCount | 6 |
ParticipantIDs | ieee_primary_8931839 |
PublicationCentury | 2000 |
PublicationDate | 2019-Oct. |
PublicationDateYYYYMMDD | 2019-10-01 |
PublicationDate_xml | – month: 10 year: 2019 text: 2019-Oct. |
PublicationDecade | 2010 |
PublicationTitle | 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) |
PublicationTitleAbbrev | SNAMS |
PublicationYear | 2019 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.907613 |
Snippet | Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 466 |
SubjectTerms | Arabic language attention model Computer architecture convolutional neural network deep learning Feature extraction long short-term memory offensive language detection Recurrent neural networks social media Social networking (online) Task analysis Training |
Title | Detecting Offensive Language on Arabic Social Media Using Deep Learning |
URI | https://ieeexplore.ieee.org/document/8931839 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0AJ09qwPiBZg8eLbS07LJHIyIxgiZKwo1Md2cNMSmElIu_3tm2YDQevLWbJtt2Nnnzdt6bBbhGmZhYogmsSzFIUEWBppg5j881UmOUjbx3eDKV41nyOO_Pa3Cz98IQUSE-o46_LGr5dmW2fqusy9jqAb0OdV5mpVdr54MJdfd1ytzXi7U4-uWDP05MKQBjdAiT3VSlTuSjs83Tjvn81YXxv-9yBK1va5542YPOMdQoa8LDkHwtgAfEs3OlJF08VTuRYpWJ2w2mSyNKL67wxRkUhVhADInWomqy-t6C2ej-7W4cVCckBEsmBnmAnB9HDENGO9LSxtb2e33nIVsqVDKNpBlYzqmIiZW1AxdKCtEl0jqMUMYmPoFGtsroFIThO4U9pZEBC63WvXCALrSesUkV0hk0_U9YrMsmGIvq-8__Hr6AAx-IUvXWhka-2dIlo3eeXhVh-wI52Jvb |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH5BPOhJDRh_24NHBxvbWno0IqICmggJN_LWvhJiMggZF_96221gNB68rc2SdX2H73193_cKcIM8UiFH5WmToBehCDxJoeU8LtdIlBI6cN7hwZD3xtHzJJ5U4HbrhSGiXHxGDfeY1_L1Qq3dUVnTYqsD9B3YtbgfxYVba-OE8WXzfWjZr5Nr2fgXr_64MyWHjO4BDDYfK5QiH411ljTU568-jP9dzSHUv8157G0LO0dQobQGjx1y1QA7wV6NKUTprF-eRbJFyu5WmMwVK9y4zJVnkOVyAdYhWrKyzeqsDuPuw-i-55V3JHhzSw0yD22GHFggUtKQ5DrUOm7FxoE2Fyh4EnDV1jarIkuttG4bn5OPJuLaYIA8VOExVNNFSifAlB0JbAmJFrJQS9ny22h87TgbFz6dQs1twnRZtMGYlv9_9vf0Nez1RoP-tP80fDmHfReUQgN3AdVstaZLi-VZcpWH8AuVKJ8o |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+Sixth+International+Conference+on+Social+Networks+Analysis%2C+Management+and+Security+%28SNAMS%29&rft.atitle=Detecting+Offensive+Language+on+Arabic+Social+Media+Using+Deep+Learning&rft.au=Mohaouchane%2C+Hanane&rft.au=Mourhir%2C+Asmaa&rft.au=Nikolov%2C+Nikola+S.&rft.date=2019-10-01&rft.pub=IEEE&rft.spage=466&rft.epage=471&rft_id=info:doi/10.1109%2FSNAMS.2019.8931839&rft.externalDocID=8931839 |