Automatic cyberbullying detection: A systematic review

Automatic cyberbullying detection is a task of growing interest, particularly in the Natural Language Processing and Machine Learning communities. Not only is it challenging, but it is also a relevant need given how social networks have become a vital part of individuals' lives and how dire the...

Full description

Saved in:

Bibliographic Details
Published in	Computers in human behavior Vol. 93; pp. 333 - 345
Main Authors	Rosa, H., Pereira, N., Ribeiro, R., Ferreira, P.C., Carvalho, J.P., Oliveira, S., Coheur, L., Paulino, P., Veiga Simão, A.M., Trancoso, I.
Format	Journal Article
Language	English
Published	Elmsford Elsevier Ltd 01.04.2019 Elsevier Science Ltd
Subjects	Abusive language Adolescents Automatic cyberbullying detection Bullying Cyberbullying Datasets Machine learning Natural language processing Social networks Systematic review Systems analysis Automatic cyberbullying detection Social networks Natural language processing Abusive language Cyberbullying Machine learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Automatic cyberbullying detection is a task of growing interest, particularly in the Natural Language Processing and Machine Learning communities. Not only is it challenging, but it is also a relevant need given how social networks have become a vital part of individuals' lives and how dire the consequences of cyberbullying can be, especially among adolescents. In this work, we conduct an in-depth analysis of 22 studies on automatic cyberbullying detection, complemented by an experiment to validate current practices through the analysis of two datasets. Results indicated that cyberbullying is often misrepresented in the literature, leading to inaccurate systems that would have little real-world application. Criteria concerning cyberbullying definitions and other methodological concerns seem to be often dismissed. Additionally, there is no uniformity regarding the methodology to evaluate said systems and the natural imbalance of datasets remains an issue. This paper aims to direct future research on the subject towards a viewpoint that is more coherent with the definition and representation of the phenomenon, so that future systems can have a practical and impactful application. Recommendations on future works are also made. •Cyberbullying is often misrepresented in automatic detection state-of-the-art.•Available systems do not capture all four key criteria of cyberbullying.•Datasets used to train detection systems are incomplete.•Feature engineering performance improvement is marginal.•Cyberbullying detection systems seem not applicable to real world situations.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Feature-3 ObjectType-Evidence Based Healthcare-1
ISSN:	0747-5632 1873-7692
DOI:	10.1016/j.chb.2018.12.021