Spam profile detection in social networks based on public features

In the context of Online Social Networks, Spam profiles are not just a source of unwanted ads, but a serious security threat used by online criminals and terrorists for various malicious purposes. Recently, such criminals were able to steal a number of accounts that belong to NatWest bank's cus...

Full description

Saved in:

Bibliographic Details
Published in	2017 8th International Conference on Information and Communication Systems (ICICS) pp. 130 - 135
Main Authors	Al-Zoubi, Ala' M., Alqatawna, Ja'far, Paris, Hossam
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2017
Subjects	Classification Classification algorithms Context Decision trees Electronic mail Feature extraction Feature selection Social Networks Spam Twitter
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the context of Online Social Networks, Spam profiles are not just a source of unwanted ads, but a serious security threat used by online criminals and terrorists for various malicious purposes. Recently, such criminals were able to steal a number of accounts that belong to NatWest bank's customers. Their attack vector was based on spam tweets posted by a Twitter account which looked very close to NatWest customer support account and leaded users to a link of a phishing site. In this study, we investigate the nature of spam profiles in Twitter with a goal to improve social spam detection. Based on a set of publicly available features, we develop spam profiles detection models. At this stage, a dataset of 82 Twitter's profiles are collected and analyzed. With feature engineering, we investigate ten binary and simple features that can be used to classify spam profiles. Moreover, a feature selection process is utilized to identify the most influencing features in the process of detecting spam profiles. For feature selection, two methods are used ReliefF and Information Gain. While for classification, four classification algorithms are applied and compared: Decision Trees, Multilayer Perceptron, k-Nearest neighbors and Naive Bayes. Preliminary experiments in this work show that the promising detection rates can be obtained using such features regardless of the language of the tweets.
DOI:	10.1109/IACS.2017.7921959