Exploring demographic information in online social networks for improving content classification

The daily interaction between users within online social networks (OSNs) is an effective way to analyze and interpret its context in real time in order to capture the interests, preferences, and concerns of the OSNs users. These offer a unique information source for several applications in several f...

Full description

Saved in:
Bibliographic Details
Published inJournal of King Saud University. Computer and information sciences Vol. 32; no. 9; pp. 1034 - 1044
Main Authors Benkhelifa, Randa, Laallam, Fatima Zohra
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.11.2020
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The daily interaction between users within online social networks (OSNs) is an effective way to analyze and interpret its context in real time in order to capture the interests, preferences, and concerns of the OSNs users. These offer a unique information source for several applications in several fields such as trendsetting, future prediction, recommendation systems, community detection, and marketing. Most of the existing studies on text classification in OSNs rely on content based approach, in order to capture users interests through exploiting and categorizing the unstructured textual content shared by those users according to their topics. Moreover, users public profiles available on OSNs often reveal their demographic attributes such as age, gender, education, marital status, etc., which can play an essential role in identifying users interests and preferences. User demographic attributes can provide some preferences for some topics of interests. People with different demographic attributes may be interested in different topics, while people with similar demographic attributes may have the same interests. Usually, young people are more interested in technology than old people, who are more interested in the political news than young people. In this paper, we propose a demographic-content-based approach which uses both users demographic attributes and the textual content to classify OSNs posts using six classifiers ANN, k-NN, Naïve Bayes, Decision Tree, Decision rules and SVM. The experiments are done on a large Facebook dataset in order to analyze the effect of these demographic attributes on the performance of the categorization of the shared textual content in OSNs.
ISSN:1319-1578
DOI:10.1016/j.jksuci.2018.10.012