Exploring demographic information in online social networks for improving content classification
The daily interaction between users within online social networks (OSNs) is an effective way to analyze and interpret its context in real time in order to capture the interests, preferences, and concerns of the OSNs users. These offer a unique information source for several applications in several f...
Saved in:
Published in | Journal of King Saud University. Computer and information sciences Vol. 32; no. 9; pp. 1034 - 1044 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.11.2020
Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The daily interaction between users within online social networks (OSNs) is an effective way to analyze and interpret its context in real time in order to capture the interests, preferences, and concerns of the OSNs users. These offer a unique information source for several applications in several fields such as trendsetting, future prediction, recommendation systems, community detection, and marketing. Most of the existing studies on text classification in OSNs rely on content based approach, in order to capture users interests through exploiting and categorizing the unstructured textual content shared by those users according to their topics. Moreover, users public profiles available on OSNs often reveal their demographic attributes such as age, gender, education, marital status, etc., which can play an essential role in identifying users interests and preferences. User demographic attributes can provide some preferences for some topics of interests. People with different demographic attributes may be interested in different topics, while people with similar demographic attributes may have the same interests. Usually, young people are more interested in technology than old people, who are more interested in the political news than young people. In this paper, we propose a demographic-content-based approach which uses both users demographic attributes and the textual content to classify OSNs posts using six classifiers ANN, k-NN, Naïve Bayes, Decision Tree, Decision rules and SVM. The experiments are done on a large Facebook dataset in order to analyze the effect of these demographic attributes on the performance of the categorization of the shared textual content in OSNs. |
---|---|
ISSN: | 1319-1578 |
DOI: | 10.1016/j.jksuci.2018.10.012 |