Network Sampling Based on Centrality Measures for Relational Classification

Many real-world networks, such as the Internet, social networks, biological networks, and others, are massive in size, which impairs their processing and analysis. To cope with this, the network size could be reduced without losing relevant information. In this paper, we extend a work that proposed...

Full description

Saved in:

Bibliographic Details
Published in	Information Management and Big Data Vol. 656; pp. 43 - 56
Main Authors	Berton, Lilian, Vega-Oliveros, Didier A., Valverde-Rebaza, Jorge, da Silva, Andre Tavares, Lopes, Alneu de Andrade
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2017 Springer International Publishing
Series	Communications in Computer and Information Science
Subjects	Centrality measures Complex networks Missing data Network sampling Relational classification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Many real-world networks, such as the Internet, social networks, biological networks, and others, are massive in size, which impairs their processing and analysis. To cope with this, the network size could be reduced without losing relevant information. In this paper, we extend a work that proposed a sampling method based on the following centrality measures: degree, k-core, clustering, eccentricity and structural holes. For our experiments, we remove \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$30\%$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$50\%$$\end{document} of the vertices and their edges from the original network. After, we evaluate our proposal on six real-world networks on relational classification task using eight different classifiers. Classification results achieved on sampled graphs generated from our proposal are similar to those obtained on the entire graphs. The execution time for learning step of the classifier is shorter on the sampled graph compared to the entire graph and random sampling. In most cases, the original graph was reduced by up to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$50\%$$\end{document} of its initial number of edges without losing topological properties.
ISBN:	9783319552088 3319552082
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-3-319-55209-5_4