Cost-based analyses of random neighbor and derived sampling methods

Random neighbor sampling, or RN , is a method for sampling vertices with a mean degree greater than that of the graph. Instead of naïvely sampling a vertex from a graph and retaining it (‘random vertex’ or RV ), a neighbor of the vertex is selected instead. While considerable research has analyzed v...

Full description

Saved in:
Bibliographic Details
Published inApplied network science Vol. 7; no. 1; pp. 1 - 23
Main Authors Novick, Yitzchak, Bar-Noy, Amotz
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.06.2022
Springer Nature B.V
SpringerOpen
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Random neighbor sampling, or RN , is a method for sampling vertices with a mean degree greater than that of the graph. Instead of naïvely sampling a vertex from a graph and retaining it (‘random vertex’ or RV ), a neighbor of the vertex is selected instead. While considerable research has analyzed various aspects of RN , the extra cost of sampling a second vertex is typically not addressed. This paper explores RN sampling from the perspective of cost. We break down the cost of sampling into two distinct costs, that of sampling a vertex and that of sampling a neighbor of an already sampled vertex, and we also include the cost of actually selecting a vertex/neighbor and retaining it for use rather than discarding it. With these three costs as our cost-model, we explore RN and compare it to RV in a more fair manner than comparisons that have been made in previous research. As we delve into costs, a number of variants to RN are introduced. These variants improve on the cost-effectiveness of RN in regard to particular costs and priorities. Our full cost-benefit analysis highlights strengths and weaknesses of the methods. We particularly focus on how our methods perform for sampling high-degree and low-degree vertices, which further enriches the understanding of the methods and how they can be practically applied. We also suggest ‘two-phase’ methods that specifically seek to cover both high-degree and low-degree vertices in separate sampling phases.
ISSN:2364-8228
2364-8228
DOI:10.1007/s41109-022-00475-x