Cost-based analyses of random neighbor and derived sampling methods
Random neighbor sampling, or RN , is a method for sampling vertices with a mean degree greater than that of the graph. Instead of naïvely sampling a vertex from a graph and retaining it (‘random vertex’ or RV ), a neighbor of the vertex is selected instead. While considerable research has analyzed v...
Saved in:
Published in | Applied network science Vol. 7; no. 1; pp. 1 - 23 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Cham
Springer International Publishing
01.06.2022
Springer Nature B.V SpringerOpen |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Random neighbor sampling, or
RN
, is a method for sampling vertices with a mean degree greater than that of the graph. Instead of naïvely sampling a vertex from a graph and retaining it (‘random vertex’ or
RV
), a neighbor of the vertex is selected instead. While considerable research has analyzed various aspects of
RN
, the extra cost of sampling a second vertex is typically not addressed. This paper explores
RN
sampling from the perspective of cost. We break down the cost of sampling into two distinct costs, that of sampling a vertex and that of sampling a neighbor of an already sampled vertex, and we also include the cost of actually selecting a vertex/neighbor and retaining it for use rather than discarding it. With these three costs as our cost-model, we explore
RN
and compare it to
RV
in a more fair manner than comparisons that have been made in previous research. As we delve into costs, a number of variants to
RN
are introduced. These variants improve on the cost-effectiveness of
RN
in regard to particular costs and priorities. Our full cost-benefit analysis highlights strengths and weaknesses of the methods. We particularly focus on how our methods perform for sampling high-degree and low-degree vertices, which further enriches the understanding of the methods and how they can be practically applied. We also suggest ‘two-phase’ methods that specifically seek to cover both high-degree and low-degree vertices in separate sampling phases. |
---|---|
ISSN: | 2364-8228 2364-8228 |
DOI: | 10.1007/s41109-022-00475-x |