One-step Gibbs sampling for the generation of synthetic households
The generation of synthetic households is challenging due to the necessity of maintaining consistency between the two layers of interest: the household itself, and the individuals composing it. Hence, the problem is typically tackled in two steps, first focusing on the individual layer and then on t...
Saved in:
Published in | Transportation research. Part C, Emerging technologies Vol. 166; p. 104770 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The generation of synthetic households is challenging due to the necessity of maintaining consistency between the two layers of interest: the household itself, and the individuals composing it. Hence, the problem is typically tackled in two steps, first focusing on the individual layer and then on the household layer. The existing two-step simulation method proposes generating the households based on their roles which diminishes the generality of the approach and makes it difficult to reproduce despite its beneficial properties. In this paper, we propose an alternative extension of Gibbs sampling for generating hierarchical datasets such as synthetic households, in order to make simulation more general and reusable. We demonstrate the performance of our method in a case study based on the 2015 Swiss micro-census data and compare it against state-of-the-art approaches. We show the influence of modeling decisions on different performance metrics and how the analyst can easily enforce consistency while avoiding generating illogical households. We show that the algorithm maintains the conditional distributions while satisfying the marginals of all variables simultaneously, all while generating consistent synthetic households.
•We propose an alternative practical extension of Gibbs sampling for generating hierarchical datasets such as synthetic households.•By grouping all relevant variables at all levels into a single random vector and sorting individuals by decreasing age, we can generate realistically detailed synthetic populations.•Implementing a separate Gibbs sampler for each household size accelerates the generation process and can be viewed as a variance reduction technique.•We demonstrate how modeling decisions impact various performance metrics and how analysts can ensure consistency while avoiding the creation of illogical households.•The results indicate that model-based methods are superior to data-driven approaches in controlling the generation process, thereby preventing the creation of illogical households. |
---|---|
ISSN: | 0968-090X |
DOI: | 10.1016/j.trc.2024.104770 |