Parallel Resampling in the Particle Filter

Modern parallel computing devices, such as the graphics processing unit (GPU), have gained significant traction in scientific and statistical computing. They are particularly well-suited to data-parallel algorithms such as the particle filter, or more generally sequential Monte Carlo (SMC), which ar...

Full description

Saved in:

Bibliographic Details
Published in	Journal of computational and graphical statistics Vol. 25; no. 3; pp. 789 - 805
Main Authors	Murray, Lawrence M., Lee, Anthony, Jacob, Pierre E.
Format	Journal Article
Language	English
Published	Alexandria Taylor & Francis 02.07.2016 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America Taylor & Francis Ltd
Subjects	Algorithms Central processing units CPUs Graphics processing unit Monte Carlo simulation Parallel computing Parallel processing Particle methods Propagation Sequential Monte Carlo Statistical Computing Statistical inference Studies
Online Access	Get full text
ISSN	1061-8600 1537-2715
DOI	10.1080/10618600.2015.1062015

Cover

Loading…

More Information
Summary:	Modern parallel computing devices, such as the graphics processing unit (GPU), have gained significant traction in scientific and statistical computing. They are particularly well-suited to data-parallel algorithms such as the particle filter, or more generally sequential Monte Carlo (SMC), which are increasingly used in statistical inference. SMC methods carry a set of weighted particles through repeated propagation, weighting, and resampling steps. The propagation and weighting steps are straightforward to parallelize, as they require only independent operations on each particle. The resampling step is more difficult, as standard schemes require a collective operation, such as a sum, across particle weights. Focusing on this resampling step, we analyze two alternative schemes that do not involve a collective operation (Metropolis and rejection resamplers), and compare them to standard schemes (multinomial, stratified, and systematic resamplers). We find that, in certain circumstances, the alternative resamplers can perform significantly faster on a GPU, and to a lesser extent on a CPU, than the standard approaches. Moreover, in single precision, the standard approaches are numerically biased for upward of hundreds of thousands of particles, while the alternatives are not. This is particularly important given greater single- than double-precision throughput on modern devices, and the consequent temptation to use single precision with a greater number of particles. Finally, we provide auxiliary functions useful for implementation, such as for the permutation of ancestry vectors to enable in-place propagation. Supplementary materials are available online.
Bibliography:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14
ISSN:	1061-8600 1537-2715
DOI:	10.1080/10618600.2015.1062015