A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks
An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
16.02.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | An explosion of high-throughput DNA sequencing in the past decade has led to
a surge of interest in population-scale inference with whole-genome data.
Recent work in population genetics has centered on designing inference methods
for relatively simple model classes, and few scalable general-purpose inference
techniques exist for more realistic, complex models. To achieve this, two
inferential challenges need to be addressed: (1) population data are
exchangeable, calling for methods that efficiently exploit the symmetries of
the data, and (2) computing likelihoods is intractable as it requires
integrating over a set of correlated, extremely high-dimensional latent
variables. These challenges are traditionally tackled by likelihood-free
methods that use scientific simulators to generate datasets and reduce them to
hand-designed, permutation-invariant summary statistics, often leading to
inaccurate inference. In this work, we develop an exchangeable neural network
that performs summary statistic-free, likelihood-free inference. Our framework
can be applied in a black-box fashion across a variety of simulation-based
tasks, both within and outside biology. We demonstrate the power of our
approach on the recombination hotspot testing problem, outperforming the
state-of-the-art. |
---|---|
DOI: | 10.48550/arxiv.1802.06153 |