A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks

An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose...

Full description

Saved in:

Bibliographic Details
Published in	bioRxiv
Main Authors	Chan, Jeffrey, Perrone, Valerio, Spence, Jeffrey P, Jenkins, Paul A, Mathieson, Sara, Song, Yun S
Format	Paper
Language	English
Published	Cold Spring Harbor Cold Spring Harbor Laboratory Press 05.11.2018
Subjects	DNA sequencing Gene loci Genomes Neural networks Population genetics Recombination Recombination hot spots
Online Access	Get full text

Cover

Loading…

More Information
Summary:	An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.
DOI:	10.1101/267211