Probabilistic inference of viral quasispecies subject to recombination

RNA viruses exist in their hosts as populations of different but related strains. The virus population, often called quasispecies, is shaped by a combination of genetic change and natural selection. Genetic change is due to both point mutations and recombination events. We present a jumping hidden M...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational biology Vol. 20; no. 2; p. 113
Main Authors Töpfer, Armin, Zagordi, Osvaldo, Prabhakaran, Sandhya, Roth, Volker, Halperin, Eran, Beerenwinkel, Niko
Format Journal Article
LanguageEnglish
Published United States 01.02.2013
Subjects
Online AccessGet more information

Cover

Loading…
More Information
Summary:RNA viruses exist in their hosts as populations of different but related strains. The virus population, often called quasispecies, is shaped by a combination of genetic change and natural selection. Genetic change is due to both point mutations and recombination events. We present a jumping hidden Markov model that describes the generation of viral quasispecies and a method to infer its parameters from next-generation sequencing data. The model introduces position-specific probability tables over the sequence alphabet to explain the diversity that can be found in the population at each site. Recombination events are indicated by a change of state, allowing a single observed read to originate from multiple sequences. We present a specific implementation of the expectation maximization (EM) algorithm to find maximum a posteriori estimates of the model parameters and a method to estimate the distribution of viral strains in the quasispecies. The model is validated on simulated data, showing the advantage of explicitly taking the recombination process into account, and applied to reads obtained from a clinical HIV sample.
ISSN:1557-8666
DOI:10.1089/cmb.2012.0232