Efficient computation of the phylogenetic likelihood function on multi-gene alignments and multi-core architectures
The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with...
Saved in:
Published in | Philosophical transactions of the Royal Society of London. Series B. Biological sciences Vol. 363; no. 1512; pp. 3977 - 3984 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
London
The Royal Society
27.12.2008
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on 'gappy' multi-gene alignments. By 'gappy' we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAxML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function. |
---|---|
Bibliography: | istex:A218241A4FAB42899867EA54ACF64B65664C5F71 ArticleID:rstb20080163 ark:/67375/V84-KNM8N9X3-8 href:3977.pdf Discussion Meeting Issue 'Statistical and computational challenges in molecular phylogenetics and evolution' organized by Ziheng Yang and Nick Goldman ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0962-8436 1471-2970 |
DOI: | 10.1098/rstb.2008.0163 |