Identifying equally scoring trees in phylogenomics with incomplete data using Gentrius

Phylogenetic trees are routinely built from huge and yet incomplete multi-locus datasets often leading to multiple equally scoring trees under many common criteria. As typical tree inference software output only a single tree, identifying all trees with identical score challenges phylogenomics. Here...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Chernomor, Olga, Elgert, Christiane, Arndt Von Haeseler
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 20.01.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Phylogenetic trees are routinely built from huge and yet incomplete multi-locus datasets often leading to multiple equally scoring trees under many common criteria. As typical tree inference software output only a single tree, identifying all trees with identical score challenges phylogenomics. Here, we introduce Gentrius, an efficient algorithm that tackles this problem. We showed on simulated and biological datasets that Gentrius generates millions of trees within seconds. Depending on the distribution of missing data across species and loci and the inferred phylogeny, the number of equally good trees varies tremendously. The strict consensus tree computed from them displays all the branches unaffected by the pattern of missing data. Thus, Gentrius provides an important systematic assessment of phylogenetic trees inferred from incomplete data.Competing Interest StatementThe authors have declared no competing interest.
DOI:10.1101/2023.01.19.524678