wQFM-TREE: highly accurate and scalable quartet-based species tree inference from gene trees

methods are becoming increasingly popular for species tree estimation from multi-locus data in the presence of gene tree discordance. Accurate Species TRee Algorithm (ASTRAL), a leading method in this class, solves the Maximum Quartet Support Species Tree problem within a constrained solution space,...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics advances Vol. 5; no. 1; p. vbaf053
Main Authors Rafi, Abdur, Rumi, Ahmed Mahir Sultan, Hakim, Sheikh Azizul, Sohaib, Tahmid, Md Toki, Momin, Rabib Jahin Ibn, Zaman, Tanjeem Azwad, Reaz, Rezwana, Bayzid, Md Shamsuzzoha
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.01.2025
Subjects
Online AccessGet full text
ISSN2635-0041
2635-0041
DOI10.1093/bioadv/vbaf053

Cover

Loading…
More Information
Summary:methods are becoming increasingly popular for species tree estimation from multi-locus data in the presence of gene tree discordance. Accurate Species TRee Algorithm (ASTRAL), a leading method in this class, solves the Maximum Quartet Support Species Tree problem within a constrained solution space, while heuristics like Weighted Quartet Fiduccia-Mattheyses (wQFM) and Weighted Quartet MaxCut (wQMC) use weighted quartets and a divide-and-conquer strategy. Recent studies showed wQFM to be more accurate than ASTRAL and wQMC, though its scalability is hindered by the computational demands of explicitly generating and weighting quartets. Here, we introduce wQFM-TREE, a novel summary method that enhances wQFM by avoiding explicit quartet generation and weighting, enabling its application to large datasets. Extensive simulations under diverse and challenging model conditions, with hundreds or thousands of taxa and genes, consistently demonstrate that wQFM-TREE matches or improves upon the accuracy of ASTRAL. It outperformed ASTRAL in 25 of 27 model conditions (statistically significant in 20) involving 200-1000 taxa. Moreover, applying wQFM-TREE to re-analyze the green plant dataset from the One Thousand Plant Transcriptomes Initiative produced a tree highly congruent with established evolutionary relationships of plants. wQFM-TREE's remarkable accuracy and scalability make it a strong competitor to leading methods. Its algorithmic and combinatorial innovations also enhance quartet-based computations, advancing phylogenetic estimation. wQFM-TREE is freely available in open source form at https://github.com/abdur-rafi/wQFM-TREE.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Abdur Rafi and Ahmed Mahir Sultan Rumi equal contribution.
ISSN:2635-0041
2635-0041
DOI:10.1093/bioadv/vbaf053