Virus classification for viral genomic fragments using PhaGCN2

Abstract Viruses are the most ubiquitous and diverse entities in the biome. Due to the rapid growth of newly identified viruses, there is an urgent need for accurate and comprehensive virus classification, particularly for novel viruses. Here, we present PhaGCN2, which can rapidly classify the taxon...

Full description

Saved in:
Bibliographic Details
Published inBriefings in bioinformatics Vol. 24; no. 1
Main Authors Jiang, Jing-Zhe, Yuan, Wen-Guang, Shang, Jiayu, Shi, Ying-Hui, Yang, Li-Ling, Liu, Min, Zhu, Peng, Jin, Tao, Sun, Yanni, Yuan, Li-Hong
Format Journal Article
LanguageEnglish
Published England Oxford University Press 19.01.2023
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Viruses are the most ubiquitous and diverse entities in the biome. Due to the rapid growth of newly identified viruses, there is an urgent need for accurate and comprehensive virus classification, particularly for novel viruses. Here, we present PhaGCN2, which can rapidly classify the taxonomy of viral sequences at the family level and supports the visualization of the associations of all families. We evaluate the performance of PhaGCN2 and compare it with the state-of-the-art virus classification tools, such as vConTACT2, CAT and VPF-Class, using the widely accepted metrics. The results show that PhaGCN2 largely improves the precision and recall of virus classification, increases the number of classifiable virus sequences in the Global Ocean Virome dataset (v2.0) by four times and classifies more than 90% of the Gut Phage Database. PhaGCN2 makes it possible to conduct high-throughput and automatic expansion of the database of the International Committee on Taxonomy of Viruses. The source code is freely available at https://github.com/KennthShang/PhaGCN2.0.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1467-5463
1477-4054
1477-4054
DOI:10.1093/bib/bbac505