TransIntegrator: capture nearly full protein-coding transcript variants via integrating Illumina and PacBio transcriptomes
Abstract Genes have the ability to produce transcript variants that perform specific cellular functions. However, accurately detecting all transcript variants remains a long-standing challenge, especially when working with poorly annotated genomes or without a known genome. To address this issue, we...
Saved in:
Published in | Briefings in bioinformatics Vol. 24; no. 6 |
---|---|
Main Authors | , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Oxford University Press
22.09.2023
Oxford Publishing Limited (England) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract
Genes have the ability to produce transcript variants that perform specific cellular functions. However, accurately detecting all transcript variants remains a long-standing challenge, especially when working with poorly annotated genomes or without a known genome. To address this issue, we have developed a new computational method, TransIntegrator, which enables transcriptome-wide detection of novel transcript variants. For this, we determined 10 Illumina sequencing transcriptomes and a PacBio full-length transcriptome for consecutive embryo development stages of amphioxus, a species of great evolutionary importance. Based on the transcriptomes, we employed TransIntegrator to create a comprehensive transcript variant library, namely iTranscriptome. The resulting iTrancriptome contained 91 915 distinct transcript variants, with an average of 2.4 variants per gene. This substantially improved current amphioxus genome annotation by expanding the number of genes from 21 954 to 38 777. Further analysis manifested that the gene expansion was largely ascribed to integration of multiple Illumina datasets instead of involving the PacBio data. Moreover, we demonstrated an example application of TransIntegrator, via generating iTrancriptome, in aiding accurate transcriptome assembly, which significantly outperformed other hybrid methods such as IDP-denovo and Trinity. For user convenience, we have deposited the source codes of TransIntegrator on GitHub as well as a conda package in Anaconda. In summary, this study proposes an affordable but efficient method for reliable transcriptomic research in most species. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1467-5463 1477-4054 |
DOI: | 10.1093/bib/bbad334 |