The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome

Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes c...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in genetics Vol. 10; p. 654
Main Authors Hoang, Nam V, Furtado, Agnelo, Perlo, Virginie, Botha, Frederik C, Henry, Robert J
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 23.07.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed, and many new generally shorter transcripts were detected by normalization. For the same input cDNA and data yield, the normalized library recovered more total transcript isoforms and number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above ∼1.25 kb and more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising ∼52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk, and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which ∼80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Ishaan Gupta, Cornell University, United States
This article was submitted to Genomic Assay Technology, a section of the journal Frontiers in Genetics
Reviewed by: Prashanth N. Suravajhala, Birla Institute of Scientific Research, India; Milind Ratnaparkhe, ICAR Indian Institute of Soybean Research, India
ISSN:1664-8021
1664-8021
DOI:10.3389/fgene.2019.00654