Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species

Tea is the oldest and among the world's most popular non-alcoholic beverages, which has important economic, health and cultural values. Tea is commonly produced from the leaves of tea plants (Camellia sinensis), which belong to the genus Camellia of family Theaceae. In the last decade, many stu...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 20; no. 1; p. 553
Main Authors	Li, Fang-Dong, Tong, Wei, Xia, En-Hua, Wei, Chao-Ling
Format	Journal Article
Language	English
Published	England BioMed Central Ltd 06.11.2019 BioMed Central BMC
Subjects	Alcoholic beverages Beverages Camellia sinensis Camellia sinensis - genetics Catechin de novo assembly Gene Expression Profiling Gene Expression Regulation, Plant Genome, Plant Genomes Genomics Health High-Throughput Nucleotide Sequencing - methods Metabolites Plant genetics Plant Leaves - genetics Plant metabolites RNA RNA, Messenger - genetics RNA, Messenger - metabolism Sequencing depth Tea plant Transcriptome Transcriptome - genetics Camellia sinensis de novo assembly Tea plant Sequencing depth Transcriptome
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Tea is the oldest and among the world's most popular non-alcoholic beverages, which has important economic, health and cultural values. Tea is commonly produced from the leaves of tea plants (Camellia sinensis), which belong to the genus Camellia of family Theaceae. In the last decade, many studies have generated the transcriptomes of tea plants at different developmental stages or under abiotic and/or biotic stresses to investigate the genetic basis of secondary metabolites that determine tea quality. However, these results exhibited large differences, particularly in the total number of reconstructed transcripts and the quality of the assembled transcriptomes. These differences largely result from limited knowledge regarding the optimized sequencing depth and assembler for transcriptome assembly of structurally complex plant species genomes. We employed different amounts of RNA-sequencing data, ranging from 4 to 84 Gb, to assemble the tea plant transcriptome using five well-known and representative transcript assemblers. Although the total number of assembled transcripts increased with increasing sequencing data, the proportion of unassembled transcripts became saturated as revealed by plant BUSCO datasets. Among the five representative assemblers, the Bridger package shows the best performance in both assembly completeness and accuracy as evaluated by the BUSCO datasets and genome alignment. In addition, we showed that Bridger and BinPacker harbored the shortest runtimes followed by SOAPdenovo and Trans-ABySS. The present study compares the performance of five representative transcript assemblers and investigates the key factors that affect the assembly quality of the transcriptome of the tea plants. This study will be of significance in helping the tea research community obtain better sequencing and assembly of tea plant transcriptomes under conditions of interest and may thus help to answer major biological questions currently facing the tea industry.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-019-3166-x