5′ Long Serial Analysis of Gene Expression (LongSAGE) and 3′ LongSAGE for Transcriptome Characterization and Genome Annotation

Complete genome annotation relies on precise identification of transcription units bounded by a transcription initiation site (TIS) and a polyadenylation site (PAS). To facilitate this process, we developed a set of two complementary methods, 5′ Long serial analysis of gene expression (LS) and 3′LS....

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 101; no. 32; pp. 11701 - 11706
Main Authors Wei, Chia-Lin, Ng, Patrick, Chiu, Kuo Ping, Wong, Chee Hong, Ang, Lipovich, Leonard, Liu, Edison T., Ruan, Yijun, White, Raymond L.
Format Journal Article
LanguageEnglish
Published United States National Academy of Sciences 10.08.2004
National Acad Sciences
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Complete genome annotation relies on precise identification of transcription units bounded by a transcription initiation site (TIS) and a polyadenylation site (PAS). To facilitate this process, we developed a set of two complementary methods, 5′ Long serial analysis of gene expression (LS) and 3′LS. These analyses are based on the original SAGE and LS methods coupled with full-length cDNA cloning, and enable the high-throughput extraction of the first and the last 20 bp of each transcript. We demonstrate that the mapping of 5′LS and 3′LS tags to the genome allows the localization of TIS and PAS. By using 537 tag pairs mapping to the region of known genes, we confirmed that >90% of the tag pairs appropriately assigned to the first and last exons. Moreover, by using tag sequences as primers for RT-PCRs, we were able to recover putative full-length transcripts in 81% of the attempts. This large-scale generation of transcript terminal tags is at least 20-40 times more efficient than full-length cDNA cloning and sequencing in the identification of complete transcription units. The apparent precision and deep coverage makes 5′LS and 3′LS an advanced approach for genome annotation through whole-transcriptome characterization.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
Note. While this manuscript was in preparation, Shiraki et al. (32) published their work on cap analysis of gene expression (CAGE), that is very similar to the 5′LS method described here. It should be noted that whereas both CAGE and 5′LS are useful for identifying TISs and possibly promoter regions, the combined 5′LS and 3′LS mapping strategy as outlined here possesses an obvious advantage in identifying new transcript units with greater confidence than the single-tag mapping.
To whom correspondence should be addressed at: Cloning and Sequencing Group, Genome Institute of Singapore, 60 Biopolis Street, Genome 02-01, Singapore 138672. E-mail: ruanyj@gis.a-star.edu.sg.
C.-L.W. and P.N. contributed equally to this work.
Communicated by Raymond L. White, University of California, San Francisco, CA, May 20, 2004
Abbreviations: SAGE, serial analysis of gene expression; LS, LongSage; MPSS, massively parallel signature sequencing; TIS, transcription initiation site; PAS, polyadenylation site; UCSC, University of California, Santa Cruz; tp, tag position.
ISSN:0027-8424
1091-6490
DOI:10.1073/pnas.0403514101