Ranked choice voting for representative transcripts with TRaCE

Abstract Summary Genome sequencing projects annotate protein-coding gene models with multiple transcripts, aiming to represent all of the available transcript evidence. However, downstream analyses often operate on only one representative transcript per gene locus, sometimes known as the canonical t...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics Vol. 38; no. 1; pp. 261 - 264
Main Authors	Olson, Andrew J, Ware, Doreen
Format	Journal Article
Language	English
Published	England Oxford University Press 22.12.2021
Subjects	Applications Notes bioinformatics genes loci Politics Protein Isoforms RNA-Seq sequence analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Summary Genome sequencing projects annotate protein-coding gene models with multiple transcripts, aiming to represent all of the available transcript evidence. However, downstream analyses often operate on only one representative transcript per gene locus, sometimes known as the canonical transcript. To choose canonical transcripts, Transcript Ranking and Canonical Election (TRaCE) holds an ‘election’ in which a set of RNA-seq samples rank transcripts by annotation edit distance. These sample-specific votes are tallied along with other criteria such as protein length and InterPro domain coverage. The winner is selected as the canonical transcript, but the election proceeds through multiple rounds of voting to order all the transcripts by relevance. Based on the set of expression data provided, TRaCE can identify the most common isoforms from a broad expression atlas or prioritize alternative transcripts expressed in specific contexts. Availability and implementation Transcript ranking code can be found on GitHub at {{https://github.com/warelab/TRaCE}}. Supplementary information Supplementary data are available at Bioinformatics online.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1367-4803 1367-4811 1460-2059 1367-4811
DOI:	10.1093/bioinformatics/btab542