OMGene: mutual improvement of gene models through optimisation of evolutionary conservation

The accurate determination of the genomic coordinates for a given gene - its gene model - is of vital importance to the utility of its annotation, and the accuracy of bioinformatic analyses derived from it. Currently-available methods of computational gene prediction, while on the whole successful,...

Full description

Saved in:
Bibliographic Details
Published inBMC genomics Vol. 19; no. 1; p. 307
Main Authors Dunne, Michael P, Kelly, Steven
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 27.04.2018
BioMed Central
BMC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The accurate determination of the genomic coordinates for a given gene - its gene model - is of vital importance to the utility of its annotation, and the accuracy of bioinformatic analyses derived from it. Currently-available methods of computational gene prediction, while on the whole successful, frequently disagree on the model for a given predicted gene, with some or all of the variant gene models often failing to match the biologically observed structure. Many prediction methods can be bolstered by using experimental data such as RNA-seq. However, these resources are not always available, and rarely give a comprehensive portrait of an organism's transcriptome due to temporal and tissue-specific expression profiles. Orthology between genes provides evolutionary evidence to guide the construction of gene models. OMGene (Optimise My Gene) aims to improve gene model accuracy in the absence of experimental data by optimising the consistency of multiple sequence alignments of orthologous genes from multiple species. Using RNA-seq data sets from plants, mammals, and fungi, considering intron/exon junction representation and exon coverage, and assessing the intra-orthogroup consistency of subcellular localisation predictions, we demonstrate the utility of OMGene for improving gene models in annotated genomes. We show that significant improvements in the accuracy of gene model annotations can be made, both in established and in de novo annotated genomes, by leveraging information from multiple species.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1471-2164
1471-2164
DOI:10.1186/s12864-018-4704-z