The Proteome Folding Project: proteome-scale prediction of structure and function

The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function...

Full description

Saved in:

Bibliographic Details
Published in	Genome research Vol. 21; no. 11; pp. 1981 - 1994
Main Authors	Drew, Kevin, Winters, Patrick, Butterfoss, Glenn L, Berstis, Viktors, Uplinger, Keith, Armstrong, Jonathan, Riffle, Michael, Schweighofer, Erik, Bovermann, Bill, Goodlett, David R, Davis, Trisha N, Shasha, Dennis, Malmström, Lars, Bonneau, Richard
Format	Journal Article
Language	English
Published	United States Cold Spring Harbor Laboratory Press 01.11.2011
Subjects	Animals Arabidopsis Chorismate Mutase - chemistry Deinococcus - metabolism Deinococcus - radiation effects Drosophila Proteins - chemistry Escherichia coli Genome Glucosyltransferases - chemistry Humans Mice Molecular Sequence Annotation Nuclear Proteins - chemistry Nuclear Proteins - classification Oryza sativa Plasmodium vivax - metabolism Protein Conformation Protein Folding Protein Structure, Tertiary Proteome - chemistry Protozoan Proteins - chemistry Quality Control Reproducibility of Results Resource Transglutaminases - chemistry User-Computer Interface
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1088-9051 1549-5469
DOI:	10.1101/gr.121475.111