Statistically rigorous automated protein annotation
Motivation: Assignment of putative protein functional annotation by comparative analysis using pre-defined experimental annotations is performed routinely by molecular biologists. The number and statistical significance of these assignments remains a challenge in this era of high-throughput proteomi...
Saved in:
Published in | Bioinformatics Vol. 20; no. 7; pp. 1066 - 1073 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Oxford
Oxford University Press
01.05.2004
Oxford Publishing Limited (England) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Motivation: Assignment of putative protein functional annotation by comparative analysis using pre-defined experimental annotations is performed routinely by molecular biologists. The number and statistical significance of these assignments remains a challenge in this era of high-throughput proteomics. A combined statistical method that enables robust, automated protein annotation by reliably expanding existing annotation sets is described. An existing clustering scheme, based on relevant experimental information (e.g. sequence identity, keywords or gene expression data) is required. The method assigns new proteins to these clusters with a measure of reliability. It can also provide human reviewers with a reliability score for both new and previously classified proteins. Results: A dataset of 27 000 annotated Protein Data Bank (PDB) polypeptide chains (of 36 000 chains currently in the PDB) was generated from 23 000 chains classified a priori. Availability: PDB annotations and sample software implementation are freely accessible on the Web at http://pmr.sdsc.edu/go |
---|---|
Bibliography: | local:bth039 istex:86110E6BA80CD8B923E8C44FE3697E0344DA9E6A Contact: bourne@sdsc.edu ark:/67375/HXZ-M1DQHWRT-5 ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2 |
ISSN: | 1367-4803 1460-2059 1367-4811 |
DOI: | 10.1093/bioinformatics/bth039 |