Statistically rigorous automated protein annotation

Motivation: Assignment of putative protein functional annotation by comparative analysis using pre-defined experimental annotations is performed routinely by molecular biologists. The number and statistical significance of these assignments remains a challenge in this era of high-throughput proteomi...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 20; no. 7; pp. 1066 - 1073
Main Authors Krebs, Werner G., Bourne, Philip E.
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 01.05.2004
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Motivation: Assignment of putative protein functional annotation by comparative analysis using pre-defined experimental annotations is performed routinely by molecular biologists. The number and statistical significance of these assignments remains a challenge in this era of high-throughput proteomics. A combined statistical method that enables robust, automated protein annotation by reliably expanding existing annotation sets is described. An existing clustering scheme, based on relevant experimental information (e.g. sequence identity, keywords or gene expression data) is required. The method assigns new proteins to these clusters with a measure of reliability. It can also provide human reviewers with a reliability score for both new and previously classified proteins. Results: A dataset of 27 000 annotated Protein Data Bank (PDB) polypeptide chains (of 36 000 chains currently in the PDB) was generated from 23 000 chains classified a priori. Availability: PDB annotations and sample software implementation are freely accessible on the Web at http://pmr.sdsc.edu/go
Bibliography:local:bth039
istex:86110E6BA80CD8B923E8C44FE3697E0344DA9E6A
Contact: bourne@sdsc.edu
ark:/67375/HXZ-M1DQHWRT-5
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/bth039