Probabilistic Interaction Network of Evidence Algorithm and its Application to Complete Labeling of Peak Lists from Protein NMR Spectroscopy

The process of assigning a finite set of tags or labels to a collection of observations, subject to side conditions, is notable for its computational complexity. This labeling paradigm is of theoretical and practical relevance to a wide range of biological applications, including the analysis of dat...

Full description

Saved in:

Bibliographic Details
Published in	PLoS computational biology Vol. 5; no. 3; p. e1000307
Main Authors	Bahrami, Arash, Assadi, Amir H., Markley, John L., Eghbalnia, Hamid R.
Format	Journal Article
Language	English
Published	United States Public Library of Science 01.03.2009 Public Library of Science (PLoS)
Subjects	Algorithms Binding Sites Biophysics/Experimental Biophysical Methods Biotechnology/Protein Chemistry and Proteomics Chemical Biology/Protein Chemistry and Proteomics Computational Biology Data Interpretation, Statistical Isotope Labeling - methods Magnetic Resonance Spectroscopy - methods Mathematics NMR Nuclear magnetic resonance Protein Binding Protein Interaction Mapping - methods Proteins Proteins - analysis Proteins - chemistry Spectrum analysis Studies Proteins Data Interpretation, Statistical Algorithms Magnetic Resonance Spectroscopy Isotope Labeling Protein Binding Binding Sites Protein Interaction Mapping
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The process of assigning a finite set of tags or labels to a collection of observations, subject to side conditions, is notable for its computational complexity. This labeling paradigm is of theoretical and practical relevance to a wide range of biological applications, including the analysis of data from DNA microarrays, metabolomics experiments, and biomolecular nuclear magnetic resonance (NMR) spectroscopy. We present a novel algorithm, called Probabilistic Interaction Network of Evidence (PINE), that achieves robust, unsupervised probabilistic labeling of data. The computational core of PINE uses estimates of evidence derived from empirical distributions of previously observed data, along with consistency measures, to drive a fictitious system M with Hamiltonian H to a quasi-stationary state that produces probabilistic label assignments for relevant subsets of the data. We demonstrate the successful application of PINE to a key task in protein NMR spectroscopy: that of converting peak lists extracted from various NMR experiments into assignments associated with probabilities for their correctness. This application, called PINE-NMR, is available from a freely accessible computer server (http://pine.nmrfam.wisc.edu). The PINE-NMR server accepts as input the sequence of the protein plus user-specified combinations of data corresponding to an extensive list of NMR experiments; it provides as output a probabilistic assignment of NMR signals (chemical shifts) to sequence-specific backbone and aliphatic side chain atoms plus a probabilistic determination of the protein secondary structure. PINE-NMR can accommodate prior information about assignments or stable isotope labeling schemes. As part of the analysis, PINE-NMR identifies, verifies, and rectifies problems related to chemical shift referencing or erroneous input data. PINE-NMR achieves robust and consistent results that have been shown to be effective in subsequent steps of NMR structure determination.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Current address: Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, Ohio, United States of America Conceived and designed the experiments: AB JLM HRE. Performed the experiments: AB. Wrote the paper: AB JLM HRE. Conceived the mathematical approaches used: AHA HRE. Conceived the PINE approach: AB JLM HRE. Developed, tested, and evaluated the software and PINE-NMR website: AB.
ISSN:	1553-7358 1553-734X 1553-7358
DOI:	10.1371/journal.pcbi.1000307