Combined Statistical Analyses of Peptide Intensities and Peptide Occurrences Improves Identification of Significant Peptides from MS-Based Proteomics Data

Liquid chromatography−mass spectrometry-based (LC−MS) proteomics uses peak intensities of proteolytic peptides to infer the differential abundance of peptides/proteins. However, substantial run-to-run variability in intensities and observations (presence/absence) of peptides makes data analysis quit...

Full description

Saved in:
Bibliographic Details
Published inJournal of proteome research Vol. 9; no. 11; pp. 5748 - 5756
Main Authors Webb-Robertson, Bobbie-Jo M, McCue, Lee Ann, Waters, Katrina M, Matzke, Melissa M, Jacobs, Jon M, Metz, Thomas O, Varnum, Susan M, Pounds, Joel G
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 05.11.2010
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Liquid chromatography−mass spectrometry-based (LC−MS) proteomics uses peak intensities of proteolytic peptides to infer the differential abundance of peptides/proteins. However, substantial run-to-run variability in intensities and observations (presence/absence) of peptides makes data analysis quite challenging. The missing observations in LC−MS proteomics data are difficult to address with traditional imputation-based approaches because the mechanisms by which data are missing are unknown a priori. Data can be missing due to random mechanisms such as experimental error or nonrandom mechanisms such as a true biological effect. We present a statistical approach that uses a test of independence known as a G-test to test the null hypothesis of independence between the number of missing values across experimental groups. We pair the G-test results, evaluating independence of missing data (IMD) with an analysis of variance (ANOVA) that uses only means and variances computed from the observed data. Each peptide is therefore represented by two statistical confidence metrics, one for qualitative differential observation and one for quantitative differential intensity. We use three LC−MS data sets to demonstrate the robustness and sensitivity of the IMD−ANOVA approach.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Undefined-1
ObjectType-Feature-3
content type line 23
PNNL-SA-72886
USDOE
AC05-76RL01830
ISSN:1535-3893
1535-3907
DOI:10.1021/pr1005247