Basophile: Accurate Fragment Charge State Prediction Improves Peptide Identification Rates

In shotgun proteomics, database search algorithms rely on fragmentation models to pre- dict fragment ions that should be observed for a given peptide sequence. The most widely used strat- egy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of a...

Full description

Saved in:
Bibliographic Details
Published inGenomics, proteomics & bioinformatics Vol. 11; no. 2; pp. 86 - 95
Main Authors Wang, Dong, Dasari, Surendra, Chambers, Matthew C., Holman, Jerry D., Chen, Kan, Liebler, Daniel C., Orton, Daniel J., Purvine, Samuel O., Monroe, Matthew E., Chung, Chang Y., Rose, Kristie L., Tabb, David L.
Format Journal Article
LanguageEnglish
Published China Elsevier Ltd 01.04.2013
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA%Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN 55905, USA%Department of Biochemistry, Vanderbilt University Medical Center, Nashville, TN 37232, USA%Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland,WA 99354, USA
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In shotgun proteomics, database search algorithms rely on fragmentation models to pre- dict fragment ions that should be observed for a given peptide sequence. The most widely used strat- egy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmen- tation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher- energy collision induced dissociation (HCD) of charged peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.
Bibliography:In shotgun proteomics, database search algorithms rely on fragmentation models to pre- dict fragment ions that should be observed for a given peptide sequence. The most widely used strat- egy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmen- tation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher- energy collision induced dissociation (HCD) of charged peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.
11-4926/Q
Fragmentation;Basicity;Fragment size;Ordinal regression
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
USDOE
AC05-76RL01830
PNNL-SA-98549
Equal contribution.
ISSN:1672-0229
2210-3244
DOI:10.1016/j.gpb.2012.11.004