Learning Concave Conditional Likelihood Models for Improved Analysis of Tandem Mass Spectra
The most widely used technology to identify the proteins present in a complex biological sample is tandem mass spectrometry, which quickly produces a large collection of spectra representative of the peptides (i.e., protein subsequences) present in the original sample. In this work, we greatly expan...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
04.09.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The most widely used technology to identify the proteins present in a complex
biological sample is tandem mass spectrometry, which quickly produces a large
collection of spectra representative of the peptides (i.e., protein
subsequences) present in the original sample. In this work, we greatly expand
the parameter learning capabilities of a dynamic Bayesian network (DBN)
peptide-scoring algorithm, Didea, by deriving emission distributions for which
its conditional log-likelihood scoring function remains concave. We show that
this class of emission distributions, called Convex Virtual Emissions (CVEs),
naturally generalizes the log-sum-exp function while rendering both maximum
likelihood estimation and conditional maximum likelihood estimation concave for
a wide range of Bayesian networks. Utilizing CVEs in Didea allows efficient
learning of a large number of parameters while ensuring global convergence, in
stark contrast to Didea's previous parameter learning framework (which could
only learn a single parameter using a costly grid search) and other trainable
models (which only ensure convergence to local optima). The newly trained
scoring function substantially outperforms the state-of-the-art in both scoring
function accuracy and downstream Fisher kernel analysis. Furthermore, we
significantly improve Didea's runtime performance through successive
optimizations to its message passing schedule and derive explicit connections
between Didea's new concave score and related MS/MS scoring functions. |
---|---|
DOI: | 10.48550/arxiv.1909.02136 |