Biomarker discovery in mass spectral profiles by means of selectivity ratio plot

This work presents a new method for variable selection in complex spectral profiles. The method is validated by comparing samples from cerebrospinal fluid (CSF) with the same samples spiked with peptide and protein standards at different concentration levels. Partial least squares discriminant analy...

Full description

Saved in:
Bibliographic Details
Published inChemometrics and intelligent laboratory systems Vol. 95; no. 1; pp. 35 - 48
Main Authors Rajalahti, Tarja, Arneberg, Reidar, Berven, Frode S., Myhr, Kjell-Morten, Ulvik, Rune J., Kvalheim, Olav M.
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.01.2009
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This work presents a new method for variable selection in complex spectral profiles. The method is validated by comparing samples from cerebrospinal fluid (CSF) with the same samples spiked with peptide and protein standards at different concentration levels. Partial least squares discriminant analysis (PLS-DA) attempts to separate two groups of samples by regressing on a y -vector consisting of zeros and ones in the PLS decomposition. In most cases, several PLS components are needed to optimize the discrimination between groups. This creates difficulties for the interpretation of the model. By using the y -vector as a target, it is possible to transform the PLS components to obtain a single predictive target-projected component analogously to the predictive component in orthogonal partial least squares discriminant analysis (OPLS-DA). By calculating the ratio between explained and residual variance of the spectral variables on the target-projected component, a selectivity ratio plot is obtained that can be used for variable selection. Used on whole mass spectral profiles of pure and spiked CSF, we can detect peptide in the low molecular mass range (740–9000 Da) at least down to 400 pM level without severe problems with false biomarker candidates. Similarly, we detect added proteins at least down to 2 nM level in the medium mass range (6000–17,500 Da). Target projection represents the optimal way to fit a latent variable decomposition to a known target, but the selectivity ratio plot can be used for OPLS as well as other methods that produce a single predictive component. Comparison with some commonly used tools for variable selection shows that the selectivity ratio plot has the best performance. This observation is attributed to the fact that target projection utilizes both the predictive ability (regression coefficients) and the explanatory ability (spectral variance/covariance matrix) for the calculation of the selectivity ratio.
ISSN:0169-7439
1873-3239
DOI:10.1016/j.chemolab.2008.08.004