Variational cross-validation of slow dynamical modes in molecular kinetics

Markov state models are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing bet...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of chemical physics Vol. 142; no. 12; p. 124105
Main Authors McGibbon, Robert T., Pande, Vijay S.
Format Journal Article
LanguageEnglish
Published United States American Institute of Physics 28.03.2015
AIP Publishing LLC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Markov state models are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing between alternative models has hampered progress, especially for non-experts applying these methods to novel biological systems. Here, we consider cross-validation with a new objective function for estimators of these slow dynamical modes, a generalized matrix Rayleigh quotient (GMRQ), which measures the ability of a rank-m projection operator to capture the slow subspace of the system. It is shown that a variational theorem bounds the GMRQ from above by the sum of the first m eigenvalues of the system’s propagator, but that this bound can be violated when the requisite matrix elements are estimated subject to statistical uncertainty. This overfitting can be detected and avoided through cross-validation. These result make it possible to construct Markov state models for protein dynamics in a way that appropriately captures the tradeoff between systematic and statistical errors.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Undefined-3
ISSN:0021-9606
1089-7690
1089-7690
DOI:10.1063/1.4916292