Variational cross-validation of slow dynamical modes in molecular kinetics

Markov state models (MSMs) are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choos...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors McGibbon, Robert T, Pande, Vijay S
Format Paper Journal Article
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 27.03.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Markov state models (MSMs) are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing between alternative models has hampered progress, especially for non-experts applying these methods to novel biological systems. Here, we consider cross-validation with a new objective function for estimators of these slow dynamical modes, a generalized matrix Rayleigh quotient (GMRQ), which measures the ability of a rank-\(m\) projection operator to capture the slow subspace of the system. It is shown that a variational theorem bounds the GMRQ from above by the sum of the first \(m\) eigenvalues of the system's propagator, but that this bound can be violated when the requisite matrix elements are estimated subject to statistical uncertainty. This overfitting can be detected and avoided through cross-validation. These result make it possible to construct Markov state models for protein dynamics in a way that appropriately captures the tradeoff between systematic and statistical errors.
Bibliography:SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
ISSN:2331-8422
DOI:10.48550/arxiv.1407.8083