Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-s...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Negrea, Jeffrey, Yang, Jun, Feng, Haoyue, Roy, Daniel M, Huggins, Jonathan H
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 20.07.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.
ISSN:2331-8422