Statistical learning based on Markovian data maximal deviation inequalities and learning rates
In statistical learning theory, numerous works established non-asymptotic bounds assessing the generalization capacity of empirical risk minimizers under a large variety of complexity assumptions for the class of decision rules over which optimization is performed, by means of sharp control of unifo...
Saved in:
Published in | Annals of mathematics and artificial intelligence Vol. 88; no. 7; pp. 735 - 757 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Cham
Springer International Publishing
01.07.2020
Springer Springer Nature B.V Springer Verlag |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In statistical learning theory, numerous works established non-asymptotic bounds assessing the generalization capacity of empirical risk minimizers under a large variety of complexity assumptions for the class of decision rules over which optimization is performed, by means of sharp control of uniform deviation of i.i.d. averages from their expectation, while fully ignoring the possible dependence across training data in general. It is the purpose of this paper to show that similar results can be obtained when statistical learning is based on a data sequence drawn from a (Harris positive) Markov chain
X
, through the running example of estimation of
minimum volume sets
(MV-sets) related to
X
’s stationary distribution, an unsupervised statistical learning approach to anomaly/novelty detection. Based on novel maximal deviation inequalities we establish, using the
regenerative method
, learning rate bounds that depend not only on the complexity of the class of candidate sets but also on the ergodicity rate of the chain
X
, expressed in terms of tail conditions for the length of the regenerative cycles. In particular, this approach fully tailored to Markovian data permits to interpret the rate bound results obtained in frequentist terms, in contrast to alternative coupling techniques based on
mixing
conditions: the larger the expected number of cycles over a trajectory of finite length, the more accurate the MV-set estimates. Beyond the theoretical analysis, this phenomenon is supported by illustrative numerical experiments. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1012-2443 1573-7470 |
DOI: | 10.1007/s10472-019-09670-6 |