Statistical learning based on Markovian data maximal deviation inequalities and learning rates

In statistical learning theory, numerous works established non-asymptotic bounds assessing the generalization capacity of empirical risk minimizers under a large variety of complexity assumptions for the class of decision rules over which optimization is performed, by means of sharp control of unifo...

Full description

Saved in:

Bibliographic Details
Published in	Annals of mathematics and artificial intelligence Vol. 88; no. 7; pp. 735 - 757
Main Authors	Clémençon, Stephan, Bertail, Patrice, Ciołek, Gabriela
Format	Journal Article
Language	English
Published	Cham Springer International Publishing 01.07.2020 Springer Springer Nature B.V Springer Verlag
Subjects	Analysis Artificial Intelligence Complex Systems Complexity Computer Science Decision-making Deviation Empirical analysis Inequalities Learning Learning theory Machine Learning Markov analysis Markov chains Markov processes Mathematics Probability Statistics Regenerative method 60J20 Stationary probability distribution Empirical process Harris positive Markov chain 62M05 Generalization bound Concentration inequality 60J05 Novelty detection Unsupervised learning Minimum volume set
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In statistical learning theory, numerous works established non-asymptotic bounds assessing the generalization capacity of empirical risk minimizers under a large variety of complexity assumptions for the class of decision rules over which optimization is performed, by means of sharp control of uniform deviation of i.i.d. averages from their expectation, while fully ignoring the possible dependence across training data in general. It is the purpose of this paper to show that similar results can be obtained when statistical learning is based on a data sequence drawn from a (Harris positive) Markov chain X , through the running example of estimation of minimum volume sets (MV-sets) related to X ’s stationary distribution, an unsupervised statistical learning approach to anomaly/novelty detection. Based on novel maximal deviation inequalities we establish, using the regenerative method , learning rate bounds that depend not only on the complexity of the class of candidate sets but also on the ergodicity rate of the chain X , expressed in terms of tail conditions for the length of the regenerative cycles. In particular, this approach fully tailored to Markovian data permits to interpret the rate bound results obtained in frequentist terms, in contrast to alternative coupling techniques based on mixing conditions: the larger the expected number of cycles over a trajectory of finite length, the more accurate the MV-set estimates. Beyond the theoretical analysis, this phenomenon is supported by illustrative numerical experiments.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1012-2443 1573-7470
DOI:	10.1007/s10472-019-09670-6