Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models

In this article, we study sequential testing problems with overlapping hypotheses. We first focus on the simple problem of assessing if the mean μ of a Gaussian distribution is smaller or larger than a fixed if both answers are considered to be correct. Then, we consider probably approximately corre...

Full description

Saved in:
Bibliographic Details
Published inSequential analysis Vol. 40; no. 1; pp. 61 - 96
Main Authors Garivier, Aurélien, Kaufmann, Emilie
Format Journal Article
LanguageEnglish
Published Philadelphia Taylor & Francis 15.01.2021
Taylor & Francis Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this article, we study sequential testing problems with overlapping hypotheses. We first focus on the simple problem of assessing if the mean μ of a Gaussian distribution is smaller or larger than a fixed if both answers are considered to be correct. Then, we consider probably approximately correct best arm identification in a bandit model: given K probability distributions on with means we derive the asymptotic complexity of identifying, with risk at most δ, an index such that We provide nonasymptotic bounds on the error of a parallel general likelihood ratio test, which can also be used for more general testing problems. We further propose a lower bound on the number of observations needed to identify a correct hypothesis. Those lower bounds rely on information-theoretic arguments, and specifically on two versions of a change of measure lemma (a high-level form and a low-level form) whose relative merits are discussed.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0747-4946
1532-4176
DOI:10.1080/07474946.2021.1847965