Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting

This paper is concerned with identifying the arm with the highest mean in a multi-armed bandit problem using as few independent samples from the arms as possible. While the so-called "best arm problem" dates back to the 1950s, only recently were two qualitatively different algorithms propo...

Full description

Saved in:
Bibliographic Details
Published in2014 48th Annual Conference on Information Sciences and Systems (CISS) pp. 1 - 6
Main Authors Jamieson, Kevin, Nowak, Robert
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper is concerned with identifying the arm with the highest mean in a multi-armed bandit problem using as few independent samples from the arms as possible. While the so-called "best arm problem" dates back to the 1950s, only recently were two qualitatively different algorithms proposed that achieve the optimal sample complexity for the problem. This paper reviews these recent advances and shows that most best-arm algorithms can be described as variants of the two recent optimal algorithms. For each algorithm type we consider a specific instance to analyze both theoretically and empirically thereby exposing the core components of the theoretical analysis of these algorithms and intuition about how the algorithms work in practice. The derived sample complexity bounds are novel, and in certain cases improve upon previous bounds. In addition, we compare a variety of state-of-the-art algorithms empirically through simulations for the best-arm-problem.
DOI:10.1109/CISS.2014.6814096