Near-Optimal Regret Bounds for Thompson Sampling

Thompson Sampling (TS) is one of the oldest heuristics for multiarmed bandit problems. It is a randomized algorithm based on Bayesian ideas and has recently generated significant interest after several studies demonstrated that it has favorable empirical performance compared to the state-of-the-art...

Full description

Saved in:

Bibliographic Details
Published in	Journal of the ACM Vol. 64; no. 5; pp. 1 - 24
Main Authors	Agrawal, Shipra, Goyal, Navin
Format	Journal Article
Language	English
Published	New York Association for Computing Machinery 01.10.2017
Subjects	Bayesian analysis Empirical analysis Heuristic Lower bounds Martingales Optimization Probability distribution functions Randomized algorithms Sampling Studies
Online Access	Get full text
ISSN	0004-5411 1557-735X
DOI	10.1145/3088510

Cover

Loading…

More Information
Metadata