Learning in games with continuous action sets and unknown payoff functions

This paper examines the convergence of no-regret learning in games with continuous action sets. For concreteness, we focus on learning via “dual averaging”, a widely used class of no-regret learning schemes where players take small steps along their individual payoff gradients and then “mirror” the...

Full description

Saved in:

Bibliographic Details
Published in	Mathematical programming Vol. 173; no. 1-2; pp. 465 - 507
Main Authors	Mertikopoulos, Panayotis, Zhou, Zhengyuan
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.01.2019 Springer Nature B.V
Subjects	Algorithms Calculus of Variations and Optimal Control; Optimization Combinatorics Convergence Distance learning Equilibrium Error analysis Feedback Full Length Paper Game theory Games Learning Mathematical and Computational Physics Mathematical Methods in Physics Mathematics Mathematics and Statistics Mathematics of Computing Numerical Analysis Optimization Theoretical Variational stability Continuous games Dual averaging Primary 91A26 Fenchel coupling Secondary 90C33 Nash equilibrium 90C15 68Q32
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper examines the convergence of no-regret learning in games with continuous action sets. For concreteness, we focus on learning via “dual averaging”, a widely used class of no-regret learning schemes where players take small steps along their individual payoff gradients and then “mirror” the output back to their action sets. In terms of feedback, we assume that players can only estimate their payoff gradients up to a zero-mean error with bounded variance. To study the convergence of the induced sequence of play, we introduce the notion of variational stability , and we show that stable equilibria are locally attracting with high probability whereas globally stable equilibria are globally attracting with probability 1. We also discuss some applications to mixed-strategy learning in finite games, and we provide explicit estimates of the method’s convergence speed.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0025-5610 1436-4646
DOI:	10.1007/s10107-018-1254-8