Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality
The interplay between exploration and exploitation in competitive multi-agent learning is still far from being well understood. Motivated by this, we study smooth Q-learning, a prototypical learning model that explicitly captures the balance between game rewards and exploration costs. We show that Q...
Saved in:
Published in | IDEAS Working Paper Series from RePEc |
---|---|
Main Authors | , , |
Format | Paper |
Language | English |
Published |
St. Louis
Federal Reserve Bank of St. Louis
01.01.2021
|
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!