Emergence of cooperation under punishment: A reinforcement learning perspective
Punishment is a common tactic to sustain cooperation and has been extensively studied for a long time. While most of previous game-theoretic work adopt the imitation learning where players imitate the strategies who are better off, the learning logic in the real world is often much more complex. In...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
29.01.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Punishment is a common tactic to sustain cooperation and has been extensively
studied for a long time. While most of previous game-theoretic work adopt the
imitation learning where players imitate the strategies who are better off, the
learning logic in the real world is often much more complex. In this work, we
turn to the reinforcement learning paradigm, where individuals make their
decisions based upon their past experience and long-term returns. Specifically,
we investigate the Prisoners' dilemma game with Q-learning algorithm, and
cooperators probabilistically pose punishment on defectors in their
neighborhood. Interestingly, we find that punishment could lead to either
continuous or discontinuous cooperation phase transitions, and the nucleation
process of cooperation clusters is reminiscent of the liquid-gas transition.
The uncovered first-order phase transition indicates that great care needs to
be taken when implementing the punishment compared to the continuous scenario. |
---|---|
DOI: | 10.48550/arxiv.2401.16073 |