Deconstructing Cooperation and Ostracism via Multi-Agent Reinforcement Learning
Cooperation is challenging in biological systems, human societies, and multi-agent systems in general. While a group can benefit when everyone cooperates, it is tempting for each agent to act selfishly instead. Prior human studies show that people can overcome such social dilemmas while choosing int...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
06.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Cooperation is challenging in biological systems, human societies, and
multi-agent systems in general. While a group can benefit when everyone
cooperates, it is tempting for each agent to act selfishly instead. Prior human
studies show that people can overcome such social dilemmas while choosing
interaction partners, i.e., strategic network rewiring. However, little is
known about how agents, including humans, can learn about cooperation from
strategic rewiring and vice versa. Here, we perform multi-agent reinforcement
learning simulations in which two agents play the Prisoner's Dilemma game
iteratively. Each agent has two policies: one controls whether to cooperate or
defect; the other controls whether to rewire connections with another agent.
This setting enables us to disentangle complex causal dynamics between
cooperation and network rewiring. We find that network rewiring facilitates
mutual cooperation even when one agent always offers cooperation, which is
vulnerable to free-riding. We then confirm that the network-rewiring effect is
exerted through agents' learning of ostracism, that is, connecting to
cooperators and disconnecting from defectors. However, we also find that
ostracism alone is not sufficient to make cooperation emerge. Instead,
ostracism emerges from the learning of cooperation, and existing cooperation is
subsequently reinforced due to the presence of ostracism. Our findings provide
insights into the conditions and mechanisms necessary for the emergence of
cooperation with network rewiring. |
---|---|
DOI: | 10.48550/arxiv.2310.04623 |