Recursive Reasoning With Reduced Complexity and Intermittency for Nonequilibrium Learning in Stochastic Games
In this article, we propose a computationally and communicationally efficient approach for decision-making in nonequilibrium stochastic games. In particular, due to the inherent complexity of computing Nash equilibria, as well as the innate tendency of agents to choose nonequilibrium strategies, we...
Saved in:
Published in | IEEE transaction on neural networks and learning systems Vol. 34; no. 11; pp. 8467 - 8481 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
United States
IEEE
01.11.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this article, we propose a computationally and communicationally efficient approach for decision-making in nonequilibrium stochastic games. In particular, due to the inherent complexity of computing Nash equilibria, as well as the innate tendency of agents to choose nonequilibrium strategies, we construct two models of bounded rationality based on recursive reasoning. In the first model, named level-<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> thinking, each agent assumes that everyone else has a cognitive level immediately lower than theirs and-given such an assumption-chooses their policy to be a best response to them. In the second model, named cognitive hierarchy, each agent conjectures that the rest of the agents have a cognitive level that is lower than theirs, but follows a distribution instead of being deterministic. To explicitly compute the boundedly rational policies, a level-recursive algorithm and a level-paralleled algorithm are constructed, where the latter one can have an overall reduced computational complexity. To further reduce the complexity in the communication layer, modifications of the proposed nonequilibrium strategies are presented, which do not require the action of a boundedly rational agent to be updated at each step of the stochastic game. Simulations are performed that demonstrate our results. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 2162-237X 2162-2388 |
DOI: | 10.1109/TNNLS.2022.3151250 |