Recursive Reasoning With Reduced Complexity and Intermittency for Nonequilibrium Learning in Stochastic Games

In this article, we propose a computationally and communicationally efficient approach for decision-making in nonequilibrium stochastic games. In particular, due to the inherent complexity of computing Nash equilibria, as well as the innate tendency of agents to choose nonequilibrium strategies, we...

Full description

Saved in:
Bibliographic Details
Published inIEEE transaction on neural networks and learning systems Vol. 34; no. 11; pp. 8467 - 8481
Main Authors Fotiadis, Filippos, Vamvoudakis, Kyriakos G.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.11.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this article, we propose a computationally and communicationally efficient approach for decision-making in nonequilibrium stochastic games. In particular, due to the inherent complexity of computing Nash equilibria, as well as the innate tendency of agents to choose nonequilibrium strategies, we construct two models of bounded rationality based on recursive reasoning. In the first model, named level-<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> thinking, each agent assumes that everyone else has a cognitive level immediately lower than theirs and-given such an assumption-chooses their policy to be a best response to them. In the second model, named cognitive hierarchy, each agent conjectures that the rest of the agents have a cognitive level that is lower than theirs, but follows a distribution instead of being deterministic. To explicitly compute the boundedly rational policies, a level-recursive algorithm and a level-paralleled algorithm are constructed, where the latter one can have an overall reduced computational complexity. To further reduce the complexity in the communication layer, modifications of the proposed nonequilibrium strategies are presented, which do not require the action of a boundedly rational agent to be updated at each step of the stochastic game. Simulations are performed that demonstrate our results.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2022.3151250