Learning structured communication for multi-agent reinforcement learning

This work explores the large-scale multi-agent communication mechanism for multi-agent reinforcement learning (MARL). We summarize the general topology categories for communication structures, which are often manually specified in MARL literature. A novel framework termed Learning Structured Communi...

Full description

Saved in:

Bibliographic Details
Published in	Autonomous agents and multi-agent systems Vol. 36; no. 2
Main Authors	Sheng, Junjie, Wang, Xiangfeng, Jin, Bo, Yan, Junchi, Li, Wenhao, Chang, Tsung-Hui, Wang, Jun, Zha, Hongyuan
Format	Journal Article
Language	English
Published	New York Springer US 01.10.2022 Springer Nature B.V
Subjects	Artificial Intelligence Computer Science Computer Systems Organization and Communication Networks Graph neural networks Learning Modules Multiagent systems Software Engineering/Programming and Operating Systems Structural hierarchy Topology User Interfaces and Human Computer Interaction Hierarchical Structure Multi-agent Reinforcement Learning Learning Communication Structures Graph Neural Networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This work explores the large-scale multi-agent communication mechanism for multi-agent reinforcement learning (MARL). We summarize the general topology categories for communication structures, which are often manually specified in MARL literature. A novel framework termed Learning Structured Communication (LSC) is proposed by learning a flexible and efficient communication topology (hierarchical structure). It contains two modules: structured communication module and communication-based policy module. The structured communication module learns to form a hierarchical structure by maximizing the cumulative reward of the agents under the current communication-based policy. The communication-based policy module adopts hierarchical graph neural networks to generate messages, propagate information based on the learned communication structure, and select actions. In contrast to existing communication mechanisms, our method has a learnable and hierarchical communication structure. Experiments on large-scale battle scenarios show that the proposed LSC has high communication efficiency and global cooperation capability.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1387-2532 1573-7454
DOI:	10.1007/s10458-022-09580-8