SYSTEM AND METHOD FOR CONTROLLING INTER-PROXY COMMUNICATION IN A MULTI-PROXY SYSTEM
A policy model is provided for an agent in a multi-agent system that controls communication of the agent with other agents in the multi-agent system. The policy model is trained by using MARL. The policy model receives more messages from one or more other agents in the multi-agent system. The policy...
Saved in:
Main Authors | , , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
09.06.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A policy model is provided for an agent in a multi-agent system that controls communication of the agent with other agents in the multi-agent system. The policy model is trained by using MARL. The policy model receives more messages from one or more other agents in the multi-agent system. The policy model generates a reward score based at least on the hidden state of the agent and the one or more messages. The reward score represents an aggregation of a value and a cost of sending the message for the task. The policy model determines whether to send the message based on the reward score. Upon determining to send the message, the policy model generates a message based on the hidden state of the agent and the one or more messages, and sends the message to one or more other agents in the multi-agent system.
对多代理系统中的代理提供控制代理与多代理系统中的其他代理的通信的策略模型。策略模型通过使用MARL来训练。策略模型从多代理系统中的一个或多个其他代理接收更多消息。策略模型至少基于代理的隐藏状态和一个或多个消息生成奖励分数。奖励分数表示对任务发送消息的价值和发送消息的成本的聚合。策略模型基于奖励分数确定是否发送消息。在确定发送消息后,策略模型基于代理的隐藏状态和一个或多个消息生成消息,并且将消息发送到多代理系统中的 |
---|---|
Bibliography: | Application Number: CN202211311121 |