MARLIN: Multi-Agent Reinforcement Learning Guided by Language-Based Inter-Robot Negotiation
Multi-agent reinforcement learning is a key method for training multi-robot systems over a series of episodes in which robots are rewarded or punished according to their performance; only once the system is trained to a suitable standard is it deployed in the real world. If the system is not trained...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
18.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Multi-agent reinforcement learning is a key method for training multi-robot
systems over a series of episodes in which robots are rewarded or punished
according to their performance; only once the system is trained to a suitable
standard is it deployed in the real world. If the system is not trained enough,
the task will likely not be completed and could pose a risk to the surrounding
environment. Therefore, reaching high performance in a shorter training period
can lead to significant reductions in time and resource consumption. We
introduce Multi-Agent Reinforcement Learning guided by Language-based
Inter-Robot Negotiation (MARLIN), which makes the training process both faster
and more transparent. We equip robots with large language models that negotiate
and debate the task, producing a plan that is used to guide the policy during
training. We dynamically switch between using reinforcement learning and the
negotiation-based approach throughout training. This offers an increase in
training speed when compared to standard multi-agent reinforcement learning and
allows the system to be deployed to physical hardware earlier. As robots
negotiate in natural language, we can better understand the behaviour of the
robots individually and as a collective. We compare the performance of our
approach to multi-agent reinforcement learning and a large language model to
show that our hybrid method trains faster at little cost to performance. |
---|---|
DOI: | 10.48550/arxiv.2410.14383 |