Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models
Reinforcement Learning (RL) plays an important role in the robotic manipulation domain since it allows self-learning from trial-and-error interactions with the environment. Still, sample efficiency and reward specification seriously limit its potential. One possible solution involves learning from e...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
04.11.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Reinforcement Learning (RL) plays an important role in the robotic
manipulation domain since it allows self-learning from trial-and-error
interactions with the environment. Still, sample efficiency and reward
specification seriously limit its potential. One possible solution involves
learning from expert guidance. However, obtaining a human expert is impractical
due to the high cost of supervising an RL agent, and developing an automatic
supervisor is a challenging endeavor. Large Language Models (LLMs) demonstrate
remarkable abilities to provide human-like feedback on user inputs in natural
language. Nevertheless, they are not designed to directly control low-level
robotic motions, as their pretraining is based on vast internet data rather
than specific robotics data. In this paper, we introduce the Lafite-RL
(Language agent feedback interactive Reinforcement Learning) framework, which
enables RL agents to learn robotic tasks efficiently by taking advantage of
LLMs' timely feedback. Our experiments conducted on RLBench tasks illustrate
that, with simple prompt design in natural language, the Lafite-RL agent
exhibits improved learning capabilities when guided by an LLM. It outperforms
the baseline in terms of both learning efficiency and success rate,
underscoring the efficacy of the rewards provided by an LLM. |
---|---|
DOI: | 10.48550/arxiv.2311.02379 |