EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI

In recent years, Large Language Models (LLMs) have demonstrated high reasoning capabilities, drawing attention for their applications as agents in various decision-making processes. One notably promising application of LLM agents is robotic manipulation. Recent research has shown that LLMs can gener...

Full description

Saved in:
Bibliographic Details
Main Authors Kagaya, Tomoyuki, Lou, Yuxuan, Yuan, Thong Jing, Lakshmi, Subramanian, Karlekar, Jayashree, Pranata, Sugiri, Murakami, Natsuki, Kinose, Akira, Oguri, Koki, Wick, Felix, You, Yang
Format Journal Article
LanguageEnglish
Published 22.10.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In recent years, Large Language Models (LLMs) have demonstrated high reasoning capabilities, drawing attention for their applications as agents in various decision-making processes. One notably promising application of LLM agents is robotic manipulation. Recent research has shown that LLMs can generate text planning or control code for robots, providing substantial flexibility and interaction capabilities. However, these methods still face challenges in terms of flexibility and applicability across different environments, limiting their ability to adapt autonomously. Current approaches typically fall into two categories: those relying on environment-specific policy training, which restricts their transferability, and those generating code actions based on fixed prompts, which leads to diminished performance when confronted with new environments. These limitations significantly constrain the generalizability of agents in robotic manipulation. To address these limitations, we propose a novel method called EnvBridge. This approach involves the retention and transfer of successful robot control codes from source environments to target environments. EnvBridge enhances the agent's adaptability and performance across diverse settings by leveraging insights from multiple environments. Notably, our approach alleviates environmental constraints, offering a more flexible and generalizable solution for robotic manipulation tasks. We validated the effectiveness of our method using robotic manipulation benchmarks: RLBench, MetaWorld, and CALVIN. Our experiments demonstrate that LLM agents can successfully leverage diverse knowledge sources to solve complex tasks. Consequently, our approach significantly enhances the adaptability and robustness of robotic manipulation agents in planning across diverse environments.
DOI:10.48550/arxiv.2410.16919