Solving Minimum-Cost Reach Avoid using Reinforcement Learning
Current reinforcement-learning methods are unable to directly learn policies that solve the minimum cost reach-avoid problem to minimize cumulative costs subject to the constraints of reaching the goal and avoiding unsafe states, as the structure of this new optimization problem is incompatible with...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
29.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Current reinforcement-learning methods are unable to directly learn policies
that solve the minimum cost reach-avoid problem to minimize cumulative costs
subject to the constraints of reaching the goal and avoiding unsafe states, as
the structure of this new optimization problem is incompatible with current
methods. Instead, a surrogate problem is solved where all objectives are
combined with a weighted sum. However, this surrogate objective results in
suboptimal policies that do not directly minimize the cumulative cost. In this
work, we propose RC-PPO, a reinforcement-learning-based method for solving the
minimum-cost reach-avoid problem by using connections to Hamilton-Jacobi
reachability. Empirical results demonstrate that RC-PPO learns policies with
comparable goal-reaching rates to while achieving up to 57% lower cumulative
costs compared to existing methods on a suite of minimum-cost reach-avoid
benchmarks on the Mujoco simulator. The project page can be found at
https://oswinso.xyz/rcppo. |
---|---|
DOI: | 10.48550/arxiv.2410.22600 |