Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning
This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-C...
Saved in:
Published in | Autonomous robots Vol. 46; no. 3; pp. 483 - 498 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.03.2022
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications. |
---|---|
ISSN: | 0929-5593 1573-7527 |
DOI: | 10.1007/s10514-022-10034-z |