Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning

This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-C...

Full description

Saved in:

Bibliographic Details
Published in	Autonomous robots Vol. 46; no. 3; pp. 483 - 498
Main Authors	Shahid, Asad Ali, Piga, Dario, Braghin, Francesco, Roveda, Loris
Format	Journal Article
Language	English
Published	New York Springer US 01.03.2022 Springer Nature B.V
Subjects	Adaptation Algorithms Artificial Intelligence Computer Imaging Control Engineering Machine learning Mechatronics Optimization Pattern Recognition and Graphics Robotics Robotics and Automation Robots Vision Robotic grasping Continuous control Policy optimization Reinforcement learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.
ISSN:	0929-5593 1573-7527
DOI:	10.1007/s10514-022-10034-z