An Inverse Kinematics Solution for a Series-Parallel Hybrid Banana-Harvesting Robot Based on Deep Reinforcement Learning

A series-parallel hybrid banana-harvesting robot was previously developed to pick bananas, with inverse kinematics intractable to an address. This paper investigates a deep reinforcement learning-based inverse kinematics solution to guide the banana-harvesting robot toward a specified target. Becaus...

Full description

Saved in:

Bibliographic Details
Published in	Agronomy (Basel) Vol. 12; no. 9; p. 2157
Main Authors	Lin, Guichao, Huang, Peichen, Wang, Minglong, Xu, Yao, Zhang, Rihong, Zhu, Lixue
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.09.2022
Subjects	Algorithms banana-harvesting robot Bananas Data mining Deep learning deep reinforcement learning Field tests Fruits Harvesting Inverse kinematics Kinematics Learning algorithms Machine learning Optimization techniques Reinforcement Robotics Robots series-parallel hybrid robot twin-delayed deep deterministic policy gradient Workspace China
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A series-parallel hybrid banana-harvesting robot was previously developed to pick bananas, with inverse kinematics intractable to an address. This paper investigates a deep reinforcement learning-based inverse kinematics solution to guide the banana-harvesting robot toward a specified target. Because deep reinforcement learning algorithms always struggle to explore huge robot workspaces, a practical technique called automatic goal generation is first developed. This draws random targets from a dynamic uniform distribution with increasing randomness to facilitate deep reinforcement learning algorithms to explore the entire robot workspace. Then, automatic goal generation is applied to a state-of-the-art deep reinforcement learning algorithm, the twin-delayed deep deterministic policy gradient, to learn an effective inverse kinematics solution. Simulation experiments show that with automatic goal generation, the twin-delayed deep deterministic policy gradient solved the inverse kinematics problem with a success rate of 96.1% and an average running time of 23.8 milliseconds; without automatic goal generation, the success rate was just 81.2%. Field experiments show that the proposed method successfully guided the robot to approach all targets. These demonstrate that automatic goal generation enables deep reinforcement learning to effectively explore the robot workspace and to learn a robust and efficient inverse kinematics policy, which can, therefore, be applied to the developed series-parallel hybrid banana-harvesting robot.
ISSN:	2073-4395 2073-4395
DOI:	10.3390/agronomy12092157