Taming an Autonomous Surface Vehicle for Path Following and Collision Avoidance Using Deep Reinforcement Learning

In this article, we explore the feasibility of applying proximal policy optimization, a state-of-the-art deep reinforcement learning algorithm for continuous control tasks, on the dual-objective problem of controlling an underactuated autonomous surface vehicle to follow an a priori known path while...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 8; pp. 41466 - 41481
Main Authors	Meyer, Eivind, Robinson, Haakon, Rasheed, Adil, San, Omer
Format	Journal Article
Language	English
Published	Piscataway IEEE 2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms autonomous surface vehicle Collision avoidance Control tasks Cybernetics Deep learning Deep reinforcement learning Earth Machine learning machine learning controller Moving obstacles Obstacle avoidance Optimization path following Range finders Sensors Surface vehicles Toolkits Trajectory planning Vehicle dynamics
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this article, we explore the feasibility of applying proximal policy optimization, a state-of-the-art deep reinforcement learning algorithm for continuous control tasks, on the dual-objective problem of controlling an underactuated autonomous surface vehicle to follow an a priori known path while avoiding collisions with non-moving obstacles along the way. The AI agent, which is equipped with multiple rangefinder sensors for obstacle detection, is trained and evaluated in a challenging, stochastically generated simulation environment based on the OpenAI gym Python toolkit. Notably, the agent is provided with real-time insight into its own reward function, allowing it to dynamically adapt its guidance strategy. Depending on its strategy, which ranges from radical path-adherence to radical obstacle avoidance, the trained agent achieves an episodic success rate close to 100%
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2020.2976586