Evaluation of Q-learning for search and inspect missions using underwater vehicles

An application for offline Reinforcement Learning in the underwater domain is proposed. We present and evaluate the integration of the Q-learning algorithm into an Autonomous Underwater Vehicle (AUV) for learning the action-value function in simulation. Three separate experiments are presented. The...

Full description

Saved in:

Bibliographic Details
Published in	2014 Oceans - St. John's pp. 1 - 6
Main Authors	Frost, Gordon, Lane, David M.
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2014
Subjects	Computer architecture Convergence Robot sensing systems Software Sonar Vehicles
Online Access	Get full text

Cover

Loading…

More Information
Summary:	An application for offline Reinforcement Learning in the underwater domain is proposed. We present and evaluate the integration of the Q-learning algorithm into an Autonomous Underwater Vehicle (AUV) for learning the action-value function in simulation. Three separate experiments are presented. The first compares two search policies: the ε - least visited, and random action, with respect to convergence time. The second experiment presents the effect of the learning discount factor, gamma, on the convergence time of the ε - least visited search policy. The final experiment is to validate the use of a policy learnt offline on a real AUV. This learning phase occurs offline within the continuous simulation environment which had been discretized into a grid-world learning problem. Presented results show the system's convergence to a global optimal solution whilst following both sub-optimal policies during simulation. Future work is introduced, after discussion of our results, to enable the system to be used in a real world application. The results presented, therefore, form the basis for future comparative analysis of the necessary improvements such as function approximation of the state space.
ISSN:	0197-7385
DOI:	10.1109/OCEANS.2014.7003088