On the assessment of reinforcement learning techniques for path planning of skid-steer mobile robots subject to terrain constraints

This paper studies the application of Reinforcement Learning (RL) techniques for path planning of Skid-Steer Mobile Robots (SSMRs) operating in complex environments with obstacles and terrain constraints. With previously recognized characteristics of traditional RL techniques, the path-planning stra...

Full description

Saved in:
Bibliographic Details
Published inIECON 2024 - 50th Annual Conference of the IEEE Industrial Electronics Society pp. 1 - 7
Main Authors Dawson, Kevin, Menendez, Oswaldo, Camacho, Christian, Torres-Torriti, Miguel, Prado, Alvaro
Format Conference Proceeding
LanguageEnglish
Published IEEE 03.11.2024
Subjects
Online AccessGet full text
DOI10.1109/IECON55916.2024.10905509

Cover

Loading…
More Information
Summary:This paper studies the application of Reinforcement Learning (RL) techniques for path planning of Skid-Steer Mobile Robots (SSMRs) operating in complex environments with obstacles and terrain constraints. With previously recognized characteristics of traditional RL techniques, the path-planning strategy is designed using Q-Learning (QL), Deep Q Networks (DQN), and Deep Deterministic Policy Gradient (DDPG) techniques. The proposed strategy addresses challenges such as large navigation maps, high-dimensional state spaces, static/dynamic obstacles, and wheel-terrain interactions in slip conditions. By integrating information from navigation maps, range sensor data, and robot kinematics, the traction and turning actions of SSMRs are controlled. The proposed strategy was first trained with an SSMR dynamic model designed to characterize real-world conditions, simulated in environments similar to open-pit mines, and then field-tested on a Cat ® 262C loader. Through several trials, it was demonstrated that the tested RL algorithms successfully plan reachable paths and achieve robust performance. In particular, QL and DQN approaches demonstrated effectiveness while maneuvering on structural regions characterized by predictable workspace. Conversely, DDPG excelled in adapting to changing slippery scenarios, achieving an improved success rate of 98.3%, an average path efficiency of 83.8%, and enhanced learning with consistent cumulative reward larger than that compared to QL and DQN. These findings are expected to contribute to energy resource savings in robots operating along optimized paths in mining environments.
DOI:10.1109/IECON55916.2024.10905509