A controlled investigation of behaviorally-cloned deep neural network behaviors in an autonomous steering task
Imitation learning (IL) is a popular method used to train machine learning models that are capable of acting on their environment based on expert examples. Two types of IL models are inverse reinforcement learning (IRL) and behavioral cloning (BC). Models trained under IRL traditionally perform bett...
Saved in:
Published in | Robotics and autonomous systems Vol. 142; p. 103780 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.08.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Imitation learning (IL) is a popular method used to train machine learning models that are capable of acting on their environment based on expert examples. Two types of IL models are inverse reinforcement learning (IRL) and behavioral cloning (BC). Models trained under IRL traditionally perform better than those trained under BC due to compounding covariate shift associated with the latter, which typically requires algorithms such as DAGGer to help compensate for this. More recently, however, deep learning architectures with increased generalization performance have been developed, which may help to alleviate the problem of compounding covariate shift and allow researchers to take advantage of the simplicity of BC. Despite these developments, recent studies on BC in sub-scale autonomous robots employ relatively primitive convolutional networks without such tools as batch normalization and skip connections, and it is difficult to judge their networks’ performance relative to others due to drastically different training and testing conditions. Here, we examine how an array of artificial neural networks, chosen to reflect more recent architectural choices available, behave in a highly controlled IL task – navigating around a small, indoor racetrack – upon being embedded in a sub-scale RC vehicle as an end-to-end steering system. For our main findings, we report the lap completion rate and path smoothness of each network under the exact same conditions as it controls the vehicle on the track. To supplement these findings, we also measure each network’s bias toward the distribution of the training actions and develop a method to highlight regions of a given input image that are deemed ‘important’ to a given network. We observe that most of the more recent neural networks perform reasonably well during testing, as opposed to the more primitive networks which did not perform as well. For these reasons and others, we identify VGG-16 and AlexNet – out of the networks tested here – as attractive candidate architectures for such tasks. |
---|---|
ISSN: | 0921-8890 1872-793X |
DOI: | 10.1016/j.robot.2021.103780 |