Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair. The Q function neural network contains a lot of implicit...

Full description

Saved in:

Bibliographic Details
Published in	Machine Learning and Knowledge Discovery in Databases Vol. 11052; pp. 414 - 429
Main Authors	Liu, Guiliang, Schulte, Oliver, Zhu, Wang, Li, Qingcan
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2019 Springer International Publishing
Series	Lecture Notes in Computer Science
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair. The Q function neural network contains a lot of implicit knowledge about the RL problems, but often remains unexamined and uninterpreted. To our knowledge, this work develops the first mimic learning framework for Q functions in DRL. We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions. An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment. Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods. The transparent tree structure of an LMUT facilitates understanding the network’s learned strategic knowledge by analyzing feature influence, extracting rules, and highlighting the super-pixels in image inputs. Code related to this paper is available at: https://github.com/Guiliang/uTree_mimic_mountain_car.
Bibliography:	O. Schulte—Supported by a Discovery Grant from the Natural Sciences and Engineering Council of Canada. This research utilized Titan X GPUs donated by the NVIDIA Corporation.
ISBN:	3030109275 9783030109271
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-10928-8_25