Off-Policy Differentiable Logic Reinforcement Learning

In this paper, we proposed an Off-Policy Differentiable Logic Reinforcement Learning (OPDLRL) framework to inherit the benefits of interpretability and generalization ability in Differentiable Inductive Logic Programming (DILP) and also resolves its weakness of execution efficiency, stability, and s...

Full description

Saved in:

Bibliographic Details
Published in	Machine Learning and Knowledge Discovery in Databases. Research Track Vol. 12976; pp. 617 - 632
Main Authors	Zhang, Li, Li, Xin, Wang, Mingzhong, Tian, Andong
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2021 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Deep reinforcement learning Interpretable reinforcement learning Neural-Symbolic AI
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we proposed an Off-Policy Differentiable Logic Reinforcement Learning (OPDLRL) framework to inherit the benefits of interpretability and generalization ability in Differentiable Inductive Logic Programming (DILP) and also resolves its weakness of execution efficiency, stability, and scalability. The key contributions include the use of approximate inference to significantly reduce the number of logic rules in the deduction process, an off-policy training method to enable approximate inference, and a distributed and hierarchical training framework. Extensive experiments, specifically playing real-time video games in Rabbids against human players, show that OPDLRL has better or similar performance as other DILP-based methods but far more practical in terms of sample efficiency and execution efficiency, making it applicable to complex and (near) real-time domains.
ISBN:	3030865193 9783030865191
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-86520-7_38