Preliminary Results Towards Reinforcement Learning with Mixed-Signal Memristive Neuromorphic Circuits

As the end of Moore's law seems to be imminent, emerging technologies that enable high performance neuromorphic hardware systems are attracting increasing attention. A very promising approach is to utilize memristors, programmable nonvolatile memory devices, as synaptic weights in neuromorphic...

Full description

Saved in:
Bibliographic Details
Published in2019 IEEE International Symposium on Circuits and Systems (ISCAS) pp. 1 - 5
Main Authors Wu, Nan, Vincent, Adrien F., Strukov, Dmitri
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:As the end of Moore's law seems to be imminent, emerging technologies that enable high performance neuromorphic hardware systems are attracting increasing attention. A very promising approach is to utilize memristors, programmable nonvolatile memory devices, as synaptic weights in neuromorphic circuits. One of the challenges for memristive hardware with integrated learning capabilities is prohibitively larger number of write cycles that might be required during learning process. In this work we propose a memristive neuromorphic hardware implementation for reinforcement learning based on temporal difference actor-critic algorithm. As a case study, we consider a task of balancing an inverted pendulum, a classical problem in both reinforcement learning and control theory. We introduce training techniques that significantly reduce the number of weight updates and are suitable for efficient in-situ learning hardware implementations. We believe that this study shows the promise of using memristor-based hardware neural networks for handling complex tasks through in-situ reinforcement learning.
ISBN:9781728103976
1728103975
ISSN:2158-1525
2158-1525
DOI:10.1109/ISCAS.2019.8702229