Training a Probabilistic Graphical Model With Resistive Switching Electronic Synapses

Current large-scale implementations of deep learning and data mining require thousands of processors, massive amounts of off-chip memory, and consume gigajoules of energy. New memory technologies, such as nanoscale two-terminal resistive switching memory devices, offer a compact, scalable, and low-p...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on electron devices Vol. 63; no. 12; pp. 5004 - 5011
Main Authors Eryilmaz, Sukru Burc, Neftci, Emre, Joshi, Siddharth, SangBum Kim, BrightSky, Matthew, Hsiang-Lan Lung, Chung Lam, Cauwenberghs, Gert, Wong, Hon-Sum Philip
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Current large-scale implementations of deep learning and data mining require thousands of processors, massive amounts of off-chip memory, and consume gigajoules of energy. New memory technologies, such as nanoscale two-terminal resistive switching memory devices, offer a compact, scalable, and low-power alternative that permits on-chip colocated processing and memory in fine-grain distributed parallel architecture. Here, we report the first use of resistive memory devices for implementing and training a restricted Boltzmann machine (RBM), a generative probabilistic graphical model as a key component for unsupervised learning in deep networks. We experimentally demonstrate a 45-synapse RBM realized with 90 resistive phase change memory (PCM) elements trained with a bioinspired variant of the contrastive divergence algorithm, implementing Hebbian and anti-Hebbian weight updates. The resistive PCM devices show a twofold to tenfold reduction in error rate in a missing pixel pattern completion task trained over 30 epochs, compared with untrained case. Measured programming energy consumption is 6.1 nJ per epoch with the PCM devices, a factor of ~ 150 times lower than the conventional processor-memory systems. We analyze and discuss the dependence of learning performance on cycle-to-cycle variations and number of gradual levels in the PCM analog memory devices.
ISSN:0018-9383
1557-9646
DOI:10.1109/TED.2016.2616483