Handling Stuck-at-Fault Defects Using Matrix Transformation for Robust Inference of DNNs

Matrix-vector multiplication is the dominating computational workload in the inference phase of deep neural networks (DNNs). Memristor crossbar arrays (MCAs) can efficiently perform matrix-vector multiplication in the analog domain. A key challenge is that memristor devices may suffer stuck-at-fault...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on computer-aided design of integrated circuits and systems Vol. 39; no. 10; pp. 2448 - 2460
Main Authors	Zhang, Baogang, Uysal, Necati, Fan, Deliang, Ewetz, Rickard
Format	Journal Article
Language	English
Published	New York IEEE 01.10.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Acceleration Accuracy Analog computing Artificial neural networks deep neural networks (DNNs) Defects Faults Hardware Inference Mathematical analysis Matrix algebra Matrix converters Matrix methods Measurement Memristors Multiplication Neurons Permutations stuck-at-faults Training transformations Transformations (mathematics) Weight
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Matrix-vector multiplication is the dominating computational workload in the inference phase of deep neural networks (DNNs). Memristor crossbar arrays (MCAs) can efficiently perform matrix-vector multiplication in the analog domain. A key challenge is that memristor devices may suffer stuck-at-fault defects, which can severely degrade the classification accuracy. Earlier studies have shown that the accuracy loss can be recovered by utilizing additional hardware or hardware aware training. In this article, we propose a framework that handles stuck-at-faults using matrix transformations, which is called the MT framework. The framework is based on introducing a cost metric that captures the negative impact of the stuck-at-fault defects. Next, the cost metric is minimized by applying matrix transformations <inline-formula> <tex-math notation="LaTeX">T </tex-math></inline-formula>. A transformation <inline-formula> <tex-math notation="LaTeX">T </tex-math></inline-formula> changes a weight matrix <inline-formula> <tex-math notation="LaTeX">W </tex-math></inline-formula> into a new weight matrix <inline-formula> <tex-math notation="LaTeX">\widetilde {W}= T(W) </tex-math></inline-formula>. In particular, a row flipping transformation, a permutation transformation, and a value range transformation are proposed. The row flipping transformation results in that stuck-off (stuck-on) faults are translated into stuck-on (stuck-off) faults. The permutation transformation maps small (large) weights to memristors stuck-off (stuck-on). The value range transformation is based on reducing the magnitude of the smallest and largest elements in the weight matrices, which results in that the stuck-at-faults introduce smaller errors. The experimental results demonstrate that the MT framework is capable of recovering 99% of the accuracy loss on both the MNIST and CIFAR-10 datasets without utilizing hardware aware training. The accuracy improvements come at the expense of an <inline-formula> <tex-math notation="LaTeX">8.19\times </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">9.23\times </tex-math></inline-formula> overhead in power and area, respectively. Nevertheless, the overhead can be reduced with up to 50% by leveraging hardware aware training.
ISSN:	0278-0070 1937-4151
DOI:	10.1109/TCAD.2019.2944582