Evaluating the Visualization of What a Deep Neural Network Has Learned

Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 28; no. 11; pp. 2660 - 2673
Main Authors	Samek, Wojciech, Binder, Alexander, Montavon, Gregoire, Lapuschkin, Sebastian, Muller, Klaus-Robert
Format	Journal Article
Language	English
Published	United States IEEE 01.11.2017 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithm design and analysis Artificial neural networks Biological neural networks Classification Convolutional neural networks Data collection Decision theory Deconvolution explaining classification Heating Image classification interpretable machine learning Learning algorithms Learning systems Machine learning Neural networks Neurons Object recognition Pixels relevance models Sensitivity Speech recognition Task complexity Test procedures Visualization Algorithm design and analysis Learning systems Deconvolution Sensitivity Heating Neurons Biological neural networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2016.2599820