Freudian Slips: Analysing the Internal Representations of a Neural Network from Its Mistakes

The use of deep networks has improved the state of the art in various domains of AI, making practical applications possible. At the same time, there are increasing calls to make learning systems more transparent and explainable, due to concerns that they might develop biases in their internal repres...

Full description

Saved in:

Bibliographic Details
Published in	Advances in Intelligent Data Analysis XVI Vol. 10584; pp. 138 - 148
Main Authors	Jia, Sen, Lansdall-Welfare, Thomas, Cristianini, Nello
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 01.01.2017 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Blackbox testing Computer vision Deep learning Explainable AI Taxonomy
Online Access	Get full text
ISBN	9783319687643 3319687646
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-319-68765-0_12

Cover

Loading…

More Information
Summary:	The use of deep networks has improved the state of the art in various domains of AI, making practical applications possible. At the same time, there are increasing calls to make learning systems more transparent and explainable, due to concerns that they might develop biases in their internal representations that might lead to unintended discrimination, when applied to sensitive personal decisions. The use of vast subsymbolic distributed representations has made this task very difficult. We suggest that we can learn a lot about the biases and the internal representations of a deep network without having to unravel its connections, but by adopting the old psychological approach of analysing its “slips of the tongue”. We demonstrate in a practical example that an analysis of the confusion matrix can reveal that a CNN has represented a biological task in a way that reflects our understanding of taxonomy, inferring more structure than it was requested to by the training algorithm. In particular, we show how a CNN trained to recognise animal families, contains also higher order information about taxa such as the superfamily, parvorder, suborder and order for example. We speculate that various forms of psycho-metric testing for neural networks might provide us insight about their inner workings.
ISBN:	9783319687643 3319687646
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-319-68765-0_12