Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies

•Series of controlled user studies examining post-hoc example-based explanations for black-box deep learners doing classification (XAI).•Black box AI models can be explained by “twinning” them with white-box models.•Explanations were only found to impact people’s perception of errors.•Explanations l...

Full description

Saved in:

Bibliographic Details
Published in	Artificial intelligence Vol. 294; p. 103459
Main Authors	Kenny, Eoin M., Ford, Courtney, Quinn, Molly, Keane, Mark T.
Format	Journal Article
Language	English
Published	Amsterdam Elsevier B.V 01.05.2021 Elsevier Science Ltd
Subjects	Artificial neural networks Case-based reasoning Convolutional neural network Deep learning Error analysis Evaluation Explainable AI Factual explanation Feature extraction k-nearest neighbours Mathematical models Morality Trust User testing Deep learning Explainable AI User testing Factual explanation Convolutional neural network Case-based reasoning Trust k-nearest neighbours
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•Series of controlled user studies examining post-hoc example-based explanations for black-box deep learners doing classification (XAI).•Black box AI models can be explained by “twinning” them with white-box models.•Explanations were only found to impact people’s perception of errors.•Explanations lead people to view errors as being “less incorrect”, but they do not improve trust.•Trust in an AI model is undermined by increases in error-rates (from 3% error-levels onwards). In this paper, we describe a post-hoc explanation-by-example approach to eXplainable AI (XAI), where a black-box, deep learning system is explained by reference to a more transparent, proxy model (in this situation a case-based reasoner), based on a feature-weighting analysis of the former that is used to find explanatory cases from the latter (as one instance of the so-called Twin Systems approach). A novel method (COLE-HP) for extracting the feature-weights from black-box models is demonstrated for a convolutional neural network (CNN) applied to the MNIST dataset; in which extracted feature-weights are used to find explanatory, nearest-neighbours for test instances. Three user studies are reported examining people's judgements of right and wrong classifications made by this XAI twin-system, in the presence/absence of explanations-by-example and different error-rates (from 3-60%). The judgements gathered include item-level evaluations of both correctness and reasonableness, and system-level evaluations of trust, satisfaction, correctness, and reasonableness. Several proposals are made about the user's mental model in these tasks and how it is impacted by explanations at an item- and system-level. The wider lessons from this work for XAI and its user studies are reviewed.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0004-3702 1872-7921
DOI:	10.1016/j.artint.2021.103459