Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies
•Series of controlled user studies examining post-hoc example-based explanations for black-box deep learners doing classification (XAI).•Black box AI models can be explained by “twinning” them with white-box models.•Explanations were only found to impact people’s perception of errors.•Explanations l...
Saved in:
Published in | Artificial intelligence Vol. 294; p. 103459 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Amsterdam
Elsevier B.V
01.05.2021
Elsevier Science Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Series of controlled user studies examining post-hoc example-based explanations for black-box deep learners doing classification (XAI).•Black box AI models can be explained by “twinning” them with white-box models.•Explanations were only found to impact people’s perception of errors.•Explanations lead people to view errors as being “less incorrect”, but they do not improve trust.•Trust in an AI model is undermined by increases in error-rates (from 3% error-levels onwards).
In this paper, we describe a post-hoc explanation-by-example approach to eXplainable AI (XAI), where a black-box, deep learning system is explained by reference to a more transparent, proxy model (in this situation a case-based reasoner), based on a feature-weighting analysis of the former that is used to find explanatory cases from the latter (as one instance of the so-called Twin Systems approach). A novel method (COLE-HP) for extracting the feature-weights from black-box models is demonstrated for a convolutional neural network (CNN) applied to the MNIST dataset; in which extracted feature-weights are used to find explanatory, nearest-neighbours for test instances. Three user studies are reported examining people's judgements of right and wrong classifications made by this XAI twin-system, in the presence/absence of explanations-by-example and different error-rates (from 3-60%). The judgements gathered include item-level evaluations of both correctness and reasonableness, and system-level evaluations of trust, satisfaction, correctness, and reasonableness. Several proposals are made about the user's mental model in these tasks and how it is impacted by explanations at an item- and system-level. The wider lessons from this work for XAI and its user studies are reviewed. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0004-3702 1872-7921 |
DOI: | 10.1016/j.artint.2021.103459 |