Explainable image classification with evidence counterfactual

The complexity of state-of-the-art modeling techniques for image classification impedes the ability to explain model predictions in an interpretable way. A counterfactual explanation highlights the parts of an image which, when removed, would change the predicted class. Both legal scholars and data...

Full description

Saved in:

Bibliographic Details
Published in	Pattern analysis and applications : PAA Vol. 25; no. 2; pp. 315 - 335
Main Authors	Vermeire, Tom, Brughmans, Dieter, Goethals, Sofie, de Oliveira, Raphael Mazzine Barbossa, Martens, David
Format	Journal Article
Language	English
Published	London Springer London 01.05.2022 Springer Nature B.V
Subjects	Classification Computer Science Image classification Literature reviews Pattern Recognition Prediction models Theoretical Advances Training Explainable artificial intelligence Counterfactual explanation Search algorithms Image classification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The complexity of state-of-the-art modeling techniques for image classification impedes the ability to explain model predictions in an interpretable way. A counterfactual explanation highlights the parts of an image which, when removed, would change the predicted class. Both legal scholars and data scientists are increasingly turning to counterfactual explanations as these provide a high degree of human interpretability, reveal what minimal information needs to be changed in order to come to a different prediction and do not require the prediction model to be disclosed. Our literature review shows that existing counterfactual methods for image classification have strong requirements regarding access to the training data and the model internals, which often are unrealistic. Therefore, SEDC is introduced as a model-agnostic instance-level explanation method for image classification that does not need access to the training data. As image classification tasks are typically multiclass problems, an additional contribution is the introduction of the SEDC-T method that allows specifying a target counterfactual class. These methods are experimentally tested on ImageNet data, and with concrete examples, we illustrate how the resulting explanations can give insights in model decisions. Moreover, SEDC is benchmarked against existing model-agnostic explanation methods, demonstrating stability of results, computational efficiency and the counterfactual nature of the explanations.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1433-7541 1433-755X
DOI:	10.1007/s10044-021-01055-y