On quantitative aspects of model interpretability

Despite the growing body of work in interpretable machine learning, it remains unclear how to evaluate different explainability methods without resorting to qualitative assessment and user-studies. While interpretability is an inherently subjective matter, previous works in cognitive science and epi...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Nguyen, An-phi, María Rodríguez Martínez
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 15.07.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Despite the growing body of work in interpretable machine learning, it remains unclear how to evaluate different explainability methods without resorting to qualitative assessment and user-studies. While interpretability is an inherently subjective matter, previous works in cognitive science and epistemology have shown that good explanations do possess aspects that can be objectively judged apart from fidelity), such assimplicity and broadness. In this paper we propose a set of metrics to programmatically evaluate interpretability methods along these dimensions. In particular, we argue that the performance of methods along these dimensions can be orthogonally imputed to two conceptual parts, namely the feature extractor and the actual explainability method. We experimentally validate our metrics on different benchmark tasks and show how they can be used to guide a practitioner in the selection of the most appropriate method for the task at hand.
ISSN:2331-8422