Clarity in complexity: how aggregating explanations resolves the disagreement problem

The Rashômon Effect, applied in Explainable Machine Learning, refers to the disagreement between the explanations provided by various attribution explainers and to the dissimilarity across multiple explanations generated by a particular explainer for a single instance from the dataset (differences b...

Full description

Saved in:
Bibliographic Details
Published inThe Artificial intelligence review Vol. 57; no. 12; p. 338
Main Authors Mitruț, Oana, Moise, Gabriela, Moldoveanu, Alin, Moldoveanu, Florica, Leordeanu, Marius, Petrescu, Livia
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Netherlands 19.10.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The Rashômon Effect, applied in Explainable Machine Learning, refers to the disagreement between the explanations provided by various attribution explainers and to the dissimilarity across multiple explanations generated by a particular explainer for a single instance from the dataset (differences between feature importances and their associated signs and ranks), an undesirable outcome especially in sensitive domains such as healthcare or finance. We propose a method inspired from textual-case based reasoning for aligning explanations from various explainers in order to resolve the disagreement and dissimilarity problems. We iteratively generated a number of 100 explanations for each instance from six popular datasets, using three prevalent feature attribution explainers: LIME, Anchors and SHAP (with the variations Tree SHAP and Kernel SHAP) and consequently applied a global cluster-based aggregation strategy that quantifies alignment and reveals similarities and associations between explanations. We evaluated our method by weighting the -NN algorithm with agreed feature overlap explanation weights and compared it to a non-weighted -NN predictor, having as task binary classification. Also, we compared the results of the weighted -NN algorithm using aggregated feature overlap explanation weights to the weighted -NN algorithm using weights produced by a single explanation method (either LIME, SHAP or Anchors). Our global alignment method benefited the most from a hybridization with feature importance scores (information gain), that was essential for acquiring a more accurate estimate of disagreement, for enabling explainers to reach a consensus across multiple explanations and for supporting effective model learning through improved classification performance.
ISSN:1573-7462
0269-2821
1573-7462
DOI:10.1007/s10462-024-10952-7