Explaining word embeddings with perfect fidelity: Case study in research impact prediction

Best performing approaches for scholarly document quality prediction are based on embedding models, which do not allow direct explanation of classifiers as distinct words no longer correspond to the input features for model training. Although model-agnostic explanation methods such as Local interpre...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Dvorackova, Lucie, Joachimiak, Marcin P, Cerny, Michal, Kubecova, Adriana, Sklenak, Vilem, Kliegr, Tomas
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 24.09.2024
Subjects	Accuracy Impact prediction Performance prediction Qualitative analysis Regression analysis Regression models
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Best performing approaches for scholarly document quality prediction are based on embedding models, which do not allow direct explanation of classifiers as distinct words no longer correspond to the input features for model training. Although model-agnostic explanation methods such as Local interpretable model-agnostic explanations (LIME) can be applied, these produce results with questionable correspondence to the ML model. We introduce a new feature importance method, Self-model Rated Entities (SMER), for logistic regression-based classification models trained on word embeddings. We show that SMER has theoretically perfect fidelity with the explained model, as its prediction corresponds exactly to the average of predictions for individual words in the text. SMER allows us to reliably determine which words or entities positively contribute to predicting impactful articles. Quantitative and qualitative evaluation is performed through five diverse experiments conducted on 50.000 research papers from the CORD-19 corpus. Through an AOPC curve analysis, we experimentally demonstrate that SMER produces better explanations than LIME for logistic regression.
ISSN:	2331-8422