GAProtoNet: A Multi-head Graph Attention-based Prototypical Network for Interpretable Text Classification
Pretrained transformer-based Language Models (LMs) are well-known for their ability to achieve significant improvement on text classification tasks with their powerful word embeddings, but their black-box nature, which leads to a lack of interpretability, has been a major concern. In this work, we i...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
20.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Pretrained transformer-based Language Models (LMs) are well-known for their
ability to achieve significant improvement on text classification tasks with
their powerful word embeddings, but their black-box nature, which leads to a
lack of interpretability, has been a major concern. In this work, we introduce
GAProtoNet, a novel white-box Multi-head Graph Attention-based Prototypical
Network designed to explain the decisions of text classification models built
with LM encoders. In our approach, the input vector and prototypes are regarded
as nodes within a graph, and we utilize multi-head graph attention to
selectively construct edges between the input node and prototype nodes to learn
an interpretable prototypical representation. During inference, the model makes
decisions based on a linear combination of activated prototypes weighted by the
attention score assigned for each prototype, allowing its choices to be
transparently explained by the attention weights and the prototypes projected
into the closest matching training examples. Experiments on multiple public
datasets show our approach achieves superior results without sacrificing the
accuracy of the original black-box LMs. We also compare with four alternative
prototypical network variations and our approach achieves the best accuracy and
F1 among all. Our case study and visualization of prototype clusters also
demonstrate the efficiency in explaining the decisions of black-box models
built with LMs. |
---|---|
DOI: | 10.48550/arxiv.2409.13312 |