GPT-2からのbigram知識の取り出し

We propose a method to extract bigram knowledge from GPT-2 models. Based on the observation that the first layer in GPT-2 is useful to predict the tokens next to the given input tokens, we propose an algorithm to use self attention heads only from the first layer to predict the next tokens. We also...

Full description

Saved in:

Bibliographic Details
Published in	人工知能学会論文誌 Vol. 40; no. 3; pp. A-O65_1 - 23
Main Authors	松本, 和幸, 吉田, 稔
Format	Journal Article
Language	Japanese
Published	一般社団法人人工知能学会 01.05.2025
Subjects	bigram embedding GPT interpretability
Online Access	Get full text
ISSN	1346-0714 1346-8030
DOI	10.1527/tjsai.40-3_A-O65

Cover

Loading…

More Information
Summary:	We propose a method to extract bigram knowledge from GPT-2 models. Based on the observation that the first layer in GPT-2 is useful to predict the tokens next to the given input tokens, we propose an algorithm to use self attention heads only from the first layer to predict the next tokens. We also propose an algorithm to find contextual words that are highly related to a given bigram by applying the backpropagation method to GPT-2 parameters for the next-token prediction. Experimental results showed that our proposed algorithms to predict next words and to induce context words showed the higher average precision values than the baseline methods.
ISSN:	1346-0714 1346-8030
DOI:	10.1527/tjsai.40-3_A-O65