Implicit discourse relation classification with contextualized word representation

A method includes: initializing a list of token embeddings, each of the token embeddings corresponding to a tokenized word from text in a corpus of text; generating a graph for a group of consecutive words s from the text, said graph including binary relations between pairs of tokenized words of the...

Full description

Saved in:
Bibliographic Details
Main Authors Gaussier, Eric Jacques Guy, Perez, Julien, Popa, Diana Nicoleta, Henderson, James
Format Patent
LanguageEnglish
Published 13.12.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method includes: initializing a list of token embeddings, each of the token embeddings corresponding to a tokenized word from text in a corpus of text; generating a graph for a group of consecutive words s from the text, said graph including binary relations between pairs of tokenized words of the group of consecutive words; selecting the token embeddings representing the words of the group of consecutive words from the list of token embeddings; computing a tensor of binary relations as the product between a matrix of the selected token embeddings and a tensor representing discourse relations, the computed tensor representing the binary relations between the pairs of tokenized words; computing a loss using the computed tensor; optimizing the list of token embeddings using the computed loss. The above may be repeated until the computed loss is within a predetermined range.
Bibliography:Application Number: US202016818335