How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing

In a classification task, dealing with text snippets and metadata usually requires dealing with multimodal approaches. When those metadata are textual, it is tempting to use them intrinsically with a pre-trained transformer, in order to leverage the semantic information encoded inside the model. Thi...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Barriere Valentin, Jacquet Guillaume
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 07.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In a classification task, dealing with text snippets and metadata usually requires dealing with multimodal approaches. When those metadata are textual, it is tempting to use them intrinsically with a pre-trained transformer, in order to leverage the semantic information encoded inside the model. This paper describes how to improve a humanitarian classification task by adding the crisis event type to each tweet to be classified. Based on additional experiments of the model weights and behavior, it identifies how the proposed neural network approach is partially over-fitting the particularities of the Crisis Benchmark, to better highlight how the model is still undoubtedly learning to use and take advantage of the metadata's textual semantics.
ISSN:2331-8422