Overview of ADoBo 2021: Automatic Detection of Unassimilated Borrowings in the Spanish Press

This paper summarizes the main ﬁndings of the ADoBo 2021 shared task, proposed in the context of IberLef 2021. In this task, we invited participants to detect lexical borrowings (coming mostly from English) in Spanish newswire texts. This task was framed as a sequence classiﬁcation problem using BIO...

Full description

Saved in:

Bibliographic Details
Published in	Procesamiento del Lenguaje Natural Vol. 67; p. 277
Main Authors	Álvarez Mellado, Elena, Espinosa Anke, Luis, Gonzalo Arroyo, Julio, Lignos, Constatine, Porta Zamorano, Jordi
Format	Journal Article
Language	English
Published	Jaén Sociedad Española para el Procesamiento del Lenguaje Natural 01.09.2021
Subjects	Borrowing Computer science Language Lexicography Linguistics Natural language processing Spanish language
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper summarizes the main ﬁndings of the ADoBo 2021 shared task, proposed in the context of IberLef 2021. In this task, we invited participants to detect lexical borrowings (coming mostly from English) in Spanish newswire texts. This task was framed as a sequence classiﬁcation problem using BIO encoding. We provided participants with an annotated corpus of lexical borrowings which we split into training, development and test splits. We received submissions from 4 teams with 9 diﬀerent system runs overall. The results, which range from F1 scores of 37 to 85, suggest that this is a challenging task, especially when out-of-domain or OOV words are considered, and that traditional methods informed with lexicographic in-formation would beneﬁt from taking advantage of current NLP trends.
ISSN:	1135-5948 1989-7553
DOI:	10.26342/2021-67-24