AfriVEC: Word Embedding Models for African Languages. Case Study of Fon and Nobiin
Africa NLP, EACL 2021 From Word2Vec to GloVe, word embedding models have played key roles in the current state-of-the-art results achieved in Natural Language Processing. Designed to give significant and unique vectorized representations of words and entities, those models have proven to efficiently...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
08.03.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Africa NLP, EACL 2021 From Word2Vec to GloVe, word embedding models have played key roles in the
current state-of-the-art results achieved in Natural Language Processing.
Designed to give significant and unique vectorized representations of words and
entities, those models have proven to efficiently extract similarities and
establish relationships reflecting semantic and contextual meaning among words
and entities. African Languages, representing more than 31% of the worldwide
spoken languages, have recently been subject to lots of research. However, to
the best of our knowledge, there are currently very few to none word embedding
models for those languages words and entities, and none for the languages under
study in this paper. After describing Glove, Word2Vec, and Poincaré
embeddings functionalities, we build Word2Vec and Poincaré word embedding
models for Fon and Nobiin, which show promising results. We test the
applicability of transfer learning between these models as a landmark for
African Languages to jointly involve in mitigating the scarcity of their
resources, and attempt to provide linguistic and social interpretations of our
results. Our main contribution is to arouse more interest in creating word
embedding models proper to African Languages, ready for use, and that can
significantly improve the performances of Natural Language Processing
downstream tasks on them. The official repository and implementation is at
https://github.com/bonaventuredossou/afrivec |
---|---|
DOI: | 10.48550/arxiv.2103.05132 |