Translating the Unseen? Yoruba-English MT in Low-Resource, Morphologically-Unmarked Settings
Translating between languages where certain features are marked morphologically in one but absent or marked contextually in the other is an important test case for machine translation. When translating into English which marks (in)definiteness morphologically, from Yor\`ub\'a which uses bare no...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
06.03.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Translating between languages where certain features are marked
morphologically in one but absent or marked contextually in the other is an
important test case for machine translation. When translating into English
which marks (in)definiteness morphologically, from Yor\`ub\'a which uses bare
nouns but marks these features contextually, ambiguities arise. In this work,
we perform fine-grained analysis on how an SMT system compares with two NMT
systems (BiLSTM and Transformer) when translating bare nouns in Yor\`ub\'a into
English. We investigate how the systems what extent they identify BNs,
correctly translate them, and compare with human translation patterns. We also
analyze the type of errors each model makes and provide a linguistic
description of these errors. We glean insights for evaluating model performance
in low-resource settings. In translating bare nouns, our results show the
transformer model outperforms the SMT and BiLSTM models for 4 categories, the
BiLSTM outperforms the SMT model for 3 categories while the SMT outperforms the
NMT models for 1 category. |
---|---|
DOI: | 10.48550/arxiv.2103.04225 |