Template-Based Model for Mongolian-Chinese Machine Translation

Mongolian and Chinese statistical machine translation (SMT) system has its limitation because of the complex Mongolian morphology, scarce resource of parallel corpus and the significant syntax differences. To address these problems, we propose a template-based machine translation (TBMT) system and c...

Full description

Saved in:
Bibliographic Details
Published inJournal of advanced computational intelligence and intelligent informatics Vol. 20; no. 6; pp. 893 - 901
Main Authors Wu, Jing, Hou, Hongxu, Bao, Feilong, Jiang, Yupeng
Format Journal Article
LanguageEnglish
Published 20.11.2016
Online AccessGet full text

Cover

Loading…
More Information
Summary:Mongolian and Chinese statistical machine translation (SMT) system has its limitation because of the complex Mongolian morphology, scarce resource of parallel corpus and the significant syntax differences. To address these problems, we propose a template-based machine translation (TBMT) system and combine it with the SMT system to achieve a better translation performance. The TBMT model we proposed includes a template extraction model and a template translation model. In the template extraction model, we present a novel method of aligning and abstracting static words from bilingual parallel corpus to extract templates automatically. In the template translation model, our specially designed method of filtering out the low quality matches can enhance the translation performance. Moreover, we apply lemmatization and Latinization to address data sparsity and do the fuzzy match. Experimentally, the coverage of TBMT system is over 50%. The combined SMT system translates all the other uncovered source sentences. The TBMT system outperforms the baselines of phrase-based and hierarchical phrase-based SMT systems for +3.08 and +1.40 BLEU points. The combined system of TBMT and SMT systems also performs better than the baselines of +2.49 and +0.81 BLEU points.
ISSN:1343-0130
1883-8014
DOI:10.20965/jaciii.2016.p0893