Multi-Hierarchy Metamorphic Testing for Hyphenated Words in Machine Translation

With the advancement of deep neural networks, machine translation has seen rapid progress in recent years. Individuals often rely on machine translation software to facil-itate various tasks. However, the intricacies of neural networks can lead to translation errors, resulting in misunderstandings o...

Full description

Saved in:
Bibliographic Details
Published inProceedings / Asia Pacific Software Engineering Conference pp. 261 - 270
Main Authors Zhu, Rui, Tao, Chuanqi, Gao, Jerry
Format Conference Proceeding
LanguageEnglish
Published IEEE 03.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the advancement of deep neural networks, machine translation has seen rapid progress in recent years. Individuals often rely on machine translation software to facil-itate various tasks. However, the intricacies of neural networks can lead to translation errors, resulting in misunderstandings or conflicts. The most common method for testing machine translation is metamorphic testing. However, metamorphic testing at the phrase or sentence hierarchy may result in some test cases being incorrectly identified as failures. To mitigate this issue, we added the word hierarchy. We proposed a multi-hierarchy metamorphic testing method, MHT, to test machine translation. Hyphenated words as a specific format prone to translation errors, which are chosen as our research object. Based on the common notion that translations of words within the same sentence should be similar, we extract contents from different hierarchies within sentences containing hyphenated words and compare the similarity of their corresponding translations for these specific words. We conducted the experiments on 881 sentences leveraging Google Translate, Microsoft Bing Translator, and Baidu Translate, which detected 111, 91, and 111 suspicious errors with high precision (78.4 %, 82.4 %, and 81.1 %). Translation errors mainly include mis-translation, under-translation, over-translation, and non-translation.
ISSN:2640-0715
DOI:10.1109/APSEC65559.2024.00037