English-Vietnamese Cross-Lingual Paraphrase Identification Using MT-DNN

Paraphrase identification is a crucial task in natural language understanding, especially in cross-language information retrieval. Nowadays, Multi-Task Deep Neural Network (MT-DNN) has become a state-of-the-art method that brings outstanding results in paraphrase identification [1]. In this paper, o...

Full description

Saved in:
Bibliographic Details
Published inEngineering, technology & applied science research Vol. 11; no. 5; pp. 7598 - 7604
Main Authors Chi, H. V. T., Anh, D. L., Thanh, N. L., Dinh, D.
Format Journal Article
LanguageEnglish
Published 01.10.2021
Online AccessGet full text

Cover

Loading…
More Information
Summary:Paraphrase identification is a crucial task in natural language understanding, especially in cross-language information retrieval. Nowadays, Multi-Task Deep Neural Network (MT-DNN) has become a state-of-the-art method that brings outstanding results in paraphrase identification [1]. In this paper, our proposed method based on MT-DNN [2] to detect similarities between English and Vietnamese sentences, is proposed. We changed the shared layers of the original MT-DNN from original the BERT [3] to other pre-trained multi-language models such as M-BERT [3] or XLM-R [4] so that our model could work on cross-language (in our case, English and Vietnamese) information retrieval. We also added some tasks as improvements to gain better results. As a result, we gained 2.3% and 2.5% increase in evaluated accuracy and F1. The proposed method was also implemented on other language pairs such as English – German and English – French. With those implementations, we got a 1.0%/0.7% improvement for English – German and a 0.7%/0.5% increase for English – French.
ISSN:2241-4487
1792-8036
DOI:10.48084/etasr.4300