English-Vietnamese Cross-Lingual Paraphrase Identification Using MT-DNN
Paraphrase identification is a crucial task in natural language understanding, especially in cross-language information retrieval. Nowadays, Multi-Task Deep Neural Network (MT-DNN) has become a state-of-the-art method that brings outstanding results in paraphrase identification [1]. In this paper, o...
Saved in:
Published in | Engineering, technology & applied science research Vol. 11; no. 5; pp. 7598 - 7604 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
01.10.2021
|
Online Access | Get full text |
Cover
Loading…
Summary: | Paraphrase identification is a crucial task in natural language understanding, especially in cross-language information retrieval. Nowadays, Multi-Task Deep Neural Network (MT-DNN) has become a state-of-the-art method that brings outstanding results in paraphrase identification [1]. In this paper, our proposed method based on MT-DNN [2] to detect similarities between English and Vietnamese sentences, is proposed. We changed the shared layers of the original MT-DNN from original the BERT [3] to other pre-trained multi-language models such as M-BERT [3] or XLM-R [4] so that our model could work on cross-language (in our case, English and Vietnamese) information retrieval. We also added some tasks as improvements to gain better results. As a result, we gained 2.3% and 2.5% increase in evaluated accuracy and F1. The proposed method was also implemented on other language pairs such as English – German and English – French. With those implementations, we got a 1.0%/0.7% improvement for English – German and a 0.7%/0.5% increase for English – French. |
---|---|
ISSN: | 2241-4487 1792-8036 |
DOI: | 10.48084/etasr.4300 |