Integrating Event Elements for Chinese-Vietnamese Cross-Lingual Event Retrieval

Chinese-Vietnamese cross-lingual event retrieval aims to retrieve the Vietnamese sentence describing the same event as a given Chinese query sentence from a set of Vietnamese sentences. Existing mainstream cross-lingual event retrieval methods rely on extracting textual representations from query te...

Full description

Saved in:

Bibliographic Details
Published in	IEICE Transactions on Information and Systems Vol. E107.D; no. 10; pp. 1353 - 1361
Main Authors	HUANG, Yuxin, YANG, Yuanlin, ZHU, Enchang, LIANG, Yin, XIAN, Yantuan
Format	Journal Article
Language	English
Published	The Institute of Electronics, Information and Communication Engineers 01.10.2024
Subjects	attention mechanism Chinese-Vietnamese cross-lingual event retrieval event elements pre-trained language model
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Chinese-Vietnamese cross-lingual event retrieval aims to retrieve the Vietnamese sentence describing the same event as a given Chinese query sentence from a set of Vietnamese sentences. Existing mainstream cross-lingual event retrieval methods rely on extracting textual representations from query texts and calculating their similarity with textual representations in other language candidate sets. However, these methods ignore the difference in event elements present during Chinese-Vietnamese cross-language retrieval. Consequently, sentences with similar meanings but different event elements may be incorrectly considered to describe the same event. To address this problem, we propose a cross-lingual retrieval method that integrates event elements. We introduce event elements as an additional supervisory signal, where we calculate the semantic similarity of event elements in two sentences using an attention mechanism to determine the attention score of the event elements. This allows us to establish a one-to-one correspondence between event elements in the text. Additionally, we leverage the multilingual pre-trained language model fine-tuned based on contrastive learning to obtain cross-language sentence representation to calculate the semantic similarity of the sentence texts. By combining these two approaches, we obtain the final text similarity score. Experimental results demonstrate that our proposed method achieves higher retrieval accuracy than the baseline model.
ISSN:	0916-8532 1745-1361
DOI:	10.1587/transinf.2024EDP7055