Improving Low-resource Dependency Parsing Using Multi-strategy Data Augmentation

Dependency syntax analysis aims to identify the syntactic dependencies between words in a sentence. Dependency syntax can provide syntactic features for tasks such as information extraction, automatic question answering and machine translation, and improve model performance. The lack of training dat...

Full description

Saved in:
Bibliographic Details
Published inJi suan ji ke xue Vol. 49; no. 1; pp. 73 - 79
Main Authors Xian, Yan-tuan, Gao, Fan-ya, Xiang, Yan, Yu, Zheng-tao, Wang, Jian
Format Journal Article
LanguageChinese
Published Chongqing Guojia Kexue Jishu Bu 01.01.2022
Editorial office of Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Dependency syntax analysis aims to identify the syntactic dependencies between words in a sentence. Dependency syntax can provide syntactic features for tasks such as information extraction, automatic question answering and machine translation, and improve model performance. The lack of training data will bring serious problems of unknown words and model overfitting. In this paper, a variety of data enhancement strategies are proposed for the low-resource-dependent syntactic analysis problem. The proposed method effectively expands training through synonym replacement. The problem of unknown words is alleviated. Through a variety of data enhancement strategies of Mixup, the problem of model overfitting is effectively alleviated and the generalization ability of the model is improved. The experimental results on the (Universal Dependencies treebanks, UD treebanks) data set show that , the proposed method effectively improves the performance of Thai, Vietnamese and English dependent syntactic analysis under the
ISSN:1002-137X
DOI:10.11896/jsjkx.210900036