ConDA: state-based data augmentation for context-dependent text-to-SQL

The context-dependent text-to-SQL task has profound real-world implications, as it facilitates users in extracting knowledge from vast databases, which allows users to acquire the information interactively for better accuracy. Unfortunately, current models struggle to address this task effectively d...

Full description

Saved in:

Bibliographic Details
Published in	International journal of machine learning and cybernetics Vol. 15; no. 8; pp. 3157 - 3168
Main Authors	Wang, Dingzirui, Dou, Longxu, Che, Wanxiang, Wang, Jiaqi, Liu, Jinbo, Li, Lixin, Shang, Jingan, Tao, Lei, Zhang, Jie, Fu, Cong, Song, Xuri
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.08.2024 Springer Nature B.V
Subjects	Annotations Artificial Intelligence Complex Systems Computational Intelligence Context Control Data augmentation Datasets Effectiveness Engineering Mechatronics Methods Original Article Pattern Recognition Performance evaluation Query languages Robotics Semantics Systems Biology Context-dependent text-to-SQL Semantic parsing Data augmentation Natural language processing
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The context-dependent text-to-SQL task has profound real-world implications, as it facilitates users in extracting knowledge from vast databases, which allows users to acquire the information interactively for better accuracy. Unfortunately, current models struggle to address this task effectively due to the scarcity of data led by the high annotation overhead. The most straightforward method for addressing this problem is data augmentation, which aims at scaling up the parsing corpus. However, the naive methods suffer from the low diversity of the augmented data. To address this limitation, we propose the state-based CON text-dependent text-to-SQL D ata A ugmentation ( ConDA ), which generate and filter augmented data based on the dialogue state, which has higher diversity. Experimental results show that ConDA yields performance improvement on all experimental datasets with an average boosting of 1.6 % , proving the effectiveness of our method.
ISSN:	1868-8071 1868-808X
DOI:	10.1007/s13042-023-02086-z