Zero and Few-Shot Localization of Task-Oriented Dialogue Agents with a Distilled Representation
Task-oriented Dialogue (ToD) agents are mostly limited to a few widely-spoken languages, mainly due to the high cost of acquiring training data for each language. Existing low-cost approaches that rely on cross-lingual embeddings or naive machine translation sacrifice a lot of accuracy for data effi...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
18.02.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Task-oriented Dialogue (ToD) agents are mostly limited to a few widely-spoken
languages, mainly due to the high cost of acquiring training data for each
language. Existing low-cost approaches that rely on cross-lingual embeddings or
naive machine translation sacrifice a lot of accuracy for data efficiency, and
largely fail in creating a usable dialogue agent. We propose automatic methods
that use ToD training data in a source language to build a high-quality
functioning dialogue agent in another target language that has no training data
(i.e. zero-shot) or a small training set (i.e. few-shot). Unlike most prior
work in cross-lingual ToD that only focuses on Dialogue State Tracking (DST),
we build an end-to-end agent.
We show that our approach closes the accuracy gap between few-shot and
existing full-shot methods for ToD agents. We achieve this by (1) improving the
dialogue data representation, (2) improving entity-aware machine translation,
and (3) automatic filtering of noisy translations.
We evaluate our approach on the recent bilingual dialogue dataset BiToD. In
Chinese to English transfer, in the zero-shot setting, our method achieves
46.7% and 22.0% in Task Success Rate (TSR) and Dialogue Success Rate (DSR)
respectively. In the few-shot setting where 10% of the data in the target
language is used, we improve the state-of-the-art by 15.2% and 14.0%, coming
within 5% of full-shot training. |
---|---|
DOI: | 10.48550/arxiv.2302.09424 |