Associated Lattice-BERT for Spoken Language Understanding

Lattices are compact representations that can encode multiple speech recognition hypotheses in spoken language understanding tasks. Previous work has extended the pre-trained transformer to model lattice inputs and achieved significant improvements in natural language processing tasks. However, thes...

Full description

Saved in:

Bibliographic Details
Published in	Neural Information Processing Vol. 1517; pp. 579 - 586
Main Authors	Zou, Ye, Sun, Huiping, Chen, Zhong
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2021 Springer International Publishing
Series	Communications in Computer and Information Science
Subjects	Attention mechanism BERT Lattice Spoken language understanding
Online Access	Get full text
ISBN	9783030923099 3030923096
ISSN	1865-0929 1865-0937
DOI	10.1007/978-3-030-92310-5_67

Cover

Loading…

More Information
Summary:	Lattices are compact representations that can encode multiple speech recognition hypotheses in spoken language understanding tasks. Previous work has extended the pre-trained transformer to model lattice inputs and achieved significant improvements in natural language processing tasks. However, these models do not consider the global probability distribution of lattices path and the correlation among multiple speech recognition hypotheses. In this paper, we propose an associated Lattice-BERT, an extension of BERT that is tailored for spoken language understanding (SLU). Associated Lattice-BERT augments self-attention with positional relation representations and lattice scores to incorporate lattice structure. We further design a lattice confusion-aware attention mechanism in the prediction layer to push the model to learn from the association information between the lattice confusion paths, which mitigates the impact of the Automatic Speech Recognizer (ASR) errors on the model. We apply the proposed model to a spoken language understanding task, the experiments on the datasets of intention detection recognition show that our proposed method outperforms the strong baselines when evaluated on spoken inputs.
ISBN:	9783030923099 3030923096
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-3-030-92310-5_67