Speaker-Informed time-and-Content-Aware attention for spoken language understanding

To mitigate the ambiguity of spoken language understanding (SLU) of an utterance, we propose contextual models that can consider the relevant context by using temporal and content-related information effectively. We first propose two axes: ‘Awareness’ and ‘Attention Level’. Awareness includes three...

Full description

Saved in:
Bibliographic Details
Published inComputer speech & language Vol. 60; p. 101022
Main Authors Kim, Jonggu, Jeong, Yewon, Lee, Jong-Hyeok
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.03.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To mitigate the ambiguity of spoken language understanding (SLU) of an utterance, we propose contextual models that can consider the relevant context by using temporal and content-related information effectively. We first propose two axes: ‘Awareness’ and ‘Attention Level’. Awareness includes three methods that consider the timing or content-similarity of context. The Attention Level includes three methods that consider speaker roles to calculate the importance of each historic utterance. By combining one method from each axis, we build various contextual models. The proposed models are designed to use a dataset to automatically learn the importance of previous utterances in terms of time and content. We also propose various speaker information that would be helpful to improve SLU accuracy. The proposed models achieved state-of-the-art F1 scores in experiments on the Dialog State Tracking Challenge (DSTC) 4 and Loqui benchmark datasets. We applied in-depth analysis to verify that the proposed methods are effective to improve SLU accuracy. The analysis also demonstrated the effectiveness of the proposed methods.
ISSN:0885-2308
1095-8363
DOI:10.1016/j.csl.2019.101022