She adapts to her student: An expert pragmatic speaker tailoring her referring expressions to the Layman listener

Communication is a dynamic process through which interlocutors adapt to each other. In the development of conversational agents, this core aspect has been put aside for several years since the main challenge was to obtain conversational neural models able to produce utterances and dialogues that at...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in artificial intelligence Vol. 6; p. 1017204
Main Authors Greco, Claudio, Bagade, Diksha, Le, Dieu-Thu, Bernardi, Raffaella
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 09.03.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Communication is a dynamic process through which interlocutors adapt to each other. In the development of conversational agents, this core aspect has been put aside for several years since the main challenge was to obtain conversational neural models able to produce utterances and dialogues that at least at the surface level are human-like. Now that this milestone has been achieved, the importance of paying attention to the dynamic and adaptive interactive aspects of language has been advocated in several position papers. In this paper, we focus on how a Speaker adapts to an interlocutor with different background knowledge. Our models undergo a pre-training phase, through which they acquire grounded knowledge by learning to describe an image, and an adaptive phase through which a Speaker and a Listener play a repeated reference game. Using a similar setting, previous studies focus on how conversational models create new conventions; we are interested, instead, in studying whether the Speaker learns from the Listener's mistakes to adapt to his background knowledge. We evaluate models based on Rational Speech Act (RSA), a likelihood loss, and a combination of the two. We show that RSA could indeed work as a backbone to drive the Speaker toward the Listener: in the combined model, apart from the improved Listener's accuracy, the language generated by the Speaker features the changes that signal adaptation to the Listener's background knowledge. Specifically, captions to unknown object categories contain more adjectives and less direct reference to the unknown objects.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Constantin Orasan, University of Surrey, United Kingdom; Albert Gatt, Utrecht University, Netherlands
Edited by: Huma Shah, Coventry University, United Kingdom
This article was submitted to Language and Computation, a section of the journal Frontiers in Artificial Intelligence
ISSN:2624-8212
2624-8212
DOI:10.3389/frai.2023.1017204