She adapts to her student: An expert pragmatic speaker tailoring her referring expressions to the Layman listener

Communication is a dynamic process through which interlocutors adapt to each other. In the development of conversational agents, this core aspect has been put aside for several years since the main challenge was to obtain conversational neural models able to produce utterances and dialogues that at...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in artificial intelligence Vol. 6; p. 1017204
Main Authors	Greco, Claudio, Bagade, Diksha, Le, Dieu-Thu, Bernardi, Raffaella
Format	Journal Article
Language	English
Published	Switzerland Frontiers Media S.A 09.03.2023
Subjects	adaptive model Artificial Intelligence grounded reference game image captioning generation knowledge disparity neural conversational models neural conversational models adaptive model image captioning generation knowledge disparity grounded reference game
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Communication is a dynamic process through which interlocutors adapt to each other. In the development of conversational agents, this core aspect has been put aside for several years since the main challenge was to obtain conversational neural models able to produce utterances and dialogues that at least at the surface level are human-like. Now that this milestone has been achieved, the importance of paying attention to the dynamic and adaptive interactive aspects of language has been advocated in several position papers. In this paper, we focus on how a Speaker adapts to an interlocutor with different background knowledge. Our models undergo a pre-training phase, through which they acquire grounded knowledge by learning to describe an image, and an adaptive phase through which a Speaker and a Listener play a repeated reference game. Using a similar setting, previous studies focus on how conversational models create new conventions; we are interested, instead, in studying whether the Speaker learns from the Listener's mistakes to adapt to his background knowledge. We evaluate models based on Rational Speech Act (RSA), a likelihood loss, and a combination of the two. We show that RSA could indeed work as a backbone to drive the Speaker toward the Listener: in the combined model, apart from the improved Listener's accuracy, the language generated by the Speaker features the changes that signal adaptation to the Listener's background knowledge. Specifically, captions to unknown object categories contain more adjectives and less direct reference to the unknown objects.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Reviewed by: Constantin Orasan, University of Surrey, United Kingdom; Albert Gatt, Utrecht University, Netherlands Edited by: Huma Shah, Coventry University, United Kingdom This article was submitted to Language and Computation, a section of the journal Frontiers in Artificial Intelligence
ISSN:	2624-8212 2624-8212
DOI:	10.3389/frai.2023.1017204