Using reinforcement learning with external rewards for open-domain natural language generation

We propose a new approach towards emotional natural language generation using bidirectional seq2seq model. Our goal is to generate emotionally relevant language that accommodates the emotional tone of the prior context. To incorporate emotional information, we train our own embeddings appended with...

Full description

Saved in:

Bibliographic Details
Published in	Journal of intelligent information systems Vol. 56; no. 1; pp. 189 - 206
Main Authors	Srinivasan, Vidhushini, Santhanam, Sashank, Shaikh, Samira
Format	Journal Article
Language	English
Published	New York Springer US 01.02.2021 Springer Nature B.V
Subjects	Arousal Artificial Intelligence Computer Science Data Structures and Information Theory Information Storage and Retrieval IT in Business Learning Natural language Natural Language Processing (NLP) Deep learning Conversational agent Emotional intelligence Natural language generation Reinforcement learning Human feedback Seq2seq learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We propose a new approach towards emotional natural language generation using bidirectional seq2seq model. Our goal is to generate emotionally relevant language that accommodates the emotional tone of the prior context. To incorporate emotional information, we train our own embeddings appended with emotion values through valence, arousal and dominance scores. We use a reinforcement-learning framework, which is tuned using policy gradient method. Two of the internal rewards in our reinforcement learning framework, viz. Ease of Answering and Semantic Coherence are based on prior state-of-the-art. We propose a new internal reward, Emotional Intelligence, computed by minimizing the affective dissonance between the source and generated text. We also train a separate external reward analyzer to predict the rewards as well as to maximize the expected rewards (both internal and external). We evaluate the system on two common corpora used for Natural Language Generation tasks: the Cornell Movie Dialog and Yelp Restaurant Review Corpus. We report standard evaluation metrics including BLEU, ROUGE-L and perplexity as well as human evaluation to validate our approach. We demonstrate the ability of proposed model to generate emotionally appropriate responses on both corpora.
ISSN:	0925-9902 1573-7675
DOI:	10.1007/s10844-020-00626-5