Position-based focal loss for diverse and relevant response generation

Response generation models trained with cross entropy loss suffer from over-general responses due to their preference for high-frequent tokens. Focal loss and anti-focal loss are candidates to solve this problem, but they have their own limitation that they exaggerate only one of relevancy or divers...

Full description

Saved in:

Bibliographic Details
Published in	Applied soft computing Vol. 165; p. 112037
Main Authors	Kim, So-Eon, Park, Seong-Bae
Format	Journal Article
Language	English
Published	Elsevier B.V 01.11.2024
Subjects	Focal loss Nlp Position Response generation Nlp Focal loss Response generation Position
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Response generation models trained with cross entropy loss suffer from over-general responses due to their preference for high-frequent tokens. Focal loss and anti-focal loss are candidates to solve this problem, but they have their own limitation that they exaggerate only one of relevancy or diversity of responses. Therefore, this paper proposes two novel losses of positional focal loss and adaptive positional focal loss which emphasize relevancy or diversity flexibly according to the position of a target token. The positional focal loss introduces a position function as a weight to the token position, but it tends to underestimate the relevancy for low confident predictions. To tackle this problem, the adaptive positional focal loss balances relevancy and diversity by limiting the effect of over-confident predictions. •The first attempt to show that a model needs to be learned with a difference in relevancy or diversity for each token for relevant and diverse response generation.•This paper proposes to keep a balance between relevancy and diversity of a response by proposing two novel losses.•This paper proposes various position functions and validate their efficiency through experiments.
ISSN:	1568-4946
DOI:	10.1016/j.asoc.2024.112037