Generating Facial Expressions for Speech

This article reports results from a program that produces high‐quality animation of facial expressions and head movements as automatically as possible in conjunction with meaning‐based speech synthesis, including spoken intonation. The goal of the research is as much to test and define our theories...

Full description

Saved in:

Bibliographic Details
Published in	Cognitive science Vol. 20; no. 1; pp. 1 - 46
Main Authors	Pelachaud, Catherine, Badler, Norman I., Steedman, Mark
Format	Journal Article
Language	English
Published	10 Industrial Avenue, Mahwah, NJ 07430‐2262, USA Lawrence Erlbaum Associates, Inc 01.01.1996 Taylor & Francis Ablex Pub. Corp Wiley Subscription Services, Inc
Subjects	Applied linguistics Computational linguistics Face Linguistics Nonverbal communication Speech Expressiveness Conversation Coordination Affectivity Mimic
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This article reports results from a program that produces high‐quality animation of facial expressions and head movements as automatically as possible in conjunction with meaning‐based speech synthesis, including spoken intonation. The goal of the research is as much to test and define our theories of the formal semantics for such gestures, as to produce convincing animation. Towards this end, we have produced a high‐level programming language for three‐dimensional (3‐D) animation of facial expressions. We have been concerned primarily with expressions conveying information correlated with the intonation of the voice: This includes the differences of timing, pitch, and emphasis that are related to such semantic distinctions of discourse as “focus,”“topic,” and “comment,”“theme” and “rheme,” or “given” and “new” information. We are also interested in the relation of affect or emotion to facial expression. Until now, systems have not embodied such rule‐governed translation from spoken utterance meaning to facial expressions. Our system embodies rules that describe and coordinate these relations: intonation/information, intonation/affect, and facial expressions/affect. A meaning representation includes discourse information: What is contrastive/background information in the given context, and what is the “topic” or “theme” of the discourse? The system maps the meaning representation into how accents and their placement are chosen, how they are conveyed over facial expression, and how speech and facial expressions are coordinated. This determines a sequence of functional groups: lip shapes, conversational signals, punctuators, regulators, and manipulators. Our algorithms then impose synchrony, create coarticulation effects, and determine affectual signals, eye and head movements. The lowest level representation is the Facial Action Coding System (FACS), which makes the generation system portable to other facial models.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0364-0213 1551-6709
DOI:	10.1207/s15516709cog2001_1