SATFace: Subject Agnostic Talking Face Generation with Natural Head Movement

Talking face generation is widely used in education, entertainment, shopping, and other social practices. Existing methods focus on matching the speaker’s mouth shape with the speech content. Still, there is a lack of research on automatically extracting potential head motion features from speech, r...

Full description

Saved in:
Bibliographic Details
Published inNeural processing letters Vol. 55; no. 6; pp. 7529 - 7542
Main Authors Yang, Shuai, Qiao, Kai, Shi, Shuhao, Yang, Jie, Ma, Dekui, Hu, Guoen, Yan, Bin, Chen, Jian
Format Journal Article
LanguageEnglish
Published New York Springer US 01.12.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Talking face generation is widely used in education, entertainment, shopping, and other social practices. Existing methods focus on matching the speaker’s mouth shape with the speech content. Still, there is a lack of research on automatically extracting potential head motion features from speech, resulting in a lack of naturalness. This paper proposes SATFace, a subject agnostic talking face generation method with natural head movement. To model the talking face’s complicated and critical features (identity, background, mouth shape, head posture, etc.), we construct SATFace by taking encoder-decoder as the primary network architecture. Then, we design a long short-time feature learning network to better reference the global and local information in audio for generating reasonable head movement. Besides, a modular training process is proposed to improve explicit and implicit features’ learning effects and efficiency. The experimental comparison results show that SATFace improves by at least about 9.8% in cumulative probability of blur detection and 8.2% in synchronization confidence compared with the mainstream methods. The mean opinion scores show that SATFace has advantages in terms of lip sync quality, head movement naturalness, and video realness.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1370-4621
1573-773X
DOI:10.1007/s11063-023-11272-7