PuppeteerGAN: Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation

Portrait animation, which aims to animate a still portrait to life using poses extracted from target frames, is an important technique for many real-world entertainment applications. Although recent works have achieved highly realistic results on synthesizing or controlling human head images, the pu...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 13515 - 13524
Main Authors	Chen, Zhuo, Wang, Chaoyue, Yuan, Bo, Tao, Dacheng
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2020
Subjects	Animation Face Image segmentation Semantics Strain Training Videos
Online Access	Get full text
ISSN	1063-6919
DOI	10.1109/CVPR42600.2020.01353

Cover

More Information
Summary:	Portrait animation, which aims to animate a still portrait to life using poses extracted from target frames, is an important technique for many real-world entertainment applications. Although recent works have achieved highly realistic results on synthesizing or controlling human head images, the puppeteering of arbitrary portraits is still confronted by the following challenges: 1) identity/personality mismatch; 2) training data/domain limitations; and 3) low-efficiency in training/fine-tuning. In this paper, we devised a novel two-stage framework called PuppeteerGAN for solving these challenges. Specifically, we first learn identity-preserved semantic segmentation animation which executes pose retargeting between any portraits. As a general representation, the semantic segmentation results could be adapted to different datasets, environmental conditions or appearance domains. Furthermore, the synthesized semantic segmentation is filled with the appearance of the source portrait. To this end, an appearance transformation network is presented to produce fidelity output by jointly considering the wrapping of semantic features and conditional generation. After training, the two networks can directly perform end-to-end inference on unseen subjects without any retraining or fine-tuning. Extensive experiments on cross-identity/domain/resolution situations demonstrate the superiority of the proposed PuppetterGAN over existing portrait animation methods in both generation quality and inference speed.
ISSN:	1063-6919
DOI:	10.1109/CVPR42600.2020.01353