Learning Motion Refinement for Unsupervised Face Animation
Unsupervised face animation aims to generate a human face video based on the appearance of a source image, mimicking the motion from a driving video. Existing methods typically adopted a prior-based motion model (e.g., the local affine motion model or the local thin-plate-spline motion model). While...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
21.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Unsupervised face animation aims to generate a human face video based on the
appearance of a source image, mimicking the motion from a driving video.
Existing methods typically adopted a prior-based motion model (e.g., the local
affine motion model or the local thin-plate-spline motion model). While it is
able to capture the coarse facial motion, artifacts can often be observed
around the tiny motion in local areas (e.g., lips and eyes), due to the limited
ability of these methods to model the finer facial motions. In this work, we
design a new unsupervised face animation approach to learn simultaneously the
coarse and finer motions. In particular, while exploiting the local affine
motion model to learn the global coarse facial motion, we design a novel motion
refinement module to compensate for the local affine motion model for modeling
finer face motions in local areas. The motion refinement is learned from the
dense correlation between the source and driving images. Specifically, we first
construct a structure correlation volume based on the keypoint features of the
source and driving images. Then, we train a model to generate the tiny facial
motions iteratively from low to high resolution. The learned motion refinements
are combined with the coarse motion to generate the new image. Extensive
experiments on widely used benchmarks demonstrate that our method achieves the
best results among state-of-the-art baselines. |
---|---|
DOI: | 10.48550/arxiv.2310.13912 |