LIA: Latent Image Animator

Previous animation techniques mainly focus on leveraging explicit structure representations ( e.g. , meshes or keypoints) for transferring motion from driving videos to source images. However, such methods are challenged with large appearance variations between source and driving data, as well as re...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. PP; pp. 1 - 16
Main Authors Wang, Yaohui, Yang, Di, Bremond, Francois, Dantcheva, Antitza
Format Journal Article
LanguageEnglish
Published IEEE 23.08.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Previous animation techniques mainly focus on leveraging explicit structure representations ( e.g. , meshes or keypoints) for transferring motion from driving videos to source images. However, such methods are challenged with large appearance variations between source and driving data, as well as require complex additional modules to respectively model appearance and motion. Towards addressing these issues, we introduce the Latent Image Animator (LIA), streamlined to animate high-resolution images. LIA is designed as a simple autoencoder that does not rely on explicit representations. Motion transfer in the pixel space is modeled as linear navigation of motion codes in the latent space. Specifically such navigation is represented as an orthogonal motion dictionary learned in a self-supervised manner based on proposed Linear Motion Decomposition (LMD). Extensive experimental results demonstrate that LIA outperforms state-of-the-art on VoxCeleb, TaichiHD, and TED-talk datasets with respect to video quality and spatio-temporal consistency. In addition LIA is well equipped for zero-shot high-resolution image animation.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0162-8828
1939-3539
1939-3539
DOI:10.1109/TPAMI.2024.3449075