Attentional pixel-wise deformation for pose-based human image generation

Human pose transfer aims to synthesize referred human images with target pose, bringing the substantial economic potential for E-commerce or virtual reality. In this paper, we propose a novel method, the Attentional Pixel-wise Deformation Network (APD-Net), for synthesizing human images with guided...

Full description

Saved in:

Bibliographic Details
Published in	Expert systems with applications Vol. 246; p. 123073
Main Authors	Liao, Fangjian, Zou, Xingxing, Wong, Wai Keung
Format	Journal Article
Language	English
Published	Elsevier Ltd 15.07.2024
Subjects	Deep learning Generative adversarial network Person image synthesis Pose transfer Spatial transformation Deep learning Spatial transformation Person image synthesis Generative adversarial network Pose transfer
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Human pose transfer aims to synthesize referred human images with target pose, bringing the substantial economic potential for E-commerce or virtual reality. In this paper, we propose a novel method, the Attentional Pixel-wise Deformation Network (APD-Net), for synthesizing human images with guided pose and referred images. Specifically, we leverage attention-based spatial transformation modules and affine transformation modules to generate accurate appearance and extract pixel-wise details in local regions to generate intermediate results. Additionally, we introduce a confidence map to refine spatial information during the final image synthesis. Domain alignment loss, cycle loss, perceptual and feature matching loss and contextual loss are applied to constrain the synthesized images while attention loss and fusion loss benefit warp images generation. We verify the efficacy of the model on the Market-1501 and DeepFashion datasets, using quantitative and qualitative measures. Our approach surpasses all previously published state-of-the-art results on most evaluation metrics, e.g., achieving 0.780 SSIM score, 9.55 Sliced Wasserstein Distance score, and a 0.963 Semantic Consistency score on DeepFashion and obtaining 0.303 SSIM score, 16.971 Sliced Wasserstein Distance score and 0.729 Semantic Consistency score on Market-1501 Code and pretrained models are available at: https://github.com/LiaoFJ/APD-Net/. •A model with attention and deformation is proposed for human pose transfer.•The novel attention-based operation can assist the final image synthesis.•The model can generate accurate images while preserving the details.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2023.123073