Learning Spatially-Adaptive Style-Modulation Networks for Single Image Synthesis

Recently there has been a growing interest in learning generative models from a single image. This task is important as in many real world applications, collecting large dataset is not feasible. Existing work like SinGAN is able to synthesize novel images that resemble the patch distribution of the...

Full description

Saved in:
Bibliographic Details
Published in2023 IEEE International Conference on Image Processing (ICIP) pp. 1455 - 1459
Main Authors Shen, Jianghao, Wu, Tianfu
Format Conference Proceeding
LanguageEnglish
Published IEEE 08.10.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently there has been a growing interest in learning generative models from a single image. This task is important as in many real world applications, collecting large dataset is not feasible. Existing work like SinGAN is able to synthesize novel images that resemble the patch distribution of the training image. However, SinGAN cannot learn high level semantics of the image, and thus their synthesized samples tend to have unrealistic spatial layouts. To address this issue, this paper proposes a spatially adaptive style-modulation (SASM) module that learns to preserve realistic spatial configuration of images. Specifically, it extracts style vector (in the form of channel-wise attention) and latent spatial mask (in the form of spatial attention) from a coarse level feature separately. The style vector and spatial mask are then aggregated to modulate features of deeper layers. The disentangled modulation of spatial and style attributes enables the model to preserve the spatial structure of the image without overfitting. Experimental results show that the proposed module learns to generate samples with better fidelity than prior works.
DOI:10.1109/ICIP49359.2023.10222483