One-shot Ultra-high-Resolution Generative Adversarial Network That Synthesizes 16K Images On A Single GPU
[Display omitted] •One-shot Ultra-high-Resolution image synthesizer trainable on a single consumer GPU.•Can synthesize high-quality 16K images with 12.5 GB and 4K images with 8 GB memory.•Seamless subregion-wise super-resolution with minimal additional memory overhead.•Vertical coordinate conv. that...
Saved in:
Published in | Image and vision computing Vol. 139; p. 104815 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.11.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | [Display omitted]
•One-shot Ultra-high-Resolution image synthesizer trainable on a single consumer GPU.•Can synthesize high-quality 16K images with 12.5 GB and 4K images with 8 GB memory.•Seamless subregion-wise super-resolution with minimal additional memory overhead.•Vertical coordinate conv. that improves visual coherence while preserving diversity.•Can synthesize large shapes with fine details and long-range coherence.
We propose a one-shot ultra-high-resolution generative adversarial network (OUR-GAN) framework that generates non-repetitive 16K (16,384×8,640) images from a single training image and is trainable on a single consumer GPU. OUR-GAN generates an initial image that is visually plausible and varied in shape at low resolution, and then gradually increases the resolution by adding detail through super-resolution. Since OUR-GAN learns from a real ultra-high-resolution (UHR) image, it can synthesize large shapes with fine details and long-range coherence, which is difficult to achieve with conventional generative models that rely on the patch distribution learned from relatively small images. OUR-GAN can synthesize high-quality 16K images with 12.5 GB of GPU memory and 4K images with only 4.29 GB as it synthesizes a UHR image part by part through seamless subregion-wise super-resolution. Additionally, OUR-GAN improves visual coherence while maintaining diversity by applying vertical positional convolution. In experiments on the ST4K and RAISE datasets, OUR-GAN exhibited improved fidelity, visual coherency, and diversity compared with the baseline one-shot synthesis models. To the best of our knowledge, OUR-GAN is the first one-shot image synthesizer that generates non-repetitive UHR images on a single consumer GPU. The synthesized image samples are presented at https://our-gan.github.io. |
---|---|
ISSN: | 0262-8856 1872-8138 |
DOI: | 10.1016/j.imavis.2023.104815 |