One-shot Ultra-high-Resolution Generative Adversarial Network That Synthesizes 16K Images On A Single GPU

[Display omitted] •One-shot Ultra-high-Resolution image synthesizer trainable on a single consumer GPU.•Can synthesize high-quality 16K images with 12.5 GB and 4K images with 8 GB memory.•Seamless subregion-wise super-resolution with minimal additional memory overhead.•Vertical coordinate conv. that...

Full description

Saved in:
Bibliographic Details
Published inImage and vision computing Vol. 139; p. 104815
Main Authors Oh, Junseok, Yoon, Donghwee, Kim, Injung
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.11.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:[Display omitted] •One-shot Ultra-high-Resolution image synthesizer trainable on a single consumer GPU.•Can synthesize high-quality 16K images with 12.5 GB and 4K images with 8 GB memory.•Seamless subregion-wise super-resolution with minimal additional memory overhead.•Vertical coordinate conv. that improves visual coherence while preserving diversity.•Can synthesize large shapes with fine details and long-range coherence. We propose a one-shot ultra-high-resolution generative adversarial network (OUR-GAN) framework that generates non-repetitive 16K (16,384×8,640) images from a single training image and is trainable on a single consumer GPU. OUR-GAN generates an initial image that is visually plausible and varied in shape at low resolution, and then gradually increases the resolution by adding detail through super-resolution. Since OUR-GAN learns from a real ultra-high-resolution (UHR) image, it can synthesize large shapes with fine details and long-range coherence, which is difficult to achieve with conventional generative models that rely on the patch distribution learned from relatively small images. OUR-GAN can synthesize high-quality 16K images with 12.5 GB of GPU memory and 4K images with only 4.29 GB as it synthesizes a UHR image part by part through seamless subregion-wise super-resolution. Additionally, OUR-GAN improves visual coherence while maintaining diversity by applying vertical positional convolution. In experiments on the ST4K and RAISE datasets, OUR-GAN exhibited improved fidelity, visual coherency, and diversity compared with the baseline one-shot synthesis models. To the best of our knowledge, OUR-GAN is the first one-shot image synthesizer that generates non-repetitive UHR images on a single consumer GPU. The synthesized image samples are presented at https://our-gan.github.io.
ISSN:0262-8856
1872-8138
DOI:10.1016/j.imavis.2023.104815