UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation

Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, w...

Full description

Saved in:
Bibliographic Details
Published inProceedings / IEEE Workshop on Applications of Computer Vision pp. 702 - 712
Main Authors Torbunov, Dmitrii, Huang, Yi, Yu, Haiwang, Huang, Jin, Yoo, Shinjae, Lin, Meifeng, Viren, Brett, Ren, Yihui
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https://github.com/LS4GAN/uvcgan.
ISSN:2642-9381
DOI:10.1109/WACV56688.2023.00077