There and Back Again: On the relation between noises, images, and their inversions in diffusion models
Denoising Diffusion Probabilistic Models (DDPMs) achieve state-of-the-art performance in synthesizing new images from random noise, but they lack meaningful latent space that encodes data into features. Recent DDPM-based editing techniques try to mitigate this issue by inverting images back to their...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
30.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Denoising Diffusion Probabilistic Models (DDPMs) achieve state-of-the-art
performance in synthesizing new images from random noise, but they lack
meaningful latent space that encodes data into features. Recent DDPM-based
editing techniques try to mitigate this issue by inverting images back to their
approximated staring noise. In this work, we study the relation between the
initial Gaussian noise, the samples generated from it, and their corresponding
latent encodings obtained through the inversion procedure. First, we interpret
their spatial distance relations to show the inaccuracy of the DDIM inversion
technique by localizing latent representations manifold between the initial
noise and generated samples. Then, we demonstrate the peculiar relation between
initial Gaussian noise and its corresponding generations during diffusion
training, showing that the high-level features of generated images stabilize
rapidly, keeping the spatial distance relationship between noises and
generations consistent throughout the training. |
---|---|
DOI: | 10.48550/arxiv.2410.23530 |