There and Back Again: On the relation between noises, images, and their inversions in diffusion models
Denoising Diffusion Probabilistic Models (DDPMs) achieve state-of-the-art performance in synthesizing new images from random noise, but they lack meaningful latent space that encodes data into features. Recent DDPM-based editing techniques try to mitigate this issue by inverting images back to their...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
30.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Denoising Diffusion Probabilistic Models (DDPMs) achieve state-of-the-art
performance in synthesizing new images from random noise, but they lack
meaningful latent space that encodes data into features. Recent DDPM-based
editing techniques try to mitigate this issue by inverting images back to their
approximated staring noise. In this work, we study the relation between the
initial Gaussian noise, the samples generated from it, and their corresponding
latent encodings obtained through the inversion procedure. First, we interpret
their spatial distance relations to show the inaccuracy of the DDIM inversion
technique by localizing latent representations manifold between the initial
noise and generated samples. Then, we demonstrate the peculiar relation between
initial Gaussian noise and its corresponding generations during diffusion
training, showing that the high-level features of generated images stabilize
rapidly, keeping the spatial distance relationship between noises and
generations consistent throughout the training. |
---|---|
AbstractList | Denoising Diffusion Probabilistic Models (DDPMs) achieve state-of-the-art
performance in synthesizing new images from random noise, but they lack
meaningful latent space that encodes data into features. Recent DDPM-based
editing techniques try to mitigate this issue by inverting images back to their
approximated staring noise. In this work, we study the relation between the
initial Gaussian noise, the samples generated from it, and their corresponding
latent encodings obtained through the inversion procedure. First, we interpret
their spatial distance relations to show the inaccuracy of the DDIM inversion
technique by localizing latent representations manifold between the initial
noise and generated samples. Then, we demonstrate the peculiar relation between
initial Gaussian noise and its corresponding generations during diffusion
training, showing that the high-level features of generated images stabilize
rapidly, keeping the spatial distance relationship between noises and
generations consistent throughout the training. |
Author | Deja, Kamil Staniszewski, Łukasz Kuciński, Łukasz |
Author_xml | – sequence: 1 givenname: Łukasz surname: Staniszewski fullname: Staniszewski, Łukasz – sequence: 2 givenname: Łukasz surname: Kuciński fullname: Kuciński, Łukasz – sequence: 3 givenname: Kamil surname: Deja fullname: Deja, Kamil |
BackLink | https://doi.org/10.48550/arXiv.2410.23530$$DView paper in arXiv |
BookMark | eNqFjrsOwjAMRTPAwOsDmPAHQCl9SIgNEIiNhb0K1GktWgclpcDfk1TsTNe-OrbOUPRYMwoxXYVBsk7TcCnNm9ogSlwRxWkcDoS6lGgQJOewk7c7bAtJvIEzQ1MiGKxkQ5rhis0LkYE1WbRzoFoWPv2dA8kAcYvGOta6EXJS6uk3qHWOlR2LvpKVxckvR2J2PFz2p0VnlD2Me2g-mTfLOrP4P_EF_HlFkA |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2410.23530 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2410_23530 |
GroupedDBID | AKY GOX |
ID | FETCH-arxiv_primary_2410_235303 |
IEDL.DBID | GOX |
IngestDate | Sat Nov 02 12:35:36 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2410_235303 |
OpenAccessLink | https://arxiv.org/abs/2410.23530 |
ParticipantIDs | arxiv_primary_2410_23530 |
PublicationCentury | 2000 |
PublicationDate | 2024-10-30 |
PublicationDateYYYYMMDD | 2024-10-30 |
PublicationDate_xml | – month: 10 year: 2024 text: 2024-10-30 day: 30 |
PublicationDecade | 2020 |
PublicationYear | 2024 |
Score | 3.874237 |
SecondaryResourceType | preprint |
Snippet | Denoising Diffusion Probabilistic Models (DDPMs) achieve state-of-the-art
performance in synthesizing new images from random noise, but they lack
meaningful... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition |
Title | There and Back Again: On the relation between noises, images, and their inversions in diffusion models |
URI | https://arxiv.org/abs/2410.23530 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07T8MwED6VTiwIBKi8b2DEIjh22rAVRKkY6AJStiiPM4oQAcVtxc_nbAfB0imOfbFOfuS7sz-fAS7rimGz0kakpiahqCpEmRZKSG0MjaOax7VnWzwn81f1lOlsAPh7Fqbovpt1iA9c2muGF0dV1jE75VtSOsrW4yILm5M-FFcv_yfHNqbP-gcSs13Y6a07nIbu2IMBtftguCs6QnbZ8a6o3nH6xt74LS5aZOMLu56Nhj1jCtvPxpK9wuaDpzo_3Xd-OR-bdh1Wtywn0V1tsnJv6G-zsQdwMXt4uZ8Lr1n-FcJI5E7p3CsdH8KQnX0aAcYxJTd1wnaMVqow45S0YgHOjSQ39uQIRptqOd5cdALbksHY_3OjUxguuxWdMZguy3Pfoj8TWnmH |
link.rule.ids | 228,230,783,888 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=There+and+Back+Again%3A+On+the+relation+between+noises%2C+images%2C+and+their+inversions+in+diffusion+models&rft.au=Staniszewski%2C+%C5%81ukasz&rft.au=Kuci%C5%84ski%2C+%C5%81ukasz&rft.au=Deja%2C+Kamil&rft.date=2024-10-30&rft_id=info:doi/10.48550%2Farxiv.2410.23530&rft.externalDocID=2410_23530 |