Diffusion Model Patching via Mixture-of-Prompts

We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the origi...

Full description

Saved in:
Bibliographic Details
Main Authors Ham, Seokil, Woo, Sangmin, Kim, Jin-Young, Go, Hyojun, Park, Byeongjun, Kim, Changick
Format Journal Article
LanguageEnglish
Published 28.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from its dynamic gating mechanism, which selects and combines a subset of learnable prompts at every step of the generative process (e.g., reverse denoising steps). This strategy, which we term "mixture-of-prompts", enables the model to draw on the distinct expertise of each prompt, essentially "patching" the model's functionality at every step with minimal yet specialized parameters. Uniquely, DMP enhances the model by further training on the same dataset on which it was originally trained, even in a scenario where significant improvements are typically not expected due to model convergence. Experiments show that DMP significantly enhances the converged FID of DiT-L/2 on FFHQ 256x256 by 10.38%, achieved with only a 1.43% parameter increase and 50K additional training iterations.
AbstractList We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from its dynamic gating mechanism, which selects and combines a subset of learnable prompts at every step of the generative process (e.g., reverse denoising steps). This strategy, which we term "mixture-of-prompts", enables the model to draw on the distinct expertise of each prompt, essentially "patching" the model's functionality at every step with minimal yet specialized parameters. Uniquely, DMP enhances the model by further training on the same dataset on which it was originally trained, even in a scenario where significant improvements are typically not expected due to model convergence. Experiments show that DMP significantly enhances the converged FID of DiT-L/2 on FFHQ 256x256 by 10.38%, achieved with only a 1.43% parameter increase and 50K additional training iterations.
Author Kim, Jin-Young
Park, Byeongjun
Ham, Seokil
Woo, Sangmin
Go, Hyojun
Kim, Changick
Author_xml – sequence: 1
  givenname: Seokil
  surname: Ham
  fullname: Ham, Seokil
– sequence: 2
  givenname: Sangmin
  surname: Woo
  fullname: Woo, Sangmin
– sequence: 3
  givenname: Jin-Young
  surname: Kim
  fullname: Kim, Jin-Young
– sequence: 4
  givenname: Hyojun
  surname: Go
  fullname: Go, Hyojun
– sequence: 5
  givenname: Byeongjun
  surname: Park
  fullname: Park, Byeongjun
– sequence: 6
  givenname: Changick
  surname: Kim
  fullname: Kim, Changick
BackLink https://doi.org/10.48550/arXiv.2405.17825$$DView paper in arXiv
BookMark eNotzr1OwzAUQGEPMEDhAZjICzj1tXPjZETlV2pFh-7RjX0Nltq4ctKqvD2idDrb0XcrroY0sBAPoMqqQVRzyqd4LHWlsATbaLwR8-cYwmGMaShWyfO2WNPkvuPwVRwjFat4mg6ZZQpyndNuP4134jrQduT7S2di8_qyWbzL5efbx-JpKam2KAN5zY7aNmjVoO5rrtFR4_qe0WsFXrdgqoDEzpIFX7EBgN5YDaQcWTMTj__bs7jb57ij_NP9ybuz3PwCg88_sw
ContentType Journal Article
Copyright http://creativecommons.org/licenses/by/4.0
Copyright_xml – notice: http://creativecommons.org/licenses/by/4.0
DBID AKY
GOX
DOI 10.48550/arxiv.2405.17825
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2405_17825
GroupedDBID AKY
GOX
ID FETCH-LOGICAL-a675-fad2eca99f20852b6e65ca8cbbe5d201d29134f5aec7a71d4e3111b3721a0ca73
IEDL.DBID GOX
IngestDate Sat Jun 01 12:10:23 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a675-fad2eca99f20852b6e65ca8cbbe5d201d29134f5aec7a71d4e3111b3721a0ca73
OpenAccessLink https://arxiv.org/abs/2405.17825
ParticipantIDs arxiv_primary_2405_17825
PublicationCentury 2000
PublicationDate 2024-05-28
PublicationDateYYYYMMDD 2024-05-28
PublicationDate_xml – month: 05
  year: 2024
  text: 2024-05-28
  day: 28
PublicationDecade 2020
PublicationYear 2024
Score 1.9201447
SecondaryResourceType preprint
Snippet We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
Title Diffusion Model Patching via Mixture-of-Prompts
URI https://arxiv.org/abs/2405.17825
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07TwMxDLbaTiwIBKg8lYE1tJdLctcRAaVCKnQoUrcqThzpJASoL_Xn17krgoU18WJH8efPTmyAWzsw0eU-SIOWCYoOUZZUWklaE6M3Zq4miuNXO3rXLzMza4H4-QvjFttq0_QHxmWP4cbcZQxipg1tpdKTree3WVOcrFtx7eV_5TjGrJf-gMTwCA730Z24b47jGFr0eQK9xyrGdcpKiTR57ENM2P2lxI_YVE6Mq21K4suvKCcLvpyr5SlMh0_Th5HczymQjsNtGV1Q5N1gENO8S4WWrPGu9IhkAuNrUKm6HY0jX7giC5pydjCYM_dyfe-K_Aw6TPWpC0JhjqSoH5S2mjkvYhmdymMWPFLf63Po1trNv5tWFPOk-LxW_OL_rUs4UAzFqeatyivorBZrumYoXeFNbc8d6dF0bw
link.rule.ids 228,230,783,888
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Diffusion+Model+Patching+via+Mixture-of-Prompts&rft.au=Ham%2C+Seokil&rft.au=Woo%2C+Sangmin&rft.au=Kim%2C+Jin-Young&rft.au=Go%2C+Hyojun&rft.date=2024-05-28&rft_id=info:doi/10.48550%2Farxiv.2405.17825&rft.externalDocID=2405_17825