Diffusion Model Patching via Mixture-of-Prompts

We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the origi...

Full description

Saved in:

Bibliographic Details
Main Authors	Ham, Seokil, Woo, Sangmin, Kim, Jin-Young, Go, Hyojun, Park, Byeongjun, Kim, Changick
Format	Journal Article
Language	English
Published	28.05.2024
Subjects	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition
Online Access	Get full text

Cover

Loading…

Abstract	We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from its dynamic gating mechanism, which selects and combines a subset of learnable prompts at every step of the generative process (e.g., reverse denoising steps). This strategy, which we term "mixture-of-prompts", enables the model to draw on the distinct expertise of each prompt, essentially "patching" the model's functionality at every step with minimal yet specialized parameters. Uniquely, DMP enhances the model by further training on the same dataset on which it was originally trained, even in a scenario where significant improvements are typically not expected due to model convergence. Experiments show that DMP significantly enhances the converged FID of DiT-L/2 on FFHQ 256x256 by 10.38%, achieved with only a 1.43% parameter increase and 50K additional training iterations.
AbstractList	We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with a negligible increase in parameters. DMP inserts a small, learnable set of prompts into the model's input space while keeping the original model frozen. The effectiveness of DMP is not merely due to the addition of parameters but stems from its dynamic gating mechanism, which selects and combines a subset of learnable prompts at every step of the generative process (e.g., reverse denoising steps). This strategy, which we term "mixture-of-prompts", enables the model to draw on the distinct expertise of each prompt, essentially "patching" the model's functionality at every step with minimal yet specialized parameters. Uniquely, DMP enhances the model by further training on the same dataset on which it was originally trained, even in a scenario where significant improvements are typically not expected due to model convergence. Experiments show that DMP significantly enhances the converged FID of DiT-L/2 on FFHQ 256x256 by 10.38%, achieved with only a 1.43% parameter increase and 50K additional training iterations.
Author	Kim, Jin-Young Park, Byeongjun Ham, Seokil Woo, Sangmin Go, Hyojun Kim, Changick
Author_xml	– sequence: 1 givenname: Seokil surname: Ham fullname: Ham, Seokil – sequence: 2 givenname: Sangmin surname: Woo fullname: Woo, Sangmin – sequence: 3 givenname: Jin-Young surname: Kim fullname: Kim, Jin-Young – sequence: 4 givenname: Hyojun surname: Go fullname: Go, Hyojun – sequence: 5 givenname: Byeongjun surname: Park fullname: Park, Byeongjun – sequence: 6 givenname: Changick surname: Kim fullname: Kim, Changick
BackLink	https://doi.org/10.48550/arXiv.2405.17825$$DView paper in arXiv
BookMark	eNotzr1OwzAUQGEPMEDhAZjICzj1tXPjZETlV2pFh-7RjX0Nltq4ctKqvD2idDrb0XcrroY0sBAPoMqqQVRzyqd4LHWlsATbaLwR8-cYwmGMaShWyfO2WNPkvuPwVRwjFat4mg6ZZQpyndNuP4134jrQduT7S2di8_qyWbzL5efbx-JpKam2KAN5zY7aNmjVoO5rrtFR4_qe0WsFXrdgqoDEzpIFX7EBgN5YDaQcWTMTj__bs7jb57ij_NP9ybuz3PwCg88_sw
ContentType	Journal Article
Copyright	http://creativecommons.org/licenses/by/4.0
Copyright_xml	– notice: http://creativecommons.org/licenses/by/4.0
DBID	AKY GOX
DOI	10.48550/arxiv.2405.17825
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2405_17825
GroupedDBID	AKY GOX
ID	FETCH-LOGICAL-a675-fad2eca99f20852b6e65ca8cbbe5d201d29134f5aec7a71d4e3111b3721a0ca73
IEDL.DBID	GOX
IngestDate	Sat Jun 01 12:10:23 EDT 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a675-fad2eca99f20852b6e65ca8cbbe5d201d29134f5aec7a71d4e3111b3721a0ca73
OpenAccessLink	https://arxiv.org/abs/2405.17825
ParticipantIDs	arxiv_primary_2405_17825
PublicationCentury	2000
PublicationDate	2024-05-28
PublicationDateYYYYMMDD	2024-05-28
PublicationDate_xml	– month: 05 year: 2024 text: 2024-05-28 day: 28
PublicationDecade	2020
PublicationYear	2024
Score	1.9201447
SecondaryResourceType	preprint
Snippet	We present Diffusion Model Patching (DMP), a simple method to boost the performance of pre-trained diffusion models that have already reached convergence, with...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition
Title	Diffusion Model Patching via Mixture-of-Prompts
URI	https://arxiv.org/abs/2405.17825
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07TwMxDLbaTiwIBKg8lYE1tJdLctcRAaVCKnQoUrcqThzpJASoL_Xn17krgoU18WJH8efPTmyAWzsw0eU-SIOWCYoOUZZUWklaE6M3Zq4miuNXO3rXLzMza4H4-QvjFttq0_QHxmWP4cbcZQxipg1tpdKTree3WVOcrFtx7eV_5TjGrJf-gMTwCA730Z24b47jGFr0eQK9xyrGdcpKiTR57ENM2P2lxI_YVE6Mq21K4suvKCcLvpyr5SlMh0_Th5HczymQjsNtGV1Q5N1gENO8S4WWrPGu9IhkAuNrUKm6HY0jX7giC5pydjCYM_dyfe-K_Aw6TPWpC0JhjqSoH5S2mjkvYhmdymMWPFLf63Po1trNv5tWFPOk-LxW_OL_rUs4UAzFqeatyivorBZrumYoXeFNbc8d6dF0bw
link.rule.ids	228,230,783,888
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Diffusion+Model+Patching+via+Mixture-of-Prompts&rft.au=Ham%2C+Seokil&rft.au=Woo%2C+Sangmin&rft.au=Kim%2C+Jin-Young&rft.au=Go%2C+Hyojun&rft.date=2024-05-28&rft_id=info:doi/10.48550%2Farxiv.2405.17825&rft.externalDocID=2405_17825