Salient Object-Aware Background Generation using Text-Guided Diffusion Models
Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditi...
Saved in:
Published in | IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops pp. 7489 - 7499 |
---|---|
Main Authors | , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
17.06.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditioned outpainting, where the goal is to extend image content beyond a salient object's boundaries on a blank background. Although popular diffusion models for text-guided inpainting can also be used for outpainting by mask inversion, they are trained to fill in missing parts of an image rather than to place an object into a scene. Consequently, when used for background creation, inpainting models frequently extend the salient object's boundaries and thereby change the object's identity, which is a phenomenon we call "object expansion." This paper introduces a model for adapting inpainting diffusion models to the salient object outpainting task using Stable Diffusion and ControlNet architectures. We present a series of qualitative and quantitative results across models and datasets, including a newly proposed metric to measure object expansion that does not require any human labeling. Compared to Stable Diffusion 2.0 Inpainting, our proposed approach reduces object expansion by 3.6× on average with no degradation in standard visual metrics across multiple datasets. |
---|---|
AbstractList | Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditioned outpainting, where the goal is to extend image content beyond a salient object's boundaries on a blank background. Although popular diffusion models for text-guided inpainting can also be used for outpainting by mask inversion, they are trained to fill in missing parts of an image rather than to place an object into a scene. Consequently, when used for background creation, inpainting models frequently extend the salient object's boundaries and thereby change the object's identity, which is a phenomenon we call "object expansion." This paper introduces a model for adapting inpainting diffusion models to the salient object outpainting task using Stable Diffusion and ControlNet architectures. We present a series of qualitative and quantitative results across models and datasets, including a newly proposed metric to measure object expansion that does not require any human labeling. Compared to Stable Diffusion 2.0 Inpainting, our proposed approach reduces object expansion by 3.6× on average with no degradation in standard visual metrics across multiple datasets. |
Author | Eshratifar, Amir Erfan Soares, Joao V. B. Ku, Yueh-Ning Mishra, Shaunak Thadani, Kapil De Juan, Paloma Kuznetsov, Mikhail |
Author_xml | – sequence: 1 givenname: Amir Erfan surname: Eshratifar fullname: Eshratifar, Amir Erfan organization: Yahoo Research – sequence: 2 givenname: Joao V. B. surname: Soares fullname: Soares, Joao V. B. organization: Yahoo Research – sequence: 3 givenname: Kapil surname: Thadani fullname: Thadani, Kapil organization: Yahoo Research – sequence: 4 givenname: Shaunak surname: Mishra fullname: Mishra, Shaunak organization: Amazon – sequence: 5 givenname: Mikhail surname: Kuznetsov fullname: Kuznetsov, Mikhail organization: Amazon – sequence: 6 givenname: Yueh-Ning surname: Ku fullname: Ku, Yueh-Ning organization: ByteDance – sequence: 7 givenname: Paloma surname: De Juan fullname: De Juan, Paloma organization: Yahoo Research |
BookMark | eNotjttKw0AURUdRsNb-gUJ-IPHMPfNYa41CS0WrPpa5nJSpdSJJivr3BvRpw16bxT4nJ6lJSMgVhYJSMNez18enN8V5yQoGTBQAWogjMjHalFwCV1JocUxGjCrItaTqjEy6bgcAFEopDR-R5bPdR0x9tnI79H0-_bItZjfWv2_b5pBCVmHC1vaxSdmhi2mbrfG7z6tDDBiy21jXQzuwZRNw312Q09ruO5z855i83M3Xs_t8saoeZtNFHqlWfT48CT54HXyNIGiQxkknGICn3OnArVFOUEXBBlU7D8ZBcJZbZoIRw5CPyeWfNyLi5rONH7b92VBQuhRM8l9-lFI0 |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/CVPRW63382.2024.00744 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Digital Library IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences |
EISBN | 9798350365474 |
EISSN | 2160-7516 |
EndPage | 7499 |
ExternalDocumentID | 10678425 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IL 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK M43 OCL RIE RIL |
ID | FETCH-LOGICAL-i176t-516dcdc7dcfe041d59b5b4200c13b7d3a96b41610ad6fbc09b0dba3a29d94b423 |
IEDL.DBID | RIE |
IngestDate | Tue May 06 03:32:56 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i176t-516dcdc7dcfe041d59b5b4200c13b7d3a96b41610ad6fbc09b0dba3a29d94b423 |
PageCount | 11 |
ParticipantIDs | ieee_primary_10678425 |
PublicationCentury | 2000 |
PublicationDate | 2024-June-17 |
PublicationDateYYYYMMDD | 2024-06-17 |
PublicationDate_xml | – month: 06 year: 2024 text: 2024-June-17 day: 17 |
PublicationDecade | 2020 |
PublicationTitle | IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops |
PublicationTitleAbbrev | CVPRW |
PublicationYear | 2024 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001085593 |
Score | 1.8760662 |
Snippet | Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 7489 |
SubjectTerms | Adaptation models background-generation Conferences controlnet Degradation Diffusion models image-generation Measurement Pattern recognition stable-diffusion Visualization |
Title | Salient Object-Aware Background Generation using Text-Guided Diffusion Models |
URI | https://ieeexplore.ieee.org/document/10678425 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEG6Ukyd8YHynB6-FLe126VFRJCYgUVBupE9DMGBgNyb-eju7i0QTE2-bZtvddNLON49vBqFLESBBkhhD4pYQhPuEkxZVhlBlm7HxSezzQHuvL7ojfj-OxyVZPefCOOfy5DNXh8c8lm8XJgNXWQPKnUHYaBttB8utIGttHCqQcSVZydKhkWy0nwePLyLYYEC4auZVsjn_0UUlVyKdKuqvP1_kjszqWarr5vNXZcZ__98uqm34enjwrYn20Jab76NqCTBxeXxXB6j3FFB3WAI_aHC_kKsPtXT4WpkZkDvmFhdFqEFWGBLiX_EQDOO7bGrDOjdT7zNwrmFooPa2qqFR53bY7pKynwKZ0kSkJKbCGmsSa7yLOLWx1LHm4ZgYynRimZJCg70TKSu8NpHUkdWKqaa0kocX2SGqzBdzd4RwwDHU-GCMeR2uWsalC7OYst6wAEFb-hjVYHsm70XJjMl6Z07-GD9FOyAiyMGiyRmqpMvMnQdtn-qLXMpfScupkA |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwFG4UH_QJLxjv9sHXwcrajj0qiqiAREF9I70aghkGtpj46-3ZhkQTE9-apW2annTn9n3nIHTGnUkQhkp5rMG5R21IvQYRyiNC15myIbNZor3b4-0hvX1hLwVZPePCGGMy8JmpwjDL5eupSiFUVoNyZ5A2WkVrTvEzktO1liEVwFxFQcHTIX5Uaz71H56588KAclXP6mRT-qOPSqZGWmXUWxwgR49Mqmkiq-rzV23Gf59wE1WWjD3c_9ZFW2jFxNuoXJiYuHjA8x3UfXR2t9sC30sIwHjnH2Jm8IVQE6B3xBrnZahBWhgg8a94AK7xdTrWbp_LsbUphNcwtFB7m1fQsHU1aLa9oqOCNyYhTzxGuFZahVpZ41OiWSSZpO6hKBLIUAci4hI8Hl9obqXyI-lrKQJRj3RE3cRgF5XiaWz2EHaWDFHWuWNWup9tQCPjVgVCWxU4I7Qh91EFrmf0nhfNGC1u5uCP76dovT3odkadm97dIdoAcQEii4RHqJTMUnPsdH8iTzKJfwFSMqzZ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition+workshops&rft.atitle=Salient+Object-Aware+Background+Generation+using+Text-Guided+Diffusion+Models&rft.au=Eshratifar%2C+Amir+Erfan&rft.au=Soares%2C+Joao+V.+B.&rft.au=Thadani%2C+Kapil&rft.au=Mishra%2C+Shaunak&rft.date=2024-06-17&rft.pub=IEEE&rft.eissn=2160-7516&rft.spage=7489&rft.epage=7499&rft_id=info:doi/10.1109%2FCVPRW63382.2024.00744&rft.externalDocID=10678425 |