Salient Object-Aware Background Generation using Text-Guided Diffusion Models

Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditi...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops pp. 7489 - 7499
Main Authors	Eshratifar, Amir Erfan, Soares, Joao V. B., Thadani, Kapil, Mishra, Shaunak, Kuznetsov, Mikhail, Ku, Yueh-Ning, De Juan, Paloma
Format	Conference Proceeding
Language	English
Published	IEEE 17.06.2024
Subjects	Adaptation models background-generation Conferences controlnet Degradation Diffusion models image-generation Measurement Pattern recognition stable-diffusion Visualization
Online Access	Get full text

Cover

Loading…

Abstract	Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditioned outpainting, where the goal is to extend image content beyond a salient object's boundaries on a blank background. Although popular diffusion models for text-guided inpainting can also be used for outpainting by mask inversion, they are trained to fill in missing parts of an image rather than to place an object into a scene. Consequently, when used for background creation, inpainting models frequently extend the salient object's boundaries and thereby change the object's identity, which is a phenomenon we call "object expansion." This paper introduces a model for adapting inpainting diffusion models to the salient object outpainting task using Stable Diffusion and ControlNet architectures. We present a series of qualitative and quantitative results across models and datasets, including a newly proposed metric to measure object expansion that does not require any human labeling. Compared to Stable Diffusion 2.0 Inpainting, our proposed approach reduces object expansion by 3.6× on average with no degradation in standard visual metrics across multiple datasets.
AbstractList	Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditioned outpainting, where the goal is to extend image content beyond a salient object's boundaries on a blank background. Although popular diffusion models for text-guided inpainting can also be used for outpainting by mask inversion, they are trained to fill in missing parts of an image rather than to place an object into a scene. Consequently, when used for background creation, inpainting models frequently extend the salient object's boundaries and thereby change the object's identity, which is a phenomenon we call "object expansion." This paper introduces a model for adapting inpainting diffusion models to the salient object outpainting task using Stable Diffusion and ControlNet architectures. We present a series of qualitative and quantitative results across models and datasets, including a newly proposed metric to measure object expansion that does not require any human labeling. Compared to Stable Diffusion 2.0 Inpainting, our proposed approach reduces object expansion by 3.6× on average with no degradation in standard visual metrics across multiple datasets.
Author	Eshratifar, Amir Erfan Soares, Joao V. B. Ku, Yueh-Ning Mishra, Shaunak Thadani, Kapil De Juan, Paloma Kuznetsov, Mikhail
Author_xml	– sequence: 1 givenname: Amir Erfan surname: Eshratifar fullname: Eshratifar, Amir Erfan organization: Yahoo Research – sequence: 2 givenname: Joao V. B. surname: Soares fullname: Soares, Joao V. B. organization: Yahoo Research – sequence: 3 givenname: Kapil surname: Thadani fullname: Thadani, Kapil organization: Yahoo Research – sequence: 4 givenname: Shaunak surname: Mishra fullname: Mishra, Shaunak organization: Amazon – sequence: 5 givenname: Mikhail surname: Kuznetsov fullname: Kuznetsov, Mikhail organization: Amazon – sequence: 6 givenname: Yueh-Ning surname: Ku fullname: Ku, Yueh-Ning organization: ByteDance – sequence: 7 givenname: Paloma surname: De Juan fullname: De Juan, Paloma organization: Yahoo Research
BookMark	eNotjttKw0AURUdRsNb-gUJ-IPHMPfNYa41CS0WrPpa5nJSpdSJJivr3BvRpw16bxT4nJ6lJSMgVhYJSMNez18enN8V5yQoGTBQAWogjMjHalFwCV1JocUxGjCrItaTqjEy6bgcAFEopDR-R5bPdR0x9tnI79H0-_bItZjfWv2_b5pBCVmHC1vaxSdmhi2mbrfG7z6tDDBiy21jXQzuwZRNw312Q09ruO5z855i83M3Xs_t8saoeZtNFHqlWfT48CT54HXyNIGiQxkknGICn3OnArVFOUEXBBlU7D8ZBcJZbZoIRw5CPyeWfNyLi5rONH7b92VBQuhRM8l9-lFI0
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CVPRW63382.2024.00744
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Digital Library IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences
EISBN	9798350365474
EISSN	2160-7516
EndPage	7499
ExternalDocumentID	10678425
Genre	orig-research
GroupedDBID	6IE 6IF 6IL 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK M43 OCL RIE RIL
ID	FETCH-LOGICAL-i176t-516dcdc7dcfe041d59b5b4200c13b7d3a96b41610ad6fbc09b0dba3a29d94b423
IEDL.DBID	RIE
IngestDate	Tue May 06 03:32:56 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i176t-516dcdc7dcfe041d59b5b4200c13b7d3a96b41610ad6fbc09b0dba3a29d94b423
PageCount	11
ParticipantIDs	ieee_primary_10678425
PublicationCentury	2000
PublicationDate	2024-June-17
PublicationDateYYYYMMDD	2024-06-17
PublicationDate_xml	– month: 06 year: 2024 text: 2024-June-17 day: 17
PublicationDecade	2020
PublicationTitle	IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops
PublicationTitleAbbrev	CVPRW
PublicationYear	2024
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0001085593
Score	1.8760662
Snippet	Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the...
SourceID	ieee
SourceType	Publisher
StartPage	7489
SubjectTerms	Adaptation models background-generation Conferences controlnet Degradation Diffusion models image-generation Measurement Pattern recognition stable-diffusion Visualization
Title	Salient Object-Aware Background Generation using Text-Guided Diffusion Models
URI	https://ieeexplore.ieee.org/document/10678425
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEG6Ukyd8YHynB6-FLe126VFRJCYgUVBupE9DMGBgNyb-eju7i0QTE2-bZtvddNLON49vBqFLESBBkhhD4pYQhPuEkxZVhlBlm7HxSezzQHuvL7ojfj-OxyVZPefCOOfy5DNXh8c8lm8XJgNXWQPKnUHYaBttB8utIGttHCqQcSVZydKhkWy0nwePLyLYYEC4auZVsjn_0UUlVyKdKuqvP1_kjszqWarr5vNXZcZ__98uqm34enjwrYn20Jab76NqCTBxeXxXB6j3FFB3WAI_aHC_kKsPtXT4WpkZkDvmFhdFqEFWGBLiX_EQDOO7bGrDOjdT7zNwrmFooPa2qqFR53bY7pKynwKZ0kSkJKbCGmsSa7yLOLWx1LHm4ZgYynRimZJCg70TKSu8NpHUkdWKqaa0kocX2SGqzBdzd4RwwDHU-GCMeR2uWsalC7OYst6wAEFb-hjVYHsm70XJjMl6Z07-GD9FOyAiyMGiyRmqpMvMnQdtn-qLXMpfScupkA
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwFG4UH_QJLxjv9sHXwcrajj0qiqiAREF9I70aghkGtpj46-3ZhkQTE9-apW2annTn9n3nIHTGnUkQhkp5rMG5R21IvQYRyiNC15myIbNZor3b4-0hvX1hLwVZPePCGGMy8JmpwjDL5eupSiFUVoNyZ5A2WkVrTvEzktO1liEVwFxFQcHTIX5Uaz71H56588KAclXP6mRT-qOPSqZGWmXUWxwgR49Mqmkiq-rzV23Gf59wE1WWjD3c_9ZFW2jFxNuoXJiYuHjA8x3UfXR2t9sC30sIwHjnH2Jm8IVQE6B3xBrnZahBWhgg8a94AK7xdTrWbp_LsbUphNcwtFB7m1fQsHU1aLa9oqOCNyYhTzxGuFZahVpZ41OiWSSZpO6hKBLIUAci4hI8Hl9obqXyI-lrKQJRj3RE3cRgF5XiaWz2EHaWDFHWuWNWup9tQCPjVgVCWxU4I7Qh91EFrmf0nhfNGC1u5uCP76dovT3odkadm97dIdoAcQEii4RHqJTMUnPsdH8iTzKJfwFSMqzZ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition+workshops&rft.atitle=Salient+Object-Aware+Background+Generation+using+Text-Guided+Diffusion+Models&rft.au=Eshratifar%2C+Amir+Erfan&rft.au=Soares%2C+Joao+V.+B.&rft.au=Thadani%2C+Kapil&rft.au=Mishra%2C+Shaunak&rft.date=2024-06-17&rft.pub=IEEE&rft.eissn=2160-7516&rft.spage=7489&rft.epage=7499&rft_id=info:doi/10.1109%2FCVPRW63382.2024.00744&rft.externalDocID=10678425