Salient Object-Aware Background Generation using Text-Guided Diffusion Models

Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditi...

Full description

Saved in:
Bibliographic Details
Published inIEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops pp. 7489 - 7499
Main Authors Eshratifar, Amir Erfan, Soares, Joao V. B., Thadani, Kapil, Mishra, Shaunak, Kuznetsov, Mikhail, Ku, Yueh-Ning, De Juan, Paloma
Format Conference Proceeding
LanguageEnglish
Published IEEE 17.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditioned outpainting, where the goal is to extend image content beyond a salient object's boundaries on a blank background. Although popular diffusion models for text-guided inpainting can also be used for outpainting by mask inversion, they are trained to fill in missing parts of an image rather than to place an object into a scene. Consequently, when used for background creation, inpainting models frequently extend the salient object's boundaries and thereby change the object's identity, which is a phenomenon we call "object expansion." This paper introduces a model for adapting inpainting diffusion models to the salient object outpainting task using Stable Diffusion and ControlNet architectures. We present a series of qualitative and quantitative results across models and datasets, including a newly proposed metric to measure object expansion that does not require any human labeling. Compared to Stable Diffusion 2.0 Inpainting, our proposed approach reduces object expansion by 3.6× on average with no degradation in standard visual metrics across multiple datasets.
AbstractList Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the presentation and context of subjects by integrating them into tailored environments. Background generation can be framed as a task of text-conditioned outpainting, where the goal is to extend image content beyond a salient object's boundaries on a blank background. Although popular diffusion models for text-guided inpainting can also be used for outpainting by mask inversion, they are trained to fill in missing parts of an image rather than to place an object into a scene. Consequently, when used for background creation, inpainting models frequently extend the salient object's boundaries and thereby change the object's identity, which is a phenomenon we call "object expansion." This paper introduces a model for adapting inpainting diffusion models to the salient object outpainting task using Stable Diffusion and ControlNet architectures. We present a series of qualitative and quantitative results across models and datasets, including a newly proposed metric to measure object expansion that does not require any human labeling. Compared to Stable Diffusion 2.0 Inpainting, our proposed approach reduces object expansion by 3.6× on average with no degradation in standard visual metrics across multiple datasets.
Author Eshratifar, Amir Erfan
Soares, Joao V. B.
Ku, Yueh-Ning
Mishra, Shaunak
Thadani, Kapil
De Juan, Paloma
Kuznetsov, Mikhail
Author_xml – sequence: 1
  givenname: Amir Erfan
  surname: Eshratifar
  fullname: Eshratifar, Amir Erfan
  organization: Yahoo Research
– sequence: 2
  givenname: Joao V. B.
  surname: Soares
  fullname: Soares, Joao V. B.
  organization: Yahoo Research
– sequence: 3
  givenname: Kapil
  surname: Thadani
  fullname: Thadani, Kapil
  organization: Yahoo Research
– sequence: 4
  givenname: Shaunak
  surname: Mishra
  fullname: Mishra, Shaunak
  organization: Amazon
– sequence: 5
  givenname: Mikhail
  surname: Kuznetsov
  fullname: Kuznetsov, Mikhail
  organization: Amazon
– sequence: 6
  givenname: Yueh-Ning
  surname: Ku
  fullname: Ku, Yueh-Ning
  organization: ByteDance
– sequence: 7
  givenname: Paloma
  surname: De Juan
  fullname: De Juan, Paloma
  organization: Yahoo Research
BookMark eNotjttKw0AURUdRsNb-gUJ-IPHMPfNYa41CS0WrPpa5nJSpdSJJivr3BvRpw16bxT4nJ6lJSMgVhYJSMNez18enN8V5yQoGTBQAWogjMjHalFwCV1JocUxGjCrItaTqjEy6bgcAFEopDR-R5bPdR0x9tnI79H0-_bItZjfWv2_b5pBCVmHC1vaxSdmhi2mbrfG7z6tDDBiy21jXQzuwZRNw312Q09ruO5z855i83M3Xs_t8saoeZtNFHqlWfT48CT54HXyNIGiQxkknGICn3OnArVFOUEXBBlU7D8ZBcJZbZoIRw5CPyeWfNyLi5rONH7b92VBQuhRM8l9-lFI0
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CVPRW63382.2024.00744
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9798350365474
EISSN 2160-7516
EndPage 7499
ExternalDocumentID 10678425
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-i176t-516dcdc7dcfe041d59b5b4200c13b7d3a96b41610ad6fbc09b0dba3a29d94b423
IEDL.DBID RIE
IngestDate Tue May 06 03:32:56 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i176t-516dcdc7dcfe041d59b5b4200c13b7d3a96b41610ad6fbc09b0dba3a29d94b423
PageCount 11
ParticipantIDs ieee_primary_10678425
PublicationCentury 2000
PublicationDate 2024-June-17
PublicationDateYYYYMMDD 2024-06-17
PublicationDate_xml – month: 06
  year: 2024
  text: 2024-June-17
  day: 17
PublicationDecade 2020
PublicationTitle IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops
PublicationTitleAbbrev CVPRW
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001085593
Score 1.8760662
Snippet Generating background scenes for salient objects plays a crucial role across various domains including creative design and e-commerce, as it enhances the...
SourceID ieee
SourceType Publisher
StartPage 7489
SubjectTerms Adaptation models
background-generation
Conferences
controlnet
Degradation
Diffusion models
image-generation
Measurement
Pattern recognition
stable-diffusion
Visualization
Title Salient Object-Aware Background Generation using Text-Guided Diffusion Models
URI https://ieeexplore.ieee.org/document/10678425
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEG6Ukyd8YHynB6-FLe126VFRJCYgUVBupE9DMGBgNyb-eju7i0QTE2-bZtvddNLON49vBqFLESBBkhhD4pYQhPuEkxZVhlBlm7HxSezzQHuvL7ojfj-OxyVZPefCOOfy5DNXh8c8lm8XJgNXWQPKnUHYaBttB8utIGttHCqQcSVZydKhkWy0nwePLyLYYEC4auZVsjn_0UUlVyKdKuqvP1_kjszqWarr5vNXZcZ__98uqm34enjwrYn20Jab76NqCTBxeXxXB6j3FFB3WAI_aHC_kKsPtXT4WpkZkDvmFhdFqEFWGBLiX_EQDOO7bGrDOjdT7zNwrmFooPa2qqFR53bY7pKynwKZ0kSkJKbCGmsSa7yLOLWx1LHm4ZgYynRimZJCg70TKSu8NpHUkdWKqaa0kocX2SGqzBdzd4RwwDHU-GCMeR2uWsalC7OYst6wAEFb-hjVYHsm70XJjMl6Z07-GD9FOyAiyMGiyRmqpMvMnQdtn-qLXMpfScupkA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwFG4UH_QJLxjv9sHXwcrajj0qiqiAREF9I70aghkGtpj46-3ZhkQTE9-apW2annTn9n3nIHTGnUkQhkp5rMG5R21IvQYRyiNC15myIbNZor3b4-0hvX1hLwVZPePCGGMy8JmpwjDL5eupSiFUVoNyZ5A2WkVrTvEzktO1liEVwFxFQcHTIX5Uaz71H56588KAclXP6mRT-qOPSqZGWmXUWxwgR49Mqmkiq-rzV23Gf59wE1WWjD3c_9ZFW2jFxNuoXJiYuHjA8x3UfXR2t9sC30sIwHjnH2Jm8IVQE6B3xBrnZahBWhgg8a94AK7xdTrWbp_LsbUphNcwtFB7m1fQsHU1aLa9oqOCNyYhTzxGuFZahVpZ41OiWSSZpO6hKBLIUAci4hI8Hl9obqXyI-lrKQJRj3RE3cRgF5XiaWz2EHaWDFHWuWNWup9tQCPjVgVCWxU4I7Qh91EFrmf0nhfNGC1u5uCP76dovT3odkadm97dIdoAcQEii4RHqJTMUnPsdH8iTzKJfwFSMqzZ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition+workshops&rft.atitle=Salient+Object-Aware+Background+Generation+using+Text-Guided+Diffusion+Models&rft.au=Eshratifar%2C+Amir+Erfan&rft.au=Soares%2C+Joao+V.+B.&rft.au=Thadani%2C+Kapil&rft.au=Mishra%2C+Shaunak&rft.date=2024-06-17&rft.pub=IEEE&rft.eissn=2160-7516&rft.spage=7489&rft.epage=7499&rft_id=info:doi/10.1109%2FCVPRW63382.2024.00744&rft.externalDocID=10678425