Zero-Shot Co-Salient Object Detection Framework

Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's capacity to recognize common and salient objects within a collection of images. Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets. The e...

Full description

Saved in:
Bibliographic Details
Published inICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 4010 - 4014
Main Authors Xiao, Haoke, Tang, Lv, Li, Bo, Luo, Zhiming, Li, Shaozi
Format Conference Proceeding
LanguageEnglish
Published IEEE 14.04.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's capacity to recognize common and salient objects within a collection of images. Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets. The exploration of training-free zero-shot CoSOD frameworks has been limited. In this paper, taking inspiration from the zero-shot transfer capabilities of foundational computer vision models, we introduce the first zero-shot CoSOD framework that harnesses these models without any training process. To achieve this, we introduce two novel components in our proposed framework: the group prompt generation (GPG) module and the co-saliency map generation (CMP) module. We evaluate the framework's performance on widely-used datasets and observe impressive results. Our approach surpasses existing unsupervised methods and even outperforms fully supervised methods developed before 2020, while remaining competitive with some fully supervised methods developed before 2022.
AbstractList Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's capacity to recognize common and salient objects within a collection of images. Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets. The exploration of training-free zero-shot CoSOD frameworks has been limited. In this paper, taking inspiration from the zero-shot transfer capabilities of foundational computer vision models, we introduce the first zero-shot CoSOD framework that harnesses these models without any training process. To achieve this, we introduce two novel components in our proposed framework: the group prompt generation (GPG) module and the co-saliency map generation (CMP) module. We evaluate the framework's performance on widely-used datasets and observe impressive results. Our approach surpasses existing unsupervised methods and even outperforms fully supervised methods developed before 2020, while remaining competitive with some fully supervised methods developed before 2022.
Author Tang, Lv
Li, Shaozi
Luo, Zhiming
Xiao, Haoke
Li, Bo
Author_xml – sequence: 1
  givenname: Haoke
  surname: Xiao
  fullname: Xiao, Haoke
  email: hk.xiao.me@gmail.com
  organization: Institute of Artificial Intelligence, Xiamen University,Xiamen,China
– sequence: 2
  givenname: Lv
  surname: Tang
  fullname: Tang, Lv
  email: lvtang@vivo.com
  organization: Vivo Mobile Communication Co., Ltd,Shanghai,China
– sequence: 3
  givenname: Bo
  surname: Li
  fullname: Li, Bo
  email: libra@vivo.com
  organization: Vivo Mobile Communication Co., Ltd,Shanghai,China
– sequence: 4
  givenname: Zhiming
  surname: Luo
  fullname: Luo, Zhiming
  email: zhiming.luo@xmu.edu.cn
  organization: Institute of Artificial Intelligence, Xiamen University,Xiamen,China
– sequence: 5
  givenname: Shaozi
  surname: Li
  fullname: Li, Shaozi
  email: szlig@xmu.edu.cn
  organization: Institute of Artificial Intelligence, Xiamen University,Xiamen,China
BookMark eNo1j8FKw0AURUdRsK39AxfxA5K-mXmTebOUaLVQaCFdiJsyY14wtU0kCYh_3wF1deAsLudOxVXbtSzEvYRMSnCLVfFQllskJJMpUJhJQCQgvBBzZx1pAzoKIy_FRGnrUung9UZMh-EAAGSRJmLxxn2Xlh_dmBSR_thwOyabcOD3MXnkMaLp2mTZ-xN_d_3nrbiu_XHg-R9nYrd82hUv6XrzHHvWaWMVpmSlJKcCV97mqNmoUDlXO20q1Cog5UHlyngw0WkyTCZg0BC10qHO9Uzc_c42zLz_6puT73_2___0GXCCRUQ
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP48485.2024.10448084
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEL
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9798350344851
EISSN 2379-190X
EndPage 4014
ExternalDocumentID 10448084
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
GroupedDBID 23M
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
JC5
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i724-8711892beda7643e52bd99f935d432b486b2625a05f93385e85b4b306b223bf63
IEDL.DBID RIE
IngestDate Wed Aug 07 05:30:58 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i724-8711892beda7643e52bd99f935d432b486b2625a05f93385e85b4b306b223bf63
OpenAccessLink https://doi.org/10.1109/icassp48485.2024.10448084
PageCount 5
ParticipantIDs ieee_primary_10448084
PublicationCentury 2000
PublicationDate 2024-April-14
PublicationDateYYYYMMDD 2024-04-14
PublicationDate_xml – month: 04
  year: 2024
  text: 2024-April-14
  day: 14
PublicationDecade 2020
PublicationTitle ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublicationTitleAbbrev ICASSP
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0008748
Score 2.3017302
Snippet Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's capacity to recognize common and salient objects within a collection of...
SourceID ieee
SourceType Publisher
StartPage 4010
SubjectTerms Computational modeling
Computer vision
Foundational Computer Vision Model
Image recognition
Object detection
Signal processing
Training
Visualization
Zero-shot Co-saliency Detection
Title Zero-Shot Co-Salient Object Detection Framework
URI https://ieeexplore.ieee.org/document/10448084
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB5sD6IXXxXfRPCatElmk92jVEsVrIVWKF5KJplFERIp6cVf7yR9-ADB0y4L-2IfM9_OzLcAV1ZbTK0m16SygzHk1DVE1q1sQNZHRlXzdD8Mov4T3k_UZBmsXsfCMHPtfMZela1t-VmRzqunMjnhAiY6GhvQiI1ZBGutr10do96EyyWJZvuuez0aDVGjVoICA_RWlX98o1JLkd4ODFb9L5xH3rx5SV768Yua8d8D3IXWV8CeM1yLoj3Y4Hwftr9xDR5A-5lnhTt6KUqnK6mo39KU80jVO4xzw2XtkpU7vZWzVgvGvdtxt-8uf0twX-MA5VYTqGAC4iyJRctgFVBmjDWhyjAMCHVEgWCdpKOkLNSKtSIkAQwkCgLZKDyEZl7kfAROJ2I_tNokmSZEk1CSxIqUjSLNPqN_DK1q6tP3BR_GdDXrkz_KT2GrWoHKBuPjGTTL2ZzPRZSXdFEv4SceKpvO
link.rule.ids 310,311,783,787,792,793,799,27937,55086
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA86wY8XvyZ-W8HX1rW9tMmjTMem2xxswvBl9NoLitDK6F7867126_wAwaeEQELCJbn75e5-EeLKKAOxUWjrmHcw-BTbGtHYhQ_IuEAgS57uXj9oP8H9WI4XyeplLgwRlcFn5BTV0pefZPGseCrjE85goqFgVayxYa2CebrW8uJVIah1cbmg0bzuNG-GwwEoUJJxoAdO1f3HRyqlHmlti341g3n4yJszy9GJP36RM_57ijui_pWyZw2WymhXrFC6J7a-sQ3ui-tnmmb28CXLrSaXbIDzUNYjFi8x1i3lZVBWarWqcK26GLXuRs22vfgvwX4NPeB7jcGC9pCSKGQ7g6SHidZG-zIB30NQAXqMdqKG5DZfSVISARkyIJsIaAL_QNTSLKVDYTUCcn2jdJQoBNARRlEoUZogUOQSuEeiXix98j5nxJhUqz7-o_1CbLRHve6k2-k_nIjNQhqFR8aFU1HLpzM6Y8We43kpzk95bJ8Z
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=ICASSP+2024+-+2024+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%28ICASSP%29&rft.atitle=Zero-Shot+Co-Salient+Object+Detection+Framework&rft.au=Xiao%2C+Haoke&rft.au=Tang%2C+Lv&rft.au=Li%2C+Bo&rft.au=Luo%2C+Zhiming&rft.date=2024-04-14&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=4010&rft.epage=4014&rft_id=info:doi/10.1109%2FICASSP48485.2024.10448084&rft.externalDocID=10448084