Tagging Like Humans: Diverse and Distinct Image Annotation
In this work we propose a new automatic image annotation model, dubbed diverse and distinct image annotation (D2IA). The generative model D2IA is inspired by the ensemble of human annotations, which create semantically relevant, yet distinct and diverse tags. In D2IA, we generate a relevant and dist...
Saved in:
Published in | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 7967 - 7975 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.06.2018
|
Subjects | |
Online Access | Get full text |
ISSN | 1063-6919 |
DOI | 10.1109/CVPR.2018.00831 |
Cover
Loading…
Abstract | In this work we propose a new automatic image annotation model, dubbed diverse and distinct image annotation (D2IA). The generative model D2IA is inspired by the ensemble of human annotations, which create semantically relevant, yet distinct and diverse tags. In D2IA, we generate a relevant and distinct tag subset, in which the tags are relevant to the image contents and semantically distinct to each other, using sequential sampling from a determinantal point process (DPP) model. Multiple such tag subsets that cover diverse semantic aspects or diverse semantic levels of the image contents are generated by randomly perturbing the DPP sampling process. We leverage a generative adversarial network (GAN) model to train D2IA. Extensive experiments including quantitative and qualitative comparisons, as well as human subject studies, on two benchmark datasets demonstrate that the proposed model can produce more diverse and distinct tags than the state-of-the-arts. |
---|---|
AbstractList | In this work we propose a new automatic image annotation model, dubbed diverse and distinct image annotation (D2IA). The generative model D2IA is inspired by the ensemble of human annotations, which create semantically relevant, yet distinct and diverse tags. In D2IA, we generate a relevant and distinct tag subset, in which the tags are relevant to the image contents and semantically distinct to each other, using sequential sampling from a determinantal point process (DPP) model. Multiple such tag subsets that cover diverse semantic aspects or diverse semantic levels of the image contents are generated by randomly perturbing the DPP sampling process. We leverage a generative adversarial network (GAN) model to train D2IA. Extensive experiments including quantitative and qualitative comparisons, as well as human subject studies, on two benchmark datasets demonstrate that the proposed model can produce more diverse and distinct tags than the state-of-the-arts. |
Author | Chen, Weidong Sun, Peng Liu, Wei Lyu, Siwei Wu, Baoyuan Ghanem, Bernard |
Author_xml | – sequence: 1 givenname: Baoyuan surname: Wu fullname: Wu, Baoyuan – sequence: 2 givenname: Weidong surname: Chen fullname: Chen, Weidong – sequence: 3 givenname: Peng surname: Sun fullname: Sun, Peng – sequence: 4 givenname: Wei surname: Liu fullname: Liu, Wei – sequence: 5 givenname: Bernard surname: Ghanem fullname: Ghanem, Bernard – sequence: 6 givenname: Siwei surname: Lyu fullname: Lyu, Siwei |
BookMark | eNotjMFKw0AQQFdRsNaePXjZH0id2clud3orsdpCQZHqtWySaVg1G2mi4N9b0NN77_Iu1Vnqkih1jTBFBL4tXp-epwbQTwE84Yma8MyjJe9cboBP1QjBUeYY-UJN-v4NAIzz5HM7UvNtaJqYGr2J76JXX21I_VzfxW859KJDqo_eDzFVg163oRG9SKkbwhC7dKXO9-Gjl8k_x-rlfrktVtnm8WFdLDZZxJkdMpHa-QrIlljnoUI6ls3BB0OGgqskZwHGsmSkvN6T5boqiQEcm6pGobG6-ftGEdl9HmIbDj87b2eeDdMvb6hIiA |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/CVPR.2018.00831 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences |
EISBN | 9781538664209 1538664208 |
EISSN | 1063-6919 |
EndPage | 7975 |
ExternalDocumentID | 8578929 |
Genre | orig-research |
GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
ID | FETCH-LOGICAL-i175t-eed68c035b1d4ac1368c5408a2323a6ce49e091bb9134df359dcb3900692cd1e3 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:52:16 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i175t-eed68c035b1d4ac1368c5408a2323a6ce49e091bb9134df359dcb3900692cd1e3 |
PageCount | 9 |
ParticipantIDs | ieee_primary_8578929 |
PublicationCentury | 2000 |
PublicationDate | 2018-Jun |
PublicationDateYYYYMMDD | 2018-06-01 |
PublicationDate_xml | – month: 06 year: 2018 text: 2018-Jun |
PublicationDecade | 2010 |
PublicationTitle | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition |
PublicationTitleAbbrev | CVPR |
PublicationYear | 2018 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0002683845 ssj0003211698 |
Score | 2.33161 |
Snippet | In this work we propose a new automatic image annotation model, dubbed diverse and distinct image annotation (D2IA). The generative model D2IA is inspired by... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 7967 |
SubjectTerms | Gallium nitride Generators Image annotation Redundancy Semantics Task analysis Training |
Title | Tagging Like Humans: Diverse and Distinct Image Annotation |
URI | https://ieeexplore.ieee.org/document/8578929 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH4BTp5Qwfg7PXh0sK3daLkZlKBRQwwYbqRr3wwhDiPj4l_v6zbRGA_e2l7WtGvf1_e-9z2ACyMsSuv7Xqhk6okk4p4OtPAUYXndU-hU9B3b4jEeTcXdLJrV4HKbC4OIBfkMO65ZxPLtymycq6wr6fcic16HOj3cylytrT8ljCWXVYTM9Tm9bGIlKzWfwFfdwfP4yXG5HHlSuqJyP8qpFNZk2ISHr3mUJJJlZ5MnHfPxS6LxvxPdhfZ33h4bby3SHtQw24dmBTRZdYzXLehPtHM0v7D7xRJZ4cdf99l1QdFApjNLbTr6mcnZ7StdOOwqy1ZlzL4N0-HNZDDyqiIK3oKQQe7RF2NpfB4lgRXaBJx6hNKkJijFdWxQKCTMkCQuBG9THilrEq6cgnFobID8ABrZKsNDYEHqxymGMuY6FD20WkjNQ-dJRQxUKo6g5ZZi_lbqZMyrVTj-e_gEdtxmlLSrU2jk7xs8IwOfJ-fFzn4CJ9Gi8w |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH5BPOgJFYy_7cGjg23tRsvNoAQUCDFguJGufSOEOIyMi3-97TbRGA_e2l7W9Mfe1_e-9z2AG8U0cu26ji947LAooI70JHOEwfKyKdCq6Fu2xTDsTtjjNJiW4HabC4OIGfkM67aZxfL1Sm2sq6zBzfEy5nwHdo3dZyLP1tp6VPyQU17EyGyfmrdNKHih5-O5otF-GT1bNpelT3JbVu5HQZXMnnQqMPiaSU4jWdY3aVRXH79EGv871QOofWfukdHWJh1CCZMjqBRQkxQXeV2F1lhaV_Oc9BdLJJknf90i9xlJA4lMtGmby5-olPRezS-H3CXJKo_a12DSeRi3u05RRsFZGGyQOuaLIVcuDSJPM6k8anoGp3FpwBSVoUIm0KCGKLJBeB3TQGgVUWE1jH2lPaTHUE5WCZ4A8WI3jNHnIZU-a6KWjEvqW18qoididgpVuxSzt1wpY1aswtnfw9ew1x0P-rN-b_h0Dvt2Y3IS1gWU0_cNXhpzn0ZX2S5_AumPpkM |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=Tagging+Like+Humans%3A+Diverse+and+Distinct+Image+Annotation&rft.au=Wu%2C+Baoyuan&rft.au=Chen%2C+Weidong&rft.au=Sun%2C+Peng&rft.au=Liu%2C+Wei&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=7967&rft.epage=7975&rft_id=info:doi/10.1109%2FCVPR.2018.00831&rft.externalDocID=8578929 |