Region-Aware Image Captioning via Interaction Learning

Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual system can notice some stimulating regions at first glance, and then volitionally focus on interesting objects within the region. For exampl...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 32; no. 6; pp. 3685 - 3696
Main Authors Liu, An-An, Zhai, Yingchen, Xu, Ning, Nie, Weizhi, Li, Wenhui, Zhang, Yongdong
Format Journal Article
LanguageEnglish
Published New York IEEE 01.06.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual system can notice some stimulating regions at first glance, and then volitionally focus on interesting objects within the region. For example, to generate a free-form sentence about "boy-catch-baseball", the visual region involving "boy" and "baseball" could be first attended and then guide the salient object discovery for the word-by-word generation. Till now, previous captioning works mainly rely on the object-wise modeling and ignore the rich regional patterns. To mitigate the drawback, this paper proposes the region-aware interaction learning method, which aims to explicitly capture the semantic correlations in the region and object dimensions for the word inference. First, given an image, we extract a set of regions which contain diverse objects and their relations. Second, we present the spatial-GCN interaction refining structure which can establish the connection between regions and objects to effectively capture contextual information. Third, we design the dual-attention interaction inference procedure, which enables attention to be calculated in region and object dimensions jointly for the word generation. Specifically, the guidance mechanism is proposed to selectively emphasize semantic inter-dependencies from region to object attentions. Extensive experiments on the MSCOCO dataset demonstrate the superiority of the proposed method. Additional ablation studies and visualization further validate its effectiveness.
AbstractList Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual system can notice some stimulating regions at first glance, and then volitionally focus on interesting objects within the region. For example, to generate a free-form sentence about “boy-catch-baseball”, the visual region involving “boy” and “baseball” could be first attended and then guide the salient object discovery for the word-by-word generation. Till now, previous captioning works mainly rely on the object-wise modeling and ignore the rich regional patterns. To mitigate the drawback, this paper proposes the region-aware interaction learning method, which aims to explicitly capture the semantic correlations in the region and object dimensions for the word inference. First, given an image, we extract a set of regions which contain diverse objects and their relations. Second, we present the spatial-GCN interaction refining structure which can establish the connection between regions and objects to effectively capture contextual information. Third, we design the dual-attention interaction inference procedure, which enables attention to be calculated in region and object dimensions jointly for the word generation. Specifically, the guidance mechanism is proposed to selectively emphasize semantic inter-dependencies from region to object attentions. Extensive experiments on the MSCOCO dataset demonstrate the superiority of the proposed method. Additional ablation studies and visualization further validate its effectiveness.
Author Xu, Ning
Nie, Weizhi
Zhai, Yingchen
Liu, An-An
Zhang, Yongdong
Li, Wenhui
Author_xml – sequence: 1
  givenname: An-An
  orcidid: 0000-0001-5755-9145
  surname: Liu
  fullname: Liu, An-An
  organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China
– sequence: 2
  givenname: Yingchen
  orcidid: 0000-0002-7980-7901
  surname: Zhai
  fullname: Zhai, Yingchen
  organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China
– sequence: 3
  givenname: Ning
  orcidid: 0000-0002-7526-4356
  surname: Xu
  fullname: Xu, Ning
  email: ningxu@tju.edu.cn
  organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China
– sequence: 4
  givenname: Weizhi
  orcidid: 0000-0002-0578-8138
  surname: Nie
  fullname: Nie, Weizhi
  organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China
– sequence: 5
  givenname: Wenhui
  orcidid: 0000-0001-9609-6120
  surname: Li
  fullname: Li, Wenhui
  email: liwenhui@tju.edu.cn
  organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China
– sequence: 6
  givenname: Yongdong
  orcidid: 0000-0002-1151-1792
  surname: Zhang
  fullname: Zhang, Yongdong
  organization: School of Information Science and Technology, University of Science and Technology of China, Hefei, China
BookMark eNp9kE9Lw0AQxRdRsK1-Ab0EPKfuTDLJ7rEE_xQKglavy2YzKSltUjep4rc3scWDB08zzLzfzOONxWnd1CzEFcgpgNS3y-zlbTlFiTCNQKYyohMxAiIVIko67XtJECoEOhfjtl1LCbGK05FInnlVNXU4-7Seg_nWrjjI7K7rZ1W9Cj4qG8zrjr11wyhYsPXD4kKclXbT8uWxTsTr_d0yewwXTw_zbLYIHWrqwpQTylkqhTlF6LTOUyoLawss0lIpFWtwEkqXJzGRZReXeWxRMgIUBSQ6moibw92db9733HZm3ex93b80mKSoJGmKepU6qJxv2tZzaVzV2cFw5221MSDNkJL5SckMKZljSj2Kf9Cdr7bWf_0PXR-gipl_AU29bdLRNx8adFA
CODEN ITCTEM
CitedBy_id crossref_primary_10_1016_j_aej_2023_07_051
crossref_primary_10_1080_13682199_2023_2179992
crossref_primary_10_32604_cmes_2023_028018
crossref_primary_10_61186_ijbc_15_3_13
crossref_primary_10_1007_s00500_023_08789_3
crossref_primary_10_1007_s00500_023_09223_4
crossref_primary_10_1007_s10723_023_09717_3
crossref_primary_10_1007_s11356_023_29118_z
crossref_primary_10_1007_s00371_022_02716_7
crossref_primary_10_1109_TCSVT_2022_3218790
crossref_primary_10_1145_3648370
crossref_primary_10_3389_fpubh_2023_1273253
crossref_primary_10_1155_2023_2768126
crossref_primary_10_1109_ACCESS_2023_3333280
crossref_primary_10_3390_s24030735
crossref_primary_10_1007_s00500_023_09271_w
crossref_primary_10_32604_cmes_2025_059192
crossref_primary_10_1109_ACCESS_2023_3319452
crossref_primary_10_1109_TMM_2023_3238308
crossref_primary_10_1007_s10489_024_05389_y
crossref_primary_10_1007_s00500_023_09313_3
crossref_primary_10_1109_ACCESS_2023_3317276
crossref_primary_10_1016_j_ecoinf_2023_102282
crossref_primary_10_1109_TKDE_2022_3187023
crossref_primary_10_3390_electronics13214207
crossref_primary_10_1109_TMM_2022_3215861
crossref_primary_10_2147_JMDH_S410301
crossref_primary_10_1007_s11276_023_03556_6
crossref_primary_10_1007_s11277_023_10777_7
crossref_primary_10_1007_s00500_023_09210_9
crossref_primary_10_1007_s00500_023_09423_y
crossref_primary_10_1109_TCSVT_2022_3189357
crossref_primary_10_1016_j_ipm_2023_103280
crossref_primary_10_1007_s00500_023_09114_8
crossref_primary_10_1109_ACCESS_2023_3298105
crossref_primary_10_1007_s11042_023_17704_9
crossref_primary_10_1109_ACCESS_2023_3339553
crossref_primary_10_1109_TCSVT_2022_3233369
crossref_primary_10_1007_s00500_023_08966_4
crossref_primary_10_1007_s11042_022_13793_0
crossref_primary_10_7717_peerj_cs_1666
crossref_primary_10_1109_TCSVT_2024_3392619
crossref_primary_10_1109_ACCESS_2023_3319089
crossref_primary_10_1109_ACCESS_2024_3524431
crossref_primary_10_1016_j_compbiomed_2023_107293
crossref_primary_10_3389_fnins_2023_1256351
crossref_primary_10_1109_TCSVT_2023_3296889
crossref_primary_10_1109_TCSVT_2024_3402247
crossref_primary_10_1007_s11063_022_11106_y
crossref_primary_10_1016_j_heliyon_2023_e21429
crossref_primary_10_1007_s00500_023_09222_5
crossref_primary_10_1016_j_heliyon_2023_e22156
crossref_primary_10_1109_TCSVT_2023_3315133
crossref_primary_10_1016_j_cviu_2024_104165
crossref_primary_10_1145_3616399
crossref_primary_10_1007_s10489_023_05167_2
crossref_primary_10_1109_JIOT_2023_3304790
crossref_primary_10_1109_TMM_2023_3243608
crossref_primary_10_3390_math11194132
crossref_primary_10_1109_ACCESS_2023_3317893
crossref_primary_10_1016_j_asoc_2023_110655
crossref_primary_10_1145_3694683
crossref_primary_10_7717_peerj_cs_1850
crossref_primary_10_1016_j_inffus_2023_102006
crossref_primary_10_1109_TCSVT_2023_3343520
crossref_primary_10_1016_j_jag_2024_103939
crossref_primary_10_3390_diagnostics13142323
crossref_primary_10_1016_j_rinp_2023_106699
crossref_primary_10_32604_cmc_2024_054841
crossref_primary_10_1109_JSEN_2024_3522105
crossref_primary_10_1093_jcde_qwad093
crossref_primary_10_1109_ACCESS_2023_3289496
crossref_primary_10_1007_s00500_023_09211_8
crossref_primary_10_1109_ACCESS_2023_3311027
crossref_primary_10_1109_TCSVT_2023_3298755
crossref_primary_10_3390_su151914597
crossref_primary_10_1007_s00500_023_09115_7
crossref_primary_10_1109_TCSVT_2022_3216663
crossref_primary_10_1109_TCSVT_2023_3243725
crossref_primary_10_1007_s00500_023_09319_x
crossref_primary_10_1145_3638558
crossref_primary_10_1007_s10723_023_09705_7
crossref_primary_10_1007_s10723_023_09709_3
crossref_primary_10_3390_s24103032
crossref_primary_10_1007_s00500_023_08963_7
crossref_primary_10_1007_s00500_023_09032_9
crossref_primary_10_1007_s00521_024_10211_4
crossref_primary_10_1016_j_heliyon_2024_e30758
crossref_primary_10_3390_cancers15174412
crossref_primary_10_1007_s00500_023_09031_w
crossref_primary_10_1007_s11042_023_16517_0
crossref_primary_10_1109_TCSVT_2024_3425513
crossref_primary_10_32604_cmc_2025_060788
crossref_primary_10_1109_TCSVT_2022_3181490
crossref_primary_10_1016_j_jksuci_2024_102127
Cites_doi 10.1109/CVPR42600.2020.01059
10.1109/TCSVT.2020.2990989
10.1109/TPAMI.2016.2577031
10.1007/978-3-319-10602-1_48
10.1109/ICCV.2019.00898
10.1109/TCSVT.2018.2867286
10.1109/ICCV.2017.100
10.1109/TMM.2019.2941820
10.1007/s11263-016-0981-7
10.1109/CVPR.2015.7298856
10.1109/CVPR.2015.7299087
10.1007/978-3-030-01216-8_31
10.1109/TPAMI.2012.162
10.1109/TIP.2020.2974065
10.1109/CVPR42600.2020.01098
10.1109/TCSVT.2018.2860797
10.1109/TCSVT.2020.3036860
10.1109/CVPR.2018.00911
10.1007/978-3-642-15561-1_2
10.1007/978-3-030-01267-0_21
10.1609/aaai.v32i1.12283
10.1007/978-3-030-01264-9_42
10.1109/CVPR.2019.01278
10.1109/CVPR.2018.00754
10.1162/neco.1997.9.8.1735
10.1109/TNN.1998.712192
10.1007/bf00992696
10.1063/1.4902458
10.1109/CVPR.2019.01094
10.1109/CVPR.2017.127
10.1109/CVPR42600.2020.01034
10.1109/CVPR.2015.7298932
10.1109/ICCV.2019.00435
10.1007/978-3-030-01246-5_21
10.1109/CVPR.2017.131
10.1109/CVPR.2015.7298935
10.1609/aaai.v32i1.12266
10.1109/ICCV.2019.00473
10.1109/CVPR.2017.121
10.1109/CVPR.2016.503
10.1109/ICCV.2017.524
10.3115/1073083.1073135
10.1109/CVPR.2018.00611
10.1109/CVPR.2019.00856
10.1109/CVPR.2019.00646
10.1109/ICCV.2019.00902
10.1145/3240508.3240632
10.1109/TCSVT.2019.2947482
10.1007/978-3-319-46454-1_24
10.1109/ICCV.2015.291
10.1145/2998181.2998364
10.1109/TMM.2017.2751140
10.1109/ICCV.2019.01039
10.1109/ICCV.2019.00271
10.1109/CVPR.2018.00636
10.1109/ICCV.2017.322
10.1109/ICCV.2015.169
10.1145/2964284.2964299
10.1609/aaai.v35i10.17034
10.1017/9781108924238.008
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TCSVT.2021.3107035
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2205
EndPage 3696
ExternalDocumentID 10_1109_TCSVT_2021_3107035
9521159
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61772359; 62002257
  funderid: 10.13039/501100001809
– fundername: Grant of Tianjin New Generation Artificial Intelligence Major Program
  grantid: 19ZXZNGX00110; 18ZXZNGX00150
– fundername: China Postdoctoral Science Foundation
  grantid: 2021M692395
  funderid: 10.13039/501100002858
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
RXW
TAE
TN5
VH1
AAYXX
CITATION
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c295t-7e65be0882b532c99b75fdaad2d7f888491c01fcb6455aec4fb4a20e211dd1693
IEDL.DBID RIE
ISSN 1051-8215
IngestDate Sun Jun 29 15:41:42 EDT 2025
Tue Jul 01 00:41:16 EDT 2025
Thu Apr 24 22:59:51 EDT 2025
Wed Aug 27 02:24:37 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c295t-7e65be0882b532c99b75fdaad2d7f888491c01fcb6455aec4fb4a20e211dd1693
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-1151-1792
0000-0001-5755-9145
0000-0002-7526-4356
0000-0002-0578-8138
0000-0002-7980-7901
0000-0001-9609-6120
PQID 2672805953
PQPubID 85433
PageCount 12
ParticipantIDs proquest_journals_2672805953
crossref_citationtrail_10_1109_TCSVT_2021_3107035
crossref_primary_10_1109_TCSVT_2021_3107035
ieee_primary_9521159
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-06-01
PublicationDateYYYYMMDD 2022-06-01
PublicationDate_xml – month: 06
  year: 2022
  text: 2022-06-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev TCSVT
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref57
ref12
ref56
ref15
ref59
ref14
ref58
ref53
ref52
ref11
ref55
ref10
ref54
Velickovic (ref49)
ref17
ref16
ref19
ref51
ref50
Banerjee (ref3)
ref46
ref48
Yang (ref61)
ref42
ref41
ref44
ref43
Huang (ref18)
ref8
ref9
ref4
ref6
ref5
ref40
Simonyan (ref45)
Devlin (ref7)
Lin (ref32); 1
ref35
ref34
ref37
Yang (ref60)
ref36
ref31
ref30
ref33
ref2
ref1
ref39
ref38
ref70
ref24
ref23
ref67
ref26
ref25
ref69
ref20
ref64
ref63
ref22
ref66
ref21
ref65
Zhang (ref68)
ref28
ref27
ref29
Vaswani (ref47)
ref62
References_xml – ident: ref5
  doi: 10.1109/CVPR42600.2020.01059
– ident: ref55
  doi: 10.1109/TCSVT.2020.2990989
– ident: ref43
  doi: 10.1109/TPAMI.2016.2577031
– ident: ref33
  doi: 10.1007/978-3-319-10602-1_48
– ident: ref22
  doi: 10.1109/ICCV.2019.00898
– ident: ref56
  doi: 10.1109/TCSVT.2018.2867286
– ident: ref35
  doi: 10.1109/ICCV.2017.100
– ident: ref57
  doi: 10.1109/TMM.2019.2941820
– ident: ref24
  doi: 10.1007/s11263-016-0981-7
– ident: ref4
  doi: 10.1109/CVPR.2015.7298856
– ident: ref48
  doi: 10.1109/CVPR.2015.7299087
– ident: ref20
  doi: 10.1007/978-3-030-01216-8_31
– ident: ref25
  doi: 10.1109/TPAMI.2012.162
– ident: ref70
  doi: 10.1109/TIP.2020.2974065
– ident: ref39
  doi: 10.1109/CVPR42600.2020.01098
– ident: ref26
  doi: 10.1109/TCSVT.2018.2860797
– ident: ref53
  doi: 10.1109/TCSVT.2020.3036860
– ident: ref69
  doi: 10.1109/CVPR.2018.00911
– start-page: 65
  volume-title: Proc. ACL Workshop MT
  ident: ref3
  article-title: METEOR: An automatic metric for MT evaluation with improved correlation with human judgments
– ident: ref8
  doi: 10.1007/978-3-642-15561-1_2
– ident: ref36
  doi: 10.1007/978-3-030-01267-0_21
– volume-title: Proc. ICLR
  ident: ref45
  article-title: Very deep convolutional networks for large-scale image recognition
– ident: ref19
  doi: 10.1609/aaai.v32i1.12283
– ident: ref62
  doi: 10.1007/978-3-030-01264-9_42
– ident: ref31
  doi: 10.1109/CVPR.2019.01278
– ident: ref37
  doi: 10.1109/CVPR.2018.00754
– volume-title: Proc. ICLR
  ident: ref68
  article-title: Learning to count objects in natural images for visual question answering
– ident: ref15
  doi: 10.1162/neco.1997.9.8.1735
– ident: ref46
  doi: 10.1109/TNN.1998.712192
– ident: ref52
  doi: 10.1007/bf00992696
– ident: ref23
  doi: 10.1063/1.4902458
– start-page: 5998
  volume-title: Proc. NIPS
  ident: ref47
  article-title: Attention is all you need
– ident: ref58
  doi: 10.1109/CVPR.2019.01094
– ident: ref9
  doi: 10.1109/CVPR.2017.127
– ident: ref13
  doi: 10.1109/CVPR42600.2020.01034
– ident: ref21
  doi: 10.1109/CVPR.2015.7298932
– ident: ref59
  doi: 10.1109/ICCV.2019.00435
– ident: ref29
  doi: 10.1007/978-3-030-01246-5_21
– ident: ref44
  doi: 10.1109/CVPR.2017.131
– ident: ref50
  doi: 10.1109/CVPR.2015.7298935
– ident: ref12
  doi: 10.1609/aaai.v32i1.12266
– start-page: 2361
  volume-title: Proc. NIPS
  ident: ref61
  article-title: Review networks for caption generation
– ident: ref17
  doi: 10.1109/ICCV.2019.00473
– ident: ref6
  doi: 10.1109/CVPR.2017.121
– ident: ref65
  doi: 10.1109/CVPR.2016.503
– ident: ref64
  doi: 10.1109/ICCV.2017.524
– ident: ref40
  doi: 10.3115/1073083.1073135
– ident: ref67
  doi: 10.1109/CVPR.2018.00611
– ident: ref42
  doi: 10.1109/CVPR.2019.00856
– volume-title: Proc. ICLR
  ident: ref49
  article-title: Graph attention networks
– ident: ref10
  doi: 10.1109/CVPR.2019.00646
– ident: ref27
  doi: 10.1109/ICCV.2019.00902
– ident: ref34
  doi: 10.1145/3240508.3240632
– ident: ref66
  doi: 10.1109/TCSVT.2019.2947482
– ident: ref1
  doi: 10.1007/978-3-319-46454-1_24
– ident: ref38
  doi: 10.1109/ICCV.2015.291
– ident: ref54
  doi: 10.1145/2998181.2998364
– ident: ref28
  doi: 10.1109/TMM.2017.2751140
– volume: 1
  start-page: 10
  volume-title: Proc. ACL Workshop
  ident: ref32
  article-title: Rouge: A package for automatic evaluation of summaries
– start-page: 4171
  volume-title: Proc. NAACL-HLT
  ident: ref7
  article-title: BERT: Pre-training of deep bidirectional transformers for language understanding
– ident: ref16
  doi: 10.1109/ICCV.2019.01039
– ident: ref63
  doi: 10.1109/ICCV.2019.00271
– ident: ref2
  doi: 10.1109/CVPR.2018.00636
– ident: ref14
  doi: 10.1109/ICCV.2017.322
– ident: ref11
  doi: 10.1109/ICCV.2015.169
– start-page: 8940
  volume-title: Proc. NIPS
  ident: ref18
  article-title: Adaptively aligned image captioning via adaptive attention time
– start-page: 444
  volume-title: Proc. EMNLP
  ident: ref60
  article-title: Corpus-guided sentence generation of natural images
– ident: ref51
  doi: 10.1145/2964284.2964299
– ident: ref30
  doi: 10.1609/aaai.v35i10.17034
– ident: ref41
  doi: 10.1017/9781108924238.008
SSID ssj0014847
Score 2.6304262
Snippet Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3685
SubjectTerms Ablation
Baseball
Computer vision
Feature extraction
Free form
image captioning
Inference
interaction learning
Learning
Learning systems
Object recognition
Proposals
Region modeling
Salience
Semantics
Sports
Task analysis
Visualization
Title Region-Aware Image Captioning via Interaction Learning
URI https://ieeexplore.ieee.org/document/9521159
https://www.proquest.com/docview/2672805953
Volume 32
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gJz34QiOKZg_edGG32xZ6JESCJnhQMNw2fa0xKhBcNPHX23a7xFeMtx7aZDLT6bw63wCcEmW0KKHCKBImIe7oJBSR5mGcZCjhklLkKvjDazoY46sJmVTgfNULo7V2n8900y5dLV_N5NKmylrM2BpjftdgzQRuRa_WqmKAO26YmHEX4rBj7FjZIBOx1qh3ezcyoSCKTYRqrzj5YoTcVJUfT7GzL_0tGJaUFd9KHpvLXDTl-zfQxv-Svg2b3tEMusXN2IGKnu7Cxif4wRrQG21_I4fdN77QweWzeVqCHp_7FG3w-sADlzAseh8CD8V6vwfj_sWoNwj9HIVQIkbysK0pEdr60oIkSDIm2iRTnCuk2pmJgDGLZRRnUlBMCNcSZwJzFGlDsVIWrGUfqtPZVB9AEDMhFCeKKqlwRgWTmRF2QiXV1EIR1iEuGZtKDzJuZ108pS7YiFjqhJFaYaReGHU4W52ZFxAbf-6uWe6udnrG1qFRyi_1WviSImqHbxFGksPfTx3BOrLtDC6r0oBqvljqY-Nk5OLE3a4Pc_fMKw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gHtSDLzSiqHvwpgu73W2hR0IkoMBBwXDb9LXGqEBw0cRfb9vdJb5ivPXQJs08Oo_OfANwhqXWooBwrUghdsOGClzuKeb6QYwCJghB9ge_PyCdUXg1xuMCXCx7YZRStvhMVc3S_uXLqViYVFmNalujze8KrGq7j_20W2v5ZxA27Dgx7TD4bkNbsrxFxqO1Yev2bqiDQeTrGNUIOf5ihuxclR-PsbUw7S3o53dLC0seq4uEV8X7N9jG_15-GzYzV9NpprKxAwU12YWNTwCEJSA3ytQju803NldO91k_Lk6LzbIkrfP6wBybMky7H5wMjPV-D0bty2Gr42aTFFyBKE7cuiKYK-NNcxwgQSmv41gyJpGsxzoGDqkvPD8WnGi6MiXCmIcMeUrfWEoD17IPxcl0og7A8SnnkmFJpJBhTDgVsWZ3QARRxIARlsHPCRuJDGbcTLt4imy44dHIMiMyzIgyZpThfHlmloJs_Lm7ZKi73JkRtgyVnH9RpocvESJm_BamODj8_dQprHWG_V7U6w6uj2AdmeYGm2OpQDGZL9SxdjkSfmIl7QNHwM90
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Region-Aware+Image+Captioning+via+Interaction+Learning&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Liu%2C+An-An&rft.au=Zhai%2C+Yingchen&rft.au=Xu%2C+Ning&rft.au=Nie%2C+Weizhi&rft.date=2022-06-01&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=32&rft.issue=6&rft.spage=3685&rft.epage=3696&rft_id=info:doi/10.1109%2FTCSVT.2021.3107035&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2021_3107035
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon