Region-Aware Image Captioning via Interaction Learning

Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual system can notice some stimulating regions at first glance, and then volitionally focus on interesting objects within the region. For exampl...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 32; no. 6; pp. 3685 - 3696
Main Authors	Liu, An-An, Zhai, Yingchen, Xu, Ning, Nie, Weizhi, Li, Wenhui, Zhang, Yongdong
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Ablation Baseball Computer vision Feature extraction Free form image captioning Inference interaction learning Learning Learning systems Object recognition Proposals Region modeling Salience Semantics Sports Task analysis Visualization
Online Access	Get full text

Cover

Loading…

Abstract	Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual system can notice some stimulating regions at first glance, and then volitionally focus on interesting objects within the region. For example, to generate a free-form sentence about "boy-catch-baseball", the visual region involving "boy" and "baseball" could be first attended and then guide the salient object discovery for the word-by-word generation. Till now, previous captioning works mainly rely on the object-wise modeling and ignore the rich regional patterns. To mitigate the drawback, this paper proposes the region-aware interaction learning method, which aims to explicitly capture the semantic correlations in the region and object dimensions for the word inference. First, given an image, we extract a set of regions which contain diverse objects and their relations. Second, we present the spatial-GCN interaction refining structure which can establish the connection between regions and objects to effectively capture contextual information. Third, we design the dual-attention interaction inference procedure, which enables attention to be calculated in region and object dimensions jointly for the word generation. Specifically, the guidance mechanism is proposed to selectively emphasize semantic inter-dependencies from region to object attentions. Extensive experiments on the MSCOCO dataset demonstrate the superiority of the proposed method. Additional ablation studies and visualization further validate its effectiveness.
AbstractList	Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual system can notice some stimulating regions at first glance, and then volitionally focus on interesting objects within the region. For example, to generate a free-form sentence about “boy-catch-baseball”, the visual region involving “boy” and “baseball” could be first attended and then guide the salient object discovery for the word-by-word generation. Till now, previous captioning works mainly rely on the object-wise modeling and ignore the rich regional patterns. To mitigate the drawback, this paper proposes the region-aware interaction learning method, which aims to explicitly capture the semantic correlations in the region and object dimensions for the word inference. First, given an image, we extract a set of regions which contain diverse objects and their relations. Second, we present the spatial-GCN interaction refining structure which can establish the connection between regions and objects to effectively capture contextual information. Third, we design the dual-attention interaction inference procedure, which enables attention to be calculated in region and object dimensions jointly for the word generation. Specifically, the guidance mechanism is proposed to selectively emphasize semantic inter-dependencies from region to object attentions. Extensive experiments on the MSCOCO dataset demonstrate the superiority of the proposed method. Additional ablation studies and visualization further validate its effectiveness.
Author	Xu, Ning Nie, Weizhi Zhai, Yingchen Liu, An-An Zhang, Yongdong Li, Wenhui
Author_xml	– sequence: 1 givenname: An-An orcidid: 0000-0001-5755-9145 surname: Liu fullname: Liu, An-An organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China – sequence: 2 givenname: Yingchen orcidid: 0000-0002-7980-7901 surname: Zhai fullname: Zhai, Yingchen organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China – sequence: 3 givenname: Ning orcidid: 0000-0002-7526-4356 surname: Xu fullname: Xu, Ning email: ningxu@tju.edu.cn organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China – sequence: 4 givenname: Weizhi orcidid: 0000-0002-0578-8138 surname: Nie fullname: Nie, Weizhi organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China – sequence: 5 givenname: Wenhui orcidid: 0000-0001-9609-6120 surname: Li fullname: Li, Wenhui email: liwenhui@tju.edu.cn organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China – sequence: 6 givenname: Yongdong orcidid: 0000-0002-1151-1792 surname: Zhang fullname: Zhang, Yongdong organization: School of Information Science and Technology, University of Science and Technology of China, Hefei, China
BookMark	eNp9kE9Lw0AQxRdRsK1-Ab0EPKfuTDLJ7rEE_xQKglavy2YzKSltUjep4rc3scWDB08zzLzfzOONxWnd1CzEFcgpgNS3y-zlbTlFiTCNQKYyohMxAiIVIko67XtJECoEOhfjtl1LCbGK05FInnlVNXU4-7Seg_nWrjjI7K7rZ1W9Cj4qG8zrjr11wyhYsPXD4kKclXbT8uWxTsTr_d0yewwXTw_zbLYIHWrqwpQTylkqhTlF6LTOUyoLawss0lIpFWtwEkqXJzGRZReXeWxRMgIUBSQ6moibw92db9733HZm3ex93b80mKSoJGmKepU6qJxv2tZzaVzV2cFw5221MSDNkJL5SckMKZljSj2Kf9Cdr7bWf_0PXR-gipl_AU29bdLRNx8adFA
CODEN	ITCTEM
CitedBy_id	crossref_primary_10_1016_j_aej_2023_07_051 crossref_primary_10_1080_13682199_2023_2179992 crossref_primary_10_32604_cmes_2023_028018 crossref_primary_10_61186_ijbc_15_3_13 crossref_primary_10_1007_s00500_023_08789_3 crossref_primary_10_1007_s00500_023_09223_4 crossref_primary_10_1007_s10723_023_09717_3 crossref_primary_10_1007_s11356_023_29118_z crossref_primary_10_1007_s00371_022_02716_7 crossref_primary_10_1109_TCSVT_2022_3218790 crossref_primary_10_1145_3648370 crossref_primary_10_3389_fpubh_2023_1273253 crossref_primary_10_1155_2023_2768126 crossref_primary_10_1109_ACCESS_2023_3333280 crossref_primary_10_3390_s24030735 crossref_primary_10_1007_s00500_023_09271_w crossref_primary_10_32604_cmes_2025_059192 crossref_primary_10_1109_ACCESS_2023_3319452 crossref_primary_10_1109_TMM_2023_3238308 crossref_primary_10_1007_s10489_024_05389_y crossref_primary_10_1007_s00500_023_09313_3 crossref_primary_10_1109_ACCESS_2023_3317276 crossref_primary_10_1016_j_ecoinf_2023_102282 crossref_primary_10_1109_TKDE_2022_3187023 crossref_primary_10_3390_electronics13214207 crossref_primary_10_1109_TMM_2022_3215861 crossref_primary_10_2147_JMDH_S410301 crossref_primary_10_1007_s11276_023_03556_6 crossref_primary_10_1007_s11277_023_10777_7 crossref_primary_10_1007_s00500_023_09210_9 crossref_primary_10_1007_s00500_023_09423_y crossref_primary_10_1109_TCSVT_2022_3189357 crossref_primary_10_1016_j_ipm_2023_103280 crossref_primary_10_1007_s00500_023_09114_8 crossref_primary_10_1109_ACCESS_2023_3298105 crossref_primary_10_1007_s11042_023_17704_9 crossref_primary_10_1109_ACCESS_2023_3339553 crossref_primary_10_1109_TCSVT_2022_3233369 crossref_primary_10_1007_s00500_023_08966_4 crossref_primary_10_1007_s11042_022_13793_0 crossref_primary_10_7717_peerj_cs_1666 crossref_primary_10_1109_TCSVT_2024_3392619 crossref_primary_10_1109_ACCESS_2023_3319089 crossref_primary_10_1109_ACCESS_2024_3524431 crossref_primary_10_1016_j_compbiomed_2023_107293 crossref_primary_10_3389_fnins_2023_1256351 crossref_primary_10_1109_TCSVT_2023_3296889 crossref_primary_10_1109_TCSVT_2024_3402247 crossref_primary_10_1007_s11063_022_11106_y crossref_primary_10_1016_j_heliyon_2023_e21429 crossref_primary_10_1007_s00500_023_09222_5 crossref_primary_10_1016_j_heliyon_2023_e22156 crossref_primary_10_1109_TCSVT_2023_3315133 crossref_primary_10_1016_j_cviu_2024_104165 crossref_primary_10_1145_3616399 crossref_primary_10_1007_s10489_023_05167_2 crossref_primary_10_1109_JIOT_2023_3304790 crossref_primary_10_1109_TMM_2023_3243608 crossref_primary_10_3390_math11194132 crossref_primary_10_1109_ACCESS_2023_3317893 crossref_primary_10_1016_j_asoc_2023_110655 crossref_primary_10_1145_3694683 crossref_primary_10_7717_peerj_cs_1850 crossref_primary_10_1016_j_inffus_2023_102006 crossref_primary_10_1109_TCSVT_2023_3343520 crossref_primary_10_1016_j_jag_2024_103939 crossref_primary_10_3390_diagnostics13142323 crossref_primary_10_1016_j_rinp_2023_106699 crossref_primary_10_32604_cmc_2024_054841 crossref_primary_10_1109_JSEN_2024_3522105 crossref_primary_10_1093_jcde_qwad093 crossref_primary_10_1109_ACCESS_2023_3289496 crossref_primary_10_1007_s00500_023_09211_8 crossref_primary_10_1109_ACCESS_2023_3311027 crossref_primary_10_1109_TCSVT_2023_3298755 crossref_primary_10_3390_su151914597 crossref_primary_10_1007_s00500_023_09115_7 crossref_primary_10_1109_TCSVT_2022_3216663 crossref_primary_10_1109_TCSVT_2023_3243725 crossref_primary_10_1007_s00500_023_09319_x crossref_primary_10_1145_3638558 crossref_primary_10_1007_s10723_023_09705_7 crossref_primary_10_1007_s10723_023_09709_3 crossref_primary_10_3390_s24103032 crossref_primary_10_1007_s00500_023_08963_7 crossref_primary_10_1007_s00500_023_09032_9 crossref_primary_10_1007_s00521_024_10211_4 crossref_primary_10_1016_j_heliyon_2024_e30758 crossref_primary_10_3390_cancers15174412 crossref_primary_10_1007_s00500_023_09031_w crossref_primary_10_1007_s11042_023_16517_0 crossref_primary_10_1109_TCSVT_2024_3425513 crossref_primary_10_32604_cmc_2025_060788 crossref_primary_10_1109_TCSVT_2022_3181490 crossref_primary_10_1016_j_jksuci_2024_102127
Cites_doi	10.1109/CVPR42600.2020.01059 10.1109/TCSVT.2020.2990989 10.1109/TPAMI.2016.2577031 10.1007/978-3-319-10602-1_48 10.1109/ICCV.2019.00898 10.1109/TCSVT.2018.2867286 10.1109/ICCV.2017.100 10.1109/TMM.2019.2941820 10.1007/s11263-016-0981-7 10.1109/CVPR.2015.7298856 10.1109/CVPR.2015.7299087 10.1007/978-3-030-01216-8_31 10.1109/TPAMI.2012.162 10.1109/TIP.2020.2974065 10.1109/CVPR42600.2020.01098 10.1109/TCSVT.2018.2860797 10.1109/TCSVT.2020.3036860 10.1109/CVPR.2018.00911 10.1007/978-3-642-15561-1_2 10.1007/978-3-030-01267-0_21 10.1609/aaai.v32i1.12283 10.1007/978-3-030-01264-9_42 10.1109/CVPR.2019.01278 10.1109/CVPR.2018.00754 10.1162/neco.1997.9.8.1735 10.1109/TNN.1998.712192 10.1007/bf00992696 10.1063/1.4902458 10.1109/CVPR.2019.01094 10.1109/CVPR.2017.127 10.1109/CVPR42600.2020.01034 10.1109/CVPR.2015.7298932 10.1109/ICCV.2019.00435 10.1007/978-3-030-01246-5_21 10.1109/CVPR.2017.131 10.1109/CVPR.2015.7298935 10.1609/aaai.v32i1.12266 10.1109/ICCV.2019.00473 10.1109/CVPR.2017.121 10.1109/CVPR.2016.503 10.1109/ICCV.2017.524 10.3115/1073083.1073135 10.1109/CVPR.2018.00611 10.1109/CVPR.2019.00856 10.1109/CVPR.2019.00646 10.1109/ICCV.2019.00902 10.1145/3240508.3240632 10.1109/TCSVT.2019.2947482 10.1007/978-3-319-46454-1_24 10.1109/ICCV.2015.291 10.1145/2998181.2998364 10.1109/TMM.2017.2751140 10.1109/ICCV.2019.01039 10.1109/ICCV.2019.00271 10.1109/CVPR.2018.00636 10.1109/ICCV.2017.322 10.1109/ICCV.2015.169 10.1145/2964284.2964299 10.1609/aaai.v35i10.17034 10.1017/9781108924238.008
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1109/TCSVT.2021.3107035
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-2205
EndPage	3696
ExternalDocumentID	10_1109_TCSVT_2021_3107035 9521159
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 61772359; 62002257 funderid: 10.13039/501100001809 – fundername: Grant of Tianjin New Generation Artificial Intelligence Major Program grantid: 19ZXZNGX00110; 18ZXZNGX00150 – fundername: China Postdoctoral Science Foundation grantid: 2021M692395 funderid: 10.13039/501100002858
GroupedDBID	-~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c295t-7e65be0882b532c99b75fdaad2d7f888491c01fcb6455aec4fb4a20e211dd1693
IEDL.DBID	RIE
ISSN	1051-8215
IngestDate	Sun Jun 29 15:41:42 EDT 2025 Tue Jul 01 00:41:16 EDT 2025 Thu Apr 24 22:59:51 EDT 2025 Wed Aug 27 02:24:37 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	6
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c295t-7e65be0882b532c99b75fdaad2d7f888491c01fcb6455aec4fb4a20e211dd1693
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-1151-1792 0000-0001-5755-9145 0000-0002-7526-4356 0000-0002-0578-8138 0000-0002-7980-7901 0000-0001-9609-6120
PQID	2672805953
PQPubID	85433
PageCount	12
ParticipantIDs	proquest_journals_2672805953 crossref_citationtrail_10_1109_TCSVT_2021_3107035 crossref_primary_10_1109_TCSVT_2021_3107035 ieee_primary_9521159
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2022-06-01
PublicationDateYYYYMMDD	2022-06-01
PublicationDate_xml	– month: 06 year: 2022 text: 2022-06-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev	TCSVT
PublicationYear	2022
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref57 ref12 ref56 ref15 ref59 ref14 ref58 ref53 ref52 ref11 ref55 ref10 ref54 Velickovic (ref49) ref17 ref16 ref19 ref51 ref50 Banerjee (ref3) ref46 ref48 Yang (ref61) ref42 ref41 ref44 ref43 Huang (ref18) ref8 ref9 ref4 ref6 ref5 ref40 Simonyan (ref45) Devlin (ref7) Lin (ref32); 1 ref35 ref34 ref37 Yang (ref60) ref36 ref31 ref30 ref33 ref2 ref1 ref39 ref38 ref70 ref24 ref23 ref67 ref26 ref25 ref69 ref20 ref64 ref63 ref22 ref66 ref21 ref65 Zhang (ref68) ref28 ref27 ref29 Vaswani (ref47) ref62
References_xml	– ident: ref5 doi: 10.1109/CVPR42600.2020.01059 – ident: ref55 doi: 10.1109/TCSVT.2020.2990989 – ident: ref43 doi: 10.1109/TPAMI.2016.2577031 – ident: ref33 doi: 10.1007/978-3-319-10602-1_48 – ident: ref22 doi: 10.1109/ICCV.2019.00898 – ident: ref56 doi: 10.1109/TCSVT.2018.2867286 – ident: ref35 doi: 10.1109/ICCV.2017.100 – ident: ref57 doi: 10.1109/TMM.2019.2941820 – ident: ref24 doi: 10.1007/s11263-016-0981-7 – ident: ref4 doi: 10.1109/CVPR.2015.7298856 – ident: ref48 doi: 10.1109/CVPR.2015.7299087 – ident: ref20 doi: 10.1007/978-3-030-01216-8_31 – ident: ref25 doi: 10.1109/TPAMI.2012.162 – ident: ref70 doi: 10.1109/TIP.2020.2974065 – ident: ref39 doi: 10.1109/CVPR42600.2020.01098 – ident: ref26 doi: 10.1109/TCSVT.2018.2860797 – ident: ref53 doi: 10.1109/TCSVT.2020.3036860 – ident: ref69 doi: 10.1109/CVPR.2018.00911 – start-page: 65 volume-title: Proc. ACL Workshop MT ident: ref3 article-title: METEOR: An automatic metric for MT evaluation with improved correlation with human judgments – ident: ref8 doi: 10.1007/978-3-642-15561-1_2 – ident: ref36 doi: 10.1007/978-3-030-01267-0_21 – volume-title: Proc. ICLR ident: ref45 article-title: Very deep convolutional networks for large-scale image recognition – ident: ref19 doi: 10.1609/aaai.v32i1.12283 – ident: ref62 doi: 10.1007/978-3-030-01264-9_42 – ident: ref31 doi: 10.1109/CVPR.2019.01278 – ident: ref37 doi: 10.1109/CVPR.2018.00754 – volume-title: Proc. ICLR ident: ref68 article-title: Learning to count objects in natural images for visual question answering – ident: ref15 doi: 10.1162/neco.1997.9.8.1735 – ident: ref46 doi: 10.1109/TNN.1998.712192 – ident: ref52 doi: 10.1007/bf00992696 – ident: ref23 doi: 10.1063/1.4902458 – start-page: 5998 volume-title: Proc. NIPS ident: ref47 article-title: Attention is all you need – ident: ref58 doi: 10.1109/CVPR.2019.01094 – ident: ref9 doi: 10.1109/CVPR.2017.127 – ident: ref13 doi: 10.1109/CVPR42600.2020.01034 – ident: ref21 doi: 10.1109/CVPR.2015.7298932 – ident: ref59 doi: 10.1109/ICCV.2019.00435 – ident: ref29 doi: 10.1007/978-3-030-01246-5_21 – ident: ref44 doi: 10.1109/CVPR.2017.131 – ident: ref50 doi: 10.1109/CVPR.2015.7298935 – ident: ref12 doi: 10.1609/aaai.v32i1.12266 – start-page: 2361 volume-title: Proc. NIPS ident: ref61 article-title: Review networks for caption generation – ident: ref17 doi: 10.1109/ICCV.2019.00473 – ident: ref6 doi: 10.1109/CVPR.2017.121 – ident: ref65 doi: 10.1109/CVPR.2016.503 – ident: ref64 doi: 10.1109/ICCV.2017.524 – ident: ref40 doi: 10.3115/1073083.1073135 – ident: ref67 doi: 10.1109/CVPR.2018.00611 – ident: ref42 doi: 10.1109/CVPR.2019.00856 – volume-title: Proc. ICLR ident: ref49 article-title: Graph attention networks – ident: ref10 doi: 10.1109/CVPR.2019.00646 – ident: ref27 doi: 10.1109/ICCV.2019.00902 – ident: ref34 doi: 10.1145/3240508.3240632 – ident: ref66 doi: 10.1109/TCSVT.2019.2947482 – ident: ref1 doi: 10.1007/978-3-319-46454-1_24 – ident: ref38 doi: 10.1109/ICCV.2015.291 – ident: ref54 doi: 10.1145/2998181.2998364 – ident: ref28 doi: 10.1109/TMM.2017.2751140 – volume: 1 start-page: 10 volume-title: Proc. ACL Workshop ident: ref32 article-title: Rouge: A package for automatic evaluation of summaries – start-page: 4171 volume-title: Proc. NAACL-HLT ident: ref7 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding – ident: ref16 doi: 10.1109/ICCV.2019.01039 – ident: ref63 doi: 10.1109/ICCV.2019.00271 – ident: ref2 doi: 10.1109/CVPR.2018.00636 – ident: ref14 doi: 10.1109/ICCV.2017.322 – ident: ref11 doi: 10.1109/ICCV.2015.169 – start-page: 8940 volume-title: Proc. NIPS ident: ref18 article-title: Adaptively aligned image captioning via adaptive attention time – start-page: 444 volume-title: Proc. EMNLP ident: ref60 article-title: Corpus-guided sentence generation of natural images – ident: ref51 doi: 10.1145/2964284.2964299 – ident: ref30 doi: 10.1609/aaai.v35i10.17034 – ident: ref41 doi: 10.1017/9781108924238.008
SSID	ssj0014847
Score	2.6304262
Snippet	Image captioning is one of the primary goals in computer vision which aims to automatically generate natural descriptions for images. Intuitively, human visual...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	3685
SubjectTerms	Ablation Baseball Computer vision Feature extraction Free form image captioning Inference interaction learning Learning Learning systems Object recognition Proposals Region modeling Salience Semantics Sports Task analysis Visualization
Title	Region-Aware Image Captioning via Interaction Learning
URI	https://ieeexplore.ieee.org/document/9521159 https://www.proquest.com/docview/2672805953
Volume	32
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gJz34QiOKZg_edGG32xZ6JESCJnhQMNw2fa0xKhBcNPHX23a7xFeMtx7aZDLT6bw63wCcEmW0KKHCKBImIe7oJBSR5mGcZCjhklLkKvjDazoY46sJmVTgfNULo7V2n8900y5dLV_N5NKmylrM2BpjftdgzQRuRa_WqmKAO26YmHEX4rBj7FjZIBOx1qh3ezcyoSCKTYRqrzj5YoTcVJUfT7GzL_0tGJaUFd9KHpvLXDTl-zfQxv-Svg2b3tEMusXN2IGKnu7Cxif4wRrQG21_I4fdN77QweWzeVqCHp_7FG3w-sADlzAseh8CD8V6vwfj_sWoNwj9HIVQIkbysK0pEdr60oIkSDIm2iRTnCuk2pmJgDGLZRRnUlBMCNcSZwJzFGlDsVIWrGUfqtPZVB9AEDMhFCeKKqlwRgWTmRF2QiXV1EIR1iEuGZtKDzJuZ108pS7YiFjqhJFaYaReGHU4W52ZFxAbf-6uWe6udnrG1qFRyi_1WviSImqHbxFGksPfTx3BOrLtDC6r0oBqvljqY-Nk5OLE3a4Pc_fMKw
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gHtSDLzSiqHvwpgu73W2hR0IkoMBBwXDb9LXGqEBw0cRfb9vdJb5ivPXQJs08Oo_OfANwhqXWooBwrUghdsOGClzuKeb6QYwCJghB9ge_PyCdUXg1xuMCXCx7YZRStvhMVc3S_uXLqViYVFmNalujze8KrGq7j_20W2v5ZxA27Dgx7TD4bkNbsrxFxqO1Yev2bqiDQeTrGNUIOf5ihuxclR-PsbUw7S3o53dLC0seq4uEV8X7N9jG_15-GzYzV9NpprKxAwU12YWNTwCEJSA3ytQju803NldO91k_Lk6LzbIkrfP6wBybMky7H5wMjPV-D0bty2Gr42aTFFyBKE7cuiKYK-NNcxwgQSmv41gyJpGsxzoGDqkvPD8WnGi6MiXCmIcMeUrfWEoD17IPxcl0og7A8SnnkmFJpJBhTDgVsWZ3QARRxIARlsHPCRuJDGbcTLt4imy44dHIMiMyzIgyZpThfHlmloJs_Lm7ZKi73JkRtgyVnH9RpocvESJm_BamODj8_dQprHWG_V7U6w6uj2AdmeYGm2OpQDGZL9SxdjkSfmIl7QNHwM90
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Region-Aware+Image+Captioning+via+Interaction+Learning&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Liu%2C+An-An&rft.au=Zhai%2C+Yingchen&rft.au=Xu%2C+Ning&rft.au=Nie%2C+Weizhi&rft.date=2022-06-01&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=32&rft.issue=6&rft.spage=3685&rft.epage=3696&rft_id=info:doi/10.1109%2FTCSVT.2021.3107035&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2021_3107035
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon