Task-Adaptive Attention for Image Captioning
Attention mechanisms are now widely used in image captioning models. However, most attention models only focus on visual features. When generating syntax related words, little visual information is needed. In this case, these attention models could mislead the word generation. In this paper, we prop...
Saved in:
Published in | IEEE transactions on circuits and systems for video technology Vol. 32; no. 1; pp. 43 - 51 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.01.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Attention mechanisms are now widely used in image captioning models. However, most attention models only focus on visual features. When generating syntax related words, little visual information is needed. In this case, these attention models could mislead the word generation. In this paper, we propose Task-Adaptive Attention module for image captioning, which can alleviate this misleading problem and learn implicit non-visual clues which can be helpful for the generation of non-visual words. We further introduce a diversity regularization to enhance the expression ability of the Task-Adaptive Attention module. Extensive experiments on the MSCOCO captioning dataset demonstrate that by plugging our Task-Adaptive Attention module into a vanilla Transformer-based image captioning model, performance improvement can be achieved. |
---|---|
AbstractList | Attention mechanisms are now widely used in image captioning models. However, most attention models only focus on visual features. When generating syntax related words, little visual information is needed. In this case, these attention models could mislead the word generation. In this paper, we propose Task-Adaptive Attention module for image captioning, which can alleviate this misleading problem and learn implicit non-visual clues which can be helpful for the generation of non-visual words. We further introduce a diversity regularization to enhance the expression ability of the Task-Adaptive Attention module. Extensive experiments on the MSCOCO captioning dataset demonstrate that by plugging our Task-Adaptive Attention module into a vanilla Transformer-based image captioning model, performance improvement can be achieved. |
Author | Yin, Jian Gao, Xingyu Li, Liang Chen, Zhenyu Hao, Yiming Yan, Chenggang Liu, Anan Mao, Zhendong |
Author_xml | – sequence: 1 givenname: Chenggang orcidid: 0000-0003-1204-0512 surname: Yan fullname: Yan, Chenggang email: cgyan@sdu.edu.cn organization: School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China – sequence: 2 givenname: Yiming orcidid: 0000-0003-3225-5887 surname: Hao fullname: Hao, Yiming email: m17863135918_1@163.com organization: School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China – sequence: 3 givenname: Liang orcidid: 0000-0001-8437-4824 surname: Li fullname: Li, Liang email: liang.li@ict.ac.cn organization: Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China – sequence: 4 givenname: Jian orcidid: 0000-0002-4820-0226 surname: Yin fullname: Yin, Jian email: yinjian@sdu.edu.cn organization: School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China – sequence: 5 givenname: Anan orcidid: 0000-0001-5755-9145 surname: Liu fullname: Liu, Anan email: anan0422@gmail.com organization: School of Electrical and Information Engineering, Tianjin University, Tianjin, China – sequence: 6 givenname: Zhendong orcidid: 0000-0001-5739-8126 surname: Mao fullname: Mao, Zhendong email: zdmao@ustc.edu.cn organization: School of Information Science and Technology, University of Science and Technology of China, Hefei, China – sequence: 7 givenname: Zhenyu surname: Chen fullname: Chen, Zhenyu email: czy9907@gmail.com organization: Big Data Center, State Grid Corporation of China, Beijing, China – sequence: 8 givenname: Xingyu surname: Gao fullname: Gao, Xingyu email: gaoxingyu@ime.ac.cn organization: Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China |
BookMark | eNp9kD1PwzAQhi1UJNrCH4AlEispPn_FHquIQqVKDARWy4mdyqVNiuMi8e9JaMXAwHQn3fvc6Z4JGjVt4xC6BjwDwOq-yF_eihnBBGYUi4wxdYbGwLlMCcF81PeYQyoJ8As06boNxsAky8borjDdezq3Zh_9p0vmMbom-rZJ6jYky51ZuyQfZm3jm_UlOq_NtnNXpzpFr4uHIn9KV8-Py3y-SiuieExLQ0uuGDNEWmEJgKGqZhWlSjLOlLBKliV3nEhuCNSVLCmrhGAWU1sqZukU3R737kP7cXBd1Jv2EJr-pCYCFCYMMtmnyDFVhbbrgqv1PvidCV8asB6s6B8rerCiT1Z6SP6BKh_N8F8Mxm__R2-OqHfO_d5SVILMBP0GGe9wEw |
CODEN | ITCTEM |
CitedBy_id | crossref_primary_10_1109_TCSVT_2022_3183648 crossref_primary_10_1016_j_eswa_2022_117174 crossref_primary_10_1145_3648370 crossref_primary_10_1002_mp_17128 crossref_primary_10_1016_j_neucom_2023_126914 crossref_primary_10_3233_JIFS_233004 crossref_primary_10_1016_j_neucom_2024_128824 crossref_primary_10_1007_s00530_021_00873_8 crossref_primary_10_1109_TCSVT_2023_3282349 crossref_primary_10_1109_TMM_2023_3242142 crossref_primary_10_1109_TIM_2024_3353830 crossref_primary_10_1016_j_measurement_2025_117105 crossref_primary_10_1109_TCSVT_2024_3391415 crossref_primary_10_1016_j_engappai_2024_109105 crossref_primary_10_1186_s13640_022_00583_9 crossref_primary_10_1016_j_jvcir_2023_103897 crossref_primary_10_1109_TNNLS_2023_3283239 crossref_primary_10_1007_s00530_021_00862_x crossref_primary_10_1007_s11042_024_18467_7 crossref_primary_10_1016_j_neucom_2024_128036 crossref_primary_10_1109_TNNLS_2023_3263846 crossref_primary_10_1016_j_ins_2024_121342 crossref_primary_10_1016_j_engappai_2023_106384 crossref_primary_10_1007_s12145_024_01479_0 crossref_primary_10_1007_s00138_023_01479_y crossref_primary_10_1016_j_eswa_2025_126850 crossref_primary_10_1007_s11760_023_02769_8 crossref_primary_10_1109_TNNLS_2024_3358858 crossref_primary_10_1109_JAS_2022_105734 crossref_primary_10_3390_app122111280 crossref_primary_10_1109_TCSVT_2023_3296889 crossref_primary_10_1109_TIP_2022_3159472 crossref_primary_10_1016_j_neucom_2022_06_062 crossref_primary_10_32604_cmc_2024_054841 crossref_primary_10_1186_s13640_021_00564_4 crossref_primary_10_1109_TCSVT_2024_3382717 crossref_primary_10_1186_s13640_023_00608_x crossref_primary_10_1109_TIM_2023_3285981 crossref_primary_10_1109_TCSVT_2024_3414275 crossref_primary_10_1109_TCSVT_2022_3211734 crossref_primary_10_1186_s13640_024_00627_2 crossref_primary_10_1016_j_displa_2024_102941 crossref_primary_10_1016_j_engappai_2024_109134 crossref_primary_10_1016_j_ins_2023_119992 crossref_primary_10_1109_TCSVT_2023_3336371 crossref_primary_10_1109_TCSVT_2022_3181490 crossref_primary_10_1007_s00530_022_00976_w crossref_primary_10_1016_j_measurement_2024_114545 crossref_primary_10_1109_TCSVT_2023_3291379 crossref_primary_10_3390_s21237982 crossref_primary_10_1016_j_optlastec_2022_108469 crossref_primary_10_1109_TIM_2024_3374320 crossref_primary_10_1109_TMM_2024_3358948 crossref_primary_10_3390_insects14010054 crossref_primary_10_7717_peerj_cs_2725 crossref_primary_10_1145_3672397 crossref_primary_10_1109_TCSVT_2024_3465445 crossref_primary_10_1007_s00530_023_01172_0 crossref_primary_10_1016_j_neucom_2024_127530 crossref_primary_10_1016_j_neucom_2025_129710 crossref_primary_10_1007_s00138_023_01370_w crossref_primary_10_1007_s10489_024_05389_y crossref_primary_10_1111_coin_12653 crossref_primary_10_1016_j_ipm_2023_103288 crossref_primary_10_1016_j_micpro_2023_104931 crossref_primary_10_1109_TCSVT_2022_3221755 crossref_primary_10_1186_s13640_022_00589_3 crossref_primary_10_1109_TCSVT_2022_3233369 crossref_primary_10_3390_rs15051395 crossref_primary_10_1109_LSP_2021_3088323 crossref_primary_10_1109_TCSVT_2022_3207228 crossref_primary_10_1109_TCI_2024_3426975 crossref_primary_10_1145_3612926 crossref_primary_10_1007_s13042_023_01876_9 crossref_primary_10_1109_TCSVT_2024_3405998 crossref_primary_10_1049_ipr2_12790 crossref_primary_10_1109_TCSVT_2022_3225549 crossref_primary_10_1016_j_cviu_2024_104088 crossref_primary_10_1007_s11227_022_04594_1 crossref_primary_10_1109_TCSVT_2021_3104932 crossref_primary_10_1186_s40537_023_00693_9 crossref_primary_10_1007_s10489_022_03675_1 crossref_primary_10_1002_eng2_12785 crossref_primary_10_1007_s11831_024_10190_8 crossref_primary_10_1016_j_image_2024_117153 crossref_primary_10_1109_TCSVT_2023_3343520 crossref_primary_10_1016_j_measurement_2023_113240 crossref_primary_10_1109_TMM_2022_3177308 crossref_primary_10_1016_j_cropro_2024_107018 crossref_primary_10_1109_TCSVT_2022_3182426 crossref_primary_10_3390_rs16122161 crossref_primary_10_1007_s00530_021_00859_6 crossref_primary_10_1016_j_jvcir_2022_103628 crossref_primary_10_1007_s00530_021_00863_w crossref_primary_10_1016_j_jvcir_2022_103740 crossref_primary_10_1016_j_measurement_2024_116385 crossref_primary_10_1109_TCSVT_2022_3178844 crossref_primary_10_1109_TMM_2023_3301279 crossref_primary_10_1109_TMM_2022_3202690 crossref_primary_10_1109_TCSVT_2023_3235704 crossref_primary_10_1016_j_jksuci_2024_102127 crossref_primary_10_1007_s11042_024_19410_6 crossref_primary_10_1016_j_engappai_2023_107732 crossref_primary_10_1016_j_engappai_2023_106406 crossref_primary_10_1109_TCSVT_2024_3358411 crossref_primary_10_1016_j_neucom_2022_07_048 crossref_primary_10_1109_LSP_2022_3177319 crossref_primary_10_1109_TMM_2023_3283878 crossref_primary_10_1109_TCSVT_2024_3497997 crossref_primary_10_1109_TCSVT_2022_3193857 crossref_primary_10_1016_j_ins_2023_119810 crossref_primary_10_3390_app14104231 crossref_primary_10_1016_j_knosys_2025_113127 crossref_primary_10_1016_j_jvcir_2022_103641 crossref_primary_10_1016_j_eswa_2025_126692 crossref_primary_10_3390_data7060080 crossref_primary_10_1016_j_engappai_2025_110358 crossref_primary_10_1016_j_neucom_2022_06_118 crossref_primary_10_1016_j_autcon_2024_105286 crossref_primary_10_1109_TCSVT_2024_3426655 crossref_primary_10_1016_j_dsp_2025_105155 crossref_primary_10_1016_j_measurement_2023_113714 crossref_primary_10_1109_TMI_2024_3507073 crossref_primary_10_1016_j_jvcir_2023_104019 crossref_primary_10_1007_s13042_023_02075_2 crossref_primary_10_1016_j_cviu_2024_104165 crossref_primary_10_1016_j_engappai_2025_110125 crossref_primary_10_1016_j_displa_2024_102798 crossref_primary_10_1109_TCSVT_2023_3281671 crossref_primary_10_1016_j_jvcir_2022_103672 crossref_primary_10_1109_TCSVT_2022_3155795 crossref_primary_10_1109_TCSVT_2022_3189242 crossref_primary_10_1016_j_autcon_2024_105298 crossref_primary_10_1016_j_entcom_2022_100511 crossref_primary_10_1007_s11277_024_11653_8 crossref_primary_10_1109_LSP_2023_3292028 crossref_primary_10_1109_TGRS_2023_3236154 crossref_primary_10_1016_j_dsp_2023_103987 crossref_primary_10_1016_j_jvcir_2023_103832 crossref_primary_10_1186_s13640_022_00588_4 crossref_primary_10_1016_j_jvcir_2023_103954 crossref_primary_10_1016_j_jvcir_2023_103836 crossref_primary_10_3389_fonc_2023_1115718 crossref_primary_10_1007_s00530_021_00841_2 crossref_primary_10_1109_TCSVT_2022_3199603 crossref_primary_10_1016_j_engappai_2024_108322 crossref_primary_10_1016_j_neucom_2024_127246 crossref_primary_10_1016_j_neucom_2024_127488 crossref_primary_10_1109_TCSVT_2023_3307554 crossref_primary_10_1016_j_neucom_2025_129683 crossref_primary_10_1016_j_jvcir_2023_103984 crossref_primary_10_1016_j_jvcir_2024_104316 crossref_primary_10_1007_s00530_024_01608_1 crossref_primary_10_1007_s00138_021_01224_3 crossref_primary_10_1007_s00530_023_01249_w crossref_primary_10_32604_cmes_2025_059192 crossref_primary_10_1007_s00371_023_03104_5 crossref_primary_10_1007_s11042_024_18150_x crossref_primary_10_1016_j_engappai_2023_107144 crossref_primary_10_1109_LSP_2024_3524120 crossref_primary_10_3390_rs16173330 crossref_primary_10_3390_jimaging9070130 crossref_primary_10_1016_j_engappai_2025_110546 crossref_primary_10_1016_j_jvcir_2023_104021 crossref_primary_10_1016_j_jvcir_2024_104328 crossref_primary_10_1016_j_measurement_2023_113867 crossref_primary_10_1016_j_neucom_2024_127814 crossref_primary_10_1016_j_image_2023_117018 crossref_primary_10_1007_s11042_022_13793_0 crossref_primary_10_1016_j_imavis_2024_105105 crossref_primary_10_1016_j_neucom_2025_129668 crossref_primary_10_1007_s12021_022_09579_2 crossref_primary_10_3390_app14062657 crossref_primary_10_1145_3576927 crossref_primary_10_1109_TCSVT_2023_3344569 crossref_primary_10_1016_j_eswa_2025_126943 crossref_primary_10_1109_TCSVT_2023_3287296 crossref_primary_10_1007_s11063_022_11106_y crossref_primary_10_1109_TCSVT_2023_3315133 crossref_primary_10_1109_LSP_2021_3125828 crossref_primary_10_1111_exsy_13474 crossref_primary_10_1016_j_ins_2024_121191 crossref_primary_10_1145_3671000 crossref_primary_10_1109_TNNLS_2022_3176611 crossref_primary_10_1016_j_neucom_2022_10_004 crossref_primary_10_1109_TCSVT_2024_3445337 crossref_primary_10_1109_TIP_2023_3288986 crossref_primary_10_1016_j_neunet_2022_03_034 crossref_primary_10_1007_s00371_023_02794_1 crossref_primary_10_1016_j_measurement_2024_114239 crossref_primary_10_1186_s13640_021_00566_2 crossref_primary_10_1016_j_jvcir_2024_104346 crossref_primary_10_1016_j_jvcir_2023_103993 crossref_primary_10_1109_TMM_2024_3407695 crossref_primary_10_1016_j_engappai_2023_107489 crossref_primary_10_1016_j_jvcir_2024_104227 crossref_primary_10_1109_TCSVT_2022_3216663 crossref_primary_10_1109_TCSVT_2023_3243725 crossref_primary_10_1109_LSP_2024_3438080 crossref_primary_10_1145_3638558 crossref_primary_10_1016_j_measurement_2025_116716 crossref_primary_10_1080_13682199_2025_2470485 crossref_primary_10_1109_TCSVT_2024_3425513 crossref_primary_10_1016_j_ins_2023_119277 crossref_primary_10_1016_j_measurement_2024_114242 crossref_primary_10_32604_cmc_2025_060788 crossref_primary_10_1007_s11042_022_12776_5 crossref_primary_10_1016_j_imavis_2022_104595 |
Cites_doi | 10.1109/TMM.2019.2954747 10.1007/978-3-030-01216-8_31 10.1109/ICCV.2017.524 10.1109/ICCV.2015.277 10.1609/aaai.v33i01.33018618 10.1109/ICCV.2017.138 10.1145/3231742 10.1007/s11263-020-01402-2 10.1109/CVPR.2015.7298935 10.1007/978–3-319-10602-148 10.1109/TPAMI.2019.2929036 10.1109/TCSVT.2019.2947482 10.1007/978-3-319-42999-1 10.1007/978-3-030-60636-7_27 10.1016/j.patcog.2019.107075 10.5555/3045118.3045336 10.1109/CVPR.2015.7298932 10.1145/3126686.3126717 10.1109/TIP.2020.2977457 10.1109/TPAMI.2019.2894139 10.1109/TMM.2017.2729019 10.1109/CVPR.2016.308 10.1007/978-3-030-20870-7_2 10.1109/TNNLS.2018.2851077 10.1109/TPAMI.2020.2975798 10.1109/ICCV.2019.00435 10.1109/CVPR.2018.00636 10.1109/TIFS.2019.2939750 10.1109/CVPR.2017.131 10.1007/s11263-016-0981-7 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TCSVT.2021.3067449 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-2205 |
EndPage | 51 |
ExternalDocumentID | 10_1109_TCSVT_2021_3067449 9381876 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61771457; 61732007; 61931008; 61672497; 61971268; 61772494; 62022083 funderid: 10.13039/501100001809 – fundername: National Key Research and Development Program of China grantid: 2020YFB1406604 funderid: 10.13039/501100012166 |
GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c295t-ba3b5944a28d6d211a39f4c339845496d98bb5e5285a21fc8b34c664d03db94d3 |
IEDL.DBID | RIE |
ISSN | 1051-8215 |
IngestDate | Sun Jun 29 12:26:13 EDT 2025 Thu Apr 24 23:10:57 EDT 2025 Tue Jul 01 00:41:15 EDT 2025 Wed Aug 27 03:03:14 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c295t-ba3b5944a28d6d211a39f4c339845496d98bb5e5285a21fc8b34c664d03db94d3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-3225-5887 0000-0001-5739-8126 0000-0001-8437-4824 0000-0001-5755-9145 0000-0003-1204-0512 0000-0002-4820-0226 |
PQID | 2619024178 |
PQPubID | 85433 |
PageCount | 9 |
ParticipantIDs | crossref_citationtrail_10_1109_TCSVT_2021_3067449 proquest_journals_2619024178 crossref_primary_10_1109_TCSVT_2021_3067449 ieee_primary_9381876 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-Jan. 2022-1-00 20220101 |
PublicationDateYYYYMMDD | 2022-01-01 |
PublicationDate_xml | – month: 01 year: 2022 text: 2022-Jan. |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on circuits and systems for video technology |
PublicationTitleAbbrev | TCSVT |
PublicationYear | 2022 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref35 Herdade (ref2) 2019 ref12 ref34 ref14 ref30 ref11 ref33 ref10 ref32 ref1 ref16 ref19 ref18 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 ref29 ref7 ref9 ref4 ref3 ref6 Mao (ref15) 2014 ref5 Vaswani (ref17) Kiros (ref8) Ranzato (ref31) 2015 |
References_xml | – ident: ref10 doi: 10.1109/TMM.2019.2954747 – ident: ref33 doi: 10.1007/978-3-030-01216-8_31 – ident: ref35 doi: 10.1109/ICCV.2017.524 – ident: ref16 doi: 10.1109/ICCV.2015.277 – ident: ref14 doi: 10.1609/aaai.v33i01.33018618 – ident: ref32 doi: 10.1109/ICCV.2017.138 – ident: ref12 doi: 10.1145/3231742 – ident: ref24 doi: 10.1007/s11263-020-01402-2 – volume-title: arXiv:1906.05963 year: 2019 ident: ref2 article-title: Image captioning: Transforming objects into words – ident: ref9 doi: 10.1109/CVPR.2015.7298935 – ident: ref28 doi: 10.1007/978–3-319-10602-148 – ident: ref21 doi: 10.1109/TPAMI.2019.2929036 – ident: ref7 doi: 10.1109/TCSVT.2019.2947482 – ident: ref19 doi: 10.1007/978-3-319-42999-1 – ident: ref22 doi: 10.1007/978-3-030-60636-7_27 – ident: ref13 doi: 10.1016/j.patcog.2019.107075 – ident: ref5 doi: 10.5555/3045118.3045336 – ident: ref29 doi: 10.1109/CVPR.2015.7298932 – ident: ref34 doi: 10.1145/3126686.3126717 – ident: ref11 doi: 10.1109/TIP.2020.2977457 – start-page: 5998 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref17 article-title: Attention is all you need – ident: ref20 doi: 10.1109/TPAMI.2019.2894139 – ident: ref25 doi: 10.1109/TMM.2017.2729019 – ident: ref26 doi: 10.1109/CVPR.2016.308 – ident: ref4 doi: 10.1007/978-3-030-20870-7_2 – volume-title: arXiv:1511.06732 year: 2015 ident: ref31 article-title: Sequence level training with recurrent neural networks – ident: ref18 doi: 10.1109/TNNLS.2018.2851077 – ident: ref3 doi: 10.1109/TPAMI.2020.2975798 – ident: ref6 doi: 10.1109/ICCV.2019.00435 – start-page: 595 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref8 article-title: Multimodal neural language models – ident: ref1 doi: 10.1109/CVPR.2018.00636 – ident: ref23 doi: 10.1109/TIFS.2019.2939750 – volume-title: arXiv:1410.1090 year: 2014 ident: ref15 article-title: Explain images with multimodal recurrent neural networks – ident: ref27 doi: 10.1109/CVPR.2017.131 – ident: ref30 doi: 10.1007/s11263-016-0981-7 |
SSID | ssj0014847 |
Score | 2.6933196 |
Snippet | Attention mechanisms are now widely used in image captioning models. However, most attention models only focus on visual features. When generating syntax... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 43 |
SubjectTerms | Adaptation models attention mechanism Computational modeling Decoding Feature extraction Feeds Image captioning Modules Regularization Task analysis transformer Visualization |
Title | Task-Adaptive Attention for Image Captioning |
URI | https://ieeexplore.ieee.org/document/9381876 https://www.proquest.com/docview/2619024178 |
Volume | 32 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED61nWDgVRCFgjKw0aRN7Dj2WFVUBakspKhb5FeXQlvRdOHXYzsPVYAQS5TBlqw7--6-ewLcSZWYV8MNNllg80kS5dOYUJ9aJ0NIDSITtnZ4-kwmM_w0j-cN6NW1MFprl3ymA_vrYvlqLXfWVdZnVr0kpAlNA9yKWq06YoCpGyZmzIXQp0aPVQUyA9ZPRy-vqYGCURhYAxnbvpl7SshNVfkhip1-GR_DtDpZkVayDHa5COTnt6aN_z36CRyVhqY3LG7GKTT06gwO99oPtqGX8u3SHyq-sTLPG-Z5kfvoGUPWe3w3ksYb8U3psT2H2fghHU38cnqCLyMW577gSMQMYx5RRZTBeRyxBZYIMYoNKCSKUSFiHUc05lG4kFQgLAnBaoCUYFihC2it1it9CV5io20yNvLIdgNUXIRYI0QGnEacG4p3IKzImcmytbidcPGWOYgxYJljQWZZkJUs6MB9vWdTNNb4c3Xb0rReWZKzA92Ka1n59raZxYTG8ggTevX7rms4iGwRg3OkdKGVf-z0jTEtcnHr7tQXYILG2Q |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LT9tAEB5ROFAOBQqIlEd9gFPrEO_D3j1wiAIo4XWpqbiZfeUChKhxVMFv4a_w35hdOxEqiBtSL5YPu5blbzwz3-w8AHaMzfCvUchN-gwvWWZjwVMRCx9kSAQyMu1rh8_O0-4FO77klzPwOK2Fcc6F5DPX9LfhLN_embEPle1Jb16ytE6hPHH3f5GgjfZ7B4jmLiFHh3mnG9czBGJDJC9jrajmkjFFhE0tsh1FZZ8ZSqVgSI1SK4XW3HEiuCJJ3whNmUlTZlvUasksxed-gjn0MzipqsOmZxRMhPFl6KAksUDLOSnJacm9vPPrd47kkyRN75Iz36nzhdkLc1xeKf9g0Y4W4WnyLapEluvmuNRN8_BPm8j_9WMtwZfalY7alewvw4wbfIWFFw0WV-BnrkbXcduqodfqUbssq-zOCF31qHeLujTqqGEdk16Fiw953TWYHdwN3DpEmT9PNBw1ru93aJVOmKM0bSlBlEKEG5BM4CtM3Tzdz_C4KQKJaskiQF54yIsa8gb8mO4ZVq1D3l294jGcrqzha8DmREqKWruMCs960bdKMvHt7V3fYb6bn50Wp73zkw34THzJRggbbcJs-WfsttCRKvV2kOcIrj5aJp4BcSYjGg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Task-Adaptive+Attention+for+Image+Captioning&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Yan%2C+Chenggang&rft.au=Hao%2C+Yiming&rft.au=Li%2C+Liang&rft.au=Yin%2C+Jian&rft.date=2022-01-01&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=32&rft.issue=1&rft.spage=43&rft.epage=51&rft_id=info:doi/10.1109%2FTCSVT.2021.3067449&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2021_3067449 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon |