Attribution rollout: a new way to interpret visual transformer

Transformer-based models are dominating the field of natural language processing and are becoming increasingly popular in the field of computer vision. However, the black box characteristics of transformers seriously hamper their application in certain fields. Prior work relies on the raw attention...

Full description

Saved in:

Bibliographic Details
Published in	Journal of ambient intelligence and humanized computing Vol. 14; no. 1; pp. 163 - 173
Main Authors	Xu, Li, Yan, Xin, Ding, Weiyue, Liu, Zechao
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.01.2023 Springer Nature B.V
Subjects	Algorithms Artificial Intelligence Back propagation Computational Intelligence Computer vision Decision making Engineering Methods Natural language processing Original Research Robotics and Automation Transformers User Interfaces and Human Computer Interaction Visualization Vision transformer Integrated gradients Interpretability Attention
Online Access	Get full text

Cover

Loading…

Abstract	Transformer-based models are dominating the field of natural language processing and are becoming increasingly popular in the field of computer vision. However, the black box characteristics of transformers seriously hamper their application in certain fields. Prior work relies on the raw attention scores or employs heuristic propagation along with the attention graph. In this work, we propose a new way to visualize model. The method computes attention scores based on attribution and then propagates these attention scores through the layers. This propagation involves attention layers and multi-head attention mechanism. Our method extracts salient dependencies in each layer to visualize prediction results. We benchmark our method on recent visual transformer networks and demonstrate its many advantages over the existing interpretability methods. Our code is available at: https://github.com/yxheartipp/attr-rollout .
AbstractList	Transformer-based models are dominating the field of natural language processing and are becoming increasingly popular in the field of computer vision. However, the black box characteristics of transformers seriously hamper their application in certain fields. Prior work relies on the raw attention scores or employs heuristic propagation along with the attention graph. In this work, we propose a new way to visualize model. The method computes attention scores based on attribution and then propagates these attention scores through the layers. This propagation involves attention layers and multi-head attention mechanism. Our method extracts salient dependencies in each layer to visualize prediction results. We benchmark our method on recent visual transformer networks and demonstrate its many advantages over the existing interpretability methods. Our code is available at: https://github.com/yxheartipp/attr-rollout. Transformer-based models are dominating the field of natural language processing and are becoming increasingly popular in the field of computer vision. However, the black box characteristics of transformers seriously hamper their application in certain fields. Prior work relies on the raw attention scores or employs heuristic propagation along with the attention graph. In this work, we propose a new way to visualize model. The method computes attention scores based on attribution and then propagates these attention scores through the layers. This propagation involves attention layers and multi-head attention mechanism. Our method extracts salient dependencies in each layer to visualize prediction results. We benchmark our method on recent visual transformer networks and demonstrate its many advantages over the existing interpretability methods. Our code is available at: https://github.com/yxheartipp/attr-rollout .
Author	Xu, Li Liu, Zechao Yan, Xin Ding, Weiyue
Author_xml	– sequence: 1 givenname: Li orcidid: 0000-0003-4950-0789 surname: Xu fullname: Xu, Li organization: College of Computer Science and Technology, Harbin Engineering University, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University – sequence: 2 givenname: Xin surname: Yan fullname: Yan, Xin organization: College of Computer Science and Technology, Harbin Engineering University – sequence: 3 givenname: Weiyue surname: Ding fullname: Ding, Weiyue organization: Department of Medicine, Harvard Medical School – sequence: 4 givenname: Zechao surname: Liu fullname: Liu, Zechao email: liuzechao@hrbeu.edu.cn organization: College of Computer Science and Technology, Harbin Engineering University
BookMark	eNp9kEtLAzEQgINUsNb-AU8Bz6t5bXbXg1CKLyh40XNIsxPZsk1qkrX035taUfDQgTADmW8m-c7RyHkHCF1Sck0JqW4iZbJkBWH5CF6Kgp2gMa1lXZRUlKPfmldnaBrjiuTgDaeUjtHdLKXQLYfUeYeD73s_pFussYMt3uodTh53LkHYBEj4s4uD7nEK2kXrwxrCBTq1uo8w_ckT9PZw_zp_KhYvj8_z2aIwjAtWSN3UlabUWLEkkltbcSAGWhAtaQ0Ag4YaIoWkrGaitC3Lt4bYyjBaVbXkE3R1mLsJ_mOAmNTKD8HllYo1tBFNI3mdu9ihywQfYwCrNqFb67BTlKi9KnVQpbIq9a1KsQzV_yDTJb33kf_Z9cdRfkBj3uPeIfy96gj1BXOSfxU
CitedBy_id	crossref_primary_10_1007_s11263_024_02034_6 crossref_primary_10_3390_e26110974
Cites_doi	10.32604/csse.2022.022305 10.1007/978-3-030-58452-8_13 10.1109/CVPR.2018.00960 10.32604/cmc.2022.022304 10.1007/s11263-014-0713-9 10.1109/ICCV.2017.371 10.1109/CVPRW50498.2020.00020 10.32604/cmc.2021.017266 10.3390/s20010207 10.1609/aaai.v35i13.17374 10.1609/aaai.v35i14.17533 10.1109/CVPR46437.2021.00084 10.1007/978-3-319-44781-0_8 10.18653/v1/D19-1514 10.1109/ICCV.2019.00304 10.18653/v1/2020.acl-main.385 10.1016/j.future.2020.09.019 10.1109/TPAMI.2018.2858759 10.1007/978-3-319-10590-1_53 10.1007/s11263-015-0816-y 10.1109/CVPR.2016.319 10.1016/j.patcog.2016.11.008 10.1609/aaai.v34i03.5632 10.1007/s11263-017-1059-x 10.18653/v1/P19-1580 10.18653/v1/D16-1053 10.1109/ICCVW.2019.00513 10.1109/ICCV.2017.74 10.1007/s11263-016-0911-8 10.1007/978-3-030-20893-6_8
ContentType	Journal Article
Copyright	The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Copyright_xml	– notice: The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
DBID	AAYXX CITATION 8FE 8FG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- P5Z P62 PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI
DOI	10.1007/s12652-022-04354-2
DatabaseName	CrossRef ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One ProQuest Central Korea ProQuest Central Student SciTech Premium Collection (ProQuest) ProQuest Computer Science Collection Computer Science Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition
DatabaseTitle	CrossRef Advanced Technologies & Aerospace Collection Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest One Academic Eastern Edition SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central Advanced Technologies & Aerospace Database ProQuest One Applied & Life Sciences ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New)
DatabaseTitleList	Advanced Technologies & Aerospace Collection
Database_xml	– sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1868-5145
EndPage	173
ExternalDocumentID	10_1007_s12652_022_04354_2
GrantInformation_xml	– fundername: Ministry of Science and Technology grantid: 2021ZD0200406 funderid: http://dx.doi.org/10.13039/100007225 – fundername: Fundamental Research Funds for the Central Universities grantid: 93K172021K04 funderid: http://dx.doi.org/10.13039/501100012226 – fundername: National Natural Science Foundation of China grantid: 62172122 funderid: http://dx.doi.org/10.13039/501100001809
GroupedDBID	-EM 06D 0R~ 0VY 1N0 203 29~ 2JY 2VQ 30V 4.4 406 408 409 40D 96X AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH AAZMS ABAKF ABBXA ABDZT ABECU ABFTV ABHQN ABJNI ABJOX ABKCH ABMQK ABQBU ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABWNU ABXPI ACAOD ACDTI ACGFS ACHSB ACKNC ACMLO ACOKC ACPIV ACZOJ ADHHG ADHIR ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFQL AEGNC AEJHL AEJRE AEMSY AENEX AEOHA AEPYU AESKC AETCA AEVLU AEXYK AFBBN AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGJBK AGMZJ AGQEE AGQMX AGRTI AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ AKLTO ALFXC ALMA_UNASSIGNED_HOLDINGS AMKLP AMXSW AMYLF AMYQR ANMIH ARAPS AUKKA AXYYD AYJHY BENPR BGLVJ BGNMA BSONS CCPQU CSCUP DNIVK DPUIP EBLON EBS EIOEI EJD ESBYG F5P FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FYJPI GGCAI GGRSB GJIRD GQ6 GQ7 GQ8 H13 HCIFZ HF~ HG6 HMJXF HQYDN HRMNR HZ~ I0C IKXTQ IWAJR IXD IZIGR J-C J0Z JBSCW JCJTX JZLTJ K7- KOV LLZTM M4Y NPVJJ NQJWS NU0 O9- O93 O9J P2P P9P PT4 QOS R89 R9I RLLFE ROL RSV S1Z S27 S3B SEG SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE T13 TSG U2A UG4 UOJIU UTJUX UZXMN VFIZW W48 WK8 Z45 Z5O Z7R Z7X Z83 Z88 ZMTXR ~A9 AAYXX ABBRH ABDBE ABFSG ACSTC ADKFA AEZWR AFDZB AFHIU AFOHR AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT 8FE 8FG ABRTQ AZQEC DWQXO GNUQQ JQ2 P62 PKEHL PQEST PQGLB PQQKQ PQUKI
ID	FETCH-LOGICAL-c2342-6a987a11cf4b063ff73e0cede4d0dcee2e91c0646128245fd20cec0f7c2177863
IEDL.DBID	BENPR
ISSN	1868-5137
IngestDate	Fri Jul 25 23:29:27 EDT 2025 Tue Jul 01 02:26:01 EDT 2025 Thu Apr 24 22:52:34 EDT 2025 Fri Feb 21 02:45:07 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	1
Keywords	Vision transformer Integrated gradients Interpretability Attention
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c2342-6a987a11cf4b063ff73e0cede4d0dcee2e91c0646128245fd20cec0f7c2177863
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0003-4950-0789
PQID	2919499638
PQPubID	2043913
PageCount	11
ParticipantIDs	proquest_journals_2919499638 crossref_primary_10_1007_s12652_022_04354_2 crossref_citationtrail_10_1007_s12652_022_04354_2 springer_journals_10_1007_s12652_022_04354_2
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20230100 2023-01-00 20230101
PublicationDateYYYYMMDD	2023-01-01
PublicationDate_xml	– month: 1 year: 2023 text: 20230100
PublicationDecade	2020
PublicationPlace	Berlin/Heidelberg
PublicationPlace_xml	– name: Berlin/Heidelberg – name: Heidelberg
PublicationTitle	Journal of ambient intelligence and humanized computing
PublicationTitleAbbrev	J Ambient Intell Human Comput
PublicationYear	2023
Publisher	Springer Berlin Heidelberg Springer Nature B.V
Publisher_xml	– name: Springer Berlin Heidelberg – name: Springer Nature B.V
References	Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In: International conference on machine learning, PMLR, pp 3319–3328 RussakovskyODengJSuHImagenet large scale visual recognition challengeInt J Comput Vis20151153211252342248210.1007/s11263-015-0816-y Singh C, Murdoch WJ, Yu B (2018) Hierarchical interpretations for neural network predictions. arXiv:1806.05337 Abnar S, Zuidema W (2020) Quantifying attention flow in transformers. arXiv:2005.00928 ZhangXRZhangWFSunWA robust 3-d medical watermarking based on wavelet transform for data protectionComput Syst Sci Eng20224131043105610.32604/csse.2022.022305 Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229 Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In: International conference on machine learning, PMLR, pp 3145–3153 RenYZhuFSharmaPKData query mechanism based on hash computing power of blockchain in internet of thingsSensors202020120710.3390/s20010207 Wang H, Wang Z, Du M et al (2020) Score-cam: score-weighted visual explanations for convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 24–25 Chen J, Song L, Wainwright MJ et al (2018) L-shapley and c-shapley: Efficient model interpretation for structured data. arXiv:1808.02610 ZhouBBauDOlivaAInterpreting deep visual representations via network dissectionIEEE Trans Pattern Anal Mach Intell20184192131214510.1109/TPAMI.2018.2858759 Yun J, Basak M, Han MM (2021) Bayesian rule modeling for interpretable mortality classification of Covid-19 patients. In: Cmc-Computers Materials & Continua, pp 2827–2843 Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. arXiv:1601.06733 Devlin J, Chang MW, Lee K et al (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777 Chefer H, Gur S, Wolf L (2021) Transformer interpretability beyond attention visualization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 782–791 Tan H, Bansal M (2019) Lxmert: learning cross-modality encoder representations from transformers. arXiv:1908.07490 Adebayo J, Gilmer J, Muelly M et al (2018) Sanity checks for saliency maps. arXiv:1810.03292 Fong R, Patrick M, Vedaldi A (2019) Understanding deep networks via extremal perturbations and smooth masks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2950–2958 MontavonGLapuschkinSBinderAExplaining nonlinear classification decisions with deep Taylor decompositionPattern Recogn20176521122210.1016/j.patcog.2016.11.008 Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 Fong RC, Vedaldi A (2017) Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE international conference on computer vision, pp 3429–3437 Voita E, Talbot D, Moiseev F et al (2019) Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned. arXiv:1905.09418 Murdoch WJ, Liu PJ, Yu B (2018) Beyond word importance: Contextual decomposition to extract interactions from lstms. arXiv:1801.05453 Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833 Lu J, Batra D, Parikh D et al (2019) Vilbert: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv:1908.02265 RenYLengYQiJMultiple cloud storage mechanism based on blockchain in smart homesFuture Gener Comput Syst202111530431310.1016/j.future.2020.09.019 ZhangXRSunXSunXMRobust reversible audio watermarking scheme for telemedicine and privacy protectionComput Mater Continua20227123035305010.32604/cmc.2022.022304 ErhanDBengioYCourvilleAVisualizing higher-layer features of a deep networkUniv Montreal2009134131 Iwana BK, Kuroki R, Uchida S (2019) Explaining convolutional neural networks using softmax gradient layer-wise relevance propagation. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 4176–4185 Zhou B, Khosla A, Lapedriza A et al (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929 Nam WJ, Gur S, Choi J et al (2020) Relative attributing propagation: interpreting the comparative contributions of individual units in deep neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 2501–2508 Selvaraju RR, Cogswell M, Das A et al (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626 ZhangJBargalSALinZTop-down neural attention by excitation backpropInt J Comput Vis2018126101084110210.1007/s11263-017-1059-x Li K, Wu Z, Peng KC et al (2018) Tell me where to look: guided attention inference network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9215–9223 Binder A, Montavon G, Lapuschkin S et al (2016) Layer-wise relevance propagation for neural networks with local renormalization layers. In: International conference on artificial neural networks. Springer, pp 63–71 Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929 Gur S, Ali A, Wolf L (2021) Visualization of supervised and self-supervised neural networks via attribution guided factorization. In: Proceedings of the AAAI conference on artificial intelligence, pp 11545–11554 Shrikumar A, Greenside P, Shcherbina A et al (2016) Not just a black box: learning important features through propagating activation differences. arXiv:1605.01713 Xu K, Ba J, Kiros R et al (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, PMLR, pp 2048–2057 Hao Y, Dong L, Wei F et al (2020) Self-attention attribution: interpreting information interactions inside transformer. arXiv:2004.11207 Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034 GuillauminMKüttelDFerrariVImagenet auto-annotation with segmentation propagationInt J Comput Vis2014110332834810.1007/s11263-014-0713-9 MahendranAVedaldiAVisualizing deep convolutional neural networks using natural pre-imagesInt J Comput Vis20161203233255354409110.1007/s11263-016-0911-8 Chen M, Radford A, Child R et al (2020) Generative pretraining from pixels. In: International conference on machine learning, PMLR, pp 1691–1703 Gu J, Yang Y, Tresp V (2018) Understanding individual decisions of cnns via contrastive backpropagation. In: Asian conference on computer vision. Springer, pp 119–134 Yuan T, Li X, Xiong H et al (2021) Explaining information flow inside vision transformers using Markov chain. In: eXplainable AI approaches for debugging and diagnosis J Zhang (4354_CR43) 2018; 126 Y Ren (4354_CR27) 2021; 115 4354_CR10 4354_CR32 4354_CR31 B Zhou (4354_CR47) 2018; 41 4354_CR30 4354_CR4 4354_CR14 G Montavon (4354_CR23) 2017; 65 4354_CR36 4354_CR5 4354_CR13 4354_CR35 4354_CR6 4354_CR12 4354_CR34 4354_CR7 4354_CR33 4354_CR18 4354_CR1 4354_CR17 4354_CR39 4354_CR2 4354_CR16 4354_CR38 4354_CR3 4354_CR37 M Guillaumin (4354_CR15) 2014; 110 4354_CR19 4354_CR8 4354_CR9 A Mahendran (4354_CR22) 2016; 120 D Erhan (4354_CR11) 2009; 1341 Y Ren (4354_CR26) 2020; 20 O Russakovsky (4354_CR28) 2015; 115 4354_CR21 4354_CR20 4354_CR42 4354_CR41 4354_CR40 4354_CR25 4354_CR24 XR Zhang (4354_CR44) 2022; 71 4354_CR46 4354_CR29 XR Zhang (4354_CR45) 2022; 41
References_xml	– reference: Selvaraju RR, Cogswell M, Das A et al (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626 – reference: Zhou B, Khosla A, Lapedriza A et al (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929 – reference: ZhouBBauDOlivaAInterpreting deep visual representations via network dissectionIEEE Trans Pattern Anal Mach Intell20184192131214510.1109/TPAMI.2018.2858759 – reference: Wang H, Wang Z, Du M et al (2020) Score-cam: score-weighted visual explanations for convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 24–25 – reference: Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777 – reference: Nam WJ, Gur S, Choi J et al (2020) Relative attributing propagation: interpreting the comparative contributions of individual units in deep neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 2501–2508 – reference: MontavonGLapuschkinSBinderAExplaining nonlinear classification decisions with deep Taylor decompositionPattern Recogn20176521122210.1016/j.patcog.2016.11.008 – reference: Devlin J, Chang MW, Lee K et al (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 – reference: Xu K, Ba J, Kiros R et al (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, PMLR, pp 2048–2057 – reference: Tan H, Bansal M (2019) Lxmert: learning cross-modality encoder representations from transformers. arXiv:1908.07490 – reference: ZhangXRZhangWFSunWA robust 3-d medical watermarking based on wavelet transform for data protectionComput Syst Sci Eng20224131043105610.32604/csse.2022.022305 – reference: Voita E, Talbot D, Moiseev F et al (2019) Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned. arXiv:1905.09418 – reference: Li K, Wu Z, Peng KC et al (2018) Tell me where to look: guided attention inference network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9215–9223 – reference: Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In: International conference on machine learning, PMLR, pp 3145–3153 – reference: Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833 – reference: RenYLengYQiJMultiple cloud storage mechanism based on blockchain in smart homesFuture Gener Comput Syst202111530431310.1016/j.future.2020.09.019 – reference: Singh C, Murdoch WJ, Yu B (2018) Hierarchical interpretations for neural network predictions. arXiv:1806.05337 – reference: GuillauminMKüttelDFerrariVImagenet auto-annotation with segmentation propagationInt J Comput Vis2014110332834810.1007/s11263-014-0713-9 – reference: Adebayo J, Gilmer J, Muelly M et al (2018) Sanity checks for saliency maps. arXiv:1810.03292 – reference: Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. arXiv:1601.06733 – reference: Fong R, Patrick M, Vedaldi A (2019) Understanding deep networks via extremal perturbations and smooth masks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2950–2958 – reference: RussakovskyODengJSuHImagenet large scale visual recognition challengeInt J Comput Vis20151153211252342248210.1007/s11263-015-0816-y – reference: Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034 – reference: Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In: International conference on machine learning, PMLR, pp 3319–3328 – reference: ZhangJBargalSALinZTop-down neural attention by excitation backpropInt J Comput Vis2018126101084110210.1007/s11263-017-1059-x – reference: Abnar S, Zuidema W (2020) Quantifying attention flow in transformers. arXiv:2005.00928 – reference: Chefer H, Gur S, Wolf L (2021) Transformer interpretability beyond attention visualization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 782–791 – reference: Yun J, Basak M, Han MM (2021) Bayesian rule modeling for interpretable mortality classification of Covid-19 patients. In: Cmc-Computers Materials & Continua, pp 2827–2843 – reference: Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229 – reference: Hao Y, Dong L, Wei F et al (2020) Self-attention attribution: interpreting information interactions inside transformer. arXiv:2004.11207 – reference: Gur S, Ali A, Wolf L (2021) Visualization of supervised and self-supervised neural networks via attribution guided factorization. In: Proceedings of the AAAI conference on artificial intelligence, pp 11545–11554 – reference: Binder A, Montavon G, Lapuschkin S et al (2016) Layer-wise relevance propagation for neural networks with local renormalization layers. In: International conference on artificial neural networks. Springer, pp 63–71 – reference: Iwana BK, Kuroki R, Uchida S (2019) Explaining convolutional neural networks using softmax gradient layer-wise relevance propagation. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 4176–4185 – reference: Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929 – reference: RenYZhuFSharmaPKData query mechanism based on hash computing power of blockchain in internet of thingsSensors202020120710.3390/s20010207 – reference: Shrikumar A, Greenside P, Shcherbina A et al (2016) Not just a black box: learning important features through propagating activation differences. arXiv:1605.01713 – reference: Murdoch WJ, Liu PJ, Yu B (2018) Beyond word importance: Contextual decomposition to extract interactions from lstms. arXiv:1801.05453 – reference: Chen J, Song L, Wainwright MJ et al (2018) L-shapley and c-shapley: Efficient model interpretation for structured data. arXiv:1808.02610 – reference: Lu J, Batra D, Parikh D et al (2019) Vilbert: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv:1908.02265 – reference: Chen M, Radford A, Child R et al (2020) Generative pretraining from pixels. In: International conference on machine learning, PMLR, pp 1691–1703 – reference: ErhanDBengioYCourvilleAVisualizing higher-layer features of a deep networkUniv Montreal2009134131 – reference: MahendranAVedaldiAVisualizing deep convolutional neural networks using natural pre-imagesInt J Comput Vis20161203233255354409110.1007/s11263-016-0911-8 – reference: Gu J, Yang Y, Tresp V (2018) Understanding individual decisions of cnns via contrastive backpropagation. In: Asian conference on computer vision. Springer, pp 119–134 – reference: Fong RC, Vedaldi A (2017) Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE international conference on computer vision, pp 3429–3437 – reference: Yuan T, Li X, Xiong H et al (2021) Explaining information flow inside vision transformers using Markov chain. In: eXplainable AI approaches for debugging and diagnosis – reference: Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 – reference: ZhangXRSunXSunXMRobust reversible audio watermarking scheme for telemedicine and privacy protectionComput Mater Continua20227123035305010.32604/cmc.2022.022304 – volume: 41 start-page: 1043 issue: 3 year: 2022 ident: 4354_CR45 publication-title: Comput Syst Sci Eng doi: 10.32604/csse.2022.022305 – ident: 4354_CR4 doi: 10.1007/978-3-030-58452-8_13 – ident: 4354_CR10 – ident: 4354_CR2 – ident: 4354_CR19 doi: 10.1109/CVPR.2018.00960 – volume: 71 start-page: 3035 issue: 2 year: 2022 ident: 4354_CR44 publication-title: Comput Mater Continua doi: 10.32604/cmc.2022.022304 – ident: 4354_CR6 – volume: 110 start-page: 328 issue: 3 year: 2014 ident: 4354_CR15 publication-title: Int J Comput Vis doi: 10.1007/s11263-014-0713-9 – ident: 4354_CR31 – ident: 4354_CR12 doi: 10.1109/ICCV.2017.371 – ident: 4354_CR33 – ident: 4354_CR38 doi: 10.1109/CVPRW50498.2020.00020 – ident: 4354_CR41 doi: 10.32604/cmc.2021.017266 – volume: 20 start-page: 207 issue: 1 year: 2020 ident: 4354_CR26 publication-title: Sensors doi: 10.3390/s20010207 – ident: 4354_CR16 doi: 10.1609/aaai.v35i13.17374 – ident: 4354_CR17 doi: 10.1609/aaai.v35i14.17533 – ident: 4354_CR5 doi: 10.1109/CVPR46437.2021.00084 – ident: 4354_CR20 – ident: 4354_CR3 doi: 10.1007/978-3-319-44781-0_8 – ident: 4354_CR24 – ident: 4354_CR30 – ident: 4354_CR35 doi: 10.18653/v1/D19-1514 – ident: 4354_CR13 doi: 10.1109/ICCV.2019.00304 – ident: 4354_CR1 doi: 10.18653/v1/2020.acl-main.385 – volume: 115 start-page: 304 year: 2021 ident: 4354_CR27 publication-title: Future Gener Comput Syst doi: 10.1016/j.future.2020.09.019 – volume: 41 start-page: 2131 issue: 9 year: 2018 ident: 4354_CR47 publication-title: IEEE Trans Pattern Anal Mach Intell doi: 10.1109/TPAMI.2018.2858759 – ident: 4354_CR42 doi: 10.1007/978-3-319-10590-1_53 – volume: 115 start-page: 211 issue: 3 year: 2015 ident: 4354_CR28 publication-title: Int J Comput Vis doi: 10.1007/s11263-015-0816-y – ident: 4354_CR46 doi: 10.1109/CVPR.2016.319 – ident: 4354_CR36 – volume: 1341 start-page: 1 issue: 3 year: 2009 ident: 4354_CR11 publication-title: Univ Montreal – ident: 4354_CR7 – volume: 65 start-page: 211 year: 2017 ident: 4354_CR23 publication-title: Pattern Recogn doi: 10.1016/j.patcog.2016.11.008 – ident: 4354_CR34 – ident: 4354_CR9 – ident: 4354_CR25 doi: 10.1609/aaai.v34i03.5632 – ident: 4354_CR32 – volume: 126 start-page: 1084 issue: 10 year: 2018 ident: 4354_CR43 publication-title: Int J Comput Vis doi: 10.1007/s11263-017-1059-x – ident: 4354_CR37 doi: 10.18653/v1/P19-1580 – ident: 4354_CR39 – ident: 4354_CR8 doi: 10.18653/v1/D16-1053 – ident: 4354_CR18 doi: 10.1109/ICCVW.2019.00513 – ident: 4354_CR29 doi: 10.1109/ICCV.2017.74 – volume: 120 start-page: 233 issue: 3 year: 2016 ident: 4354_CR22 publication-title: Int J Comput Vis doi: 10.1007/s11263-016-0911-8 – ident: 4354_CR21 – ident: 4354_CR14 doi: 10.1007/978-3-030-20893-6_8 – ident: 4354_CR40
SSID	ssj0000393111
Score	2.2517366
Snippet	Transformer-based models are dominating the field of natural language processing and are becoming increasingly popular in the field of computer vision....
SourceID	proquest crossref springer
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	163
SubjectTerms	Algorithms Artificial Intelligence Back propagation Computational Intelligence Computer vision Decision making Engineering Methods Natural language processing Original Research Robotics and Automation Transformers User Interfaces and Human Computer Interaction Visualization
SummonAdditionalLinks	– databaseName: SpringerLink Journals (ICM) dbid: U2A link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5SL3oQn1itkoM3DWwem-x6EIpYiqAnC72FbDaBgrTSbhX_vZPtpquigudk5zCzmfkmmfkGoQvDU-OlL0iIjURYU5AiB4Pw0lKrrHLeh-bkh0c5HIn7cTpumsIWsdo9PknWnrptdmMyZSRUnycQ4wUBx7uZQu4eCrlGrL--WQndprQevBuo4ElKuWq6ZX4W8zUitTDz28toHXAGu2inQYq4vzLtHtpw0320_Yk_8ADd9Kv1wCo8B4vOltU1NhigMn4z77ia4UksKsSvk8US5FURqbr5IRoN7p5uh6QZiEAs44IRafJMGUqtFwVAC-8Vd4l1pRNlUkK0Yy6nFjAGoJaMidSXDFZt4pWFxENlkh-hznQ2dccIlz5kUjIL5xfik8-y0ALLpQt0Mj5PuohGpWjbsIWHoRXPuuU5DorUoEhdK1KzLrpcf_Oy4sr4c3cv6lo352ahWU4DWw44hS66ivpvl3-XdvK_7adoK8yNX92l9FCnmi_dGaCLqjivf6YPziLDSQ priority: 102 providerName: Springer Nature
Title	Attribution rollout: a new way to interpret visual transformer
URI	https://link.springer.com/article/10.1007/s12652-022-04354-2 https://www.proquest.com/docview/2919499638
Volume	14
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dS8MwED90e_FFFBWnc-TBNw226Vfqg1JlHygOEQfzqbRpAgNZ59Yp_vde2nRDwb02TR7ukrtfLnf3AzhPHC9Rvkqp9o3UFUlK0xAV4mTCFoEIpFK6OPlp6A9G7sPYG5uA28KkVdY2sTTUWS50jPyKhbbuo4Lb5Xb2QTVrlH5dNRQa29BEE8x5A5p33eHzyyrKoitP7ZKEV7eFp57tBKZypqqfY77HqE5otxA2uJT99k5ryPnnlbR0Pr092DWokUSVmvdhS04P4CYqVnRVZI76zJfFNUkIAmXylXyTIieTOqWQfE4WS1yhqHGqnB_CqNd9vR9QQ4dABXNcRv0k5EFi20K5KQILpQJHWkJm0s2sDH0dk6EtEGEgZuHM9VTGcFRYKhB47Qi47xxBY5pP5TGQTOl7lM_16UXvpDjXBbCOL3UzGRVaLbBrMcTC9ArXlBXv8brLsRZdjKKLS9HFrAUXqzmzqlPGxr_btXRjc2oW8VrHLbisJb4e_n-1k82rncKOZomvIidtaBTzpTxDLFGkHdjmvX4HmlH_7bHbMdsHv45Y9ANSuMbQ
linkProvider	ProQuest
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3JTsMwEB2VcoALAgGirD7ACSwSZ0cCVAGlLO2pSNxC4thSJdRAm1L1p_hGZrK0AglunJPMYfzi97y8GYDDyHIi7eqYEzdyW0YxjwMcECuRpvSkp7Qmc3Kn67af7Ptn57kGn5UXhq5VVnNiPlEnqaQ98lMRmFRHBeFy-fbOqWsUna5WLTQKWDyo6QSXbKPzu2sc3yMhWje9qzYvuwpwKSxbcDfCZXZkmlLbMfKz1p6lDKkSZSdGgpQhVGBKJGqkfl_Yjk4EPpWG9iSqd893LYy7AIu2hUxOzvTW7WxPh3yuZt7yl4rQc8e0vNKnU7j1hOsITtfnDRQpNhffuXAucH-cyeZU11qFlVKjsmYBqjWoqcE6XDSzWXMsNkT0pOPsjEUMZTmbRFOWpaxfXWBkH_3RGCNklSpWww14-pc0bUJ9kA7UFrBE06rN9WmuQC7Uvk92W8tVVLpGB0YDzCoNoSwrk1ODjNdwXlOZUhdi6sI8daFowPHsm7eiLsefb-9W2Q3Lf3QUzhHVgJMq4_PHv0fb_jvaASy1e53H8PGu-7ADy9Sfvtiz2YV6NhyrPVQxWbyfQ4fBy39j9QvvvP7d
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5SQfQgPrFaNQdvGrrJZje7HoSilvoqHiz0FnazCRRkW9qt4r93so9uFRU8J5nDTJL58vi-Qegscr3I-CYmNjcSrqKYxCEExE0UVUIJbYwlJz_1_d6A3w-94RKLP__tXj1JFpwGq9KUZu1JYto18Y35HiP2J7oD-Z4T2IRXuWUDw4wesM7ilsUyT2lehNfKwhOPuqJkzvxs5mt2qiHnt1fSPPl0t9BmiRpxpwjzNlrR6Q7aWNIS3EVXnWxRvApPIbrjeXaJIwywGb9HHzgb41H1wRC_jWZzsJdVqFVP99Cge_ty3SNlcQSimMsZ8aMwEBGlyvAYYIYxwtWO0onmiZNA5mM6pArwBiCYgHHPJAxalWOEgkOICHx3HzXScaoPEE6MPVX5gV3LkKtMEFg6rOtrKy1jQqeJaOUUqUrlcFvA4lXWmsfWkRIcKXNHStZE54sxk0I348_ercrXslxDM8lCapVzYINooovK_3Xz79YO_9f9FK0933Tl413_4Qit23LyxRVLCzWy6VwfA-jI4pN8Xn0CAyjKbw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Attribution+rollout%3A+a+new+way+to+interpret+visual+transformer&rft.jtitle=Journal+of+ambient+intelligence+and+humanized+computing&rft.au=Xu%2C+Li&rft.au=Yan%2C+Xin&rft.au=Ding%2C+Weiyue&rft.au=Liu%2C+Zechao&rft.date=2023-01-01&rft.issn=1868-5137&rft.eissn=1868-5145&rft.volume=14&rft.issue=1&rft.spage=163&rft.epage=173&rft_id=info:doi/10.1007%2Fs12652-022-04354-2&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s12652_022_04354_2
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1868-5137&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1868-5137&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1868-5137&client=summon