A Note about: Local Explanation Methods for Deep Neural Networks lack Sensitivity to Parameter Values

Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to the claim from Adebayo et al. (2018) that local explanation methods lack sensitivity, i.e., DNNs with randomly-initialized weights produce...

Full description

Saved in:
Bibliographic Details
Main Authors Sundararajan, Mukund, Taly, Ankur
Format Journal Article
LanguageEnglish
Published 11.06.2018
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to the claim from Adebayo et al. (2018) that local explanation methods lack sensitivity, i.e., DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNNs with learned weights. Further investigation reveals that their findings are due to two choices in their analysis: (a) ignoring the signs of the attributions; and (b) for integrated gradients (IG), including pixels in their analysis that have zero attributions by choice of the baseline (an auxiliary input relative to which the attributions are computed). When both factors are accounted for, IG attributions for a random network and the actual network are uncorrelated. Our investigation also sheds light on how these issues affect visualizations, although we note that more work is needed to understand how viewers interpret the difference between the random and the actual attributions.
AbstractList Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to the claim from Adebayo et al. (2018) that local explanation methods lack sensitivity, i.e., DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNNs with learned weights. Further investigation reveals that their findings are due to two choices in their analysis: (a) ignoring the signs of the attributions; and (b) for integrated gradients (IG), including pixels in their analysis that have zero attributions by choice of the baseline (an auxiliary input relative to which the attributions are computed). When both factors are accounted for, IG attributions for a random network and the actual network are uncorrelated. Our investigation also sheds light on how these issues affect visualizations, although we note that more work is needed to understand how viewers interpret the difference between the random and the actual attributions.
Author Taly, Ankur
Sundararajan, Mukund
Author_xml – sequence: 1
  givenname: Mukund
  surname: Sundararajan
  fullname: Sundararajan, Mukund
– sequence: 2
  givenname: Ankur
  surname: Taly
  fullname: Taly, Ankur
BackLink https://doi.org/10.48550/arXiv.1806.04205$$DView paper in arXiv
BookMark eNotj7FOwzAURT3AAIUPYOL9QIId20lgq0qhSCEgUbFGz-mziJrGkeOU9u8JhekO9-hI55Kdda4jxm4Ej1WuNb9Df2j2sch5GnOVcH3BaA6lCwRo3BgeoHA1trA89C12GBrXwSuFL7cZwDoPj0Q9lDT6iSkpfDu_HaDFegsf1A1NaPZNOEJw8I4edxTIwye2Iw1X7NxiO9D1_87Y-mm5Xqyi4u35ZTEvIkwzHd0Lk6KQmZEZyToThotc5dwaVSO3KRkurZ0eobnGRKh6ApOa-CZJtVJCyRm7_dOeOqveNzv0x-q3tzr1yh8Ep1I5
ContentType Journal Article
Copyright http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID AKY
EPD
GOX
DOI 10.48550/arxiv.1806.04205
DatabaseName arXiv Computer Science
arXiv Statistics
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 1806_04205
GroupedDBID AKY
EPD
GOX
ID FETCH-LOGICAL-a675-91b6a137b37e3c71b018480fb4ca0f6eb03ffe3c1505a214cb372ce0d26544143
IEDL.DBID GOX
IngestDate Mon Jan 08 05:49:30 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a675-91b6a137b37e3c71b018480fb4ca0f6eb03ffe3c1505a214cb372ce0d26544143
OpenAccessLink https://arxiv.org/abs/1806.04205
ParticipantIDs arxiv_primary_1806_04205
PublicationCentury 2000
PublicationDate 2018-06-11
PublicationDateYYYYMMDD 2018-06-11
PublicationDate_xml – month: 06
  year: 2018
  text: 2018-06-11
  day: 11
PublicationDecade 2010
PublicationYear 2018
Score 1.7026799
SecondaryResourceType preprint
Snippet Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Learning
Statistics - Machine Learning
Title A Note about: Local Explanation Methods for Deep Neural Networks lack Sensitivity to Parameter Values
URI https://arxiv.org/abs/1806.04205
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1NSwMxEA21Jy-iqNRP5uB1MV_7UW9FrUVsFazSW0myExCkLd1V_PlOkopevIVkTi-QeZO8vGHsolaorc15Zr2RmfYGaeR4hmXtKt6vMI-qyvGkGL3o-1k-6zD4-Qtj1l9vn8kf2DaXogpvBVoGk9ItKYNk6-5xlh4noxXXJv43jjhmnPqTJIa7bGfD7mCQtmOPdXCxz3AAk2WLEDXAV_AQsgcE7ZtJN3Ewjl2cGyD-CDeIKwiOGRQzSRLtBsIlGzwHqXnq9QDtEp5MkFURKvBq3ulsP2DT4e30epRtuhtkhkg6HTK2MEKVVpWoXCksp1qr4t5qZ7gv0HLlPa0QYcuNFNpRoHTIa1mEtmFaHbLuYrnAHgMtjHeOV55ApmrCU8mjtNKmL2r0wusj1ouYzFfJwGIe4JpHuI7_Xzph20QOqiCLEuKUddv1B55RAm7tedyFb0RUhzA
link.rule.ids 228,230,783,888
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Note+about%3A+Local+Explanation+Methods+for+Deep+Neural+Networks+lack+Sensitivity+to+Parameter+Values&rft.au=Sundararajan%2C+Mukund&rft.au=Taly%2C+Ankur&rft.date=2018-06-11&rft_id=info:doi/10.48550%2Farxiv.1806.04205&rft.externalDocID=1806_04205