A Note about: Local Explanation Methods for Deep Neural Networks lack Sensitivity to Parameter Values

Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to the claim from Adebayo et al. (2018) that local explanation methods lack sensitivity, i.e., DNNs with randomly-initialized weights produce...

Full description

Saved in:

Bibliographic Details
Main Authors	Sundararajan, Mukund, Taly, Ankur
Format	Journal Article
Language	English
Published	11.06.2018
Subjects	Computer Science - Learning Statistics - Machine Learning
Online Access	Get full text

Cover

Loading…

Abstract	Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to the claim from Adebayo et al. (2018) that local explanation methods lack sensitivity, i.e., DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNNs with learned weights. Further investigation reveals that their findings are due to two choices in their analysis: (a) ignoring the signs of the attributions; and (b) for integrated gradients (IG), including pixels in their analysis that have zero attributions by choice of the baseline (an auxiliary input relative to which the attributions are computed). When both factors are accounted for, IG attributions for a random network and the actual network are uncorrelated. Our investigation also sheds light on how these issues affect visualizations, although we note that more work is needed to understand how viewers interpret the difference between the random and the actual attributions.
AbstractList	Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to the claim from Adebayo et al. (2018) that local explanation methods lack sensitivity, i.e., DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNNs with learned weights. Further investigation reveals that their findings are due to two choices in their analysis: (a) ignoring the signs of the attributions; and (b) for integrated gradients (IG), including pixels in their analysis that have zero attributions by choice of the baseline (an auxiliary input relative to which the attributions are computed). When both factors are accounted for, IG attributions for a random network and the actual network are uncorrelated. Our investigation also sheds light on how these issues affect visualizations, although we note that more work is needed to understand how viewers interpret the difference between the random and the actual attributions.
Author	Taly, Ankur Sundararajan, Mukund
Author_xml	– sequence: 1 givenname: Mukund surname: Sundararajan fullname: Sundararajan, Mukund – sequence: 2 givenname: Ankur surname: Taly fullname: Taly, Ankur
BackLink	https://doi.org/10.48550/arXiv.1806.04205$$DView paper in arXiv
BookMark	eNotj7FOwzAURT3AAIUPYOL9QIId20lgq0qhSCEgUbFGz-mziJrGkeOU9u8JhekO9-hI55Kdda4jxm4Ej1WuNb9Df2j2sch5GnOVcH3BaA6lCwRo3BgeoHA1trA89C12GBrXwSuFL7cZwDoPj0Q9lDT6iSkpfDu_HaDFegsf1A1NaPZNOEJw8I4edxTIwye2Iw1X7NxiO9D1_87Y-mm5Xqyi4u35ZTEvIkwzHd0Lk6KQmZEZyToThotc5dwaVSO3KRkurZ0eobnGRKh6ApOa-CZJtVJCyRm7_dOeOqveNzv0x-q3tzr1yh8Ep1I5
ContentType	Journal Article
Copyright	http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml	– notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID	AKY EPD GOX
DOI	10.48550/arxiv.1806.04205
DatabaseName	arXiv Computer Science arXiv Statistics arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	1806_04205
GroupedDBID	AKY EPD GOX
ID	FETCH-LOGICAL-a675-91b6a137b37e3c71b018480fb4ca0f6eb03ffe3c1505a214cb372ce0d26544143
IEDL.DBID	GOX
IngestDate	Mon Jan 08 05:49:30 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a675-91b6a137b37e3c71b018480fb4ca0f6eb03ffe3c1505a214cb372ce0d26544143
OpenAccessLink	https://arxiv.org/abs/1806.04205
ParticipantIDs	arxiv_primary_1806_04205
PublicationCentury	2000
PublicationDate	2018-06-11
PublicationDateYYYYMMDD	2018-06-11
PublicationDate_xml	– month: 06 year: 2018 text: 2018-06-11 day: 11
PublicationDecade	2010
PublicationYear	2018
Score	1.7026799
SecondaryResourceType	preprint
Snippet	Local explanation methods, also known as attribution methods, attribute a deep network's prediction to its input (cf. Baehrens et al. (2010)). We respond to...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Learning Statistics - Machine Learning
Title	A Note about: Local Explanation Methods for Deep Neural Networks lack Sensitivity to Parameter Values
URI	https://arxiv.org/abs/1806.04205
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1NSwMxEA21Jy-iqNRP5uB1MV_7UW9FrUVsFazSW0myExCkLd1V_PlOkopevIVkTi-QeZO8vGHsolaorc15Zr2RmfYGaeR4hmXtKt6vMI-qyvGkGL3o-1k-6zD4-Qtj1l9vn8kf2DaXogpvBVoGk9ItKYNk6-5xlh4noxXXJv43jjhmnPqTJIa7bGfD7mCQtmOPdXCxz3AAk2WLEDXAV_AQsgcE7ZtJN3Ewjl2cGyD-CDeIKwiOGRQzSRLtBsIlGzwHqXnq9QDtEp5MkFURKvBq3ulsP2DT4e30epRtuhtkhkg6HTK2MEKVVpWoXCksp1qr4t5qZ7gv0HLlPa0QYcuNFNpRoHTIa1mEtmFaHbLuYrnAHgMtjHeOV55ApmrCU8mjtNKmL2r0wusj1ouYzFfJwGIe4JpHuI7_Xzph20QOqiCLEuKUddv1B55RAm7tedyFb0RUhzA
link.rule.ids	228,230,783,888
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Note+about%3A+Local+Explanation+Methods+for+Deep+Neural+Networks+lack+Sensitivity+to+Parameter+Values&rft.au=Sundararajan%2C+Mukund&rft.au=Taly%2C+Ankur&rft.date=2018-06-11&rft_id=info:doi/10.48550%2Farxiv.1806.04205&rft.externalDocID=1806_04205