Shallow-Deep Networks: Understanding and Mitigating Network Overthinking
We characterize a prevalent weakness of deep neural networks (DNNs)---overthinking---which occurs when a DNN can reach correct predictions before its final layer. Overthinking is computationally wasteful, and it can also be destructive when, by the final layer, a correct prediction changes into a mi...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , , |
Format | Paper |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
09.05.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | We characterize a prevalent weakness of deep neural networks (DNNs)---overthinking---which occurs when a DNN can reach correct predictions before its final layer. Overthinking is computationally wasteful, and it can also be destructive when, by the final layer, a correct prediction changes into a misclassification. Understanding overthinking requires studying how each prediction evolves during a DNN's forward pass, which conventionally is opaque. For prediction transparency, we propose the Shallow-Deep Network (SDN), a generic modification to off-the-shelf DNNs that introduces internal classifiers. We apply SDN to four modern architectures, trained on three image classification tasks, to characterize the overthinking problem. We show that SDNs can mitigate the wasteful effect of overthinking with confidence-based early exits, which reduce the average inference cost by more than 50% and preserve the accuracy. We also find that the destructive effect occurs for 50% of misclassifications on natural inputs and that it can be induced, adversarially, with a recent backdooring attack. To mitigate this effect, we propose a new confusion metric to quantify the internal disagreements that will likely lead to misclassifications. |
---|---|
AbstractList | We characterize a prevalent weakness of deep neural networks (DNNs)---overthinking---which occurs when a DNN can reach correct predictions before its final layer. Overthinking is computationally wasteful, and it can also be destructive when, by the final layer, a correct prediction changes into a misclassification. Understanding overthinking requires studying how each prediction evolves during a DNN's forward pass, which conventionally is opaque. For prediction transparency, we propose the Shallow-Deep Network (SDN), a generic modification to off-the-shelf DNNs that introduces internal classifiers. We apply SDN to four modern architectures, trained on three image classification tasks, to characterize the overthinking problem. We show that SDNs can mitigate the wasteful effect of overthinking with confidence-based early exits, which reduce the average inference cost by more than 50% and preserve the accuracy. We also find that the destructive effect occurs for 50% of misclassifications on natural inputs and that it can be induced, adversarially, with a recent backdooring attack. To mitigate this effect, we propose a new confusion metric to quantify the internal disagreements that will likely lead to misclassifications. |
Author | Kaya, Yigitcan Hong, Sanghyun Tudor Dumitras |
Author_xml | – sequence: 1 givenname: Yigitcan surname: Kaya fullname: Kaya, Yigitcan – sequence: 2 givenname: Sanghyun surname: Hong fullname: Hong, Sanghyun – sequence: 3 fullname: Tudor Dumitras |
BookMark | eNrjYmDJy89LZWLgNDI2NtS1MDEy4mDgLS7OMjAwMDIzNzI1NeZk8AjOSMzJyS_XdUlNLVDwSy0pzy_KLrZSCM1LSS0qLknMS8nMS1cAUgq-mSWZ6YklIC5UmYJ_WWpRSUZmXjZQkIeBNS0xpziVF0pzMyi7uYY4e-gWFOUXlqYWl8Rn5ZcW5QGl4o0MgdDA0tjUwJg4VQDDAT4R |
ContentType | Paper |
Copyright | 2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials AUTh Library subscriptions: ProQuest Central Technology Collection ProQuest One Community College ProQuest Central SciTech Premium Collection (Proquest) (PQ_SDU_P3) ProQuest Engineering Collection ProQuest Engineering Database Publicly Available Content Database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
ID | FETCH-proquest_journals_21212093503 |
IEDL.DBID | 8FG |
IngestDate | Thu Oct 10 17:15:54 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-proquest_journals_21212093503 |
OpenAccessLink | https://www.proquest.com/docview/2121209350?pq-origsite=%requestingapplication% |
PQID | 2121209350 |
PQPubID | 2050157 |
ParticipantIDs | proquest_journals_2121209350 |
PublicationCentury | 2000 |
PublicationDate | 20190509 |
PublicationDateYYYYMMDD | 2019-05-09 |
PublicationDate_xml | – month: 05 year: 2019 text: 20190509 day: 09 |
PublicationDecade | 2010 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2019 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 3.2053857 |
SecondaryResourceType | preprint |
Snippet | We characterize a prevalent weakness of deep neural networks (DNNs)---overthinking---which occurs when a DNN can reach correct predictions before its final... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Classification Classifiers Computation Neural networks Task complexity |
Title | Shallow-Deep Networks: Understanding and Mitigating Network Overthinking |
URI | https://www.proquest.com/docview/2121209350 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQSTU0NjdJTAItlTI21jWxSErVTUxJMtc1NzVJTDE3SEkzBK8m9PUz8wg18YowjYAOuBVDl1XCykRwQZ2SnwwaI9cHFrGgbZ7Gpgb2BYW6oFujQLOr0Cs0mBlYDY3MzUGdLws3d_gYi5GZObDFbIxRzILrDjdBBtaAxILUIiEGptQ8YQZ28JLL5GIRBo9g0D0m-eW6LqmpBQp-kOXYxVYKocjbTRSAlIJvJuQgDCAXqkzBH3SNcgbk3gNRBmU31xBnD12Y9fHQBFIcj_COsRgDC7CnnyrBoGBkZmRpapAG7GGZppgAq41EoyQLsyRLSxNDI9B67VRJBhl8Jknhl5Zm4ALW9pbg1XqWMgwsJUWlqbLAGrUkSQ4cbHIMrE6ufgFBQJ5vnSsAEzeANg |
link.rule.ids | 783,787,12777,21400,33385,33756,43612,43817 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQSTU0NjdJTAItlTI21jWxSErVTUxJMtc1NzVJTDE3SEkzBK8m9PUz8wg18YowjYAOuBVDl1XCykRwQZ2SnwwaI9cHFrGgbZ7Gpgb2BYW6oFujQLOr0Cs0mBlYTYyBdTVop7ibO3yMxcjMHNhiNsYoZsF1h5sgA2tAYkFqkRADU2qeMAM7eMllcrEIg0cw6B6T_HJdl9TUAgU_yHLsYiuFUOTtJgpASsE3E3IQBpALVabgD7pGOQNy74Eog7Kba4izhy7M-nhoAimOR3jHWIyBBdjTT5VgUDAyM7I0NUgD9rBMU0yA1UaiUZKFWZKlpYmhEWi9dqokgww-k6TwS8szcHqE-PrE-3j6eUszcAFrfkvwyj1LGQaWkqLSVFlg7VqSJAcOQgAdq4BN |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Shallow-Deep+Networks%3A+Understanding+and+Mitigating+Network+Overthinking&rft.jtitle=arXiv.org&rft.au=Kaya%2C+Yigitcan&rft.au=Hong%2C+Sanghyun&rft.au=Tudor+Dumitras&rft.date=2019-05-09&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422 |