DMS: Dynamic Model Scaling for Quality-Aware Deep Learning Inference in Mobile and Embedded Devices
Recently, deep learning has brought revolutions to many mobile and embedded systems that interact with the physical world using continuous video streams. Although there have been significant efforts to reduce the computational overheads of deep learning inference in such systems, previous approaches...
Saved in:
Published in | IEEE access Vol. 7; pp. 168048 - 168059 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 2169-3536 2169-3536 |
DOI | 10.1109/ACCESS.2019.2954546 |
Cover
Loading…
Abstract | Recently, deep learning has brought revolutions to many mobile and embedded systems that interact with the physical world using continuous video streams. Although there have been significant efforts to reduce the computational overheads of deep learning inference in such systems, previous approaches have focused on delivering `best-effort' performance, resulting in unpredictable performance under variable environments. In this paper, we propose a runtime control method, called DMS (Dynamic Model Scaling), that enables dynamic resource-accuracy trade-offs to support various QoS requirements of deep learning applications. In DMS, the resource demands of deep learning inference can be controlled by adaptive pruning of computation-intensive convolution filters. DMS avoids irregularity of pruned models by reorganizing filters according to their importance so that varying number of filters can be applied efficiently. Since DMS's pruning method incurs no runtime overhead and preserves the full capacity of original deep learning models, DMS can tailor the models at runtime for concurrent deep learning applications with their respective resource-accuracy trade-offs. We demonstrate the viability of DMS by implementing a prototype. The evaluation results demonstrate that, if properly coordinated with system level resource managers, DMS can support highly robust and efficient inference performance against unpredictable workloads. |
---|---|
AbstractList | Recently, deep learning has brought revolutions to many mobile and embedded systems that interact with the physical world using continuous video streams. Although there have been significant efforts to reduce the computational overheads of deep learning inference in such systems, previous approaches have focused on delivering `best-effort' performance, resulting in unpredictable performance under variable environments. In this paper, we propose a runtime control method, called DMS (Dynamic Model Scaling), that enables dynamic resource-accuracy trade-offs to support various QoS requirements of deep learning applications. In DMS, the resource demands of deep learning inference can be controlled by adaptive pruning of computation-intensive convolution filters. DMS avoids irregularity of pruned models by reorganizing filters according to their importance so that varying number of filters can be applied efficiently. Since DMS's pruning method incurs no runtime overhead and preserves the full capacity of original deep learning models, DMS can tailor the models at runtime for concurrent deep learning applications with their respective resource-accuracy trade-offs. We demonstrate the viability of DMS by implementing a prototype. The evaluation results demonstrate that, if properly coordinated with system level resource managers, DMS can support highly robust and efficient inference performance against unpredictable workloads. |
Author | Kim, Daeyeon Park, Junyoung Kang, Woochul |
Author_xml | – sequence: 1 givenname: Woochul orcidid: 0000-0002-4757-8999 surname: Kang fullname: Kang, Woochul email: wchkang@inu.ac.kr organization: Department of Embedded Systems Engineering, Incheon National University, Incheon, South Korea – sequence: 2 givenname: Daeyeon surname: Kim fullname: Kim, Daeyeon organization: Department of Embedded Systems Engineering, Incheon National University, Incheon, South Korea – sequence: 3 givenname: Junyoung surname: Park fullname: Park, Junyoung organization: Department of Embedded Systems Engineering, Incheon National University, Incheon, South Korea |
BookMark | eNp9kUFrGzEQhUVJoWmSX5CLoOd1Ja20WvVmbLcxOJTg5ixmpdkgs5Zc7brB_77rbFJKD53LDDPvewy8j-QipoiE3HI245yZz_PFYrXdzgTjZiaMkkpW78il4JUpSlVWF3_NH8hN3-_YWPW4UvqSuOX99gtdniLsg6P3yWNHtw66EJ9omzJ9OI7zcCrmz5CRLhEPdIOQ4_m-ji1mjA5piCPahA4pRE9X-wa9Rz_KfwWH_TV530LX481rvyKPX1c_FnfF5vu39WK-KZxk9VBIpUvfQCkqV5raOSgNADa65sobYaQGRCcqU3vWemiUZ5WS2ikDQleyNeUVWU--PsHOHnLYQz7ZBMG-LFJ-spCH4Dq0SjvOKlSAICU3HhRvfOMl58JrjfXo9WnyOuT084j9YHfpmOP4vhVSKWWM4XxUmUnlcur7jK11YYAhpDhkCJ3lzJ4jslNE9hyRfY1oZMt_2LeP_0_dTlRAxD9EbZiuhSh_A1d9nYk |
CODEN | IAECCG |
CitedBy_id | crossref_primary_10_1109_ACCESS_2024_3384233 crossref_primary_10_1016_j_neucom_2025_129402 crossref_primary_10_1109_TCSII_2020_3017538 crossref_primary_10_3390_s22207714 crossref_primary_10_1145_3476990 crossref_primary_10_1007_s11265_022_01784_1 crossref_primary_10_1038_s41598_022_25089_2 |
Cites_doi | 10.1145/3005348 10.1109/MPRV.2009.82 10.1109/CVPR.2016.90 10.1007/978-3-030-01261-8_25 10.23919/DATE.2018.8342119 10.1145/3079856.3080246 10.1109/ACCESS.2018.2887099 10.1109/CVPR.2016.91 10.1145/2893356 10.1109/CVPR.2017.194 10.1109/TPAMI.2018.2878258 10.1162/neco.1989.1.4.541 10.1109/CPSNA.2015.23 10.1007/978-3-030-01234-2_48 10.1145/2647868.2654889 10.1145/3241539.3241559 10.21437/Interspeech.2013-552 10.1145/3007787.3001163 10.1109/CVPR.2014.81 10.1109/ICPR.2016.7900006 10.1109/MC.2007.443 10.1145/2815400.2815403 10.1109/IPSN.2016.7460664 10.1109/ICCV.2017.298 10.1007/s11241-018-9314-y 10.1007/s11263-015-0816-y 10.1007/978-3-319-06486-4_7 10.1002/047166880X 10.1145/2749469.2750389 10.1109/MS.2017.79 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2019.2954546 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore Digital Library CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Open Access Full Text |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 168059 |
ExternalDocumentID | oai_doaj_org_article_57c106e5aea4419da51bdbd4112d77e8 10_1109_ACCESS_2019_2954546 8907822 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Research Foundation of Korea funderid: 10.13039/501100003725 – fundername: National Research Foundation of Korea grantid: NRF-2019R1F1A1060959 funderid: 10.13039/501100003725 – fundername: Ministry of Education grantid: NRF-2016R1D1A1B03934266 funderid: 10.13039/100010002 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS AAYXX CITATION RIG 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c408t-4573dba326c398cca39aaeb7815d92947aeec2698d0fdab5d06547c59a2764f93 |
IEDL.DBID | DOA |
ISSN | 2169-3536 |
IngestDate | Wed Aug 27 01:24:08 EDT 2025 Mon Jun 30 02:41:44 EDT 2025 Tue Jul 01 01:21:52 EDT 2025 Thu Apr 24 22:52:48 EDT 2025 Wed Aug 27 02:44:45 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://creativecommons.org/licenses/by/4.0/legalcode |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c408t-4573dba326c398cca39aaeb7815d92947aeec2698d0fdab5d06547c59a2764f93 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-4757-8999 |
OpenAccessLink | https://doaj.org/article/57c106e5aea4419da51bdbd4112d77e8 |
PQID | 2455599911 |
PQPubID | 4845423 |
PageCount | 12 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_57c106e5aea4419da51bdbd4112d77e8 crossref_citationtrail_10_1109_ACCESS_2019_2954546 ieee_primary_8907822 proquest_journals_2455599911 crossref_primary_10_1109_ACCESS_2019_2954546 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20190000 2019-00-00 20190101 2019-01-01 |
PublicationDateYYYYMMDD | 2019-01-01 |
PublicationDate_xml | – year: 2019 text: 20190000 |
PublicationDecade | 2010 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2019 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref15 ref14 ref10 ref17 ref16 ref18 lin (ref45) 2017 (ref19) 2019 huang (ref42) 2018 ref46 ref48 ref47 ref41 simonyan (ref25) 2015 ref44 ref43 ref49 lecun (ref39) 1990 ref7 ref9 ref4 ref6 ref40 iandola (ref37) 2016 han (ref12) 2015 (ref8) 2019 ref35 ref34 li (ref13) 2017 abadi (ref32) 2016 ref36 ref31 zhang (ref5) 2017 ref30 howard (ref38) 2017 han (ref21) 2016 ref2 goodfellow (ref1) 2016 jaderberg (ref24) 2014 ref26 ref20 ref22 krizhevsky (ref3) 2012 pallipadi (ref33) 2006; 2 denton (ref11) 2014 ref28 krizhevsky (ref27) 2009 ref29 xue (ref23) 2013 |
References_xml | – year: 2017 ident: ref38 article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications publication-title: arXiv 1704 04861 – ident: ref22 doi: 10.1145/3005348 – year: 2016 ident: ref37 article-title: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and ¡ 0.5 MB model size publication-title: arXiv 1602 07360 – start-page: 598 year: 1990 ident: ref39 article-title: Optimal brain damage publication-title: Proc Adv Neural Inf Process Syst – ident: ref20 doi: 10.1109/MPRV.2009.82 – year: 2019 ident: ref19 publication-title: Nvidia tensorrt – ident: ref4 doi: 10.1109/CVPR.2016.90 – ident: ref46 doi: 10.1007/978-3-030-01261-8_25 – ident: ref49 doi: 10.23919/DATE.2018.8342119 – ident: ref16 doi: 10.1145/3079856.3080246 – start-page: 1097 year: 2012 ident: ref3 article-title: ImageNet classification with deep convolutional neural networks publication-title: Proc Adv Neural Inf Process Syst – ident: ref29 doi: 10.1109/ACCESS.2018.2887099 – start-page: 1135 year: 2015 ident: ref12 article-title: Learning both weights and connections for efficient neural network publication-title: Proc 28th Int Conf Neural Inf Process Syst – ident: ref6 doi: 10.1109/CVPR.2016.91 – start-page: 265 year: 2016 ident: ref32 article-title: TensorFlow: A system for large-scale machine learning publication-title: Proc of USENIX Symp on Operating Systems Design and Implementation (OSDI) – ident: ref48 doi: 10.1145/2893356 – start-page: 1 year: 2017 ident: ref13 article-title: Pruning filters for efficient convnets publication-title: Proc Int Conf Learn Represent – year: 2009 ident: ref27 article-title: Learning multiple layers of features from tiny images – year: 2019 ident: ref8 publication-title: This Powerful Wearable is a Life-Changer for the Blind – start-page: 1 year: 2015 ident: ref25 article-title: Very deep convolutional networks for large-scale image recognition publication-title: Proc Int Conf Learn Represent – ident: ref43 doi: 10.1109/CVPR.2017.194 – ident: ref44 doi: 10.1109/TPAMI.2018.2878258 – ident: ref2 doi: 10.1162/neco.1989.1.4.541 – ident: ref36 doi: 10.1109/CPSNA.2015.23 – ident: ref28 doi: 10.1007/978-3-030-01234-2_48 – start-page: 2181 year: 2017 ident: ref45 article-title: Runtime neural pruning publication-title: Proc Adv Neural Inf Process Syst – ident: ref18 doi: 10.1145/2647868.2654889 – year: 2016 ident: ref1 publication-title: Deep Learning – ident: ref47 doi: 10.1145/3241539.3241559 – start-page: 377 year: 2017 ident: ref5 article-title: Live video analytics at scale with approximation and delay-tolerance publication-title: Proc USENIX Symp Netw Syst Design Implem (NSDI) – start-page: 1 year: 2016 ident: ref21 article-title: Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding publication-title: Proc Int Conf Learn Represent – start-page: 2365 year: 2013 ident: ref23 article-title: Restructuring of deep neural network acoustic models with singular value decomposition publication-title: Proc INTERSPEECH doi: 10.21437/Interspeech.2013-552 – ident: ref14 doi: 10.1145/3007787.3001163 – ident: ref7 doi: 10.1109/CVPR.2014.81 – start-page: 1269 year: 2014 ident: ref11 article-title: Exploiting linear structure within convolutional networks for efficient evaluation publication-title: Proc Adv Neural Inf Process Syst – ident: ref41 doi: 10.1109/ICPR.2016.7900006 – ident: ref35 doi: 10.1109/MC.2007.443 – ident: ref34 doi: 10.1145/2815400.2815403 – ident: ref10 doi: 10.1109/IPSN.2016.7460664 – ident: ref40 doi: 10.1109/ICCV.2017.298 – ident: ref31 doi: 10.1007/s11241-018-9314-y – ident: ref26 doi: 10.1007/s11263-015-0816-y – ident: ref17 doi: 10.1007/978-3-319-06486-4_7 – start-page: 1 year: 2014 ident: ref24 article-title: Speeding up convolutional neural networks with low rank expansions publication-title: Proc Brit Mach Vis Conf – ident: ref30 doi: 10.1002/047166880X – ident: ref15 doi: 10.1145/2749469.2750389 – ident: ref9 doi: 10.1109/MS.2017.79 – volume: 2 start-page: 215 year: 2006 ident: ref33 article-title: The ondemand governor publication-title: Proc Linux Symp – start-page: 1 year: 2018 ident: ref42 article-title: Multi-scale dense networks for resource efficient image classification publication-title: Proc Int Conf Learn Represent |
SSID | ssj0000816957 |
Score | 2.2033188 |
Snippet | Recently, deep learning has brought revolutions to many mobile and embedded systems that interact with the physical world using continuous video streams.... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 168048 |
SubjectTerms | Adaptation models Adaptive filters Computational modeling Control methods Convolution Deep learning Dynamic models edge devices Electronic devices Embedded systems energy efficiency feedback control filter pruning Inference mobile devices Model accuracy model compression Pruning QoS Quality of service Run time (computers) Runtime Task analysis Tradeoffs Video data |
SummonAdditionalLinks | – databaseName: IEEE Xplore Digital Library dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB4BJ3roA1p1Ka184EiWbGLHdm_bXRCttFwoEjfLj0lVFQKCXVXtr-_Y8UaorRC3KLEjR99k5vNjvgE40Ly0FJfbQkjkBa-bptC2bYqghRONcmgx5g4vzprTC_7lUlxuwOGQC4OI6fAZjuNl2ssPN34Vl8qOlE4BbRM2aeLW52oN6ymxgIQWMgsLTUp9NJ3N6Bvi6S09jrtZIpLcB8EnafTnoir_eOIUXk5ewGI9sP5UyY_xaunG_vdfmo1PHflLeJ55Jpv2hvEKNrDbgWcP1Ad3wc8X5x_ZvC9Jz2JRtCt2TpDRQ0ZUlvXyGr-K6U97h2yOeMuyGus39nmdJ8i-d9TVkW9htgvs-NohubJAzZMLeg0XJ8dfZ6dFrrlQeF6qZcGFrIOzROp8rRXBW2tr0Uk1EYGYFJcW0VeNVqFsg3UixORU6YW2lWx4q-s3sNXddPgWmCDX4Yh-KpoBcpTaEtOStiW-WSmiNXYE1RoM47MgeayLcWXSxKTUpkfQRARNRnAEh0On216P4_HmnyLKQ9Mopp1uEDom_5tGSE8TYxRklkQOdbBi4oILnKhokBLVCHYjosNLMpgj2F_bjMk__r2puIgabhRC9v7f6x1sxwH2qzj7sLW8W-F74jVL9yEZ9B81A_Jn priority: 102 providerName: IEEE |
Title | DMS: Dynamic Model Scaling for Quality-Aware Deep Learning Inference in Mobile and Embedded Devices |
URI | https://ieeexplore.ieee.org/document/8907822 https://www.proquest.com/docview/2455599911 https://doaj.org/article/57c106e5aea4419da51bdbd4112d77e8 |
Volume | 7 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQJxgQUBCFUnlgJJAmdmyzlX6oIJWlVOpm2bGDkEqoShHi33N23CoSEiysiZ3Ed5e7d4nvHUKXgsQK4nIRUWZJRNIsi4QqssgIqmnGtVXW1Q5PHrPxjDzM6bzW6svtCavogSvBQcKeQ9ZiKcyByC2Mol1ttCGAEwxj1pf5QsyrJVPeB_NuJigLNEPdWNz0-n1YkdvLJa7dvy3qIG8tFHnG_tBi5Ydf9sFmdID2A0rEverpDtGOLY_QXo07sInywWR6iwdVQ3nsWpot8BQEDicxAFFckWN8Rb1PtbJ4YO0SBy7VZ3y_qfLDLyVM1eAZsCoNHr5qC47IwHDvQI7RbDR86o-j0DEhyknM1xGhLDVaASTLU8FBOalQymrGu9QADiJMWZsnmeAmLozS1LjSUpZToRKWkUKkJ6hRvpX2FGEKL74G8MghfyOWCQU4iakC0GLCAZSoFko2wpN5oBN3XS0W0qcVsZCVxKWTuAwSb6Gr7aRlxabx-_A7p5XtUEeF7Q-AgchgIPIvA2mhptPp9iJceFTUQu2NjmV4bd9lQqhjYIMAcPYftz5Hu2451RebNmqsVx_2AjDMWne8uXZ8ueE30z7qPw |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB5Remg50Aet2Ja2PvRIlmxix3Fv213Q0rJcAImb5ccEVUBAdFdV-fUdO94ItVXVW5TYkaNvMvP5Md8AfFQ8NxSXm0xI5BkvqypTpqkyr4QVVW3RYMgdnh9XszP-5Vycr8FunwuDiPHwGQ7DZdzL9zduGZbK9moVA9ojeExxn6suW6tfUQklJJSQSVpolKu98WRCXxHOb6lh2M8SgeY-CD9RpT-VVfnDF8cAc_AM5quhdedKLofLhR26-99UG_937M9hMzFNNu5M4wWsYfsSNh7oD26Bm85PPrFpV5SehbJoV-yEQKOHjMgs6wQ2fmbjH-YO2RTxliU91gt2uMoUZN9a6mrJuzDTerZ_bZGcmafm0Qm9grOD_dPJLEtVFzLH83qRcSFLbw3ROleqmgAulTFoZT0SnrgUlwbRFZWqfd54Y4UP6anSCWUKWfFGla9hvb1pcRuYIOdhiYDWNAfkKJUhriVNQ4yzqInYmAEUKzC0S5LkoTLGlY5Tk1zpDkEdENQJwQHs9p1uO0WOfzf_HFDumwY57XiD0NHp79RCOpoaoyDDJHqovBEj663nREa9lFgPYCsg2r8kgTmAnZXN6PTrf9cFF0HFjYLIm7_3-gBPZqfzI310ePz1LTwNg-3WdHZgfXG3xHfEchb2fTTuXxPw9bc |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DMS%3A+Dynamic+Model+Scaling+for+Quality-Aware+Deep+Learning+Inference+in+Mobile+and+Embedded+Devices&rft.jtitle=IEEE+access&rft.au=Kang%2C+Woochul&rft.au=Kim%2C+Daeyeon&rft.au=Park%2C+Junyoung&rft.date=2019&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=7&rft.spage=168048&rft.epage=168059&rft_id=info:doi/10.1109%2FACCESS.2019.2954546&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2019_2954546 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |