Pre-Defined Sparse Neural Networks With Hardware Acceleration
Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities as...
Saved in:
Published in | IEEE journal on emerging and selected topics in circuits and systems Vol. 9; no. 2; pp. 332 - 345 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.06.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized field programmable gate array (FPGA)s. |
---|---|
AbstractList | Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized field programmable gate array (FPGA)s. |
Author | Beerel, Peter A. Huang, Kuan-Wen Chugg, Keith M. Dey, Sourya |
Author_xml | – sequence: 1 givenname: Sourya orcidid: 0000-0003-3084-1428 surname: Dey fullname: Dey, Sourya email: souryade@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA – sequence: 2 givenname: Kuan-Wen surname: Huang fullname: Huang, Kuan-Wen email: kuanwenh@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA – sequence: 3 givenname: Peter A. orcidid: 0000-0002-8283-0168 surname: Beerel fullname: Beerel, Peter A. email: pabeerel@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA – sequence: 4 givenname: Keith M. surname: Chugg fullname: Chugg, Keith M. email: chugg@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA |
BookMark | eNqFkLtOw0AQRVcIJELIF6SxRO2wz1lvQRGFR0ARIAVEaa3XY-Fg7LDrKOLvcXCUgoZp7hT3zOOekeO6qZGQMaMTxqi5fLh5mU2XE06ZmXDDaALyiAw4UxALAer40Ct9SkYhrGhXChhIOSBXzx7jayzKGvNoubY-YPSIG2-rTtpt4z9C9Fa279Hc-nxrPUZT57BCb9uyqc_JSWGrgKO9DsnrbXfNPF483d3PpovYcaPbWDOnGCLniZYSM55R63LhMiOVgdzlBVgJCm2SiVzrhLEClBUFZ6CA6gzEkFz0c9e--dpgaNNVs_F1tzLlXBgAoaXqXKJ3Od-E4LFI1778tP47ZTTdRZX2UaW7qNJ9VB1l_lCubH-_a70tq3_Ycc-WiHjYlkBiKDDxA8sEd9I |
CODEN | IJESLY |
CitedBy_id | crossref_primary_10_1109_TC_2020_2972520 crossref_primary_10_1002_cpe_6363 crossref_primary_10_1016_j_neucom_2021_08_059 crossref_primary_10_1109_JETCAS_2021_3101740 crossref_primary_10_1109_TSMC_2024_3370221 crossref_primary_10_1109_TCSI_2021_3099034 crossref_primary_10_1109_TPDS_2021_3067825 crossref_primary_10_1016_j_spl_2022_109698 crossref_primary_10_3389_fninf_2024_1430987 crossref_primary_10_1007_s41315_022_00231_5 crossref_primary_10_1109_JSAC_2021_3126024 |
Cites_doi | 10.1109/JSSC.2016.2616357 10.1109/ACSSC.2017.8335713 10.1145/3007787.3001165 10.1109/CVPR.2015.7298594 10.1145/3007787.3001163 10.1038/s41467-018-04316-3 10.1109/ISPA/IUCC.2017.00099 10.1109/MICRO.2016.7783723 10.1109/ITA.2018.8502950 10.1145/2541940.2541967 10.1109/DATE.2007.364613 10.1109/92.784098 10.1145/3079856.3080246 10.1016/j.vlsi.2017.12.009 10.1109/29.46546 10.1145/2847263.2847276 10.1109/RECONFIG.2018.8641739 10.1109/ICCCNT.2018.8494011 10.1007/978-3-319-68600-4_32 10.1109/ICCV.2015.123 10.1109/VTC.2001.957178 10.1145/3007787.3001138 10.1093/imanum/20.3.389 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/JETCAS.2019.2910864 |
DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2156-3365 |
EndPage | 345 |
ExternalDocumentID | 10_1109_JETCAS_2019_2910864 8689061 |
Genre | orig-research |
GrantInformation_xml | – fundername: Division of Computing and Communication Foundations grantid: 1763747 funderid: 10.13039/100000143 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG AENEX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD HZ~ IFIPE IPLJI JAVBF M43 O9- OCL PQQKQ RIA RIE RNS AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c297t-71c51ee228744eb2b0acd3cb94596dcdf6a465ea8b3d77811f65a3f2165607b63 |
IEDL.DBID | RIE |
ISSN | 2156-3357 |
IngestDate | Sun Jun 29 16:54:53 EDT 2025 Tue Jul 01 00:41:35 EDT 2025 Thu Apr 24 23:09:51 EDT 2025 Wed Aug 27 05:56:10 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c297t-71c51ee228744eb2b0acd3cb94596dcdf6a465ea8b3d77811f65a3f2165607b63 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-8283-0168 0000-0003-3084-1428 |
PQID | 2239663745 |
PQPubID | 2040416 |
PageCount | 14 |
ParticipantIDs | proquest_journals_2239663745 crossref_primary_10_1109_JETCAS_2019_2910864 crossref_citationtrail_10_1109_JETCAS_2019_2910864 ieee_primary_8689061 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2019-06-01 |
PublicationDateYYYYMMDD | 2019-06-01 |
PublicationDate_xml | – month: 06 year: 2019 text: 2019-06-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE journal on emerging and selected topics in circuits and systems |
PublicationTitleAbbrev | JETCAS |
PublicationYear | 2019 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref12 aghasi (ref13) 2017 goodfellow (ref34) 2016 ref11 jenatton (ref52) 2011; 12 xu (ref50) 2015 ref19 srivastava (ref10) 2014; 15 han (ref3) 2015 sindhwani (ref15) 2015 ref51 wen (ref14) 2016 guan (ref33) 2017 ref46 lewis (ref41) 2004; 5 ref48 ref44 chen (ref7) 2015 ref9 ref4 krizhevsky (ref1) 2012 ref5 masters (ref39) 2018 wang (ref16) 2018 lecun (ref40) 2018 coates (ref2) 2013; 28 ref37 ref31 ref30 prabhu (ref18) 2017 hinton (ref42) 2012 garofolo (ref43) 2018 krizhevsky (ref45) 2009 bourely (ref17) 2017 kingma (ref47) 2014 han (ref8) 2015 wang (ref32) 2017; 36 ref24 ref23 ref26 ref25 ref20 ref22 ref21 yosinski (ref35) 2012 ref28 ref27 goyal (ref38) 2017 ref29 weste (ref36) 2010 gong (ref6) 2014 ba (ref49) 2014 |
References_xml | – volume: 5 start-page: 361 year: 2004 ident: ref41 article-title: RCV1: A new benchmark collection for text categorization research publication-title: J Mach Learn Res – year: 2017 ident: ref18 publication-title: Deep expander networks Efficient deep networks from graph theory – ident: ref23 doi: 10.1109/JSSC.2016.2616357 – year: 2018 ident: ref40 publication-title: The MNIST Database of Handwritten Digits – ident: ref21 doi: 10.1109/ACSSC.2017.8335713 – year: 2016 ident: ref34 publication-title: Deep Learning – ident: ref12 doi: 10.1145/3007787.3001165 – ident: ref5 doi: 10.1109/CVPR.2015.7298594 – volume: 28 start-page: iii-1337 year: 2013 ident: ref2 article-title: Deep learning with COTS HPC systems publication-title: Proc Int Conf Mach Learn (ICML) – ident: ref9 doi: 10.1145/3007787.3001163 – start-page: 1135 year: 2015 ident: ref3 article-title: Learning both weights and connections for efficient neural network publication-title: Proc Adv Neural Inf Process Syst (NIPS) – year: 2014 ident: ref6 publication-title: Compressing deep convolutional networks using vector quantization – year: 2010 ident: ref36 publication-title: CMOS VLSI Design A Circuits and Systems Perspective – ident: ref19 doi: 10.1038/s41467-018-04316-3 – start-page: 1097 year: 2012 ident: ref1 article-title: ImageNet classification with deep convolutional neural networks publication-title: Proc Adv Neural Inf Process Syst (NIPS) – volume: 15 start-page: 1929 year: 2014 ident: ref10 article-title: Dropout: A simple way to prevent neural networks from overfitting publication-title: J Mach Learn Res – year: 2012 ident: ref42 publication-title: Improving Neural Networks by Preventing Co-adaptation of Feature Detectors – ident: ref31 doi: 10.1109/ISPA/IUCC.2017.00099 – ident: ref25 doi: 10.1109/MICRO.2016.7783723 – ident: ref22 doi: 10.1109/ITA.2018.8502950 – year: 2015 ident: ref8 publication-title: Deep compression Compressing deep neural networks with pruning trained quantization and huffman coding – year: 2014 ident: ref49 publication-title: Multiple object recognition with visual attention – volume: 36 start-page: 513 year: 2017 ident: ref32 article-title: DLAU: A scalable deep learning accelerator unit on FPGA publication-title: IEEE Trans Comput -Aided Design Integr Circuits Syst – ident: ref27 doi: 10.1145/2541940.2541967 – ident: ref29 doi: 10.1109/DATE.2007.364613 – year: 2018 ident: ref39 publication-title: Revisiting small batch training for deep neural networks – year: 2017 ident: ref38 publication-title: Accurate large minibatch sgd Training imagenet in 1 hour – volume: 12 start-page: 2777 year: 2011 ident: ref52 article-title: Structured variable selection with sparsity-inducing norms publication-title: J Mach Learn Res – year: 2018 ident: ref43 publication-title: TIMIT Acoustic-Phonetic Continuous Speech Corpus – year: 2009 ident: ref45 article-title: Learning multiple layers of features from tiny images – ident: ref28 doi: 10.1109/92.784098 – ident: ref4 doi: 10.1145/3079856.3080246 – ident: ref24 doi: 10.1016/j.vlsi.2017.12.009 – start-page: 3177 year: 2017 ident: ref13 article-title: Net-trim: Convex pruning of deep neural networks with performance guarantee publication-title: Proc Adv Neural Inf Process Syst (NIPS) – start-page: 1 year: 2012 ident: ref35 article-title: Visually debugging restricted Boltzmann machine training with a 3D example publication-title: Proc Int Conf Mach Learn (ICML) – start-page: 2048 year: 2015 ident: ref50 article-title: Show, attend and tell: Neural image caption generation with visual attention publication-title: Proc Int Conf Mach Learn (ICML) – start-page: 2074 year: 2016 ident: ref14 article-title: Learning structured sparsity in deep neural networks publication-title: Proc Adv Neural Inf Process Syst (NIPS) – start-page: 3088 year: 2015 ident: ref15 article-title: Structured transforms for small-footprint deep learning publication-title: Proc Adv Neural Inform Process Syst (NIPS) – start-page: 11 year: 2018 ident: ref16 article-title: C-LSTM: Enabling efficient LSTM using structured compression techniques on FPGAs publication-title: Proc ACM Int Symp Field-Program Gate Arrays (FPGA) – ident: ref44 doi: 10.1109/29.46546 – ident: ref26 doi: 10.1145/2847263.2847276 – ident: ref37 doi: 10.1109/RECONFIG.2018.8641739 – ident: ref48 doi: 10.1109/ICCCNT.2018.8494011 – ident: ref20 doi: 10.1007/978-3-319-68600-4_32 – ident: ref46 doi: 10.1109/ICCV.2015.123 – year: 2017 ident: ref33 publication-title: Recursive binary neural network learning model with 2 28b/weight storage requirement – year: 2017 ident: ref17 publication-title: Sparse neural networks topologies – ident: ref30 doi: 10.1109/VTC.2001.957178 – year: 2014 ident: ref47 article-title: Adam: A method for stochastic optimization publication-title: Proc Int Conf Learn Represent – ident: ref11 doi: 10.1145/3007787.3001138 – start-page: 1 year: 2015 ident: ref7 article-title: Compressing neural networks with the hashing trick publication-title: Proc Int Conf Mach Learn (ICML) – ident: ref51 doi: 10.1093/imanum/20.3.389 |
SSID | ssj0000561644 |
Score | 2.3223886 |
Snippet | Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 332 |
SubjectTerms | Acceleration Artificial intelligence Artificial neural networks Complexity Complexity theory Computation Computer architecture Energy storage Field programmable gate arrays Hardware hardware acceleration Inference Junctions Machine learning multilayer perceptron Multilayer perceptrons neural network Neural networks Neurons Sparsity Training |
Title | Pre-Defined Sparse Neural Networks With Hardware Acceleration |
URI | https://ieeexplore.ieee.org/document/8689061 https://www.proquest.com/docview/2239663745 |
Volume | 9 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF5qT3rwLdYXOXjsxiT76h48FB8UwSJU0VvYxwRFaUtNEfz17m7S4AvxFsJu2MxM8s3sznyD0LH1xYwOljAklGJKjcJaCIIJyAwIFA5T_Yb-9ZAP7ujVA3tooW5TCwMAIfkMYn8ZzvLtxMz9VtlJj_dk4mOdJRe4VbVazX6K94R56N3qQIxjQpioSYbSRJ5cXdye9Uc-k0vGmfTdhegXIAqdVX78jgPGXK6h68XqqtSS53he6ti8fyNu_O_y19Fq7WxG_co6NlALxpto5RMF4RY6vZkBPofC3bHRaOrCXIg8YYebNqwyxF-j-6fyMfJH_G9qBlHfGAdVleFso7tL984DXLdUwCaTosQiNSwFyALrvQuqdaKMJUZLyiS3xhZcUc5A9TSxwhehFpwpUmSBo0doTnZQezwZwy6KlNOkYlZwZRwSpoVOnDUkxjBqMptw3UHZQr65qfnGfduLlzzEHYnMK6XkXil5rZQO6jaTphXdxt_Dt7yYm6G1hDvoYKHIvP4kX3PnB7nQjgjK9n6ftY-W_bOrPLAD1C5nczh0Hkepj4KpfQCwuM_1 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1Lb9NAEB6VcgAOvAoiUMAHuLGps8_soYeIUqWvCKmt6M3sYywQKK0SRxX8Fv4K_43ZtRPxErdK3Cxr17Z2PvubWc98A_AipmJGoiWGpZRMyuCYN0YwgZajwJo4NW3oH030-FTun6mzNfi2qoVBxJx8hv10mP_lx_OwSFtlW0M9tMQ_XQrlAX65pABtvr23Q9Z8yfnum5PXY9b1EGCBW9MwMwhqgMizzDtFkb50IYrgrVRWxxBr7aRW6IZeRJOqLmutnKh5FqUxXgu67jW4Tn6G4m112GoHJ_neOneLJdrUTAhlOlmjQWm39ulhRscpd8z2uU39jOQv1Jd7ufxBAJnVdu_A9-V6tMksn_qLxvfD19-kIv_XBbsLtzt3uhi1-L8Hazi9D7d-ElncgO23M2Q7WNOZWBxfUCCPRZIkoWmTNgd-Xrz72HwoUhLDpZthMQqByLh9NR7A6ZU8_0NYn55P8REUjrDqVDTaBeL6Qe1LwnsZgpKBx1L7HvClPavQKaqnxh6fqxxZlbZqQVAlEFQdCHrwajXpohUU-ffwjWTW1dDOoj3YXAKn6j4684o8PQpehZHq8d9nPYcb45Ojw-pwb3LwBG6m-7RZb5uw3swW-JT8q8Y_yzAv4P1Vw-QHS_YtRA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Pre-Defined+Sparse+Neural+Networks+With+Hardware+Acceleration&rft.jtitle=IEEE+journal+on+emerging+and+selected+topics+in+circuits+and+systems&rft.au=Dey%2C+Sourya&rft.au=Huang%2C+Kuan-Wen&rft.au=Beerel%2C+Peter+A.&rft.au=Chugg%2C+Keith+M.&rft.date=2019-06-01&rft.issn=2156-3357&rft.eissn=2156-3365&rft.volume=9&rft.issue=2&rft.spage=332&rft.epage=345&rft_id=info:doi/10.1109%2FJETCAS.2019.2910864&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_JETCAS_2019_2910864 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2156-3357&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2156-3357&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2156-3357&client=summon |