Pre-Defined Sparse Neural Networks With Hardware Acceleration

Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities as...

Full description

Saved in:

Bibliographic Details
Published in	IEEE journal on emerging and selected topics in circuits and systems Vol. 9; no. 2; pp. 332 - 345
Main Authors	Dey, Sourya, Huang, Kuan-Wen, Beerel, Peter A., Chugg, Keith M.
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.06.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Acceleration Artificial intelligence Artificial neural networks Complexity Complexity theory Computation Computer architecture Energy storage Field programmable gate arrays Hardware hardware acceleration Inference Junctions Machine learning multilayer perceptron Multilayer perceptrons neural network Neural networks Neurons Sparsity Training
Online Access	Get full text

Cover

Loading…

Abstract	Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized field programmable gate array (FPGA)s.
AbstractList	Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized field programmable gate array (FPGA)s.
Author	Beerel, Peter A. Huang, Kuan-Wen Chugg, Keith M. Dey, Sourya
Author_xml	– sequence: 1 givenname: Sourya orcidid: 0000-0003-3084-1428 surname: Dey fullname: Dey, Sourya email: souryade@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA – sequence: 2 givenname: Kuan-Wen surname: Huang fullname: Huang, Kuan-Wen email: kuanwenh@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA – sequence: 3 givenname: Peter A. orcidid: 0000-0002-8283-0168 surname: Beerel fullname: Beerel, Peter A. email: pabeerel@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA – sequence: 4 givenname: Keith M. surname: Chugg fullname: Chugg, Keith M. email: chugg@usc.edu organization: Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
BookMark	eNqFkLtOw0AQRVcIJELIF6SxRO2wz1lvQRGFR0ARIAVEaa3XY-Fg7LDrKOLvcXCUgoZp7hT3zOOekeO6qZGQMaMTxqi5fLh5mU2XE06ZmXDDaALyiAw4UxALAer40Ct9SkYhrGhXChhIOSBXzx7jayzKGvNoubY-YPSIG2-rTtpt4z9C9Fa279Hc-nxrPUZT57BCb9uyqc_JSWGrgKO9DsnrbXfNPF483d3PpovYcaPbWDOnGCLniZYSM55R63LhMiOVgdzlBVgJCm2SiVzrhLEClBUFZ6CA6gzEkFz0c9e--dpgaNNVs_F1tzLlXBgAoaXqXKJ3Od-E4LFI1778tP47ZTTdRZX2UaW7qNJ9VB1l_lCubH-_a70tq3_Ycc-WiHjYlkBiKDDxA8sEd9I
CODEN	IJESLY
CitedBy_id	crossref_primary_10_1109_TC_2020_2972520 crossref_primary_10_1002_cpe_6363 crossref_primary_10_1016_j_neucom_2021_08_059 crossref_primary_10_1109_JETCAS_2021_3101740 crossref_primary_10_1109_TSMC_2024_3370221 crossref_primary_10_1109_TCSI_2021_3099034 crossref_primary_10_1109_TPDS_2021_3067825 crossref_primary_10_1016_j_spl_2022_109698 crossref_primary_10_3389_fninf_2024_1430987 crossref_primary_10_1007_s41315_022_00231_5 crossref_primary_10_1109_JSAC_2021_3126024
Cites_doi	10.1109/JSSC.2016.2616357 10.1109/ACSSC.2017.8335713 10.1145/3007787.3001165 10.1109/CVPR.2015.7298594 10.1145/3007787.3001163 10.1038/s41467-018-04316-3 10.1109/ISPA/IUCC.2017.00099 10.1109/MICRO.2016.7783723 10.1109/ITA.2018.8502950 10.1145/2541940.2541967 10.1109/DATE.2007.364613 10.1109/92.784098 10.1145/3079856.3080246 10.1016/j.vlsi.2017.12.009 10.1109/29.46546 10.1145/2847263.2847276 10.1109/RECONFIG.2018.8641739 10.1109/ICCCNT.2018.8494011 10.1007/978-3-319-68600-4_32 10.1109/ICCV.2015.123 10.1109/VTC.2001.957178 10.1145/3007787.3001138 10.1093/imanum/20.3.389
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1109/JETCAS.2019.2910864
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2156-3365
EndPage	345
ExternalDocumentID	10_1109_JETCAS_2019_2910864 8689061
Genre	orig-research
GrantInformation_xml	– fundername: Division of Computing and Communication Foundations grantid: 1763747 funderid: 10.13039/100000143
GroupedDBID	0R~ 4.4 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG AENEX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD HZ~ IFIPE IPLJI JAVBF M43 O9- OCL PQQKQ RIA RIE RNS AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c297t-71c51ee228744eb2b0acd3cb94596dcdf6a465ea8b3d77811f65a3f2165607b63
IEDL.DBID	RIE
ISSN	2156-3357
IngestDate	Sun Jun 29 16:54:53 EDT 2025 Tue Jul 01 00:41:35 EDT 2025 Thu Apr 24 23:09:51 EDT 2025 Wed Aug 27 05:56:10 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	2
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c297t-71c51ee228744eb2b0acd3cb94596dcdf6a465ea8b3d77811f65a3f2165607b63
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-8283-0168 0000-0003-3084-1428
PQID	2239663745
PQPubID	2040416
PageCount	14
ParticipantIDs	proquest_journals_2239663745 crossref_primary_10_1109_JETCAS_2019_2910864 crossref_citationtrail_10_1109_JETCAS_2019_2910864 ieee_primary_8689061
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2019-06-01
PublicationDateYYYYMMDD	2019-06-01
PublicationDate_xml	– month: 06 year: 2019 text: 2019-06-01 day: 01
PublicationDecade	2010
PublicationPlace	Piscataway
PublicationPlace_xml	– name: Piscataway
PublicationTitle	IEEE journal on emerging and selected topics in circuits and systems
PublicationTitleAbbrev	JETCAS
PublicationYear	2019
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref12 aghasi (ref13) 2017 goodfellow (ref34) 2016 ref11 jenatton (ref52) 2011; 12 xu (ref50) 2015 ref19 srivastava (ref10) 2014; 15 han (ref3) 2015 sindhwani (ref15) 2015 ref51 wen (ref14) 2016 guan (ref33) 2017 ref46 lewis (ref41) 2004; 5 ref48 ref44 chen (ref7) 2015 ref9 ref4 krizhevsky (ref1) 2012 ref5 masters (ref39) 2018 wang (ref16) 2018 lecun (ref40) 2018 coates (ref2) 2013; 28 ref37 ref31 ref30 prabhu (ref18) 2017 hinton (ref42) 2012 garofolo (ref43) 2018 krizhevsky (ref45) 2009 bourely (ref17) 2017 kingma (ref47) 2014 han (ref8) 2015 wang (ref32) 2017; 36 ref24 ref23 ref26 ref25 ref20 ref22 ref21 yosinski (ref35) 2012 ref28 ref27 goyal (ref38) 2017 ref29 weste (ref36) 2010 gong (ref6) 2014 ba (ref49) 2014
References_xml	– volume: 5 start-page: 361 year: 2004 ident: ref41 article-title: RCV1: A new benchmark collection for text categorization research publication-title: J Mach Learn Res – year: 2017 ident: ref18 publication-title: Deep expander networks Efficient deep networks from graph theory – ident: ref23 doi: 10.1109/JSSC.2016.2616357 – year: 2018 ident: ref40 publication-title: The MNIST Database of Handwritten Digits – ident: ref21 doi: 10.1109/ACSSC.2017.8335713 – year: 2016 ident: ref34 publication-title: Deep Learning – ident: ref12 doi: 10.1145/3007787.3001165 – ident: ref5 doi: 10.1109/CVPR.2015.7298594 – volume: 28 start-page: iii-1337 year: 2013 ident: ref2 article-title: Deep learning with COTS HPC systems publication-title: Proc Int Conf Mach Learn (ICML) – ident: ref9 doi: 10.1145/3007787.3001163 – start-page: 1135 year: 2015 ident: ref3 article-title: Learning both weights and connections for efficient neural network publication-title: Proc Adv Neural Inf Process Syst (NIPS) – year: 2014 ident: ref6 publication-title: Compressing deep convolutional networks using vector quantization – year: 2010 ident: ref36 publication-title: CMOS VLSI Design A Circuits and Systems Perspective – ident: ref19 doi: 10.1038/s41467-018-04316-3 – start-page: 1097 year: 2012 ident: ref1 article-title: ImageNet classification with deep convolutional neural networks publication-title: Proc Adv Neural Inf Process Syst (NIPS) – volume: 15 start-page: 1929 year: 2014 ident: ref10 article-title: Dropout: A simple way to prevent neural networks from overfitting publication-title: J Mach Learn Res – year: 2012 ident: ref42 publication-title: Improving Neural Networks by Preventing Co-adaptation of Feature Detectors – ident: ref31 doi: 10.1109/ISPA/IUCC.2017.00099 – ident: ref25 doi: 10.1109/MICRO.2016.7783723 – ident: ref22 doi: 10.1109/ITA.2018.8502950 – year: 2015 ident: ref8 publication-title: Deep compression Compressing deep neural networks with pruning trained quantization and huffman coding – year: 2014 ident: ref49 publication-title: Multiple object recognition with visual attention – volume: 36 start-page: 513 year: 2017 ident: ref32 article-title: DLAU: A scalable deep learning accelerator unit on FPGA publication-title: IEEE Trans Comput -Aided Design Integr Circuits Syst – ident: ref27 doi: 10.1145/2541940.2541967 – ident: ref29 doi: 10.1109/DATE.2007.364613 – year: 2018 ident: ref39 publication-title: Revisiting small batch training for deep neural networks – year: 2017 ident: ref38 publication-title: Accurate large minibatch sgd Training imagenet in 1 hour – volume: 12 start-page: 2777 year: 2011 ident: ref52 article-title: Structured variable selection with sparsity-inducing norms publication-title: J Mach Learn Res – year: 2018 ident: ref43 publication-title: TIMIT Acoustic-Phonetic Continuous Speech Corpus – year: 2009 ident: ref45 article-title: Learning multiple layers of features from tiny images – ident: ref28 doi: 10.1109/92.784098 – ident: ref4 doi: 10.1145/3079856.3080246 – ident: ref24 doi: 10.1016/j.vlsi.2017.12.009 – start-page: 3177 year: 2017 ident: ref13 article-title: Net-trim: Convex pruning of deep neural networks with performance guarantee publication-title: Proc Adv Neural Inf Process Syst (NIPS) – start-page: 1 year: 2012 ident: ref35 article-title: Visually debugging restricted Boltzmann machine training with a 3D example publication-title: Proc Int Conf Mach Learn (ICML) – start-page: 2048 year: 2015 ident: ref50 article-title: Show, attend and tell: Neural image caption generation with visual attention publication-title: Proc Int Conf Mach Learn (ICML) – start-page: 2074 year: 2016 ident: ref14 article-title: Learning structured sparsity in deep neural networks publication-title: Proc Adv Neural Inf Process Syst (NIPS) – start-page: 3088 year: 2015 ident: ref15 article-title: Structured transforms for small-footprint deep learning publication-title: Proc Adv Neural Inform Process Syst (NIPS) – start-page: 11 year: 2018 ident: ref16 article-title: C-LSTM: Enabling efficient LSTM using structured compression techniques on FPGAs publication-title: Proc ACM Int Symp Field-Program Gate Arrays (FPGA) – ident: ref44 doi: 10.1109/29.46546 – ident: ref26 doi: 10.1145/2847263.2847276 – ident: ref37 doi: 10.1109/RECONFIG.2018.8641739 – ident: ref48 doi: 10.1109/ICCCNT.2018.8494011 – ident: ref20 doi: 10.1007/978-3-319-68600-4_32 – ident: ref46 doi: 10.1109/ICCV.2015.123 – year: 2017 ident: ref33 publication-title: Recursive binary neural network learning model with 2 28b/weight storage requirement – year: 2017 ident: ref17 publication-title: Sparse neural networks topologies – ident: ref30 doi: 10.1109/VTC.2001.957178 – year: 2014 ident: ref47 article-title: Adam: A method for stochastic optimization publication-title: Proc Int Conf Learn Represent – ident: ref11 doi: 10.1145/3007787.3001138 – start-page: 1 year: 2015 ident: ref7 article-title: Compressing neural networks with the hashing trick publication-title: Proc Int Conf Mach Learn (ICML) – ident: ref51 doi: 10.1093/imanum/20.3.389
SSID	ssj0000561644
Score	2.3223886
Snippet	Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	332
SubjectTerms	Acceleration Artificial intelligence Artificial neural networks Complexity Complexity theory Computation Computer architecture Energy storage Field programmable gate arrays Hardware hardware acceleration Inference Junctions Machine learning multilayer perceptron Multilayer perceptrons neural network Neural networks Neurons Sparsity Training
Title	Pre-Defined Sparse Neural Networks With Hardware Acceleration
URI	https://ieeexplore.ieee.org/document/8689061 https://www.proquest.com/docview/2239663745
Volume	9
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF5qT3rwLdYXOXjsxiT76h48FB8UwSJU0VvYxwRFaUtNEfz17m7S4AvxFsJu2MxM8s3sznyD0LH1xYwOljAklGJKjcJaCIIJyAwIFA5T_Yb-9ZAP7ujVA3tooW5TCwMAIfkMYn8ZzvLtxMz9VtlJj_dk4mOdJRe4VbVazX6K94R56N3qQIxjQpioSYbSRJ5cXdye9Uc-k0vGmfTdhegXIAqdVX78jgPGXK6h68XqqtSS53he6ti8fyNu_O_y19Fq7WxG_co6NlALxpto5RMF4RY6vZkBPofC3bHRaOrCXIg8YYebNqwyxF-j-6fyMfJH_G9qBlHfGAdVleFso7tL984DXLdUwCaTosQiNSwFyALrvQuqdaKMJUZLyiS3xhZcUc5A9TSxwhehFpwpUmSBo0doTnZQezwZwy6KlNOkYlZwZRwSpoVOnDUkxjBqMptw3UHZQr65qfnGfduLlzzEHYnMK6XkXil5rZQO6jaTphXdxt_Dt7yYm6G1hDvoYKHIvP4kX3PnB7nQjgjK9n6ftY-W_bOrPLAD1C5nczh0Hkepj4KpfQCwuM_1
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1Lb9NAEB6VcgAOvAoiUMAHuLGps8_soYeIUqWvCKmt6M3sYywQKK0SRxX8Fv4K_43ZtRPxErdK3Cxr17Z2PvubWc98A_AipmJGoiWGpZRMyuCYN0YwgZajwJo4NW3oH030-FTun6mzNfi2qoVBxJx8hv10mP_lx_OwSFtlW0M9tMQ_XQrlAX65pABtvr23Q9Z8yfnum5PXY9b1EGCBW9MwMwhqgMizzDtFkb50IYrgrVRWxxBr7aRW6IZeRJOqLmutnKh5FqUxXgu67jW4Tn6G4m112GoHJ_neOneLJdrUTAhlOlmjQWm39ulhRscpd8z2uU39jOQv1Jd7ufxBAJnVdu_A9-V6tMksn_qLxvfD19-kIv_XBbsLtzt3uhi1-L8Hazi9D7d-ElncgO23M2Q7WNOZWBxfUCCPRZIkoWmTNgd-Xrz72HwoUhLDpZthMQqByLh9NR7A6ZU8_0NYn55P8REUjrDqVDTaBeL6Qe1LwnsZgpKBx1L7HvClPavQKaqnxh6fqxxZlbZqQVAlEFQdCHrwajXpohUU-ffwjWTW1dDOoj3YXAKn6j4684o8PQpehZHq8d9nPYcb45Ojw-pwb3LwBG6m-7RZb5uw3swW-JT8q8Y_yzAv4P1Vw-QHS_YtRA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Pre-Defined+Sparse+Neural+Networks+With+Hardware+Acceleration&rft.jtitle=IEEE+journal+on+emerging+and+selected+topics+in+circuits+and+systems&rft.au=Dey%2C+Sourya&rft.au=Huang%2C+Kuan-Wen&rft.au=Beerel%2C+Peter+A.&rft.au=Chugg%2C+Keith+M.&rft.date=2019-06-01&rft.issn=2156-3357&rft.eissn=2156-3365&rft.volume=9&rft.issue=2&rft.spage=332&rft.epage=345&rft_id=info:doi/10.1109%2FJETCAS.2019.2910864&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_JETCAS_2019_2910864
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2156-3357&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2156-3357&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2156-3357&client=summon