Approximate Memory Compression
Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to...
Saved in:
Published in | IEEE transactions on very large scale integration (VLSI) systems Vol. 28; no. 4; pp. 980 - 991 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.04.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 1063-8210 1557-9999 |
DOI | 10.1109/TVLSI.2020.2970041 |
Cover
Abstract | Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to reduce off-chip memory traffic, thereby improving energy and performance. We realize approximate memory compression by enhancing the memory controller to be aware of approximate memory regions-regions in memory that contain approximation-resilient data-and to transparently compress (decompress) the data written to (read from) these regions. To provide control over approximations, each approximate memory region is associated with an error constraint such as the maximum error that may be introduced in each data element. The quality-aware memory controller subjects memory transactions to a compression scheme that introduces approximations, thereby reducing memory traffic, while adhering to the specified error constraint for each approximate memory region. A software interface is provided to allow programmers to identify data structures (DSs) that are resilient to approximations. A runtime quality control framework automatically determines the error constraints for the identified DSs such that a given target application-level quality is maintained. We evaluate our proposal by applying it to three different main memory technologies in the context of a general-purpose computing system-DDR3 DRAM, LPDDR3 DRAM, and spin-transfer torque magnetic RAM (STT-MRAM). To demonstrate the feasibility of the proposed concepts, we also implement a hardware prototype using the Intel UniPHY-DDR3 memory controller and Nios-II processor, a Hynix DDR3 DRAM module, and a Stratix-IV field-programmable gate array (FPGA) development board. Across a wide range of machine learning benchmarks, approximate memory compression obtains significant benefits in main memory energy (<inline-formula> <tex-math notation="LaTeX">1.18\times </tex-math></inline-formula> for DDR3 DRAM, <inline-formula> <tex-math notation="LaTeX">1.52\times </tex-math></inline-formula> for LPDDR3 DRAM, and <inline-formula> <tex-math notation="LaTeX">2.0\times </tex-math></inline-formula> for STT-MRAM) and a simultaneous improvement in execution time (5.2% for DDR3 DRAM, 5.4% for LPDDR3 DRAM, and 9.3% for STT-MRAM) with nearly identical application output quality. |
---|---|
AbstractList | Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to reduce off-chip memory traffic, thereby improving energy and performance. We realize approximate memory compression by enhancing the memory controller to be aware of approximate memory regions—regions in memory that contain approximation-resilient data—and to transparently compress (decompress) the data written to (read from) these regions. To provide control over approximations, each approximate memory region is associated with an error constraint such as the maximum error that may be introduced in each data element. The quality-aware memory controller subjects memory transactions to a compression scheme that introduces approximations, thereby reducing memory traffic, while adhering to the specified error constraint for each approximate memory region. A software interface is provided to allow programmers to identify data structures (DSs) that are resilient to approximations. A runtime quality control framework automatically determines the error constraints for the identified DSs such that a given target application-level quality is maintained. We evaluate our proposal by applying it to three different main memory technologies in the context of a general-purpose computing system—DDR3 DRAM, LPDDR3 DRAM, and spin-transfer torque magnetic RAM (STT-MRAM). To demonstrate the feasibility of the proposed concepts, we also implement a hardware prototype using the Intel UniPHY-DDR3 memory controller and Nios-II processor, a Hynix DDR3 DRAM module, and a Stratix-IV field-programmable gate array (FPGA) development board. Across a wide range of machine learning benchmarks, approximate memory compression obtains significant benefits in main memory energy ([Formula Omitted] for DDR3 DRAM, [Formula Omitted] for LPDDR3 DRAM, and [Formula Omitted] for STT-MRAM) and a simultaneous improvement in execution time (5.2% for DDR3 DRAM, 5.4% for LPDDR3 DRAM, and 9.3% for STT-MRAM) with nearly identical application output quality. Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to reduce off-chip memory traffic, thereby improving energy and performance. We realize approximate memory compression by enhancing the memory controller to be aware of approximate memory regions-regions in memory that contain approximation-resilient data-and to transparently compress (decompress) the data written to (read from) these regions. To provide control over approximations, each approximate memory region is associated with an error constraint such as the maximum error that may be introduced in each data element. The quality-aware memory controller subjects memory transactions to a compression scheme that introduces approximations, thereby reducing memory traffic, while adhering to the specified error constraint for each approximate memory region. A software interface is provided to allow programmers to identify data structures (DSs) that are resilient to approximations. A runtime quality control framework automatically determines the error constraints for the identified DSs such that a given target application-level quality is maintained. We evaluate our proposal by applying it to three different main memory technologies in the context of a general-purpose computing system-DDR3 DRAM, LPDDR3 DRAM, and spin-transfer torque magnetic RAM (STT-MRAM). To demonstrate the feasibility of the proposed concepts, we also implement a hardware prototype using the Intel UniPHY-DDR3 memory controller and Nios-II processor, a Hynix DDR3 DRAM module, and a Stratix-IV field-programmable gate array (FPGA) development board. Across a wide range of machine learning benchmarks, approximate memory compression obtains significant benefits in main memory energy (<inline-formula> <tex-math notation="LaTeX">1.18\times </tex-math></inline-formula> for DDR3 DRAM, <inline-formula> <tex-math notation="LaTeX">1.52\times </tex-math></inline-formula> for LPDDR3 DRAM, and <inline-formula> <tex-math notation="LaTeX">2.0\times </tex-math></inline-formula> for STT-MRAM) and a simultaneous improvement in execution time (5.2% for DDR3 DRAM, 5.4% for LPDDR3 DRAM, and 9.3% for STT-MRAM) with nearly identical application output quality. |
Author | Raha, Arnab Raghunathan, Anand Ranjan, Ashish Raghunathan, Vijay |
Author_xml | – sequence: 1 givenname: Ashish orcidid: 0000-0003-2434-0475 surname: Ranjan fullname: Ranjan, Ashish email: aranjan@purdue.edu organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA – sequence: 2 givenname: Arnab orcidid: 0000-0002-8848-1069 surname: Raha fullname: Raha, Arnab email: araha@purdue.edu organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA – sequence: 3 givenname: Vijay orcidid: 0000-0003-4713-5386 surname: Raghunathan fullname: Raghunathan, Vijay email: vr@purdue.edu organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA – sequence: 4 givenname: Anand surname: Raghunathan fullname: Raghunathan, Anand email: raghunathan@purdue.edu organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA |
BookMark | eNp9UE1LAzEQDVLBtvoHFKTgedfJbDa7OZbiR6Hiweo1pJssbGk3a7IF---d2uLBg3OZgffevJk3YoPWt46xaw4p56Dulx-Lt3mKgJCiKgAEP2NDnudFoqgGNIPMkhI5XLBRjGsALoSCIbuddl3wX83W9G7y4rY-7Cczv-2Ci7Hx7SU7r80muqtTH7P3x4fl7DlZvD7NZ9NFUqHK-6Q0JdYm57VVJZI72EoKt0KbVRYzi4UkqDaGS7niRLBW5oSAEhZRCpWN2d1xLx3zuXOx12u_Cy1ZasxKXij6RRALj6wq-BiDq3UX6PKw1xz0IQf9k4M-5KBPOZCo_COqmt709FwfTLP5X3pzlDbOuV8vRVieiewbzCtrmw |
CODEN | IEVSE9 |
CitedBy_id | crossref_primary_10_1109_ACCESS_2024_3467375 crossref_primary_10_1109_COMST_2023_3302474 crossref_primary_10_1109_TCAD_2023_3267713 crossref_primary_10_1109_ACCESS_2020_3023047 crossref_primary_10_1016_j_compeleceng_2024_109106 crossref_primary_10_1145_3481641 crossref_primary_10_1109_JIOT_2024_3365306 crossref_primary_10_1145_3589766 crossref_primary_10_1587_transfun_2022VLP0001 crossref_primary_10_1016_j_future_2023_12_001 crossref_primary_10_1145_3711683 |
Cites_doi | 10.1145/1944862.1944876 10.2200/S00683ED1V01Y201511CAC036 10.1109/ISCA.2005.6 10.1145/775832.775912 10.1145/2744769.2751163 10.1109/MDT.2005.134 10.1109/40.918003 10.1145/2627369.2627626 10.1109/ISLPED.2017.8009198 10.1145/2540708.2540724 10.1109/ISVLSI.2012.82 10.1109/TC.2016.2640296 10.1109/ASPDAC.2016.7428043 10.1145/1950365.1950391 10.1007/978-3-031-01722-3 10.1109/MICRO.2016.7783744 10.1145/2463209.2488873 10.1145/2024716.2024718 10.1109/ISPASS.2013.6557176 10.1109/LES.2015.2393860 10.1145/1394608.1382159 10.1145/2830772.2830790 10.1109/HOTCHIPS.2011.7477494 10.1109/TVLSI.2017.2776954 10.1109/MDAT.2015.2505723 10.1145/2744769.2744799 10.1109/MICRO.2016.7783746 10.1109/TCSVT.2011.2105550 10.1109/43.811316 10.1145/2644808 10.1109/JPROC.2008.917729 10.1109/MICRO.2014.22 10.1109/ISLPED.2017.8009173 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
DBID | 97E RIA RIE AAYXX CITATION 7SP 8FD L7M |
DOI | 10.1109/TVLSI.2020.2970041 |
DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace |
DatabaseTitle | CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1557-9999 |
EndPage | 991 |
ExternalDocumentID | 10_1109_TVLSI_2020_2970041 9004534 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Science Foundation grantid: 1423290 funderid: 10.13039/501100008982 – fundername: Defense Advanced Research Projects Agency funderid: 10.13039/100000185 |
GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYOK AAYXX CITATION RIG 7SP 8FD L7M |
ID | FETCH-LOGICAL-c295t-8a82fa51fd9820040dc64eb2d3cd23d2761fdfaa166b1200dd65cd2094d226493 |
IEDL.DBID | RIE |
ISSN | 1063-8210 |
IngestDate | Mon Jun 30 10:09:39 EDT 2025 Thu Apr 24 22:56:33 EDT 2025 Tue Jul 01 02:17:47 EDT 2025 Wed Aug 27 02:36:26 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c295t-8a82fa51fd9820040dc64eb2d3cd23d2761fdfaa166b1200dd65cd2094d226493 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-4713-5386 0000-0003-2434-0475 0000-0002-8848-1069 |
PQID | 2381791554 |
PQPubID | 85424 |
PageCount | 12 |
ParticipantIDs | proquest_journals_2381791554 ieee_primary_9004534 crossref_citationtrail_10_1109_TVLSI_2020_2970041 crossref_primary_10_1109_TVLSI_2020_2970041 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2020-04-01 |
PublicationDateYYYYMMDD | 2020-04-01 |
PublicationDate_xml | – month: 04 year: 2020 text: 2020-04-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on very large scale integration (VLSI) systems |
PublicationTitleAbbrev | TVLSI |
PublicationYear | 2020 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref35 ref13 ref15 ref14 ref30 ref33 ref11 ref10 (ref39) 2018 ref2 ref1 ref17 lucas (ref12) 2014 ref16 ref19 ref18 binkert (ref36) 2011; 39 ranjan (ref22) 2017 barroso (ref34) 2009 ref24 ref23 ref26 ref25 ref20 ref42 benini (ref31) 2003 ref21 (ref41) 2013 ref28 (ref38) 2018 ref27 ref29 (ref37) 2018 (ref40) 2015 ref8 ref7 ref9 ref4 ref3 ref6 ref5 (ref32) 2018 |
References_xml | – ident: ref26 doi: 10.1145/1944862.1944876 – ident: ref24 doi: 10.2200/S00683ED1V01Y201511CAC036 – year: 2018 ident: ref32 publication-title: Everspin STT-MRAM – year: 2018 ident: ref37 publication-title: Micron MT41J512M8 Datasheet – year: 2018 ident: ref39 publication-title: Everspin 256Mb DDR3 Spin-Torque MRAM (EMD3D256M08G1) – ident: ref28 doi: 10.1109/ISCA.2005.6 – ident: ref29 doi: 10.1145/775832.775912 – year: 2015 ident: ref40 publication-title: Nios-II Processor – ident: ref7 doi: 10.1145/2744769.2751163 – ident: ref1 doi: 10.1109/MDT.2005.134 – ident: ref25 doi: 10.1109/40.918003 – ident: ref13 doi: 10.1145/2627369.2627626 – ident: ref23 doi: 10.1109/ISLPED.2017.8009198 – ident: ref27 doi: 10.1145/2540708.2540724 – ident: ref35 doi: 10.1109/ISVLSI.2012.82 – ident: ref10 doi: 10.1109/TC.2016.2640296 – ident: ref11 doi: 10.1109/ASPDAC.2016.7428043 – ident: ref9 doi: 10.1145/1950365.1950391 – year: 2009 ident: ref34 publication-title: The Datacenter as a Computer An Introduction to the Design of Warehouse-Scale Machines doi: 10.1007/978-3-031-01722-3 – start-page: 1 year: 2014 ident: ref12 article-title: Sparkk: Quality-scalable approximate storage in DRAM publication-title: Memory Forum Workshop – ident: ref17 doi: 10.1109/MICRO.2016.7783744 – ident: ref6 doi: 10.1145/2463209.2488873 – volume: 39 start-page: 1 year: 2011 ident: ref36 article-title: The gem5 simulator publication-title: SIGARCH Comput Archit News doi: 10.1145/2024716.2024718 – ident: ref33 doi: 10.1109/ISPASS.2013.6557176 – year: 2013 ident: ref41 publication-title: Introduction to UniPHY IP – ident: ref18 doi: 10.1109/LES.2015.2393860 – ident: ref2 doi: 10.1145/1394608.1382159 – ident: ref16 doi: 10.1145/2830772.2830790 – ident: ref3 doi: 10.1109/HOTCHIPS.2011.7477494 – ident: ref4 doi: 10.1109/TVLSI.2017.2776954 – start-page: 356 year: 2017 ident: ref22 article-title: STAxCache: An approximate, energy efficient STT-MRAM cache publication-title: Proc Eur Conf Exhib Design Autom Test – ident: ref8 doi: 10.1109/MDAT.2015.2505723 – ident: ref21 doi: 10.1145/2744769.2744799 – ident: ref19 doi: 10.1109/MICRO.2016.7783746 – ident: ref20 doi: 10.1109/TCSVT.2011.2105550 – year: 2018 ident: ref38 publication-title: Micron EDF8164A1MA Datasheet – ident: ref30 doi: 10.1109/43.811316 – ident: ref14 doi: 10.1145/2644808 – ident: ref5 doi: 10.1109/JPROC.2008.917729 – start-page: 449 year: 2003 ident: ref31 article-title: Hardware-assisted data compression for energy minimization in systems with embedded processors publication-title: Proc Design Autom Test Eur Conf Exhibition (DATE) – ident: ref15 doi: 10.1109/MICRO.2014.22 – ident: ref42 doi: 10.1109/ISLPED.2017.8009173 |
SSID | ssj0014490 |
Score | 2.3784068 |
Snippet | Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 980 |
SubjectTerms | Approximate memories Computation Controllers Data structures DRAM Field programmable gate arrays Hardware Machine learning main memory memory compression Micromechanical devices Microprocessors Program processors Quality control Random access memory Runtime spin-transfer torque magnetic RAM (STT-MRAM) Traffic congestion |
Title | Approximate Memory Compression |
URI | https://ieeexplore.ieee.org/document/9004534 https://www.proquest.com/docview/2381791554 |
Volume | 28 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLa2neDAayAGA_XADdI1aZo1xwkxDcS4sKHdquZRCYE6BJ0E_HqS9CFeQtwqxaki14k_N_ZngJNMYUFIoBEz_gNRQSMkSJShWHCBsWQsUi5B9oZN5vRqES1acNbUwmitXfKZ9u2ju8tXS7myv8oG3AKQkLahbcysrNVqbgwo5SXzAAtRbOKYukAm4IPZ3fXtpQkFSeATbunc8Rcn5Lqq_DiKnX8Zb8K0XlmZVvLgrwrhy_dvpI3_XfoWbFRA0xuVlrENLZ3vwPon-sEuHI8sofjrvQGt2pvalNs3z54PZWpsvgvz8cXsfIKqfglIEh4VKE5jkqURzhSPrfEHSjJqImcVSkVCRYbMDGVpihkT2AgoxSIzYgI8ZctpebgHnXyZ633wYgMcbO8-s_81VRlOpRgaV27Qi8GPUsY9wLUCE1mRidueFo-JCyoCnjilJ1bpSaX0Hpw2c55KKo0_pbtWi41kpcAe9OvvlFS77SWxsGNoie7pwe-zDmHNvrvMuOlDp3he6SMDJgpx7KzoA--owko |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLbGOAAHXgMxGKMHbtCuSZOsOU4ItMG2CxvarWoelRCoQ9BJwK8nabuKlxC3SknUyHHtz439GeA0UUhg7GuXGf_hEkGoKzBN3FBwgZBkjKo8QXbM-lNyPaOzGpxXtTBa6zz5THv2Mb_LV3O5sL_KOtwCkICswKrx-4QW1VrVnQEhvOAeYIEbmkhmWSLj887kbng7MMEg9j3MLaE7-uKG8r4qP4xx7mGutmC03FuRWPLgLTLhyfdvtI3_3fw2bJZQ0-kVurEDNZ3uwsYnAsIGtHuWUvz13sBW7Yxs0u2bYy1EkRyb7sH06nJy0XfLjgmuxJxmbhiHOIkpShQPrfr7SjJiYmcVSIUDhbvMDCVxjBgTyExQilEzYkI8ZQtqebAP9XSe6gNwQgMdbPc-YwE0UQmKpegaZ27wi0GQUoZNQEsBRrKkE7ddLR6jPKzweZQLPbJCj0qhN-GsWvNUkGn8ObthpVjNLAXYhNbynKLye3uJLPDoWqp7cvj7qhNY609Gw2g4GN8cwbp9T5F_04J69rzQxwZaZKKda9QHTkLFlw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Approximate+Memory+Compression&rft.jtitle=IEEE+transactions+on+very+large+scale+integration+%28VLSI%29+systems&rft.au=Ranjan%2C+Ashish&rft.au=Raha%2C+Arnab&rft.au=Raghunathan%2C+Vijay&rft.au=Raghunathan%2C+Anand&rft.date=2020-04-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1063-8210&rft.eissn=1557-9999&rft.volume=28&rft.issue=4&rft.spage=980&rft_id=info:doi/10.1109%2FTVLSI.2020.2970041&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-8210&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-8210&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-8210&client=summon |