Approximate Memory Compression

Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on very large scale integration (VLSI) systems Vol. 28; no. 4; pp. 980 - 991
Main Authors Ranjan, Ashish, Raha, Arnab, Raghunathan, Vijay, Raghunathan, Anand
Format Journal Article
LanguageEnglish
Published New York IEEE 01.04.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1063-8210
1557-9999
DOI10.1109/TVLSI.2020.2970041

Cover

Abstract Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to reduce off-chip memory traffic, thereby improving energy and performance. We realize approximate memory compression by enhancing the memory controller to be aware of approximate memory regions-regions in memory that contain approximation-resilient data-and to transparently compress (decompress) the data written to (read from) these regions. To provide control over approximations, each approximate memory region is associated with an error constraint such as the maximum error that may be introduced in each data element. The quality-aware memory controller subjects memory transactions to a compression scheme that introduces approximations, thereby reducing memory traffic, while adhering to the specified error constraint for each approximate memory region. A software interface is provided to allow programmers to identify data structures (DSs) that are resilient to approximations. A runtime quality control framework automatically determines the error constraints for the identified DSs such that a given target application-level quality is maintained. We evaluate our proposal by applying it to three different main memory technologies in the context of a general-purpose computing system-DDR3 DRAM, LPDDR3 DRAM, and spin-transfer torque magnetic RAM (STT-MRAM). To demonstrate the feasibility of the proposed concepts, we also implement a hardware prototype using the Intel UniPHY-DDR3 memory controller and Nios-II processor, a Hynix DDR3 DRAM module, and a Stratix-IV field-programmable gate array (FPGA) development board. Across a wide range of machine learning benchmarks, approximate memory compression obtains significant benefits in main memory energy (<inline-formula> <tex-math notation="LaTeX">1.18\times </tex-math></inline-formula> for DDR3 DRAM, <inline-formula> <tex-math notation="LaTeX">1.52\times </tex-math></inline-formula> for LPDDR3 DRAM, and <inline-formula> <tex-math notation="LaTeX">2.0\times </tex-math></inline-formula> for STT-MRAM) and a simultaneous improvement in execution time (5.2% for DDR3 DRAM, 5.4% for LPDDR3 DRAM, and 9.3% for STT-MRAM) with nearly identical application output quality.
AbstractList Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to reduce off-chip memory traffic, thereby improving energy and performance. We realize approximate memory compression by enhancing the memory controller to be aware of approximate memory regions—regions in memory that contain approximation-resilient data—and to transparently compress (decompress) the data written to (read from) these regions. To provide control over approximations, each approximate memory region is associated with an error constraint such as the maximum error that may be introduced in each data element. The quality-aware memory controller subjects memory transactions to a compression scheme that introduces approximations, thereby reducing memory traffic, while adhering to the specified error constraint for each approximate memory region. A software interface is provided to allow programmers to identify data structures (DSs) that are resilient to approximations. A runtime quality control framework automatically determines the error constraints for the identified DSs such that a given target application-level quality is maintained. We evaluate our proposal by applying it to three different main memory technologies in the context of a general-purpose computing system—DDR3 DRAM, LPDDR3 DRAM, and spin-transfer torque magnetic RAM (STT-MRAM). To demonstrate the feasibility of the proposed concepts, we also implement a hardware prototype using the Intel UniPHY-DDR3 memory controller and Nios-II processor, a Hynix DDR3 DRAM module, and a Stratix-IV field-programmable gate array (FPGA) development board. Across a wide range of machine learning benchmarks, approximate memory compression obtains significant benefits in main memory energy ([Formula Omitted] for DDR3 DRAM, [Formula Omitted] for LPDDR3 DRAM, and [Formula Omitted] for STT-MRAM) and a simultaneous improvement in execution time (5.2% for DDR3 DRAM, 5.4% for LPDDR3 DRAM, and 9.3% for STT-MRAM) with nearly identical application output quality.
Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to reduce off-chip memory traffic, thereby improving energy and performance. We realize approximate memory compression by enhancing the memory controller to be aware of approximate memory regions-regions in memory that contain approximation-resilient data-and to transparently compress (decompress) the data written to (read from) these regions. To provide control over approximations, each approximate memory region is associated with an error constraint such as the maximum error that may be introduced in each data element. The quality-aware memory controller subjects memory transactions to a compression scheme that introduces approximations, thereby reducing memory traffic, while adhering to the specified error constraint for each approximate memory region. A software interface is provided to allow programmers to identify data structures (DSs) that are resilient to approximations. A runtime quality control framework automatically determines the error constraints for the identified DSs such that a given target application-level quality is maintained. We evaluate our proposal by applying it to three different main memory technologies in the context of a general-purpose computing system-DDR3 DRAM, LPDDR3 DRAM, and spin-transfer torque magnetic RAM (STT-MRAM). To demonstrate the feasibility of the proposed concepts, we also implement a hardware prototype using the Intel UniPHY-DDR3 memory controller and Nios-II processor, a Hynix DDR3 DRAM module, and a Stratix-IV field-programmable gate array (FPGA) development board. Across a wide range of machine learning benchmarks, approximate memory compression obtains significant benefits in main memory energy (<inline-formula> <tex-math notation="LaTeX">1.18\times </tex-math></inline-formula> for DDR3 DRAM, <inline-formula> <tex-math notation="LaTeX">1.52\times </tex-math></inline-formula> for LPDDR3 DRAM, and <inline-formula> <tex-math notation="LaTeX">2.0\times </tex-math></inline-formula> for STT-MRAM) and a simultaneous improvement in execution time (5.2% for DDR3 DRAM, 5.4% for LPDDR3 DRAM, and 9.3% for STT-MRAM) with nearly identical application output quality.
Author Raha, Arnab
Raghunathan, Anand
Ranjan, Ashish
Raghunathan, Vijay
Author_xml – sequence: 1
  givenname: Ashish
  orcidid: 0000-0003-2434-0475
  surname: Ranjan
  fullname: Ranjan, Ashish
  email: aranjan@purdue.edu
  organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
– sequence: 2
  givenname: Arnab
  orcidid: 0000-0002-8848-1069
  surname: Raha
  fullname: Raha, Arnab
  email: araha@purdue.edu
  organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
– sequence: 3
  givenname: Vijay
  orcidid: 0000-0003-4713-5386
  surname: Raghunathan
  fullname: Raghunathan, Vijay
  email: vr@purdue.edu
  organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
– sequence: 4
  givenname: Anand
  surname: Raghunathan
  fullname: Raghunathan, Anand
  email: raghunathan@purdue.edu
  organization: School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
BookMark eNp9UE1LAzEQDVLBtvoHFKTgedfJbDa7OZbiR6Hiweo1pJssbGk3a7IF---d2uLBg3OZgffevJk3YoPWt46xaw4p56Dulx-Lt3mKgJCiKgAEP2NDnudFoqgGNIPMkhI5XLBRjGsALoSCIbuddl3wX83W9G7y4rY-7Cczv-2Ci7Hx7SU7r80muqtTH7P3x4fl7DlZvD7NZ9NFUqHK-6Q0JdYm57VVJZI72EoKt0KbVRYzi4UkqDaGS7niRLBW5oSAEhZRCpWN2d1xLx3zuXOx12u_Cy1ZasxKXij6RRALj6wq-BiDq3UX6PKw1xz0IQf9k4M-5KBPOZCo_COqmt709FwfTLP5X3pzlDbOuV8vRVieiewbzCtrmw
CODEN IEVSE9
CitedBy_id crossref_primary_10_1109_ACCESS_2024_3467375
crossref_primary_10_1109_COMST_2023_3302474
crossref_primary_10_1109_TCAD_2023_3267713
crossref_primary_10_1109_ACCESS_2020_3023047
crossref_primary_10_1016_j_compeleceng_2024_109106
crossref_primary_10_1145_3481641
crossref_primary_10_1109_JIOT_2024_3365306
crossref_primary_10_1145_3589766
crossref_primary_10_1587_transfun_2022VLP0001
crossref_primary_10_1016_j_future_2023_12_001
crossref_primary_10_1145_3711683
Cites_doi 10.1145/1944862.1944876
10.2200/S00683ED1V01Y201511CAC036
10.1109/ISCA.2005.6
10.1145/775832.775912
10.1145/2744769.2751163
10.1109/MDT.2005.134
10.1109/40.918003
10.1145/2627369.2627626
10.1109/ISLPED.2017.8009198
10.1145/2540708.2540724
10.1109/ISVLSI.2012.82
10.1109/TC.2016.2640296
10.1109/ASPDAC.2016.7428043
10.1145/1950365.1950391
10.1007/978-3-031-01722-3
10.1109/MICRO.2016.7783744
10.1145/2463209.2488873
10.1145/2024716.2024718
10.1109/ISPASS.2013.6557176
10.1109/LES.2015.2393860
10.1145/1394608.1382159
10.1145/2830772.2830790
10.1109/HOTCHIPS.2011.7477494
10.1109/TVLSI.2017.2776954
10.1109/MDAT.2015.2505723
10.1145/2744769.2744799
10.1109/MICRO.2016.7783746
10.1109/TCSVT.2011.2105550
10.1109/43.811316
10.1145/2644808
10.1109/JPROC.2008.917729
10.1109/MICRO.2014.22
10.1109/ISLPED.2017.8009173
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
RIA
RIE
AAYXX
CITATION
7SP
8FD
L7M
DOI 10.1109/TVLSI.2020.2970041
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Electronics & Communications Abstracts
Technology Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Technology Research Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1557-9999
EndPage 991
ExternalDocumentID 10_1109_TVLSI_2020_2970041
9004534
Genre orig-research
GrantInformation_xml – fundername: National Science Foundation
  grantid: 1423290
  funderid: 10.13039/501100008982
– fundername: Defense Advanced Research Projects Agency
  funderid: 10.13039/100000185
GroupedDBID -~X
.DC
0R~
29I
3EH
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
TN5
VH1
AAYOK
AAYXX
CITATION
RIG
7SP
8FD
L7M
ID FETCH-LOGICAL-c295t-8a82fa51fd9820040dc64eb2d3cd23d2761fdfaa166b1200dd65cd2094d226493
IEDL.DBID RIE
ISSN 1063-8210
IngestDate Mon Jun 30 10:09:39 EDT 2025
Thu Apr 24 22:56:33 EDT 2025
Tue Jul 01 02:17:47 EDT 2025
Wed Aug 27 02:36:26 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c295t-8a82fa51fd9820040dc64eb2d3cd23d2761fdfaa166b1200dd65cd2094d226493
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-4713-5386
0000-0003-2434-0475
0000-0002-8848-1069
PQID 2381791554
PQPubID 85424
PageCount 12
ParticipantIDs proquest_journals_2381791554
ieee_primary_9004534
crossref_citationtrail_10_1109_TVLSI_2020_2970041
crossref_primary_10_1109_TVLSI_2020_2970041
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2020-04-01
PublicationDateYYYYMMDD 2020-04-01
PublicationDate_xml – month: 04
  year: 2020
  text: 2020-04-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on very large scale integration (VLSI) systems
PublicationTitleAbbrev TVLSI
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref35
ref13
ref15
ref14
ref30
ref33
ref11
ref10
(ref39) 2018
ref2
ref1
ref17
lucas (ref12) 2014
ref16
ref19
ref18
binkert (ref36) 2011; 39
ranjan (ref22) 2017
barroso (ref34) 2009
ref24
ref23
ref26
ref25
ref20
ref42
benini (ref31) 2003
ref21
(ref41) 2013
ref28
(ref38) 2018
ref27
ref29
(ref37) 2018
(ref40) 2015
ref8
ref7
ref9
ref4
ref3
ref6
ref5
(ref32) 2018
References_xml – ident: ref26
  doi: 10.1145/1944862.1944876
– ident: ref24
  doi: 10.2200/S00683ED1V01Y201511CAC036
– year: 2018
  ident: ref32
  publication-title: Everspin STT-MRAM
– year: 2018
  ident: ref37
  publication-title: Micron MT41J512M8 Datasheet
– year: 2018
  ident: ref39
  publication-title: Everspin 256Mb DDR3 Spin-Torque MRAM (EMD3D256M08G1)
– ident: ref28
  doi: 10.1109/ISCA.2005.6
– ident: ref29
  doi: 10.1145/775832.775912
– year: 2015
  ident: ref40
  publication-title: Nios-II Processor
– ident: ref7
  doi: 10.1145/2744769.2751163
– ident: ref1
  doi: 10.1109/MDT.2005.134
– ident: ref25
  doi: 10.1109/40.918003
– ident: ref13
  doi: 10.1145/2627369.2627626
– ident: ref23
  doi: 10.1109/ISLPED.2017.8009198
– ident: ref27
  doi: 10.1145/2540708.2540724
– ident: ref35
  doi: 10.1109/ISVLSI.2012.82
– ident: ref10
  doi: 10.1109/TC.2016.2640296
– ident: ref11
  doi: 10.1109/ASPDAC.2016.7428043
– ident: ref9
  doi: 10.1145/1950365.1950391
– year: 2009
  ident: ref34
  publication-title: The Datacenter as a Computer An Introduction to the Design of Warehouse-Scale Machines
  doi: 10.1007/978-3-031-01722-3
– start-page: 1
  year: 2014
  ident: ref12
  article-title: Sparkk: Quality-scalable approximate storage in DRAM
  publication-title: Memory Forum Workshop
– ident: ref17
  doi: 10.1109/MICRO.2016.7783744
– ident: ref6
  doi: 10.1145/2463209.2488873
– volume: 39
  start-page: 1
  year: 2011
  ident: ref36
  article-title: The gem5 simulator
  publication-title: SIGARCH Comput Archit News
  doi: 10.1145/2024716.2024718
– ident: ref33
  doi: 10.1109/ISPASS.2013.6557176
– year: 2013
  ident: ref41
  publication-title: Introduction to UniPHY IP
– ident: ref18
  doi: 10.1109/LES.2015.2393860
– ident: ref2
  doi: 10.1145/1394608.1382159
– ident: ref16
  doi: 10.1145/2830772.2830790
– ident: ref3
  doi: 10.1109/HOTCHIPS.2011.7477494
– ident: ref4
  doi: 10.1109/TVLSI.2017.2776954
– start-page: 356
  year: 2017
  ident: ref22
  article-title: STAxCache: An approximate, energy efficient STT-MRAM cache
  publication-title: Proc Eur Conf Exhib Design Autom Test
– ident: ref8
  doi: 10.1109/MDAT.2015.2505723
– ident: ref21
  doi: 10.1145/2744769.2744799
– ident: ref19
  doi: 10.1109/MICRO.2016.7783746
– ident: ref20
  doi: 10.1109/TCSVT.2011.2105550
– year: 2018
  ident: ref38
  publication-title: Micron EDF8164A1MA Datasheet
– ident: ref30
  doi: 10.1109/43.811316
– ident: ref14
  doi: 10.1145/2644808
– ident: ref5
  doi: 10.1109/JPROC.2008.917729
– start-page: 449
  year: 2003
  ident: ref31
  article-title: Hardware-assisted data compression for energy minimization in systems with embedded processors
  publication-title: Proc Design Autom Test Eur Conf Exhibition (DATE)
– ident: ref15
  doi: 10.1109/MICRO.2014.22
– ident: ref42
  doi: 10.1109/ISLPED.2017.8009173
SSID ssj0014490
Score 2.3784068
Snippet Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 980
SubjectTerms Approximate memories
Computation
Controllers
Data structures
DRAM
Field programmable gate arrays
Hardware
Machine learning
main memory
memory compression
Micromechanical devices
Microprocessors
Program processors
Quality control
Random access memory
Runtime
spin-transfer torque magnetic RAM (STT-MRAM)
Traffic congestion
Title Approximate Memory Compression
URI https://ieeexplore.ieee.org/document/9004534
https://www.proquest.com/docview/2381791554
Volume 28
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLa2neDAayAGA_XADdI1aZo1xwkxDcS4sKHdquZRCYE6BJ0E_HqS9CFeQtwqxaki14k_N_ZngJNMYUFIoBEz_gNRQSMkSJShWHCBsWQsUi5B9oZN5vRqES1acNbUwmitXfKZ9u2ju8tXS7myv8oG3AKQkLahbcysrNVqbgwo5SXzAAtRbOKYukAm4IPZ3fXtpQkFSeATbunc8Rcn5Lqq_DiKnX8Zb8K0XlmZVvLgrwrhy_dvpI3_XfoWbFRA0xuVlrENLZ3vwPon-sEuHI8sofjrvQGt2pvalNs3z54PZWpsvgvz8cXsfIKqfglIEh4VKE5jkqURzhSPrfEHSjJqImcVSkVCRYbMDGVpihkT2AgoxSIzYgI8ZctpebgHnXyZ633wYgMcbO8-s_81VRlOpRgaV27Qi8GPUsY9wLUCE1mRidueFo-JCyoCnjilJ1bpSaX0Hpw2c55KKo0_pbtWi41kpcAe9OvvlFS77SWxsGNoie7pwe-zDmHNvrvMuOlDp3he6SMDJgpx7KzoA--owko
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLbGOAAHXgMxGKMHbtCuSZOsOU4ItMG2CxvarWoelRCoQ9BJwK8nabuKlxC3SknUyHHtz439GeA0UUhg7GuXGf_hEkGoKzBN3FBwgZBkjKo8QXbM-lNyPaOzGpxXtTBa6zz5THv2Mb_LV3O5sL_KOtwCkICswKrx-4QW1VrVnQEhvOAeYIEbmkhmWSLj887kbng7MMEg9j3MLaE7-uKG8r4qP4xx7mGutmC03FuRWPLgLTLhyfdvtI3_3fw2bJZQ0-kVurEDNZ3uwsYnAsIGtHuWUvz13sBW7Yxs0u2bYy1EkRyb7sH06nJy0XfLjgmuxJxmbhiHOIkpShQPrfr7SjJiYmcVSIUDhbvMDCVxjBgTyExQilEzYkI8ZQtqebAP9XSe6gNwQgMdbPc-YwE0UQmKpegaZ27wi0GQUoZNQEsBRrKkE7ddLR6jPKzweZQLPbJCj0qhN-GsWvNUkGn8ObthpVjNLAXYhNbynKLye3uJLPDoWqp7cvj7qhNY609Gw2g4GN8cwbp9T5F_04J69rzQxwZaZKKda9QHTkLFlw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Approximate+Memory+Compression&rft.jtitle=IEEE+transactions+on+very+large+scale+integration+%28VLSI%29+systems&rft.au=Ranjan%2C+Ashish&rft.au=Raha%2C+Arnab&rft.au=Raghunathan%2C+Vijay&rft.au=Raghunathan%2C+Anand&rft.date=2020-04-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1063-8210&rft.eissn=1557-9999&rft.volume=28&rft.issue=4&rft.spage=980&rft_id=info:doi/10.1109%2FTVLSI.2020.2970041&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-8210&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-8210&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-8210&client=summon