MLPerf Inference Benchmark

Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnit...

Full description

Saved in:
Bibliographic Details
Published in2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) pp. 446 - 459
Main Authors Reddi, Vijay Janapa, Cheng, Christine, Kanter, David, Mattson, Peter, Schmuelling, Guenther, Wu, Carole-Jean, Anderson, Brian, Breughe, Maximilien, Charlebois, Mark, Chou, William, Chukka, Ramesh, Coleman, Cody, Davis, Sam, Deng, Pan, Diamos, Greg, Duke, Jared, Fick, Dave, Gardner, J. Scott, Hubara, Itay, Idgunji, Sachin, Jablin, Thomas B., Jiao, Jeff, John, Tom St, Kanwar, Pankaj, Lee, David, Liao, Jeffery, Lokhmotov, Anton, Massa, Francisco, Meng, Peng, Micikevicius, Paulius, Osborne, Colin, Pekhimenko, Gennady, Rajan, Arun Tejusve Raghunath, Sequeira, Dilip, Sirasao, Ashish, Sun, Fei, Tang, Hanlin, Thomson, Michael, Wei, Frank, Wu, Ephrem, Xu, Lingjie, Yamada, Koichi, Yu, Bing, Yuan, George, Zhong, Aaron, Zhang, Peizhao, Zhou, Yuchen
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2020
Subjects
Online AccessGet full text
DOI10.1109/ISCA45697.2020.00045

Cover

Loading…
Abstract Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.
AbstractList Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.
Author Massa, Francisco
Chukka, Ramesh
Kanwar, Pankaj
Thomson, Michael
Wu, Carole-Jean
Lokhmotov, Anton
Zhou, Yuchen
Lee, David
Kanter, David
Anderson, Brian
Meng, Peng
Micikevicius, Paulius
Chou, William
Liao, Jeffery
Pekhimenko, Gennady
Sirasao, Ashish
Duke, Jared
Gardner, J. Scott
Hubara, Itay
Zhang, Peizhao
Tang, Hanlin
Deng, Pan
Breughe, Maximilien
Sun, Fei
Yamada, Koichi
Jablin, Thomas B.
Cheng, Christine
Xu, Lingjie
Fick, Dave
Osborne, Colin
Zhong, Aaron
Schmuelling, Guenther
Yu, Bing
Davis, Sam
Charlebois, Mark
Coleman, Cody
Diamos, Greg
Idgunji, Sachin
Wei, Frank
John, Tom St
Mattson, Peter
Jiao, Jeff
Rajan, Arun Tejusve Raghunath
Yuan, George
Reddi, Vijay Janapa
Sequeira, Dilip
Wu, Ephrem
Author_xml – sequence: 1
  givenname: Vijay Janapa
  surname: Reddi
  fullname: Reddi, Vijay Janapa
  organization: Harvard University
– sequence: 2
  givenname: Christine
  surname: Cheng
  fullname: Cheng, Christine
  organization: Intel
– sequence: 3
  givenname: David
  surname: Kanter
  fullname: Kanter, David
  organization: Real World Insights
– sequence: 4
  givenname: Peter
  surname: Mattson
  fullname: Mattson, Peter
  organization: Google
– sequence: 5
  givenname: Guenther
  surname: Schmuelling
  fullname: Schmuelling, Guenther
  organization: Microsoft
– sequence: 6
  givenname: Carole-Jean
  surname: Wu
  fullname: Wu, Carole-Jean
  organization: Facebook
– sequence: 7
  givenname: Brian
  surname: Anderson
  fullname: Anderson, Brian
  organization: Google
– sequence: 8
  givenname: Maximilien
  surname: Breughe
  fullname: Breughe, Maximilien
  organization: NVIDIA
– sequence: 9
  givenname: Mark
  surname: Charlebois
  fullname: Charlebois, Mark
  organization: Qualcomm
– sequence: 10
  givenname: William
  surname: Chou
  fullname: Chou, William
  organization: Qualcomm
– sequence: 11
  givenname: Ramesh
  surname: Chukka
  fullname: Chukka, Ramesh
  organization: Intel
– sequence: 12
  givenname: Cody
  surname: Coleman
  fullname: Coleman, Cody
  organization: Stanford University
– sequence: 13
  givenname: Sam
  surname: Davis
  fullname: Davis, Sam
  organization: Myrtle
– sequence: 14
  givenname: Pan
  surname: Deng
  fullname: Deng, Pan
  organization: Tencent
– sequence: 15
  givenname: Greg
  surname: Diamos
  fullname: Diamos, Greg
  organization: Landing AI
– sequence: 16
  givenname: Jared
  surname: Duke
  fullname: Duke, Jared
  organization: Google
– sequence: 17
  givenname: Dave
  surname: Fick
  fullname: Fick, Dave
  organization: Mythic
– sequence: 18
  givenname: J. Scott
  surname: Gardner
  fullname: Gardner, J. Scott
  organization: Advantage Engineering
– sequence: 19
  givenname: Itay
  surname: Hubara
  fullname: Hubara, Itay
  organization: Habana Labs
– sequence: 20
  givenname: Sachin
  surname: Idgunji
  fullname: Idgunji, Sachin
  organization: NVIDIA
– sequence: 21
  givenname: Thomas B.
  surname: Jablin
  fullname: Jablin, Thomas B.
  organization: Google
– sequence: 22
  givenname: Jeff
  surname: Jiao
  fullname: Jiao, Jeff
  organization: Alibaba T-Head
– sequence: 23
  givenname: Tom St
  surname: John
  fullname: John, Tom St
  organization: Tesla
– sequence: 24
  givenname: Pankaj
  surname: Kanwar
  fullname: Kanwar, Pankaj
  organization: Google
– sequence: 25
  givenname: David
  surname: Lee
  fullname: Lee, David
  organization: Facebook (formerly at MediaTek)
– sequence: 26
  givenname: Jeffery
  surname: Liao
  fullname: Liao, Jeffery
  organization: OPPO (formerly at Synopsys)
– sequence: 27
  givenname: Anton
  surname: Lokhmotov
  fullname: Lokhmotov, Anton
  organization: dividiti
– sequence: 28
  givenname: Francisco
  surname: Massa
  fullname: Massa, Francisco
  organization: Facebook
– sequence: 29
  givenname: Peng
  surname: Meng
  fullname: Meng, Peng
  organization: Tencent
– sequence: 30
  givenname: Paulius
  surname: Micikevicius
  fullname: Micikevicius, Paulius
  organization: NVIDIA
– sequence: 31
  givenname: Colin
  surname: Osborne
  fullname: Osborne, Colin
  organization: Arm
– sequence: 32
  givenname: Gennady
  surname: Pekhimenko
  fullname: Pekhimenko, Gennady
  organization: University of Toronto & Vector Institute
– sequence: 33
  givenname: Arun Tejusve Raghunath
  surname: Rajan
  fullname: Rajan, Arun Tejusve Raghunath
  organization: Intel
– sequence: 34
  givenname: Dilip
  surname: Sequeira
  fullname: Sequeira, Dilip
  organization: NVIDIA
– sequence: 35
  givenname: Ashish
  surname: Sirasao
  fullname: Sirasao, Ashish
  organization: Xilinx
– sequence: 36
  givenname: Fei
  surname: Sun
  fullname: Sun, Fei
  organization: Alibaba (formerly at Facebook)
– sequence: 37
  givenname: Hanlin
  surname: Tang
  fullname: Tang, Hanlin
  organization: Intel
– sequence: 38
  givenname: Michael
  surname: Thomson
  fullname: Thomson, Michael
  organization: Centaur Technology
– sequence: 39
  givenname: Frank
  surname: Wei
  fullname: Wei, Frank
  organization: Alibaba Cloud
– sequence: 40
  givenname: Ephrem
  surname: Wu
  fullname: Wu, Ephrem
  organization: Xilinx
– sequence: 41
  givenname: Lingjie
  surname: Xu
  fullname: Xu, Lingjie
  organization: Biren Research (formerly at Alibaba)
– sequence: 42
  givenname: Koichi
  surname: Yamada
  fullname: Yamada, Koichi
  organization: Intel
– sequence: 43
  givenname: Bing
  surname: Yu
  fullname: Yu, Bing
  organization: Facebook (formerly at MediaTek)
– sequence: 44
  givenname: George
  surname: Yuan
  fullname: Yuan, George
  organization: NVIDIA
– sequence: 45
  givenname: Aaron
  surname: Zhong
  fullname: Zhong, Aaron
  organization: Alibaba T-Head
– sequence: 46
  givenname: Peizhao
  surname: Zhang
  fullname: Zhang, Peizhao
  organization: Facebook
– sequence: 47
  givenname: Yuchen
  surname: Zhou
  fullname: Zhou, Yuchen
  organization: General Motors
BookMark eNotjctOwzAQRY0EEvTxA8CiP5B0xh47nmWJKI2UqpWAdWW7YxEBASVs-HuKYHPP5ujciTrvP3pR6hahRAReNo_1iqzjqtSgoQQAsmdqgpX2SM6hvVTzcewiEFrjNeKVut62exnyoumzDNInWdyd9uU9DK8zdZHD2yjzf07V8_r-qd4U7e6hqVdtEbS3XwVCDDajVAmzHIE5OIFM7FOMyWeO7I4sjiWzo1-VwEbyJroUBIjMVN38dTsROXwO3en8-8BoPHs2P2D-PHE
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ISCA45697.2020.00045
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1728146615
9781728146614
EndPage 459
ExternalDocumentID 9138989
Genre orig-research
GroupedDBID 6IE
6IH
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
GUFHI
LHSKQ
RIE
RIO
ID FETCH-LOGICAL-a285t-10ba5f1e7c1fed099a6e0f498cbbc8f9b96d9e69ef964ba5f405b483b6cae0443
IEDL.DBID RIE
IngestDate Wed Aug 06 17:54:09 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a285t-10ba5f1e7c1fed099a6e0f498cbbc8f9b96d9e69ef964ba5f405b483b6cae0443
PageCount 14
ParticipantIDs ieee_primary_9138989
PublicationCentury 2000
PublicationDate 2020-May
PublicationDateYYYYMMDD 2020-05-01
PublicationDate_xml – month: 05
  year: 2020
  text: 2020-May
PublicationDecade 2020
PublicationTitle 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)
PublicationTitleAbbrev ISCA
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib041538211
Score 2.6062334
Snippet Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded....
SourceID ieee
SourceType Publisher
StartPage 446
SubjectTerms Accuracy
Benchmark testing
Benchmarking
Computer architecture
Degradation
Hardware
Inference
Machine Learning
Optimization
Organizations
Servers
Standards organizations
Throughput
Title MLPerf Inference Benchmark
URI https://ieeexplore.ieee.org/document/9138989
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjR3LTgIxcAKcPKkBo_gIB48u7C7dbueoRAJGDImScCOd7jQmRDCb5cLX2-4CRuPBS9M007TT18x0XgC3MZIxJqQgIxSBSIgCcmQviJLMMQvaCUXa_0NOXuRoJp7mybwGdwdfGGYujc-466ulLj9bm43_Kuuh16oprEPdCW6Vr9b-7Ah_c50ws_OOi0LsjV8H9449wNRJgXFYhuVMfuRQKUnI8Bgm-8Ery5Fld1NQ12x_xWX87-xOoPXtrNeZHsjQKdR41YT25HnKue2MDyAPrnz_0PmyBbPh49tgFOzSIAQ6VknhHkrSiY04NZHlzHF0WnJoBSpDZJRFQpkhS2SLUnhQx4ORUH2SRnMoRP8MGqv1is-hk0WKrY0jK_skpLIUpowpKpsaw67HBTQ9XovPKtLFYodS--_mSzjyK1uZ_11Bo8g3fO1IdEE35d58AcsSkNw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gHvSkBozii4NHF7ZL222PSjSgLCEREm5kpzuNCREMWS7-ettdwGg8eGmapk3f_WY6L4DbSKMxJsQgQ80DLhADdLAXMJE5YiF1TFHq_yGToexN-PNUTCtwt7OFIaJC-YxaPlvI8rOlWfuvsrb2UjWl92Df4b5gpbXW9vRwf3cdO7Oxj2Ohbvdfu_eOQNCx4wOjsHDMKX5EUSlA5OkIkm33pe7IvLXOsWU-f3lm_O_4jqH-ba7XHO2A6AQqtKhBIxmMaGWb_V2VB5e-vaereR0mT4_jbi_YBEII0kiJ3D2VmArLKDbMUuZoulRSaLlWBtEoq1HLTJPUZLXkvqqjwpCrDkqTUsh55xSqi-WCzqCZMUXWRszKDnKpLIYx6VgrGxtDrsU51Py8Zh-lr4vZZkqNv4tv4KA3TgazQX_4cgGHfpVLZcBLqOarNV05wM7xutinL5CwlCU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+ACM%2FIEEE+47th+Annual+International+Symposium+on+Computer+Architecture+%28ISCA%29&rft.atitle=MLPerf+Inference+Benchmark&rft.au=Reddi%2C+Vijay+Janapa&rft.au=Cheng%2C+Christine&rft.au=Kanter%2C+David&rft.au=Mattson%2C+Peter&rft.date=2020-05-01&rft.pub=IEEE&rft.spage=446&rft.epage=459&rft_id=info:doi/10.1109%2FISCA45697.2020.00045&rft.externalDocID=9138989