MLPerf Inference Benchmark
Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnit...
Saved in:
Published in | 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) pp. 446 - 459 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.05.2020
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/ISCA45697.2020.00045 |
Cover
Loading…
Abstract | Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability. |
---|---|
AbstractList | Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability. |
Author | Massa, Francisco Chukka, Ramesh Kanwar, Pankaj Thomson, Michael Wu, Carole-Jean Lokhmotov, Anton Zhou, Yuchen Lee, David Kanter, David Anderson, Brian Meng, Peng Micikevicius, Paulius Chou, William Liao, Jeffery Pekhimenko, Gennady Sirasao, Ashish Duke, Jared Gardner, J. Scott Hubara, Itay Zhang, Peizhao Tang, Hanlin Deng, Pan Breughe, Maximilien Sun, Fei Yamada, Koichi Jablin, Thomas B. Cheng, Christine Xu, Lingjie Fick, Dave Osborne, Colin Zhong, Aaron Schmuelling, Guenther Yu, Bing Davis, Sam Charlebois, Mark Coleman, Cody Diamos, Greg Idgunji, Sachin Wei, Frank John, Tom St Mattson, Peter Jiao, Jeff Rajan, Arun Tejusve Raghunath Yuan, George Reddi, Vijay Janapa Sequeira, Dilip Wu, Ephrem |
Author_xml | – sequence: 1 givenname: Vijay Janapa surname: Reddi fullname: Reddi, Vijay Janapa organization: Harvard University – sequence: 2 givenname: Christine surname: Cheng fullname: Cheng, Christine organization: Intel – sequence: 3 givenname: David surname: Kanter fullname: Kanter, David organization: Real World Insights – sequence: 4 givenname: Peter surname: Mattson fullname: Mattson, Peter organization: Google – sequence: 5 givenname: Guenther surname: Schmuelling fullname: Schmuelling, Guenther organization: Microsoft – sequence: 6 givenname: Carole-Jean surname: Wu fullname: Wu, Carole-Jean organization: Facebook – sequence: 7 givenname: Brian surname: Anderson fullname: Anderson, Brian organization: Google – sequence: 8 givenname: Maximilien surname: Breughe fullname: Breughe, Maximilien organization: NVIDIA – sequence: 9 givenname: Mark surname: Charlebois fullname: Charlebois, Mark organization: Qualcomm – sequence: 10 givenname: William surname: Chou fullname: Chou, William organization: Qualcomm – sequence: 11 givenname: Ramesh surname: Chukka fullname: Chukka, Ramesh organization: Intel – sequence: 12 givenname: Cody surname: Coleman fullname: Coleman, Cody organization: Stanford University – sequence: 13 givenname: Sam surname: Davis fullname: Davis, Sam organization: Myrtle – sequence: 14 givenname: Pan surname: Deng fullname: Deng, Pan organization: Tencent – sequence: 15 givenname: Greg surname: Diamos fullname: Diamos, Greg organization: Landing AI – sequence: 16 givenname: Jared surname: Duke fullname: Duke, Jared organization: Google – sequence: 17 givenname: Dave surname: Fick fullname: Fick, Dave organization: Mythic – sequence: 18 givenname: J. Scott surname: Gardner fullname: Gardner, J. Scott organization: Advantage Engineering – sequence: 19 givenname: Itay surname: Hubara fullname: Hubara, Itay organization: Habana Labs – sequence: 20 givenname: Sachin surname: Idgunji fullname: Idgunji, Sachin organization: NVIDIA – sequence: 21 givenname: Thomas B. surname: Jablin fullname: Jablin, Thomas B. organization: Google – sequence: 22 givenname: Jeff surname: Jiao fullname: Jiao, Jeff organization: Alibaba T-Head – sequence: 23 givenname: Tom St surname: John fullname: John, Tom St organization: Tesla – sequence: 24 givenname: Pankaj surname: Kanwar fullname: Kanwar, Pankaj organization: Google – sequence: 25 givenname: David surname: Lee fullname: Lee, David organization: Facebook (formerly at MediaTek) – sequence: 26 givenname: Jeffery surname: Liao fullname: Liao, Jeffery organization: OPPO (formerly at Synopsys) – sequence: 27 givenname: Anton surname: Lokhmotov fullname: Lokhmotov, Anton organization: dividiti – sequence: 28 givenname: Francisco surname: Massa fullname: Massa, Francisco organization: Facebook – sequence: 29 givenname: Peng surname: Meng fullname: Meng, Peng organization: Tencent – sequence: 30 givenname: Paulius surname: Micikevicius fullname: Micikevicius, Paulius organization: NVIDIA – sequence: 31 givenname: Colin surname: Osborne fullname: Osborne, Colin organization: Arm – sequence: 32 givenname: Gennady surname: Pekhimenko fullname: Pekhimenko, Gennady organization: University of Toronto & Vector Institute – sequence: 33 givenname: Arun Tejusve Raghunath surname: Rajan fullname: Rajan, Arun Tejusve Raghunath organization: Intel – sequence: 34 givenname: Dilip surname: Sequeira fullname: Sequeira, Dilip organization: NVIDIA – sequence: 35 givenname: Ashish surname: Sirasao fullname: Sirasao, Ashish organization: Xilinx – sequence: 36 givenname: Fei surname: Sun fullname: Sun, Fei organization: Alibaba (formerly at Facebook) – sequence: 37 givenname: Hanlin surname: Tang fullname: Tang, Hanlin organization: Intel – sequence: 38 givenname: Michael surname: Thomson fullname: Thomson, Michael organization: Centaur Technology – sequence: 39 givenname: Frank surname: Wei fullname: Wei, Frank organization: Alibaba Cloud – sequence: 40 givenname: Ephrem surname: Wu fullname: Wu, Ephrem organization: Xilinx – sequence: 41 givenname: Lingjie surname: Xu fullname: Xu, Lingjie organization: Biren Research (formerly at Alibaba) – sequence: 42 givenname: Koichi surname: Yamada fullname: Yamada, Koichi organization: Intel – sequence: 43 givenname: Bing surname: Yu fullname: Yu, Bing organization: Facebook (formerly at MediaTek) – sequence: 44 givenname: George surname: Yuan fullname: Yuan, George organization: NVIDIA – sequence: 45 givenname: Aaron surname: Zhong fullname: Zhong, Aaron organization: Alibaba T-Head – sequence: 46 givenname: Peizhao surname: Zhang fullname: Zhang, Peizhao organization: Facebook – sequence: 47 givenname: Yuchen surname: Zhou fullname: Zhou, Yuchen organization: General Motors |
BookMark | eNotjctOwzAQRY0EEvTxA8CiP5B0xh47nmWJKI2UqpWAdWW7YxEBASVs-HuKYHPP5ujciTrvP3pR6hahRAReNo_1iqzjqtSgoQQAsmdqgpX2SM6hvVTzcewiEFrjNeKVut62exnyoumzDNInWdyd9uU9DK8zdZHD2yjzf07V8_r-qd4U7e6hqVdtEbS3XwVCDDajVAmzHIE5OIFM7FOMyWeO7I4sjiWzo1-VwEbyJroUBIjMVN38dTsROXwO3en8-8BoPHs2P2D-PHE |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/ISCA45697.2020.00045 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 1728146615 9781728146614 |
EndPage | 459 |
ExternalDocumentID | 9138989 |
Genre | orig-research |
GroupedDBID | 6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK GUFHI LHSKQ RIE RIO |
ID | FETCH-LOGICAL-a285t-10ba5f1e7c1fed099a6e0f498cbbc8f9b96d9e69ef964ba5f405b483b6cae0443 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 06 17:54:09 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a285t-10ba5f1e7c1fed099a6e0f498cbbc8f9b96d9e69ef964ba5f405b483b6cae0443 |
PageCount | 14 |
ParticipantIDs | ieee_primary_9138989 |
PublicationCentury | 2000 |
PublicationDate | 2020-May |
PublicationDateYYYYMMDD | 2020-05-01 |
PublicationDate_xml | – month: 05 year: 2020 text: 2020-May |
PublicationDecade | 2020 |
PublicationTitle | 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) |
PublicationTitleAbbrev | ISCA |
PublicationYear | 2020 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib041538211 |
Score | 2.6062334 |
Snippet | Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded.... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 446 |
SubjectTerms | Accuracy Benchmark testing Benchmarking Computer architecture Degradation Hardware Inference Machine Learning Optimization Organizations Servers Standards organizations Throughput |
Title | MLPerf Inference Benchmark |
URI | https://ieeexplore.ieee.org/document/9138989 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjR3LTgIxcAKcPKkBo_gIB48u7C7dbueoRAJGDImScCOd7jQmRDCb5cLX2-4CRuPBS9M007TT18x0XgC3MZIxJqQgIxSBSIgCcmQviJLMMQvaCUXa_0NOXuRoJp7mybwGdwdfGGYujc-466ulLj9bm43_Kuuh16oprEPdCW6Vr9b-7Ah_c50ws_OOi0LsjV8H9449wNRJgXFYhuVMfuRQKUnI8Bgm-8Ery5Fld1NQ12x_xWX87-xOoPXtrNeZHsjQKdR41YT25HnKue2MDyAPrnz_0PmyBbPh49tgFOzSIAQ6VknhHkrSiY04NZHlzHF0WnJoBSpDZJRFQpkhS2SLUnhQx4ORUH2SRnMoRP8MGqv1is-hk0WKrY0jK_skpLIUpowpKpsaw67HBTQ9XovPKtLFYodS--_mSzjyK1uZ_11Bo8g3fO1IdEE35d58AcsSkNw |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gHvSkBozii4NHF7ZL222PSjSgLCEREm5kpzuNCREMWS7-ettdwGg8eGmapk3f_WY6L4DbSKMxJsQgQ80DLhADdLAXMJE5YiF1TFHq_yGToexN-PNUTCtwt7OFIaJC-YxaPlvI8rOlWfuvsrb2UjWl92Df4b5gpbXW9vRwf3cdO7Oxj2Ohbvdfu_eOQNCx4wOjsHDMKX5EUSlA5OkIkm33pe7IvLXOsWU-f3lm_O_4jqH-ba7XHO2A6AQqtKhBIxmMaGWb_V2VB5e-vaereR0mT4_jbi_YBEII0kiJ3D2VmArLKDbMUuZoulRSaLlWBtEoq1HLTJPUZLXkvqqjwpCrDkqTUsh55xSqi-WCzqCZMUXWRszKDnKpLIYx6VgrGxtDrsU51Py8Zh-lr4vZZkqNv4tv4KA3TgazQX_4cgGHfpVLZcBLqOarNV05wM7xutinL5CwlCU |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+ACM%2FIEEE+47th+Annual+International+Symposium+on+Computer+Architecture+%28ISCA%29&rft.atitle=MLPerf+Inference+Benchmark&rft.au=Reddi%2C+Vijay+Janapa&rft.au=Cheng%2C+Christine&rft.au=Kanter%2C+David&rft.au=Mattson%2C+Peter&rft.date=2020-05-01&rft.pub=IEEE&rft.spage=446&rft.epage=459&rft_id=info:doi/10.1109%2FISCA45697.2020.00045&rft.externalDocID=9138989 |