Ten Lessons From Three Generations Shaped Google's TPUv4i : Industrial Product

Google deployed several TPU generations since 2015, teaching us lessons that changed our views: semi-conductor technology advances unequally; compiler compatibility trumps binary compatibility, especially for VLIW domain-specific architectures (DSA); target total cost of ownership vs initial cost; s...

Full description

Saved in:
Bibliographic Details
Published inProceedings - International Symposium on Computer Architecture pp. 1 - 14
Main Authors Jouppi, Norman P., Hyun Yoon, Doe, Ashcraft, Matthew, Gottscho, Mark, Jablin, Thomas B., Kurian, George, Laudon, James, Li, Sheng, Ma, Peter, Ma, Xiaoyu, Norrie, Thomas, Patil, Nishant, Prasad, Sushma, Young, Cliff, Zhou, Zongwei, Patterson, David
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2021
Subjects
Online AccessGet full text
ISSN2575-713X
DOI10.1109/ISCA52012.2021.00010

Cover

Loading…
More Information
Summary:Google deployed several TPU generations since 2015, teaching us lessons that changed our views: semi-conductor technology advances unequally; compiler compatibility trumps binary compatibility, especially for VLIW domain-specific architectures (DSA); target total cost of ownership vs initial cost; support multi-tenancy; deep neural networks (DNN) grow 1.5X annually; DNN advances evolve workloads; some inference tasks require floating point; inference DSAs need air-cooling; apps limit latency, not batch size; and backwards ML compatibility helps deploy DNNs quickly. These lessons molded TPUv4i, an inference DSA deployed since 2020.
ISSN:2575-713X
DOI:10.1109/ISCA52012.2021.00010