Performance modeling on DaVinci AI core

The extensive use of Deep Neural Networks (DNNs) encourages people to design domain-specific hardware called Artificial Intelligence (AI) processors. The novel hardware makes optimizations challenging without a proper performance model that reveals working details and performance implications. This...

Full description

Saved in:

Bibliographic Details
Published in	Journal of parallel and distributed computing Vol. 175; pp. 134 - 149
Main Authors	Tang, Yifeng, Wang, Cho-li
Format	Journal Article
Language	English
Published	Elsevier Inc 01.05.2023
Subjects	AI processors Benchmarking Performance modeling AI processors Performance modeling Benchmarking
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The extensive use of Deep Neural Networks (DNNs) encourages people to design domain-specific hardware called Artificial Intelligence (AI) processors. The novel hardware makes optimizations challenging without a proper performance model that reveals working details and performance implications. This paper presents a performance model, Verrocchio, for Huawei DaVinci AI Core, which predicts the execution time of real-world DaVinci kernels. We propose specially-crafted micro-benchmarks to identify contention source, runtime behaviors, and bandwidth sharing, which significantly determine performance. Since DaVinci Core adopts a binary semaphore mechanism for synchronization, Verrocchio views each instruction as a discrete event and manages its execution time based on the programming logic. For evaluation, Verrocchio achieves average error rates of 2.62% and 2.30% in sample kernels for single-core and double-core execution. We demonstrate an optimizing process of matrix multiplications with Verrocchio, achieving speedups of 1.70× for operators and 1.53× for applications and error rates of 5.06% and 5.25%. •Detailed dissections of Huawei DaVinci AI Core, a novel AI processor.•Benchmarking the DaVinci Core bandwidth contention, the key performance factor.•Performance model for accurate execution time prediction of kernel program.•Demonstration of DaVinci kernel optimization and prediction accuracy evaluation.
ISSN:	0743-7315 1096-0848
DOI:	10.1016/j.jpdc.2023.01.008