NASH: Neural Architecture and Accelerator Search for Multiplication-Reduced Hybrid Models

The significant computational cost of multiplications hinders the deployment of deep neural networks (DNNs) on edge devices. While multiplication-free models offer enhanced hardware efficiency, they typically sacrifice accuracy. As a solution, multiplication-reduced hybrid models have emerged to com...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems. I, Regular papers pp. 1 - 13
Main Authors	Xu, Yang, Shi, Huihong, Wang, Zhongfeng
Format	Journal Article
Language	English
Published	IEEE 16.09.2024
Subjects	Accuracy Adders algorithm and hardware co-optimization Computational modeling Computer architecture Hardware hardware accelerator Measurement Multiplication-reduced hybrid model neural architecture and accelerator co-search Training zero-shot search
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The significant computational cost of multiplications hinders the deployment of deep neural networks (DNNs) on edge devices. While multiplication-free models offer enhanced hardware efficiency, they typically sacrifice accuracy. As a solution, multiplication-reduced hybrid models have emerged to combine the benefits of both approaches. Particularly, prior works, i.e., NASA and NASA-F, leverage Neural Architecture Search (NAS) to construct such hybrid models, enhancing hardware efficiency while maintaining accuracy. However, they either entail costly retraining or encounter gradient conflicts, limiting both search efficiency and accuracy. Additionally, they overlook the acceleration opportunity introduced by accelerator search, yielding sub-optimal hardware performance. To overcome these limitations, we propose NASH, a N eural architecture and A ccelerator S earch framework for multiplication-reduced H ybrid models. Specifically, as for NAS, we propose a tailored zero-shot metric to pre-identify promising hybrid models before training, enhancing search efficiency while alleviating gradient conflicts. Regarding accelerator search, we innovatively introduce coarse-to-fine search to streamline the search process. Furthermore, we seamlessly integrate these two levels of searches to unveil NASH, obtaining optimal model and accelerator pairing. Experiments validate our effectiveness, e.g., when compared with the state-of-the-art multiplication-based system, we can achieve <inline-formula> <tex-math notation="LaTeX">\uparrow</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">2.14\times</tex-math> </inline-formula> throughput and <inline-formula> <tex-math notation="LaTeX">\uparrow</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">2.01\times</tex-math> </inline-formula> FPS with <inline-formula> <tex-math notation="LaTeX">\uparrow</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">0.25\%</tex-math> </inline-formula> accuracy on CIFAR-100, and <inline-formula> <tex-math notation="LaTeX">\uparrow</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">1.40\times</tex-math> </inline-formula> throughput and <inline-formula> <tex-math notation="LaTeX">\uparrow</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">1.19\times</tex-math> </inline-formula> FPS with <inline-formula> <tex-math notation="LaTeX">\uparrow</tex-math> </inline-formula> <inline-formula> <tex-math notation="LaTeX">0.56\%</tex-math> </inline-formula> accuracy on Tiny-ImageNet. Codes are available at https://github.com/xuyang527/NASH.
ISSN:	1549-8328
DOI:	10.1109/TCSI.2024.3457628