CaW-NAS: Compression Aware Neural Architecture Search

With the ever-growing demand for deep learning (DL) at the edge, building small and efficient DL architectures has become a significant challenge. Optimization techniques such as quantization, pruning or hardware-aware neural architecture search (HW-NAS) have been proposed. In this paper, we present...

Full description

Saved in:

Bibliographic Details
Published in	2022 25th Euromicro Conference on Digital System Design (DSD) pp. 391 - 397
Main Authors	Benmeziane, Hadjer, Ouranoughi, Hamza, Niar, Smail, El Maghraoui, Kaoutar
Format	Conference Proceeding
Language	English
Published	IEEE 01.08.2022
Subjects	Architecture Buildings computer vision Deep learning Digital systems Energy consumption hardware-aware neural architecture search quantization Quantization (signal) Search problems
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With the ever-growing demand for deep learning (DL) at the edge, building small and efficient DL architectures has become a significant challenge. Optimization techniques such as quantization, pruning or hardware-aware neural architecture search (HW-NAS) have been proposed. In this paper, we present an efficient HW-NAS; Compression-Aware Neural Architecture search (CaW-NAS), that combines the search for the architecture and its quantization policy. While former works search over a fully quantized search space, we define our search space with quantized and non-quantized architectures. Our search strategy finds the best trade-off between accuracy and latency according to the target hardware. Experimental results on a mobile platform show that, our method allows to obtain more efficient networks in terms of accuracy, execution time and energy consumption when compared to the state of the art.
ISSN:	2771-2508
DOI:	10.1109/DSD57027.2022.00059