Graceful CNN Model Degradation in Uncorrected Flash Storage for Embedded Edge Devices

Computing near the source of data has been proven effective in terms of energy conservation, latency improvement, and privacy preservation. With this, edge intelligence refers to local CNN inference in embedded edge devices. Because edge devices are highly resource-constrained, in practice they stor...

Full description

Saved in:

Bibliographic Details
Published in	ACM transactions on storage
Main Authors	Chen, Hung-Yi, Chang, Jin-Wei, Lin, Hong-Ruei, Chang, Li-Pin
Format	Journal Article
Language	English
Published	07.07.2025
Online Access	Get full text
ISSN	1553-3077 1553-3093
DOI	10.1145/3747298

Cover

Loading…

More Information
Summary:	Computing near the source of data has been proven effective in terms of energy conservation, latency improvement, and privacy preservation. With this, edge intelligence refers to local CNN inference in embedded edge devices. Because edge devices are highly resource-constrained, in practice they store CNN model(s) in external flash memory and load them (or part of) during runtime inference. However, flash memory is subject to time-related retention errors, and thus a dilemma is that without error correction, a CNN model in flash memory quickly deteriorates and becomes useless. On the other hand, strong error correction involves extra read sensing and read retrying. Due to the concern of reliability, the lineup of embedded serial flash memory offers low-density solutions only. In this study, we investigate graceful degradation of CNN models in uncorrected high-density flash storage for edge intelligence. Inspired by the observation that in popular CNN models, the majority of CNN weight parameters share a few monotonic bit patterns in their high-order bits, we propose mapping the frequent bit patterns to low flash-cell voltage levels for protection against retention errors. Furthermore, as CNN layers show different levels of tolerance to bit errors, we also propose using adaptive cell-bit density to provide non-uniform error protection among layers. We conducted experiments on popular CNN designs, including VGG16, ResNet50, and InceptionV3, under realistic effects of flash aging. Results show that with prior methods, the inference accuracy of ResNet50 degrades to 70.7% in just three months; by contrast, with our approach, the degradation is less than 1% and the accuracy remains at 91.6% after one year of retention.
ISSN:	1553-3077 1553-3093
DOI:	10.1145/3747298