Graceful CNN Model Degradation in Uncorrected Flash Storage for Embedded Edge Devices

Computing near the source of data has been proven effective in terms of energy conservation, latency improvement, and privacy preservation. With this, edge intelligence refers to local CNN inference in embedded edge devices. Because edge devices are highly resource-constrained, in practice they stor...

Full description

Saved in:
Bibliographic Details
Published inACM transactions on storage
Main Authors Chen, Hung-Yi, Chang, Jin-Wei, Lin, Hong-Ruei, Chang, Li-Pin
Format Journal Article
LanguageEnglish
Published 07.07.2025
Online AccessGet full text
ISSN1553-3077
1553-3093
DOI10.1145/3747298

Cover

Loading…
More Information
Summary:Computing near the source of data has been proven effective in terms of energy conservation, latency improvement, and privacy preservation. With this, edge intelligence refers to local CNN inference in embedded edge devices. Because edge devices are highly resource-constrained, in practice they store CNN model(s) in external flash memory and load them (or part of) during runtime inference. However, flash memory is subject to time-related retention errors, and thus a dilemma is that without error correction, a CNN model in flash memory quickly deteriorates and becomes useless. On the other hand, strong error correction involves extra read sensing and read retrying. Due to the concern of reliability, the lineup of embedded serial flash memory offers low-density solutions only. In this study, we investigate graceful degradation of CNN models in uncorrected high-density flash storage for edge intelligence. Inspired by the observation that in popular CNN models, the majority of CNN weight parameters share a few monotonic bit patterns in their high-order bits, we propose mapping the frequent bit patterns to low flash-cell voltage levels for protection against retention errors. Furthermore, as CNN layers show different levels of tolerance to bit errors, we also propose using adaptive cell-bit density to provide non-uniform error protection among layers. We conducted experiments on popular CNN designs, including VGG16, ResNet50, and InceptionV3, under realistic effects of flash aging. Results show that with prior methods, the inference accuracy of ResNet50 degrades to 70.7% in just three months; by contrast, with our approach, the degradation is less than 1% and the accuracy remains at 91.6% after one year of retention.
ISSN:1553-3077
1553-3093
DOI:10.1145/3747298