Defect Mechanisms Responsible for Silent Data Errors

As the scale of silicon integration increases, and as System-on-Chip (SoC) devices are installed in datacenters in ever larger numbers, silicon faults that are undetected by the machine check architecture must be tightly managed [1]. While many undetected faults will result in a work stoppage throug...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE International Reliability Physics Symposium (IRPS) pp. 1 - 5
Main Authors Shamsa, Manu, Lerner, David
Format Conference Proceeding
LanguageEnglish
Published IEEE 14.04.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:As the scale of silicon integration increases, and as System-on-Chip (SoC) devices are installed in datacenters in ever larger numbers, silicon faults that are undetected by the machine check architecture must be tightly managed [1]. While many undetected faults will result in a work stoppage through an application crash or a detected uncorrected error (DUE), those that manifest as silent data errors (SDE) are of greater concern because they may cause data loss or data corruption [2]. Intel has analyzed devices that exhibit SDE to better understand the underlying physical defect mechanisms. This paper reports the first detailed defect characterization study regarding the types of defects that lead to SDE events.
ISSN:1938-1891
DOI:10.1109/IRPS48228.2024.10529392