Resilient linear classification an approach to deal with attacks on training data

Data-driven techniques are used in cyber-physical systems (CPS) for controlling autonomous vehicles, handling demand responses for energy management, and modeling human physiology for medical devices. These data-driven techniques extract models from training data, where their performance is often an...

Full description

Saved in:

Bibliographic Details
Published in	2017 ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS) pp. 155 - 164
Main Authors	Park, Sangdon, Weimer, James, Lee, Insup
Format	Conference Proceeding
Language	English
Published	New York, NY, USA ACM 18.04.2017
Series	ACM Other Conferences
Subjects	Classification algorithms Computer systems organization > Embedded and cyber-physical systems Computing methodologies > Machine learning > Learning paradigms > Supervised learning > Supervised learning by classification Computing methodologies > Machine learning > Learning settings > Batch learning Cyber-physical systems linear classification Measurement Resilience Robot sensing systems Security Security and privacy > Software and application security > Domain-specific security and privacy architectures Training data training data attacks cyber-physical systems training data attacks linear classification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Data-driven techniques are used in cyber-physical systems (CPS) for controlling autonomous vehicles, handling demand responses for energy management, and modeling human physiology for medical devices. These data-driven techniques extract models from training data, where their performance is often analyzed with respect to random errors in the training data. However, if the training data is maliciously altered by attackers, the effect of these attacks on the learning algorithms underpinning data-driven CPS have yet to be considered. In this paper, we analyze the resilience of classification algorithms to training data attacks. Specifically, a generic metric is proposed that is tailored to measure resilience of classification algorithms with respect to worst-case tampering of the training data. Using the metric, we show that traditional linear classification algorithms are resilient under restricted conditions. To overcome these limitations, we propose a linear classification algorithm with a majority constraint and prove that it is strictly more resilient than the traditional algorithms. Evaluations on both synthetic data and a real-world retrospective arrhythmia medical case-study show that the traditional algorithms are vulnerable to tampered training data, whereas the proposed algorithm is more resilient (as measured by worst-case tampering).
ISBN:	9781450349659 145034965X
DOI:	10.1145/3055004.3055006