Breaking Fair Binary Classification with Optimal Flipping Attacks

Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier. Recent works showed that this approach yields an unfair classifier if the training set is corrupted. In this work, we study the minimum amount of data corruption required for a successful flippi...

Full description

Saved in:

Bibliographic Details
Published in	2022 IEEE International Symposium on Information Theory (ISIT) pp. 1453 - 1458
Main Authors	Jo, Changhun, Sohn, Jy-Yong, Lee, Kangwook
Format	Conference Proceeding
Language	English
Published	IEEE 26.06.2022
Subjects	Classification algorithms Collaborative work Computational efficiency Computational modeling Perturbation methods Training Upper bound
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier. Recent works showed that this approach yields an unfair classifier if the training set is corrupted. In this work, we study the minimum amount of data corruption required for a successful flipping attack. First, we find lower/upper bounds on this quantity and show that these bounds are tight when the target model is the unique unconstrained risk minimizer. Second, we propose a computationally efficient data poisoning attack algorithm that can compromise the performance of fair learning algorithms.
ISSN:	2157-8117
DOI:	10.1109/ISIT50566.2022.9834475