Demographic Parity: Mitigating Biases in Real-World Data
Computer-based decision systems are widely used to automate decisions in many aspects of everyday life, which include sensitive areas like hiring, loaning and even criminal sentencing. A decision pipeline heavily relies on large volumes of historical real-world data for training its models. However,...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
27.09.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Computer-based decision systems are widely used to automate decisions in many
aspects of everyday life, which include sensitive areas like hiring, loaning
and even criminal sentencing. A decision pipeline heavily relies on large
volumes of historical real-world data for training its models. However,
historical training data often contains gender, racial or other biases which
are propagated to the trained models influencing computer-based decisions. In
this work, we propose a robust methodology that guarantees the removal of
unwanted biases while maximally preserving classification utility. Our approach
can always achieve this in a model-independent way by deriving from real-world
data the asymptotic dataset that uniquely encodes demographic parity and
realism. As a proof-of-principle, we deduce from public census records such an
asymptotic dataset from which synthetic samples can be generated to train
well-established classifiers. Benchmarking the generalization capability of
these classifiers trained on our synthetic data, we confirm the absence of any
explicit or implicit bias in the computer-aided decision. |
---|---|
DOI: | 10.48550/arxiv.2309.17347 |