Do Not Harm Protected Groups in Debiasing Language Representation Models
Language Representation Models (LRMs) trained with real-world data may capture and exacerbate undesired bias and cause unfair treatment of people in various demographic groups. Several techniques have been investigated for applying interventions to LRMs to remove bias in benchmark evaluations on, fo...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
27.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Language Representation Models (LRMs) trained with real-world data may
capture and exacerbate undesired bias and cause unfair treatment of people in
various demographic groups. Several techniques have been investigated for
applying interventions to LRMs to remove bias in benchmark evaluations on, for
example, word embeddings. However, the negative side effects of debiasing
interventions are usually not revealed in the downstream tasks. We propose
xGAP-DEBIAS, a set of evaluations on assessing the fairness of debiasing. In
this work, We examine four debiasing techniques on a real-world text
classification task and show that reducing biasing is at the cost of degrading
performance for all demographic groups, including those the debiasing
techniques aim to protect. We advocate that a debiasing technique should have
good downstream performance with the constraint of ensuring no harm to the
protected group. |
---|---|
DOI: | 10.48550/arxiv.2310.18458 |