FAE: A Fairness-Aware Ensemble Framework

Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick...

Full description

Saved in:

Bibliographic Details
Published in	2019 IEEE International Conference on Big Data (Big Data) pp. 1375 - 1380
Main Authors	Iosifidis, Vasileios, Fetahu, Besnik, Ntoutsi, Eirini
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2019
Subjects	Big Data Boosting class imbalance class overlap Data models ensemble learning fairness-aware classification group imbalance Machine learning algorithms Training Training data
Online Access	Get full text
DOI	10.1109/BigData47090.2019.9006487

Cover

Loading…

More Information
Summary:	Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick up any encoded societal biases in the training data, but even worse, they might reinforce such biases resulting in more severe discrimination. The majority of thus far proposed fairness-aware machine learning approaches focus solely on the pre-, in- or post-processing steps of the machine learning process, that is, input data, learning algorithms or derived models, respectively. However, the fairness problem cannot be isolated to a single step of the ML process. Rather, discrimination is often a result of complex interactions between big data and algorithms, and therefore, a more holistic approach is required.The proposed FAE (Fairness-Aware Ensemble) framework combines fairness-related interventions at both pre-and post-processing steps of the data analysis process. In the pre-processing step, we tackle the problems of under-representation of the protected group (group imbalance) and of class-imbalance by generating balanced training samples. In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness.
DOI:	10.1109/BigData47090.2019.9006487