Efficient algorithms for fair clustering with a new notion of fairness

We revisit the problem of fair clustering, first introduced by Chierichetti et al. (Fair clustering through fairlets, 2017), which requires each protected attribute to have approximately equal representation in every cluster, i.e., a Balance property. Existing solutions to fair clustering are either...

Full description

Saved in:
Bibliographic Details
Published inData mining and knowledge discovery Vol. 37; no. 5; pp. 1959 - 1997
Main Authors Gupta, Shivam, Ghalme, Ganesh, Krishnan, Narayanan C., Jain, Shweta
Format Journal Article
LanguageEnglish
Published New York Springer US 01.09.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We revisit the problem of fair clustering, first introduced by Chierichetti et al. (Fair clustering through fairlets, 2017), which requires each protected attribute to have approximately equal representation in every cluster, i.e., a Balance property. Existing solutions to fair clustering are either not scalable or do not achieve an optimal trade-off between clustering objectives and fairness. In this paper, we propose a new notion of fairness which we call τ -ratio fairness, that strictly generalizes the Balance property and enables a fine-grained efficiency vs. fairness trade-off. Furthermore, we show that a simple greedy round-robin-based algorithm achieves this trade-off efficiently. Under a more general setting of multi-valued protected attributes, we rigorously analyze the theoretical properties of the proposed algorithm, the Fair Round-Robin Algorithm for Clustering Over-End ( FRAC OE ). We also propose a heuristic algorithm, Fair Round-Robin Algorithm for Clustering ( FRAC ), that applies round-robin allocation at each iteration of a vanilla clustering algorithm. Our experimental results suggest that both FRAC and FRAC OE outperform all the state-of-the-art algorithms and work exceptionally well even for a large number of clusters.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1384-5810
1573-756X
DOI:10.1007/s10618-023-00928-6