TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks

With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained t...

Full description

Saved in:
Bibliographic Details
Published inMachine learning and knowledge extraction Vol. 4; no. 2; pp. 488 - 501
Main Authors Rajabi, Amirarsalan, Garibay, Ozlem Ozmen
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.06.2022
Subjects
Online AccessGet full text
ISSN2504-4990
2504-4990
DOI10.3390/make4020022

Cover

Loading…
Abstract With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. We show that using a fairly simple architecture and applying quantile transformation of numerical attributes the model achieves promising performance. In the unconstrained case, i.e., when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats the state-of-the-art GANs proposed in the literature to produce synthetic tabular data. Furthermore, in the constrained case in which the first phase of training is followed by the second phase, we train the network and test it on four datasets studied in the fairness literature and compare our results with another state-of-the-art pre-processing method, and present the promising results that it achieves. Comparing to other studies utilizing GANs for fair data generation, our model is comparably more stable by using only one critic, and also by avoiding major problems of original GAN model, such as mode-dropping and non-convergence.
AbstractList With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. We show that using a fairly simple architecture and applying quantile transformation of numerical attributes the model achieves promising performance. In the unconstrained case, i.e., when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats the state-of-the-art GANs proposed in the literature to produce synthetic tabular data. Furthermore, in the constrained case in which the first phase of training is followed by the second phase, we train the network and test it on four datasets studied in the fairness literature and compare our results with another state-of-the-art pre-processing method, and present the promising results that it achieves. Comparing to other studies utilizing GANs for fair data generation, our model is comparably more stable by using only one critic, and also by avoiding major problems of original GAN model, such as mode-dropping and non-convergence.
Author Rajabi, Amirarsalan
Garibay, Ozlem Ozmen
Author_xml – sequence: 1
  givenname: Amirarsalan
  orcidid: 0000-0001-8328-8473
  surname: Rajabi
  fullname: Rajabi, Amirarsalan
– sequence: 2
  givenname: Ozlem Ozmen
  orcidid: 0000-0001-9215-694X
  surname: Garibay
  fullname: Garibay, Ozlem Ozmen
BookMark eNptUctOAkEQnBhNROTkD2zi0azOa1_eCAqSIF7wPOl5rA4sOzg7QPx7F1BDjKeu7lRVKl0X6LR2tUHoiuBbxgp8t4SF4ZhiTOkJ6tAE85gXBT49wueo1zRz3FKyghPMO-h5BnII1o_60_toB6L2sK7ARw8QIBqZ2ngI1tXR1ob3331jor7eGN-At1BFUxO2zi-aS3RWQtWY3vfsotfh42zwFE9eRuNBfxIrlvIQ5zklLOM5oyXjJS0goSk2GBgzWaY00TKVOddagaZaSuAp4ZoTljCguc5L1kXjg692MBcrb5fgP4UDK_YH598E-GBVZYSiUHAuCQUMvEyZxIak3DAlFZGszFqv64PXyruPtWmCmLu1r9v4gqZZkSU5oWnLIgeW8q5pvCmFsmH_mODBVoJgsStBHJXQam7-aH6S_sf-Atd3iMY
CitedBy_id crossref_primary_10_1016_j_jbi_2023_104404
crossref_primary_10_1016_j_neucom_2021_12_082
crossref_primary_10_3390_s24227389
crossref_primary_10_1007_s12206_024_0835_0
crossref_primary_10_1016_j_mlwa_2025_100637
crossref_primary_10_1109_TAI_2022_3229289
crossref_primary_10_1145_3631326
crossref_primary_10_3390_math10152733
crossref_primary_10_1016_j_neunet_2024_106157
crossref_primary_10_36680_j_itcon_2025_009
crossref_primary_10_1016_j_cities_2022_104027
crossref_primary_10_1109_ACCESS_2023_3348451
Cites_doi 10.1287/mnsc.2018.3093
10.1007/978-3-642-33486-3_3
10.1109/BigData.2018.8622525
10.1016/j.dss.2014.03.001
10.1201/9781003278290-37
10.1145/3278721.3278779
10.1007/s10519-009-9281-0
10.1016/S0016-0032(96)00063-4
10.1007/s10115-011-0463-8
10.1109/BigData47090.2019.9006322
10.1109/ISACV.2018.8354080
10.1023/A:1026543900054
10.1147/JRD.2019.2945519
10.1145/2783258.2783311
10.1089/big.2016.0047
10.1007/978-3-540-71050-9
ContentType Journal Article
Copyright 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
8FE
8FG
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
DOA
DOI 10.3390/make4020022
DatabaseName CrossRef
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials - QC
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central Korea
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database (ProQuest)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Advanced Technologies & Aerospace Collection
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
Advanced Technologies & Aerospace Database
ProQuest One Applied & Life Sciences
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList CrossRef
Publicly Available Content Database

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Open Access Full Text
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
EISSN 2504-4990
EndPage 501
ExternalDocumentID oai_doaj_org_article_c2a944b12a0a4f63b0e164e3cbc1b3f7
10_3390_make4020022
GroupedDBID AADQD
AAFWJ
AAYXX
AFKRA
AFPKN
AFZYC
ALMA_UNASSIGNED_HOLDINGS
ARAPS
BENPR
BGLVJ
CCPQU
CITATION
GROUPED_DOAJ
HCIFZ
IAO
K7-
MODMG
M~E
OK1
PHGZM
PHGZT
PIMPY
8FE
8FG
ABUWG
AZQEC
DWQXO
GNUQQ
JQ2
P62
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PUEGO
ID FETCH-LOGICAL-c364t-8821374832f34f29a5260e0a33e77cd1db6b84ddcad2dbba4614d41353a28d8f3
IEDL.DBID 8FG
ISSN 2504-4990
IngestDate Wed Aug 27 01:20:34 EDT 2025
Fri Jul 25 01:33:13 EDT 2025
Tue Jul 01 03:11:07 EDT 2025
Thu Apr 24 23:14:52 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c364t-8821374832f34f29a5260e0a33e77cd1db6b84ddcad2dbba4614d41353a28d8f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-9215-694X
0000-0001-8328-8473
OpenAccessLink https://www.proquest.com/docview/2679758126?pq-origsite=%requestingapplication%
PQID 2679758126
PQPubID 5046881
PageCount 14
ParticipantIDs doaj_primary_oai_doaj_org_article_c2a944b12a0a4f63b0e164e3cbc1b3f7
proquest_journals_2679758126
crossref_citationtrail_10_3390_make4020022
crossref_primary_10_3390_make4020022
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-06-01
PublicationDateYYYYMMDD 2022-06-01
PublicationDate_xml – month: 06
  year: 2022
  text: 2022-06-01
  day: 01
PublicationDecade 2020
PublicationPlace Basel
PublicationPlace_xml – name: Basel
PublicationTitle Machine learning and knowledge extraction
PublicationYear 2022
Publisher MDPI AG
Publisher_xml – name: MDPI AG
References Gulrajani (ref_26) 2017; 30
Pedregosa (ref_35) 2011; 12
ref_34
ref_11
ref_33
ref_32
ref_31
Kamiran (ref_4) 2012; 33
Moro (ref_30) 2014; 62
Hardt (ref_7) 2016; 29
ref_18
ref_17
ref_16
ref_15
Chouldechova (ref_1) 2017; 5
Sattigeri (ref_19) 2019; 63
Pardo (ref_13) 1997; 334
Rubner (ref_14) 2000; 40
Beasley (ref_25) 2009; 39
ref_24
Goodfellow (ref_10) 2014; 27
ref_22
Lambrecht (ref_2) 2019; 65
ref_21
Xu (ref_23) 2019; 32
ref_20
ref_3
ref_29
ref_28
ref_27
ref_9
ref_8
Vondrick (ref_12) 2016; 29
ref_5
ref_6
References_xml – ident: ref_28
– ident: ref_9
– ident: ref_3
– ident: ref_34
– volume: 29
  start-page: 3315
  year: 2016
  ident: ref_7
  article-title: Equality of opportunity in supervised learning
  publication-title: Adv. Neural Inf. Process. Syst.
– volume: 27
  start-page: 2672
  year: 2014
  ident: ref_10
  article-title: Generative adversarial nets
  publication-title: Adv. Neural Inf. Process. Syst.
– ident: ref_11
– volume: 65
  start-page: 2966
  year: 2019
  ident: ref_2
  article-title: Algorithmic bias? An empirical study of apparent gender-based discrimination in the display of stem career ads
  publication-title: Manag. Sci.
  doi: 10.1287/mnsc.2018.3093
– ident: ref_6
  doi: 10.1007/978-3-642-33486-3_3
– volume: 29
  start-page: 613
  year: 2016
  ident: ref_12
  article-title: Generating videos with scene dynamics
  publication-title: Adv. Neural Inf. Process. Syst.
– ident: ref_20
  doi: 10.1109/BigData.2018.8622525
– ident: ref_16
– volume: 62
  start-page: 22
  year: 2014
  ident: ref_30
  article-title: A data-driven approach to predict the success of bank telemarketing
  publication-title: Decis. Support Syst.
  doi: 10.1016/j.dss.2014.03.001
– ident: ref_32
  doi: 10.1201/9781003278290-37
– ident: ref_18
  doi: 10.1145/3278721.3278779
– volume: 39
  start-page: 580
  year: 2009
  ident: ref_25
  article-title: Rank-based inverse normal transformations are increasingly used, but are they merited?
  publication-title: Behav. Genet.
  doi: 10.1007/s10519-009-9281-0
– volume: 334
  start-page: 307
  year: 1997
  ident: ref_13
  article-title: The jensen-shannon divergence
  publication-title: J. Frankl. Inst.
  doi: 10.1016/S0016-0032(96)00063-4
– volume: 32
  start-page: 7333
  year: 2019
  ident: ref_23
  article-title: Modeling Tabular data using Conditional GAN
  publication-title: Adv. Neural Inf. Process. Syst.
– ident: ref_21
– volume: 12
  start-page: 2825
  year: 2011
  ident: ref_35
  article-title: Scikit-learn: Machine learning in Python
  publication-title: J. Mach. Learn. Res.
– volume: 33
  start-page: 1
  year: 2012
  ident: ref_4
  article-title: Data preprocessing techniques for classification without discrimination
  publication-title: Knowl. Inf. Syst.
  doi: 10.1007/s10115-011-0463-8
– ident: ref_31
– ident: ref_29
– ident: ref_33
– ident: ref_15
– ident: ref_24
  doi: 10.1109/BigData47090.2019.9006322
– ident: ref_8
  doi: 10.1109/ISACV.2018.8354080
– volume: 40
  start-page: 99
  year: 2000
  ident: ref_14
  article-title: The earth mover’s distance as a metric for image retrieval
  publication-title: Int. J. Comput. Vis.
  doi: 10.1023/A:1026543900054
– ident: ref_17
– volume: 63
  start-page: 3:1
  year: 2019
  ident: ref_19
  article-title: Fairness GAN: Generating datasets with fairness properties using a generative adversarial network
  publication-title: IBM J. Res. Dev.
  doi: 10.1147/JRD.2019.2945519
– ident: ref_5
  doi: 10.1145/2783258.2783311
– volume: 30
  start-page: 5769
  year: 2017
  ident: ref_26
  article-title: Improved Training of Wasserstein GANs
  publication-title: Adv. Neural Inf. Process. Syst.
– ident: ref_22
– volume: 5
  start-page: 153
  year: 2017
  ident: ref_1
  article-title: Fair prediction with disparate impact: A study of bias in recidivism prediction instruments
  publication-title: Big Data
  doi: 10.1089/big.2016.0047
– ident: ref_27
  doi: 10.1007/978-3-540-71050-9
SSID ssj0002794104
Score 2.4282098
Snippet With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a...
SourceID doaj
proquest
crossref
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
StartPage 488
SubjectTerms Accuracy
Algorithms
Artificial intelligence
Automation
Bias
Constraints
Datasets
Decision making
fair data generation
fairness in artificial intelligence
Generative adversarial networks
Machine learning
Methods
Minority & ethnic groups
Random variables
Tables (data)
Training
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LSwMxEA7SkxdRVKxWyaEnYWk2yWYTb_VRi9CeWuhtmbzAV5W2-vtNsttSUfDibTcM7PJNdme-JPMNQl2SG-Vd6bKisDbjVpJMejCZ1lJRbanlqQ_ZaCyGU_4wK2Zbrb7imbBaHrgGrmcoKM51ToEA94Jp4kKG75jRJtfMpzpyosgWmXpK22mKB6JRF-SxwOt7r_DsIlcilH4LQUmp_8ePOEWXwT7aa9JC3K9f5wDtuPkhGk1AD-Bxcd8fX-F4gcNAPDaKb2EFuFaMjsDiuJq6uf90OLVZXkKcXHhcH_ReHqHp4G5yM8ya9geZYYKvspD75lEchlHPuKcKisA9HAHGXFkam1sttOTWGgiQag08RFrLYx8LoNJKz45Ra_42dycIO1cyRrR1SpQcBAAQVwBTuvRMgmRtdLlGpDKNNnhsUfFSBY4Q4au24Guj7sb4vZbE-N3sOkK7MYk61mkgeLdqvFv95d026qwdUzUf17KiolSB5uRUnP7HM87QLo01DWlppYNaq8WHOw-ZxkpfpEn1BSsm0k0
  priority: 102
  providerName: Directory of Open Access Journals
Title TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks
URI https://www.proquest.com/docview/2679758126
https://doaj.org/article/c2a944b12a0a4f63b0e164e3cbc1b3f7
Volume 4
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8MwDI54XLggECDGY8qBE1K1NknTlAvisYeQNiE0JG6V8yhCwAbb4MhvJ067AQJxqdokJ8ex_bmOP0KO4sTkpctclKbWRsKqOFIlmEhrlTNtmRWBh6w_kL1bcXWX3tUJt2ldVjm3icFQ27HBHHmLySz3sW3C5OnLa4SsUfh3tabQWCarfkKinqtOd5FjYV7ZPNyoruVxj-5bz_DoEDHFjP1wRKFf_y9zHHxMZ4Os18EhPat2c5MsudEW6Q9Bd-Bh0j0bnFB8oX4Ai0fpJcyAVn2jUbwUc6qL73dHA9nyFFDF6KAq955uk9tOe3jRi2oShMhwKWaRj4ATbBHDWclFyXJIPQJxMXDusszYxGqplbDWgBes1iC8v7UC2SyAKatKvkNWRuOR2yXUuYzzWFuXy0yABIDYpcBznZVcgeINcjyXSGHqDuFIVPFUeKSA4iu-ia9BjhaLX6rGGH8vO0fRLpZgN-swMJ7cF_XhKAyDXAidMIhBlJLr2HkU57jRJtG8zBrkYL4xRX3EpsWXQuz9P71P1hjeWQipkwOyMpu8uUMfScx0M6hLk6yetwfXN82Ax_2z_9H-BGP6zP8
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwEB7R5UAvqBVU3RZaH-ilUoRjOy8kVEFhWQobIbRI3NLxIxWi7MLutog_1d9YTx4LqKg3boljRdHMZ88jnvkANnhostIlLogiawNlUx6kJZpA6zQT2gqrKh6yQR73z9S38-h8Af60tTB0rLLdE6uN2o4N5cg3RZxk3rcNRfzl-iYg1ij6u9pSaNSwOHJ3tz5km24f7nn9fhKitz_82g8aVoHAyFjNAu9ShtRzRYpSqlJkGHmX3nGU0iWJsaHVsU6VtQb9l2qNyhswq4geAkVq01L6976ARUUVrR1Y3N3PT07nWR3h4e0DnLoQUMqMb17hpaMYjQvxyPRVDAH_GIDKqvVewXLjjrKdGj-vYcGNVmAwRN3Di8nBTr7F6IL5ATquyvZwhqzuVE0KZZTFnd__dqyid54igZrl9QHz6SqcPYuA3kBnNB65t8CcS6Tk2rosThTGiMhdhDLTSSlTTGUXPrcSKUzTk5yoMX4WPjYh8RUPxNeFjfnk67oVx9PTdkm08ynUP7saGE9-FM1yLIzATCkdCuSoylhq7nzc6KTRJtSyTLqw1iqmaBb1tLiH4Lv_P_4IS_3h4Lg4PsyP3sNLQRUTVeJmDTqzyS-37v2Ymf7QgIfB9-fG61_HcwdL
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3dT9RAEJ8gJMYXowHjKeo-4ItJw3Z3221NDAHPAgIXHyDhrc5-GSPc4d0J4V_zr2OnH6cGwxtv7XbTNLO_3fnozPwANnhqy-C1T7LMuUS5gidFQJsYU5TCOOFUw0N2NMr3TtTn0-x0CX73tTCUVtmfic1B7SaWYuSbItdltG2p4CV0aRFfhtXWxc-EGKToT2tPp9FC5MBfX0X3bfZhfxjX-q0Q1afjj3tJxzCQWJmreRLNy5T6r0gRpAqixCya956jlF5r61JnclMo5yzGrzYGVVRmThFVBIrCFUHG9z6AFS3jt1GVerW7iO-ICPTo6rQlgVKWfPMcf3jy1rgQ_yjBhivglipo9Fv1BB53hinbbpH0FJb8eBWOjtFU-H26uz16z-iCxQFKXGVDnCNre1bT0jKK5y7uLz1riJ5nSPBmozbVfLYGJ_cinmewPJ6M_XNg3mspuXG-zLXCHBG5z1CWRgdZYCEH8K6XSG277uREknFWRy-FxFf_Jb4BbCwmX7RNOf4_bYdEu5hCnbSbgcn0W91tzNoKLJUyqUCOKuTScB89SC-tsamRQQ9gvV-Yutves_oPGF_c_fgNPIworQ_3Rwcv4ZGg0okmgrMOy_PpL_8qGjRz87pBDoOv9w3VGzAAChs
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TabFairGAN%3A+Fair+Tabular+Data+Generation+with+Generative+Adversarial+Networks&rft.jtitle=Machine+learning+and+knowledge+extraction&rft.au=Rajabi%2C+Amirarsalan&rft.au=Garibay%2C+Ozlem+Ozmen&rft.date=2022-06-01&rft.pub=MDPI+AG&rft.eissn=2504-4990&rft.volume=4&rft.issue=2&rft.spage=488&rft_id=info:doi/10.3390%2Fmake4020022&rft.externalDBID=HAS_PDF_LINK
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2504-4990&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2504-4990&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2504-4990&client=summon