TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks
With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained t...
Saved in:
Published in | Machine learning and knowledge extraction Vol. 4; no. 2; pp. 488 - 501 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Basel
MDPI AG
01.06.2022
|
Subjects | |
Online Access | Get full text |
ISSN | 2504-4990 2504-4990 |
DOI | 10.3390/make4020022 |
Cover
Loading…
Abstract | With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. We show that using a fairly simple architecture and applying quantile transformation of numerical attributes the model achieves promising performance. In the unconstrained case, i.e., when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats the state-of-the-art GANs proposed in the literature to produce synthetic tabular data. Furthermore, in the constrained case in which the first phase of training is followed by the second phase, we train the network and test it on four datasets studied in the fairness literature and compare our results with another state-of-the-art pre-processing method, and present the promising results that it achieves. Comparing to other studies utilizing GANs for fair data generation, our model is comparably more stable by using only one critic, and also by avoiding major problems of original GAN model, such as mode-dropping and non-convergence. |
---|---|
AbstractList | With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. We show that using a fairly simple architecture and applying quantile transformation of numerical attributes the model achieves promising performance. In the unconstrained case, i.e., when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats the state-of-the-art GANs proposed in the literature to produce synthetic tabular data. Furthermore, in the constrained case in which the first phase of training is followed by the second phase, we train the network and test it on four datasets studied in the fairness literature and compare our results with another state-of-the-art pre-processing method, and present the promising results that it achieves. Comparing to other studies utilizing GANs for fair data generation, our model is comparably more stable by using only one critic, and also by avoiding major problems of original GAN model, such as mode-dropping and non-convergence. |
Author | Rajabi, Amirarsalan Garibay, Ozlem Ozmen |
Author_xml | – sequence: 1 givenname: Amirarsalan orcidid: 0000-0001-8328-8473 surname: Rajabi fullname: Rajabi, Amirarsalan – sequence: 2 givenname: Ozlem Ozmen orcidid: 0000-0001-9215-694X surname: Garibay fullname: Garibay, Ozlem Ozmen |
BookMark | eNptUctOAkEQnBhNROTkD2zi0azOa1_eCAqSIF7wPOl5rA4sOzg7QPx7F1BDjKeu7lRVKl0X6LR2tUHoiuBbxgp8t4SF4ZhiTOkJ6tAE85gXBT49wueo1zRz3FKyghPMO-h5BnII1o_60_toB6L2sK7ARw8QIBqZ2ngI1tXR1ob3331jor7eGN-At1BFUxO2zi-aS3RWQtWY3vfsotfh42zwFE9eRuNBfxIrlvIQ5zklLOM5oyXjJS0goSk2GBgzWaY00TKVOddagaZaSuAp4ZoTljCguc5L1kXjg692MBcrb5fgP4UDK_YH598E-GBVZYSiUHAuCQUMvEyZxIak3DAlFZGszFqv64PXyruPtWmCmLu1r9v4gqZZkSU5oWnLIgeW8q5pvCmFsmH_mODBVoJgsStBHJXQam7-aH6S_sf-Atd3iMY |
CitedBy_id | crossref_primary_10_1016_j_jbi_2023_104404 crossref_primary_10_1016_j_neucom_2021_12_082 crossref_primary_10_3390_s24227389 crossref_primary_10_1007_s12206_024_0835_0 crossref_primary_10_1016_j_mlwa_2025_100637 crossref_primary_10_1109_TAI_2022_3229289 crossref_primary_10_1145_3631326 crossref_primary_10_3390_math10152733 crossref_primary_10_1016_j_neunet_2024_106157 crossref_primary_10_36680_j_itcon_2025_009 crossref_primary_10_1016_j_cities_2022_104027 crossref_primary_10_1109_ACCESS_2023_3348451 |
Cites_doi | 10.1287/mnsc.2018.3093 10.1007/978-3-642-33486-3_3 10.1109/BigData.2018.8622525 10.1016/j.dss.2014.03.001 10.1201/9781003278290-37 10.1145/3278721.3278779 10.1007/s10519-009-9281-0 10.1016/S0016-0032(96)00063-4 10.1007/s10115-011-0463-8 10.1109/BigData47090.2019.9006322 10.1109/ISACV.2018.8354080 10.1023/A:1026543900054 10.1147/JRD.2019.2945519 10.1145/2783258.2783311 10.1089/big.2016.0047 10.1007/978-3-540-71050-9 |
ContentType | Journal Article |
Copyright | 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | AAYXX CITATION 8FE 8FG ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- P5Z P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS DOA |
DOI | 10.3390/make4020022 |
DatabaseName | CrossRef ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials - QC ProQuest Central Technology Collection ProQuest One Community College ProQuest Central Korea ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database (ProQuest) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Publicly Available Content Database Advanced Technologies & Aerospace Collection Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central Advanced Technologies & Aerospace Database ProQuest One Applied & Life Sciences ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | CrossRef Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Open Access Full Text url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2504-4990 |
EndPage | 501 |
ExternalDocumentID | oai_doaj_org_article_c2a944b12a0a4f63b0e164e3cbc1b3f7 10_3390_make4020022 |
GroupedDBID | AADQD AAFWJ AAYXX AFKRA AFPKN AFZYC ALMA_UNASSIGNED_HOLDINGS ARAPS BENPR BGLVJ CCPQU CITATION GROUPED_DOAJ HCIFZ IAO K7- MODMG M~E OK1 PHGZM PHGZT PIMPY 8FE 8FG ABUWG AZQEC DWQXO GNUQQ JQ2 P62 PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PUEGO |
ID | FETCH-LOGICAL-c364t-8821374832f34f29a5260e0a33e77cd1db6b84ddcad2dbba4614d41353a28d8f3 |
IEDL.DBID | 8FG |
ISSN | 2504-4990 |
IngestDate | Wed Aug 27 01:20:34 EDT 2025 Fri Jul 25 01:33:13 EDT 2025 Tue Jul 01 03:11:07 EDT 2025 Thu Apr 24 23:14:52 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
License | https://creativecommons.org/licenses/by/4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c364t-8821374832f34f29a5260e0a33e77cd1db6b84ddcad2dbba4614d41353a28d8f3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0001-9215-694X 0000-0001-8328-8473 |
OpenAccessLink | https://www.proquest.com/docview/2679758126?pq-origsite=%requestingapplication% |
PQID | 2679758126 |
PQPubID | 5046881 |
PageCount | 14 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_c2a944b12a0a4f63b0e164e3cbc1b3f7 proquest_journals_2679758126 crossref_citationtrail_10_3390_make4020022 crossref_primary_10_3390_make4020022 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-06-01 |
PublicationDateYYYYMMDD | 2022-06-01 |
PublicationDate_xml | – month: 06 year: 2022 text: 2022-06-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Basel |
PublicationPlace_xml | – name: Basel |
PublicationTitle | Machine learning and knowledge extraction |
PublicationYear | 2022 |
Publisher | MDPI AG |
Publisher_xml | – name: MDPI AG |
References | Gulrajani (ref_26) 2017; 30 Pedregosa (ref_35) 2011; 12 ref_34 ref_11 ref_33 ref_32 ref_31 Kamiran (ref_4) 2012; 33 Moro (ref_30) 2014; 62 Hardt (ref_7) 2016; 29 ref_18 ref_17 ref_16 ref_15 Chouldechova (ref_1) 2017; 5 Sattigeri (ref_19) 2019; 63 Pardo (ref_13) 1997; 334 Rubner (ref_14) 2000; 40 Beasley (ref_25) 2009; 39 ref_24 Goodfellow (ref_10) 2014; 27 ref_22 Lambrecht (ref_2) 2019; 65 ref_21 Xu (ref_23) 2019; 32 ref_20 ref_3 ref_29 ref_28 ref_27 ref_9 ref_8 Vondrick (ref_12) 2016; 29 ref_5 ref_6 |
References_xml | – ident: ref_28 – ident: ref_9 – ident: ref_3 – ident: ref_34 – volume: 29 start-page: 3315 year: 2016 ident: ref_7 article-title: Equality of opportunity in supervised learning publication-title: Adv. Neural Inf. Process. Syst. – volume: 27 start-page: 2672 year: 2014 ident: ref_10 article-title: Generative adversarial nets publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_11 – volume: 65 start-page: 2966 year: 2019 ident: ref_2 article-title: Algorithmic bias? An empirical study of apparent gender-based discrimination in the display of stem career ads publication-title: Manag. Sci. doi: 10.1287/mnsc.2018.3093 – ident: ref_6 doi: 10.1007/978-3-642-33486-3_3 – volume: 29 start-page: 613 year: 2016 ident: ref_12 article-title: Generating videos with scene dynamics publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_20 doi: 10.1109/BigData.2018.8622525 – ident: ref_16 – volume: 62 start-page: 22 year: 2014 ident: ref_30 article-title: A data-driven approach to predict the success of bank telemarketing publication-title: Decis. Support Syst. doi: 10.1016/j.dss.2014.03.001 – ident: ref_32 doi: 10.1201/9781003278290-37 – ident: ref_18 doi: 10.1145/3278721.3278779 – volume: 39 start-page: 580 year: 2009 ident: ref_25 article-title: Rank-based inverse normal transformations are increasingly used, but are they merited? publication-title: Behav. Genet. doi: 10.1007/s10519-009-9281-0 – volume: 334 start-page: 307 year: 1997 ident: ref_13 article-title: The jensen-shannon divergence publication-title: J. Frankl. Inst. doi: 10.1016/S0016-0032(96)00063-4 – volume: 32 start-page: 7333 year: 2019 ident: ref_23 article-title: Modeling Tabular data using Conditional GAN publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_21 – volume: 12 start-page: 2825 year: 2011 ident: ref_35 article-title: Scikit-learn: Machine learning in Python publication-title: J. Mach. Learn. Res. – volume: 33 start-page: 1 year: 2012 ident: ref_4 article-title: Data preprocessing techniques for classification without discrimination publication-title: Knowl. Inf. Syst. doi: 10.1007/s10115-011-0463-8 – ident: ref_31 – ident: ref_29 – ident: ref_33 – ident: ref_15 – ident: ref_24 doi: 10.1109/BigData47090.2019.9006322 – ident: ref_8 doi: 10.1109/ISACV.2018.8354080 – volume: 40 start-page: 99 year: 2000 ident: ref_14 article-title: The earth mover’s distance as a metric for image retrieval publication-title: Int. J. Comput. Vis. doi: 10.1023/A:1026543900054 – ident: ref_17 – volume: 63 start-page: 3:1 year: 2019 ident: ref_19 article-title: Fairness GAN: Generating datasets with fairness properties using a generative adversarial network publication-title: IBM J. Res. Dev. doi: 10.1147/JRD.2019.2945519 – ident: ref_5 doi: 10.1145/2783258.2783311 – volume: 30 start-page: 5769 year: 2017 ident: ref_26 article-title: Improved Training of Wasserstein GANs publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_22 – volume: 5 start-page: 153 year: 2017 ident: ref_1 article-title: Fair prediction with disparate impact: A study of bias in recidivism prediction instruments publication-title: Big Data doi: 10.1089/big.2016.0047 – ident: ref_27 doi: 10.1007/978-3-540-71050-9 |
SSID | ssj0002794104 |
Score | 2.4282098 |
Snippet | With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a... |
SourceID | doaj proquest crossref |
SourceType | Open Website Aggregation Database Enrichment Source Index Database |
StartPage | 488 |
SubjectTerms | Accuracy Algorithms Artificial intelligence Automation Bias Constraints Datasets Decision making fair data generation fairness in artificial intelligence Generative adversarial networks Machine learning Methods Minority & ethnic groups Random variables Tables (data) Training |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LSwMxEA7SkxdRVKxWyaEnYWk2yWYTb_VRi9CeWuhtmbzAV5W2-vtNsttSUfDibTcM7PJNdme-JPMNQl2SG-Vd6bKisDbjVpJMejCZ1lJRbanlqQ_ZaCyGU_4wK2Zbrb7imbBaHrgGrmcoKM51ToEA94Jp4kKG75jRJtfMpzpyosgWmXpK22mKB6JRF-SxwOt7r_DsIlcilH4LQUmp_8ePOEWXwT7aa9JC3K9f5wDtuPkhGk1AD-Bxcd8fX-F4gcNAPDaKb2EFuFaMjsDiuJq6uf90OLVZXkKcXHhcH_ReHqHp4G5yM8ya9geZYYKvspD75lEchlHPuKcKisA9HAHGXFkam1sttOTWGgiQag08RFrLYx8LoNJKz45Ra_42dycIO1cyRrR1SpQcBAAQVwBTuvRMgmRtdLlGpDKNNnhsUfFSBY4Q4au24Guj7sb4vZbE-N3sOkK7MYk61mkgeLdqvFv95d026qwdUzUf17KiolSB5uRUnP7HM87QLo01DWlppYNaq8WHOw-ZxkpfpEn1BSsm0k0 priority: 102 providerName: Directory of Open Access Journals |
Title | TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks |
URI | https://www.proquest.com/docview/2679758126 https://doaj.org/article/c2a944b12a0a4f63b0e164e3cbc1b3f7 |
Volume | 4 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8MwDI54XLggECDGY8qBE1K1NknTlAvisYeQNiE0JG6V8yhCwAbb4MhvJ067AQJxqdokJ8ex_bmOP0KO4sTkpctclKbWRsKqOFIlmEhrlTNtmRWBh6w_kL1bcXWX3tUJt2ldVjm3icFQ27HBHHmLySz3sW3C5OnLa4SsUfh3tabQWCarfkKinqtOd5FjYV7ZPNyoruVxj-5bz_DoEDHFjP1wRKFf_y9zHHxMZ4Os18EhPat2c5MsudEW6Q9Bd-Bh0j0bnFB8oX4Ai0fpJcyAVn2jUbwUc6qL73dHA9nyFFDF6KAq955uk9tOe3jRi2oShMhwKWaRj4ATbBHDWclFyXJIPQJxMXDusszYxGqplbDWgBes1iC8v7UC2SyAKatKvkNWRuOR2yXUuYzzWFuXy0yABIDYpcBznZVcgeINcjyXSGHqDuFIVPFUeKSA4iu-ia9BjhaLX6rGGH8vO0fRLpZgN-swMJ7cF_XhKAyDXAidMIhBlJLr2HkU57jRJtG8zBrkYL4xRX3EpsWXQuz9P71P1hjeWQipkwOyMpu8uUMfScx0M6hLk6yetwfXN82Ax_2z_9H-BGP6zP8 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwEB7R5UAvqBVU3RZaH-ilUoRjOy8kVEFhWQobIbRI3NLxIxWi7MLutog_1d9YTx4LqKg3boljRdHMZ88jnvkANnhostIlLogiawNlUx6kJZpA6zQT2gqrKh6yQR73z9S38-h8Af60tTB0rLLdE6uN2o4N5cg3RZxk3rcNRfzl-iYg1ij6u9pSaNSwOHJ3tz5km24f7nn9fhKitz_82g8aVoHAyFjNAu9ShtRzRYpSqlJkGHmX3nGU0iWJsaHVsU6VtQb9l2qNyhswq4geAkVq01L6976ARUUVrR1Y3N3PT07nWR3h4e0DnLoQUMqMb17hpaMYjQvxyPRVDAH_GIDKqvVewXLjjrKdGj-vYcGNVmAwRN3Di8nBTr7F6IL5ATquyvZwhqzuVE0KZZTFnd__dqyid54igZrl9QHz6SqcPYuA3kBnNB65t8CcS6Tk2rosThTGiMhdhDLTSSlTTGUXPrcSKUzTk5yoMX4WPjYh8RUPxNeFjfnk67oVx9PTdkm08ynUP7saGE9-FM1yLIzATCkdCuSoylhq7nzc6KTRJtSyTLqw1iqmaBb1tLiH4Lv_P_4IS_3h4Lg4PsyP3sNLQRUTVeJmDTqzyS-37v2Ymf7QgIfB9-fG61_HcwdL |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3dT9RAEJ8gJMYXowHjKeo-4ItJw3Z3221NDAHPAgIXHyDhrc5-GSPc4d0J4V_zr2OnH6cGwxtv7XbTNLO_3fnozPwANnhqy-C1T7LMuUS5gidFQJsYU5TCOOFUw0N2NMr3TtTn0-x0CX73tTCUVtmfic1B7SaWYuSbItdltG2p4CV0aRFfhtXWxc-EGKToT2tPp9FC5MBfX0X3bfZhfxjX-q0Q1afjj3tJxzCQWJmreRLNy5T6r0gRpAqixCya956jlF5r61JnclMo5yzGrzYGVVRmThFVBIrCFUHG9z6AFS3jt1GVerW7iO-ICPTo6rQlgVKWfPMcf3jy1rgQ_yjBhivglipo9Fv1BB53hinbbpH0FJb8eBWOjtFU-H26uz16z-iCxQFKXGVDnCNre1bT0jKK5y7uLz1riJ5nSPBmozbVfLYGJ_cinmewPJ6M_XNg3mspuXG-zLXCHBG5z1CWRgdZYCEH8K6XSG277uREknFWRy-FxFf_Jb4BbCwmX7RNOf4_bYdEu5hCnbSbgcn0W91tzNoKLJUyqUCOKuTScB89SC-tsamRQQ9gvV-Yutves_oPGF_c_fgNPIworQ_3Rwcv4ZGg0okmgrMOy_PpL_8qGjRz87pBDoOv9w3VGzAAChs |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TabFairGAN%3A+Fair+Tabular+Data+Generation+with+Generative+Adversarial+Networks&rft.jtitle=Machine+learning+and+knowledge+extraction&rft.au=Rajabi%2C+Amirarsalan&rft.au=Garibay%2C+Ozlem+Ozmen&rft.date=2022-06-01&rft.pub=MDPI+AG&rft.eissn=2504-4990&rft.volume=4&rft.issue=2&rft.spage=488&rft_id=info:doi/10.3390%2Fmake4020022&rft.externalDBID=HAS_PDF_LINK |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2504-4990&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2504-4990&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2504-4990&client=summon |