Efficient Multidimensional Suppression for K-Anonymity
Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable effective data mining while preserving privacy is to anonymize the data set that includes private information about subjects before being rele...
Saved in:
Published in | IEEE transactions on knowledge and data engineering Vol. 22; no. 3; pp. 334 - 347 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York, NY
IEEE
01.03.2010
IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable effective data mining while preserving privacy is to anonymize the data set that includes private information about subjects before being released for data mining. One way to anonymize data set is to manipulate its content so that the records adhere to k-anonymity. Two common manipulation techniques used to achieve k-anonymity of a data set are generalization and suppression. Generalization refers to replacing a value with a less specific but semantically consistent value, while suppression refers to not releasing a value at all. Generalization is more commonly applied in this domain since suppression may dramatically reduce the quality of the data mining results if not properly used. However, generalization presents a major drawback as it requires a manually generated domain hierarchy taxonomy for every quasi-identifier in the data set on which k-anonymity has to be performed. In this paper, we propose a new method for achieving k-anonymity named K-anonymity of Classification Trees Using Suppression (kACTUS). In kACTUS, efficient multidimensional suppression is performed, i.e., values are suppressed only on certain records depending on other attribute values, without the need for manually produced domain hierarchy trees. Thus, in kACTUS, we identify attributes that have less influence on the classification of the data records and suppress them if needed in order to comply with k-anonymity. The kACTUS method was evaluated on 10 separate data sets to evaluate its accuracy as compared to other k-anonymity generalization- and suppression-based methods. Encouraging results suggest that kACTUS' predictive performance is better than that of existing k-anonymity algorithms. Specifically, on average, the accuracies of TDS, TDR, and kADET are lower than kACTUS in 3.5, 3.3, and 1.9 percent, respectively, despite their usage of manually defined domain trees. The accuracy gap is increased to 5.3, 4.3, and 3.1 percent, respectively, when no domain trees are used. |
---|---|
AbstractList | Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable effective data mining while preserving privacy is to anonymize the data set that includes private information about subjects before being released for data mining. One way to anonymize data set is to manipulate its content so that the records adhere to k-anonymity. Two common manipulation techniques used to achieve k-anonymity of a data set are generalization and suppression. Generalization refers to replacing a value with a less specific but semantically consistent value, while suppression refers to not releasing a value at all. Generalization is more commonly applied in this domain since suppression may dramatically reduce the quality of the data mining results if not properly used. However, generalization presents a major drawback as it requires a manually generated domain hierarchy taxonomy for every quasi-identifier in the data set on which k-anonymity has to be performed. In this paper, we propose a new method for achieving k-anonymity named K-anonymity of Classification Trees Using Suppression (kACTUS). In kACTUS, efficient multidimensional suppression is performed, i.e., values are suppressed only on certain records depending on other attribute values, without the need for manually produced domain hierarchy trees. Thus, in kACTUS, we identify attributes that have less influence on the classification of the data records and suppress them if needed in order to comply with k-anonymity. The kACTUS method was evaluated on 10 separate data sets to evaluate its accuracy as compared to other k-anonymity generalization- and suppression-based methods. Encouraging results suggest that kACTUS' predictive performance is better than that of existing k-anonymity algorithms. Specifically, on average, the accuracies of TDS, TDR, and kADET are lower than kACTUS in 3.5, 3.3, and 1.9 percent, respectively, despite their usage of manually defined domain trees. The accuracy gap is increased to 5.3, 4.3, and 3.1 percent, respectively, when no domain trees are used. |
Author | Kisilevich, S. Elovici, Y. Shapira, B. Rokach, L. |
Author_xml | – sequence: 1 givenname: S. surname: Kisilevich fullname: Kisilevich, S. organization: Dept. of Comput. & Inf. Sci., Univ. of Konstanz/Germany, Konstanz, Germany – sequence: 2 givenname: L. surname: Rokach fullname: Rokach, L. organization: Dept. of Inf. Syst. Eng., Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel – sequence: 3 givenname: Y. surname: Elovici fullname: Elovici, Y. organization: Dept. of Inf. Syst. Eng., Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel – sequence: 4 givenname: B. surname: Shapira fullname: Shapira, B. organization: Dept. of Inf. Syst. Eng., Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel |
BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=22410285$$DView record in Pascal Francis |
BookMark | eNqFkUtLAzEUhYMoWKs7d24GQdw4Ne_HUmp9oOJCXYd05gZSpjM1mVn035vS6kIQV7kJ3z3n5p4jtN92LSB0SvCEEGyu359uZxOKsZkYsodGRAhdUmLIfq4xJyVnXB2io5QWGGOtNBkhOfM-VAHavngZmj7UYQltCl3rmuJtWK0ipM2t8F0snsqbbLhehn59jA68axKc7M4x-ribvU8fyufX-8fpzXNZMSX7ci6YkN7URMnaMWASOKX5SctqLhSl4BkGhiVAHk8IT7yZM-dBMVEbY4CN0eVWdxW7zwFSb5chVdA0roVuSFYrgSmVhP1LKsEUlZSaTJ7_IhfdEPOHkzWEmCypeYYudpBLlWt8dG0Vkl3FsHRxbSnlBFMtMke3XBW7lCJ4W4Xe9XllfXShsQTbTTR2E43dRJM9ctPVr6Zv3T_wsy0eAOAH5ZpjxjX7AgmLl-Q |
CODEN | ITKEEH |
CitedBy_id | crossref_primary_10_1002_cpe_7565 crossref_primary_10_1049_htl_2015_0050 crossref_primary_10_2481_dsj_009_025 crossref_primary_10_3724_SP_J_1016_2012_00827 crossref_primary_10_1016_j_asoc_2021_107743 crossref_primary_10_1016_j_cose_2016_12_010 crossref_primary_10_1016_j_future_2016_10_022 crossref_primary_10_1016_j_trc_2013_12_003 crossref_primary_10_1109_ACCESS_2024_3513796 crossref_primary_10_1109_ACCESS_2020_3045700 crossref_primary_10_1109_TP_2025_3527461 crossref_primary_10_29252_jsdp_15_3_31 crossref_primary_10_1016_j_ins_2019_05_011 crossref_primary_10_1089_bio_2015_0100 crossref_primary_10_1016_j_ins_2011_07_035 crossref_primary_10_1007_s10115_010_0354_4 crossref_primary_10_1109_ACCESS_2018_2834858 crossref_primary_10_1016_j_knosys_2013_01_007 crossref_primary_10_1142_S0218488515500300 crossref_primary_10_1109_ACCESS_2019_2927386 crossref_primary_10_1007_s11277_020_07110_x crossref_primary_10_1016_j_datak_2011_07_001 crossref_primary_10_1089_big_2021_0169 crossref_primary_10_1515_cait_2017_0015 crossref_primary_10_3390_s17051059 crossref_primary_10_1016_j_ins_2013_01_027 crossref_primary_10_1049_iet_ifs_2015_0545 crossref_primary_10_1109_ACCESS_2016_2596542 crossref_primary_10_1016_j_engappai_2020_103787 crossref_primary_10_1016_j_ins_2010_03_011 crossref_primary_10_1016_j_ins_2013_07_034 crossref_primary_10_1145_3546934 crossref_primary_10_4018_ijrqeh_2014010105 crossref_primary_10_1016_j_ins_2022_09_004 crossref_primary_10_1002_sec_1084 |
Cites_doi | 10.1016/S0004-3702(97)00043-X 10.1145/1066157.1066164 10.1007/s10791-008-9061-0 10.1145/775047.775089 10.1145/1065167.1065184 10.1109/ICDE.2005.111 10.1145/335191.335438 10.1109/TKDE.2007.1015 10.1145/1055558.1055591 10.1142/S021848850200165X 10.1002/cncr.21599 10.1007/s10618-008-0105-2 10.1145/1014052.1014126 10.5555/1248547.1248548 10.1145/1014052.1014120 10.1145/1150402.1150435 10.1109/ICDE.2005.143 10.1109/ICDM.2004.10110 10.1109/ICDM.2005.142 10.1137/1.9781611972757.9 10.1145/974121.974131 10.1145/275487.275508 10.1007/s00778-006-0039-5 10.1162/089976699300016007 10.1007/978-3-540-78478-4_6 10.1109/ICDE.2006.101 10.1007/978-0-387-35285-5_22 10.1006/jcss.1996.0042 10.1109/69.971193 10.1016/j.patcog.2007.10.013 10.1109/ICDE.2007.369026 10.1007/978-3-540-30576-7_20 10.1142/S0218488502001648 |
ContentType | Journal Article |
Copyright | 2015 INIST-CNRS Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Mar 2010 |
Copyright_xml | – notice: 2015 INIST-CNRS – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Mar 2010 |
DBID | 97E RIA RIE AAYXX CITATION IQODW 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
DOI | 10.1109/TKDE.2009.91 |
DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
DatabaseTitleList | Technology Research Database Technology Research Database Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science Applied Sciences |
EISSN | 1558-2191 |
EndPage | 347 |
ExternalDocumentID | 2543237961 22410285 10_1109_TKDE_2009_91 4840348 |
Genre | orig-research |
GroupedDBID | -~X .DC 0R~ 1OL 29I 4.4 5GY 5VS 6IK 97E 9M8 AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RXW RZB TAE TAF TN5 UHB VH1 AAYOK AAYXX CITATION RIG IQODW 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
ID | FETCH-LOGICAL-c376t-b5356f9d176da3e36e42235686cb5722ef30e306ee04155f1f9b3afe735d999e3 |
IEDL.DBID | RIE |
ISSN | 1041-4347 |
IngestDate | Thu Jul 10 17:52:49 EDT 2025 Thu Jul 10 23:07:42 EDT 2025 Sun Jun 29 12:53:35 EDT 2025 Mon Jul 21 09:15:30 EDT 2025 Thu Apr 24 23:04:29 EDT 2025 Tue Jul 01 05:17:31 EDT 2025 Wed Aug 27 02:52:17 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Keywords | Private life Data privacy Data analysis Taxonomy Anonymity Information extraction Graph theory Data mining Decision tree Identifier Data integrity Semantics Classification decision trees Database Privacy-preserving data mining deindentified data k-anonymity |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c376t-b5356f9d176da3e36e42235686cb5722ef30e306ee04155f1f9b3afe735d999e3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 |
PQID | 911987584 |
PQPubID | 23500 |
PageCount | 14 |
ParticipantIDs | proquest_miscellaneous_875022613 proquest_journals_911987584 crossref_citationtrail_10_1109_TKDE_2009_91 proquest_miscellaneous_753726229 ieee_primary_4840348 crossref_primary_10_1109_TKDE_2009_91 pascalfrancis_primary_22410285 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2010-03-01 |
PublicationDateYYYYMMDD | 2010-03-01 |
PublicationDate_xml | – month: 03 year: 2010 text: 2010-03-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | New York, NY |
PublicationPlace_xml | – name: New York, NY – name: New York |
PublicationTitle | IEEE transactions on knowledge and data engineering |
PublicationTitleAbbrev | TKDE |
PublicationYear | 2010 |
Publisher | IEEE IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: IEEE Computer Society – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref35 ref12 ref15 ref14 ref31 ref30 ref11 ref33 ref10 ref32 ref2 ref1 ref17 ref39 ref16 ref18 Asuncion (ref36) 2007 Roberto (ref19); 21 ref23 Aggarwal (ref24) 2005 Tiancheng (ref9) ref26 Witten (ref38) 2005 ref25 ref20 ref22 ref21 Frank (ref37) ref28 ref27 ref29 ref8 ref7 ref4 ref3 ref6 ref5 ref40 Quinlan (ref34) 1993 |
References_xml | – ident: ref14 doi: 10.1016/S0004-3702(97)00043-X – ident: ref12 doi: 10.1145/1066157.1066164 – ident: ref2 doi: 10.1007/s10791-008-9061-0 – volume-title: C4.5: Programs for Machine Learning year: 1993 ident: ref34 – ident: ref10 doi: 10.1145/775047.775089 – year: 2005 ident: ref24 article-title: Approximation Algorithms for k-Anonymity publication-title: J. Privacy Technology – ident: ref20 doi: 10.1145/1065167.1065184 – ident: ref23 doi: 10.1109/ICDE.2005.111 – ident: ref16 doi: 10.1145/335191.335438 – ident: ref11 doi: 10.1109/TKDE.2007.1015 – volume-title: Data Mining: Practical Machine Learning Tools year: 2005 ident: ref38 – ident: ref25 doi: 10.1145/1055558.1055591 – ident: ref6 doi: 10.1142/S021848850200165X – ident: ref3 doi: 10.1002/cncr.21599 – start-page: 518 volume-title: Proc. Sixth IEEE Int’l Conf. Data Mining Workshops ident: ref9 article-title: Optimal K-Anonymity with Flexible Generalization Schemes through Bottom-Up Searching – ident: ref33 doi: 10.1007/s10618-008-0105-2 – ident: ref1 doi: 10.1145/1014052.1014126 – ident: ref39 doi: 10.5555/1248547.1248548 – ident: ref17 doi: 10.1145/1014052.1014120 – ident: ref28 doi: 10.1145/1150402.1150435 – start-page: 144 volume-title: Proc. 15th Int’l Conf. Machine Learning ident: ref37 article-title: Generating Accurate Rule Sets without Global Optimization – ident: ref7 doi: 10.1109/ICDE.2005.143 – ident: ref8 doi: 10.1109/ICDM.2004.10110 – ident: ref22 doi: 10.1109/ICDM.2005.142 – ident: ref18 doi: 10.1137/1.9781611972757.9 – ident: ref15 doi: 10.1145/974121.974131 – volume: 21 start-page: 217 volume-title: Proc. Int’l Conf. Data Eng. ident: ref19 article-title: Data Privacy through Optimal k-Anonymization – ident: ref4 doi: 10.1145/275487.275508 – ident: ref13 doi: 10.1007/s00778-006-0039-5 – ident: ref35 doi: 10.1162/089976699300016007 – ident: ref30 doi: 10.1007/978-3-540-78478-4_6 – ident: ref27 doi: 10.1109/ICDE.2006.101 – ident: ref29 doi: 10.1007/978-0-387-35285-5_22 – ident: ref31 doi: 10.1006/jcss.1996.0042 – ident: ref26 doi: 10.1109/69.971193 – ident: ref40 doi: 10.1016/j.patcog.2007.10.013 – ident: ref32 doi: 10.1109/ICDE.2007.369026 – ident: ref21 doi: 10.1007/978-3-540-30576-7_20 – ident: ref5 doi: 10.1142/S0218488502001648 – year: 2007 ident: ref36 article-title: UCI Machine Learning Repository |
SSID | ssj0008781 |
Score | 2.2840884 |
Snippet | Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable... |
SourceID | proquest pascalfrancis crossref ieee |
SourceType | Aggregation Database Index Database Enrichment Source Publisher |
StartPage | 334 |
SubjectTerms | Algorithms Applied sciences Classification Classification tree analysis Computer science; control theory; systems Data mining Data privacy Data processing. List processing. Character string processing Decision trees deindentified data Delta modulation Diseases Exact sciences and technology Hierarchies Information retrieval. Graph Information systems. Data bases k-anonymity Memory and file management (including protection and security) Memory organisation. Data processing Multidimensional systems National security Preserving Privacy-preserving data mining Releasing Software Studies Taxonomy Theoretical computing Trees |
Title | Efficient Multidimensional Suppression for K-Anonymity |
URI | https://ieeexplore.ieee.org/document/4840348 https://www.proquest.com/docview/911987584 https://www.proquest.com/docview/753726229 https://www.proquest.com/docview/875022613 |
Volume | 22 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED6VTjBQKCBCocoAE6QlcezUI4JWFVWZQOoWxc55AbWItgu_nrOdRjwltki-OM759Z19dx_AuUBJ-7jQERnISZRqySKlTRqZQhkd0xaXxTZ2ePogxk_p_YzPGnBVx8IgonM-w559dHf55UKv7VFZPyVrhKWDLdgiw83HatWr7iBzhKRkXZBNxNKsdnKX_cfJ3dBnppTxl-3H8alYb8hiSQoxnsnix6LsdppRC6abNnoHk-feeqV6-v1b-sb__sQe7FaQM7zxY2QfGjhvQ2tD5xBWs7sNO59yEx6AGLrkElRX6IJ0S0sD4FN4hJYK1PvPzkMCveEk8qcIhOgP4Wk0fLwdRxXJQqRpbVlFijMujCzjTJQFQyYwJcTAxUBoxbMkQcOukewKRBvMz01spGKFwYzxksAlsiNo0ifwGEJlU-UQ4JFJQbhEGMX4gOlSqYJelloGcLnRfa6rDOSWCOMld5bItcxtT1liTJnLOICLWvrVZ974Q-7AKrmWqfQbQPdLt9blFrMQqOIBdDb9nFfzdknV2UMYAmUBhHUpTTh7i1LMcbFe5mTfZYlIEvm3CNVA0IiA0snvbevAtndCsK5sp9Bcva3xjLDNSnXdoP4A7KX2MQ |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED7xGICBRwsiPDPABCkkjp16RFBUKGVqpW5R7JwXUEG0Xfj1nO004imxRbLjOOfHfWff3QdwIlCSHhc6IgM5iVItWaS0SSNTKKNjUnFZbGOH-4-iO0zvR3y0AOd1LAwiOuczbNlHd5dfvuiZPSq7SMkaYWl7EZZJ7_PYR2vV-247c5SkZF-QVcTSrHZzlxeD3k3H56aU8RcF5BhVrD9kMSGRGM9l8WNbdrrmdgP68156F5On1myqWvr9WwLH__7GJqxXoDO88rNkCxZw3ICNOaFDWK3vBqx9yk7YBNFx6SWordCF6ZaWCMAn8QgtGaj3oB2HBHvDXuTPEQjTb8PwtjO47kYVzUKkaXeZRoozLows40yUBUMmMCXMwEVbaMWzJEHDLpEsC0Qbzs9NbKRihcGM8ZLgJbIdWKJP4C6EyibLIcgjk4KQiTCK8TbTpVIFvSy1DOBsLvtcVznILRXGc-5skUuZ25Gy1Jgyl3EAp3XtV5974496TSvkuk4l3wCOvgxrXW5RC8EqHsD-fJzzauVOqDl7DEOwLICwLqUlZ-9RijG-zCY5WXhZIpJE_l2FWiBwRFBp7_e-HcNKd9B_yB_uHnv7sOpdEqxj2wEsTd9meEhIZ6qO3AT_AOZ5-Xo |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+Multidimensional+Suppression+for+K-Anonymity&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Kisilevich%2C+Slava&rft.au=Rokach%2C+Lior&rft.au=Elovici%2C+Yuval&rft.au=Shapira%2C+Bracha&rft.date=2010-03-01&rft.issn=1041-4347&rft.volume=22&rft.issue=3&rft.spage=334&rft.epage=347&rft_id=info:doi/10.1109%2FTKDE.2009.91&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon |