Efficient Multidimensional Suppression for K-Anonymity

Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable effective data mining while preserving privacy is to anonymize the data set that includes private information about subjects before being rele...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 22; no. 3; pp. 334 - 347
Main Authors Kisilevich, S., Rokach, L., Elovici, Y., Shapira, B.
Format Journal Article
LanguageEnglish
Published New York, NY IEEE 01.03.2010
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable effective data mining while preserving privacy is to anonymize the data set that includes private information about subjects before being released for data mining. One way to anonymize data set is to manipulate its content so that the records adhere to k-anonymity. Two common manipulation techniques used to achieve k-anonymity of a data set are generalization and suppression. Generalization refers to replacing a value with a less specific but semantically consistent value, while suppression refers to not releasing a value at all. Generalization is more commonly applied in this domain since suppression may dramatically reduce the quality of the data mining results if not properly used. However, generalization presents a major drawback as it requires a manually generated domain hierarchy taxonomy for every quasi-identifier in the data set on which k-anonymity has to be performed. In this paper, we propose a new method for achieving k-anonymity named K-anonymity of Classification Trees Using Suppression (kACTUS). In kACTUS, efficient multidimensional suppression is performed, i.e., values are suppressed only on certain records depending on other attribute values, without the need for manually produced domain hierarchy trees. Thus, in kACTUS, we identify attributes that have less influence on the classification of the data records and suppress them if needed in order to comply with k-anonymity. The kACTUS method was evaluated on 10 separate data sets to evaluate its accuracy as compared to other k-anonymity generalization- and suppression-based methods. Encouraging results suggest that kACTUS' predictive performance is better than that of existing k-anonymity algorithms. Specifically, on average, the accuracies of TDS, TDR, and kADET are lower than kACTUS in 3.5, 3.3, and 1.9 percent, respectively, despite their usage of manually defined domain trees. The accuracy gap is increased to 5.3, 4.3, and 3.1 percent, respectively, when no domain trees are used.
AbstractList Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable effective data mining while preserving privacy is to anonymize the data set that includes private information about subjects before being released for data mining. One way to anonymize data set is to manipulate its content so that the records adhere to k-anonymity. Two common manipulation techniques used to achieve k-anonymity of a data set are generalization and suppression. Generalization refers to replacing a value with a less specific but semantically consistent value, while suppression refers to not releasing a value at all. Generalization is more commonly applied in this domain since suppression may dramatically reduce the quality of the data mining results if not properly used. However, generalization presents a major drawback as it requires a manually generated domain hierarchy taxonomy for every quasi-identifier in the data set on which k-anonymity has to be performed. In this paper, we propose a new method for achieving k-anonymity named K-anonymity of Classification Trees Using Suppression (kACTUS). In kACTUS, efficient multidimensional suppression is performed, i.e., values are suppressed only on certain records depending on other attribute values, without the need for manually produced domain hierarchy trees. Thus, in kACTUS, we identify attributes that have less influence on the classification of the data records and suppress them if needed in order to comply with k-anonymity. The kACTUS method was evaluated on 10 separate data sets to evaluate its accuracy as compared to other k-anonymity generalization- and suppression-based methods. Encouraging results suggest that kACTUS' predictive performance is better than that of existing k-anonymity algorithms. Specifically, on average, the accuracies of TDS, TDR, and kADET are lower than kACTUS in 3.5, 3.3, and 1.9 percent, respectively, despite their usage of manually defined domain trees. The accuracy gap is increased to 5.3, 4.3, and 3.1 percent, respectively, when no domain trees are used.
Author Kisilevich, S.
Elovici, Y.
Shapira, B.
Rokach, L.
Author_xml – sequence: 1
  givenname: S.
  surname: Kisilevich
  fullname: Kisilevich, S.
  organization: Dept. of Comput. & Inf. Sci., Univ. of Konstanz/Germany, Konstanz, Germany
– sequence: 2
  givenname: L.
  surname: Rokach
  fullname: Rokach, L.
  organization: Dept. of Inf. Syst. Eng., Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel
– sequence: 3
  givenname: Y.
  surname: Elovici
  fullname: Elovici, Y.
  organization: Dept. of Inf. Syst. Eng., Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel
– sequence: 4
  givenname: B.
  surname: Shapira
  fullname: Shapira, B.
  organization: Dept. of Inf. Syst. Eng., Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=22410285$$DView record in Pascal Francis
BookMark eNqFkUtLAzEUhYMoWKs7d24GQdw4Ne_HUmp9oOJCXYd05gZSpjM1mVn035vS6kIQV7kJ3z3n5p4jtN92LSB0SvCEEGyu359uZxOKsZkYsodGRAhdUmLIfq4xJyVnXB2io5QWGGOtNBkhOfM-VAHavngZmj7UYQltCl3rmuJtWK0ipM2t8F0snsqbbLhehn59jA68axKc7M4x-ribvU8fyufX-8fpzXNZMSX7ci6YkN7URMnaMWASOKX5SctqLhSl4BkGhiVAHk8IT7yZM-dBMVEbY4CN0eVWdxW7zwFSb5chVdA0roVuSFYrgSmVhP1LKsEUlZSaTJ7_IhfdEPOHkzWEmCypeYYudpBLlWt8dG0Vkl3FsHRxbSnlBFMtMke3XBW7lCJ4W4Xe9XllfXShsQTbTTR2E43dRJM9ctPVr6Zv3T_wsy0eAOAH5ZpjxjX7AgmLl-Q
CODEN ITKEEH
CitedBy_id crossref_primary_10_1002_cpe_7565
crossref_primary_10_1049_htl_2015_0050
crossref_primary_10_2481_dsj_009_025
crossref_primary_10_3724_SP_J_1016_2012_00827
crossref_primary_10_1016_j_asoc_2021_107743
crossref_primary_10_1016_j_cose_2016_12_010
crossref_primary_10_1016_j_future_2016_10_022
crossref_primary_10_1016_j_trc_2013_12_003
crossref_primary_10_1109_ACCESS_2024_3513796
crossref_primary_10_1109_ACCESS_2020_3045700
crossref_primary_10_1109_TP_2025_3527461
crossref_primary_10_29252_jsdp_15_3_31
crossref_primary_10_1016_j_ins_2019_05_011
crossref_primary_10_1089_bio_2015_0100
crossref_primary_10_1016_j_ins_2011_07_035
crossref_primary_10_1007_s10115_010_0354_4
crossref_primary_10_1109_ACCESS_2018_2834858
crossref_primary_10_1016_j_knosys_2013_01_007
crossref_primary_10_1142_S0218488515500300
crossref_primary_10_1109_ACCESS_2019_2927386
crossref_primary_10_1007_s11277_020_07110_x
crossref_primary_10_1016_j_datak_2011_07_001
crossref_primary_10_1089_big_2021_0169
crossref_primary_10_1515_cait_2017_0015
crossref_primary_10_3390_s17051059
crossref_primary_10_1016_j_ins_2013_01_027
crossref_primary_10_1049_iet_ifs_2015_0545
crossref_primary_10_1109_ACCESS_2016_2596542
crossref_primary_10_1016_j_engappai_2020_103787
crossref_primary_10_1016_j_ins_2010_03_011
crossref_primary_10_1016_j_ins_2013_07_034
crossref_primary_10_1145_3546934
crossref_primary_10_4018_ijrqeh_2014010105
crossref_primary_10_1016_j_ins_2022_09_004
crossref_primary_10_1002_sec_1084
Cites_doi 10.1016/S0004-3702(97)00043-X
10.1145/1066157.1066164
10.1007/s10791-008-9061-0
10.1145/775047.775089
10.1145/1065167.1065184
10.1109/ICDE.2005.111
10.1145/335191.335438
10.1109/TKDE.2007.1015
10.1145/1055558.1055591
10.1142/S021848850200165X
10.1002/cncr.21599
10.1007/s10618-008-0105-2
10.1145/1014052.1014126
10.5555/1248547.1248548
10.1145/1014052.1014120
10.1145/1150402.1150435
10.1109/ICDE.2005.143
10.1109/ICDM.2004.10110
10.1109/ICDM.2005.142
10.1137/1.9781611972757.9
10.1145/974121.974131
10.1145/275487.275508
10.1007/s00778-006-0039-5
10.1162/089976699300016007
10.1007/978-3-540-78478-4_6
10.1109/ICDE.2006.101
10.1007/978-0-387-35285-5_22
10.1006/jcss.1996.0042
10.1109/69.971193
10.1016/j.patcog.2007.10.013
10.1109/ICDE.2007.369026
10.1007/978-3-540-30576-7_20
10.1142/S0218488502001648
ContentType Journal Article
Copyright 2015 INIST-CNRS
Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Mar 2010
Copyright_xml – notice: 2015 INIST-CNRS
– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Mar 2010
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
DOI 10.1109/TKDE.2009.91
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList Technology Research Database
Technology Research Database
Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Applied Sciences
EISSN 1558-2191
EndPage 347
ExternalDocumentID 2543237961
22410285
10_1109_TKDE_2009_91
4840348
Genre orig-research
GroupedDBID -~X
.DC
0R~
1OL
29I
4.4
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RXW
RZB
TAE
TAF
TN5
UHB
VH1
AAYOK
AAYXX
CITATION
RIG
IQODW
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
ID FETCH-LOGICAL-c376t-b5356f9d176da3e36e42235686cb5722ef30e306ee04155f1f9b3afe735d999e3
IEDL.DBID RIE
ISSN 1041-4347
IngestDate Thu Jul 10 17:52:49 EDT 2025
Thu Jul 10 23:07:42 EDT 2025
Sun Jun 29 12:53:35 EDT 2025
Mon Jul 21 09:15:30 EDT 2025
Thu Apr 24 23:04:29 EDT 2025
Tue Jul 01 05:17:31 EDT 2025
Wed Aug 27 02:52:17 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords Private life
Data privacy
Data analysis
Taxonomy
Anonymity
Information extraction
Graph theory
Data mining
Decision tree
Identifier
Data integrity
Semantics
Classification
decision trees
Database
Privacy-preserving data mining
deindentified data
k-anonymity
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c376t-b5356f9d176da3e36e42235686cb5722ef30e306ee04155f1f9b3afe735d999e3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
PQID 911987584
PQPubID 23500
PageCount 14
ParticipantIDs proquest_miscellaneous_875022613
proquest_journals_911987584
crossref_citationtrail_10_1109_TKDE_2009_91
proquest_miscellaneous_753726229
ieee_primary_4840348
crossref_primary_10_1109_TKDE_2009_91
pascalfrancis_primary_22410285
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2010-03-01
PublicationDateYYYYMMDD 2010-03-01
PublicationDate_xml – month: 03
  year: 2010
  text: 2010-03-01
  day: 01
PublicationDecade 2010
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
– name: New York
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 2010
Publisher IEEE
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: IEEE Computer Society
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref35
ref12
ref15
ref14
ref31
ref30
ref11
ref33
ref10
ref32
ref2
ref1
ref17
ref39
ref16
ref18
Asuncion (ref36) 2007
Roberto (ref19); 21
ref23
Aggarwal (ref24) 2005
Tiancheng (ref9)
ref26
Witten (ref38) 2005
ref25
ref20
ref22
ref21
Frank (ref37)
ref28
ref27
ref29
ref8
ref7
ref4
ref3
ref6
ref5
ref40
Quinlan (ref34) 1993
References_xml – ident: ref14
  doi: 10.1016/S0004-3702(97)00043-X
– ident: ref12
  doi: 10.1145/1066157.1066164
– ident: ref2
  doi: 10.1007/s10791-008-9061-0
– volume-title: C4.5: Programs for Machine Learning
  year: 1993
  ident: ref34
– ident: ref10
  doi: 10.1145/775047.775089
– year: 2005
  ident: ref24
  article-title: Approximation Algorithms for k-Anonymity
  publication-title: J. Privacy Technology
– ident: ref20
  doi: 10.1145/1065167.1065184
– ident: ref23
  doi: 10.1109/ICDE.2005.111
– ident: ref16
  doi: 10.1145/335191.335438
– ident: ref11
  doi: 10.1109/TKDE.2007.1015
– volume-title: Data Mining: Practical Machine Learning Tools
  year: 2005
  ident: ref38
– ident: ref25
  doi: 10.1145/1055558.1055591
– ident: ref6
  doi: 10.1142/S021848850200165X
– ident: ref3
  doi: 10.1002/cncr.21599
– start-page: 518
  volume-title: Proc. Sixth IEEE Int’l Conf. Data Mining Workshops
  ident: ref9
  article-title: Optimal K-Anonymity with Flexible Generalization Schemes through Bottom-Up Searching
– ident: ref33
  doi: 10.1007/s10618-008-0105-2
– ident: ref1
  doi: 10.1145/1014052.1014126
– ident: ref39
  doi: 10.5555/1248547.1248548
– ident: ref17
  doi: 10.1145/1014052.1014120
– ident: ref28
  doi: 10.1145/1150402.1150435
– start-page: 144
  volume-title: Proc. 15th Int’l Conf. Machine Learning
  ident: ref37
  article-title: Generating Accurate Rule Sets without Global Optimization
– ident: ref7
  doi: 10.1109/ICDE.2005.143
– ident: ref8
  doi: 10.1109/ICDM.2004.10110
– ident: ref22
  doi: 10.1109/ICDM.2005.142
– ident: ref18
  doi: 10.1137/1.9781611972757.9
– ident: ref15
  doi: 10.1145/974121.974131
– volume: 21
  start-page: 217
  volume-title: Proc. Int’l Conf. Data Eng.
  ident: ref19
  article-title: Data Privacy through Optimal k-Anonymization
– ident: ref4
  doi: 10.1145/275487.275508
– ident: ref13
  doi: 10.1007/s00778-006-0039-5
– ident: ref35
  doi: 10.1162/089976699300016007
– ident: ref30
  doi: 10.1007/978-3-540-78478-4_6
– ident: ref27
  doi: 10.1109/ICDE.2006.101
– ident: ref29
  doi: 10.1007/978-0-387-35285-5_22
– ident: ref31
  doi: 10.1006/jcss.1996.0042
– ident: ref26
  doi: 10.1109/69.971193
– ident: ref40
  doi: 10.1016/j.patcog.2007.10.013
– ident: ref32
  doi: 10.1109/ICDE.2007.369026
– ident: ref21
  doi: 10.1007/978-3-540-30576-7_20
– ident: ref5
  doi: 10.1142/S0218488502001648
– year: 2007
  ident: ref36
  article-title: UCI Machine Learning Repository
SSID ssj0008781
Score 2.2840884
Snippet Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable...
SourceID proquest
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 334
SubjectTerms Algorithms
Applied sciences
Classification
Classification tree analysis
Computer science; control theory; systems
Data mining
Data privacy
Data processing. List processing. Character string processing
Decision trees
deindentified data
Delta modulation
Diseases
Exact sciences and technology
Hierarchies
Information retrieval. Graph
Information systems. Data bases
k-anonymity
Memory and file management (including protection and security)
Memory organisation. Data processing
Multidimensional systems
National security
Preserving
Privacy-preserving data mining
Releasing
Software
Studies
Taxonomy
Theoretical computing
Trees
Title Efficient Multidimensional Suppression for K-Anonymity
URI https://ieeexplore.ieee.org/document/4840348
https://www.proquest.com/docview/911987584
https://www.proquest.com/docview/753726229
https://www.proquest.com/docview/875022613
Volume 22
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED6VTjBQKCBCocoAE6QlcezUI4JWFVWZQOoWxc55AbWItgu_nrOdRjwltki-OM759Z19dx_AuUBJ-7jQERnISZRqySKlTRqZQhkd0xaXxTZ2ePogxk_p_YzPGnBVx8IgonM-w559dHf55UKv7VFZPyVrhKWDLdgiw83HatWr7iBzhKRkXZBNxNKsdnKX_cfJ3dBnppTxl-3H8alYb8hiSQoxnsnix6LsdppRC6abNnoHk-feeqV6-v1b-sb__sQe7FaQM7zxY2QfGjhvQ2tD5xBWs7sNO59yEx6AGLrkElRX6IJ0S0sD4FN4hJYK1PvPzkMCveEk8qcIhOgP4Wk0fLwdRxXJQqRpbVlFijMujCzjTJQFQyYwJcTAxUBoxbMkQcOukewKRBvMz01spGKFwYzxksAlsiNo0ifwGEJlU-UQ4JFJQbhEGMX4gOlSqYJelloGcLnRfa6rDOSWCOMld5bItcxtT1liTJnLOICLWvrVZ974Q-7AKrmWqfQbQPdLt9blFrMQqOIBdDb9nFfzdknV2UMYAmUBhHUpTTh7i1LMcbFe5mTfZYlIEvm3CNVA0IiA0snvbevAtndCsK5sp9Bcva3xjLDNSnXdoP4A7KX2MQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED7xGICBRwsiPDPABCkkjp16RFBUKGVqpW5R7JwXUEG0Xfj1nO004imxRbLjOOfHfWff3QdwIlCSHhc6IgM5iVItWaS0SSNTKKNjUnFZbGOH-4-iO0zvR3y0AOd1LAwiOuczbNlHd5dfvuiZPSq7SMkaYWl7EZZJ7_PYR2vV-247c5SkZF-QVcTSrHZzlxeD3k3H56aU8RcF5BhVrD9kMSGRGM9l8WNbdrrmdgP68156F5On1myqWvr9WwLH__7GJqxXoDO88rNkCxZw3ICNOaFDWK3vBqx9yk7YBNFx6SWordCF6ZaWCMAn8QgtGaj3oB2HBHvDXuTPEQjTb8PwtjO47kYVzUKkaXeZRoozLows40yUBUMmMCXMwEVbaMWzJEHDLpEsC0Qbzs9NbKRihcGM8ZLgJbIdWKJP4C6EyibLIcgjk4KQiTCK8TbTpVIFvSy1DOBsLvtcVznILRXGc-5skUuZ25Gy1Jgyl3EAp3XtV5974496TSvkuk4l3wCOvgxrXW5RC8EqHsD-fJzzauVOqDl7DEOwLICwLqUlZ-9RijG-zCY5WXhZIpJE_l2FWiBwRFBp7_e-HcNKd9B_yB_uHnv7sOpdEqxj2wEsTd9meEhIZ6qO3AT_AOZ5-Xo
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+Multidimensional+Suppression+for+K-Anonymity&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Kisilevich%2C+Slava&rft.au=Rokach%2C+Lior&rft.au=Elovici%2C+Yuval&rft.au=Shapira%2C+Bracha&rft.date=2010-03-01&rft.issn=1041-4347&rft.volume=22&rft.issue=3&rft.spage=334&rft.epage=347&rft_id=info:doi/10.1109%2FTKDE.2009.91&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon