STUDY OF BIG DATA BASED PROBLEMS FOR DATA ANONYMIZATION

In order to defeat typical attacks like the similarity attack, the probabilistic inference attack, and others that are possible with anonymized data, we have studied different techniques. To anonymize the data set and disseminate the anonymized data set on a distributed environment without endangeri...

Full description

Saved in:
Bibliographic Details
Published inInformation Management and Computer Science Vol. 6; no. 1; pp. 17 - 21
Main Author Singh, Monika
Format Journal Article
LanguageEnglish
Published 2023
Online AccessGet full text

Cover

Loading…
More Information
Summary:In order to defeat typical attacks like the similarity attack, the probabilistic inference attack, and others that are possible with anonymized data, we have studied different techniques. To anonymize the data set and disseminate the anonymized data set on a distributed environment without endangering data privacy, a privacy-preserving distributed framework is suggested in most of the techniques. It is possible to achieve a better balance between privacy and data utility, and the data utility is demonstrated in terms of conventional measures. The privacy-preserved data set is also subjected to the application of several classifiers in order to measure the utility of the data When sharing and processing data in a distributed setting or with the Internet of Things, data privacy is a crucial requirement. High communication and computational costs are involved in collaborative privacy-preserving data mining based on secured multiparty computation. Data protection against identity revelation is achieved by the use of data anonymization, a promising technology in the field of privacy-preserving data mining. Anonymization faces significant difficulties, including information loss and frequent attacks that may be made on the anonymized data. Utilizing data mining techniques, data anonymization has recently demonstrated a considerable increase in data value. Still, the methods now in use are ineffective for dealing with attacks. Therefore, a clustering-based anonymization approach that is resistant to similarity attacks and attacks based on inference is suggested in this study. On the Hadoop Distributed File System, the anonymized data is dispersed. The technique creates a better balance between utility and privacy.
ISSN:2616-5961
2616-5961
DOI:10.26480/imcs.01.2023.17.21