Privacy-preserving predictive modeling for early detection of chronic kidney disease

With the phenomenal growth of machine learning, privacy protection has become a major concern in medical science. Sensitive medical information, such as chronic kidney disease (CKD) data, is subject to privacy protections and cannot be shared with others without patients’confidentiality and data sec...

Full description

Saved in:
Bibliographic Details
Published inNetwork modeling and analysis in health informatics and bioinformatics (Wien) Vol. 13; no. 1; p. 16
Main Authors Gogoi, Prokash, Valan, J. Arul
Format Journal Article
LanguageEnglish
Published Vienna Springer Vienna 06.04.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the phenomenal growth of machine learning, privacy protection has become a major concern in medical science. Sensitive medical information, such as chronic kidney disease (CKD) data, is subject to privacy protections and cannot be shared with others without patients’confidentiality and data security. This research study aims to design a secure and effective methodology for developing a predictive model for the early detection of chronic kidney disease. In our research study, we addressed two major problems: accurate diagnosis of chronic kidney disease and data security. The workflow in this research paper is divided into two stages. In the initial phase, to prioritize shorter processing times and improve the model’s performance, we applied two meta-heuristic algorithms, the genetic algorithm and the bat algorithm, as feature selection methods to identify the most relevant features for accurate CKD diagnosis. In the subsequent phase of our study, we proposed a privacy-preserving logistic regression-based chronic kidney disease inference model to perform predictive analysis on encrypted chronic kidney disease data. In this model, we used the Paillier homomorphic cryptosystem that provides strong security assurances, ensuring secure communication and processing of medical data, which maintains patients’confidentiality during diagnosis. Our proposed model, utilizing the genetic algorithm feature selection approach, achieves higher accuracy at 98.75% compared to the original features and bat algorithm selected features. We also observed that this model had less computation time compared to the original features and bat algorithm-selected features. To confirm the efficiency of our proposed model, we compared its performance with those obtained by applying the logistic regression algorithm to plain text. We found that, in all cases, we achieved the same performance as observed with encrypted data.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2192-6670
2192-6662
2192-6670
DOI:10.1007/s13721-024-00452-7