A Neighborhood-Similarity-Based Imputation Algorithm for Healthcare Data Sets: A Comparative Study

The increasing computerisation of medical services has highlighted inconsistencies in the way in which patients’ historic medical data were recorded. Differences in process and practice between medical services and facilities have led to many incomplete and inaccurate medical histories being recorde...

Full description

Saved in:

Bibliographic Details
Published in	Electronics (Basel) Vol. 12; no. 23; p. 4809
Main Authors	Wilcox, Colin, Giagos, Vasileios, Djahel, Soufiene
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.12.2023
Subjects	Algorithms Comparative studies Datasets Electronic data processing Health services Mathematical functions Medical care Medical records Missing data Neighborhoods Similarity Technology application United Kingdom Variables United Kingdom
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The increasing computerisation of medical services has highlighted inconsistencies in the way in which patients’ historic medical data were recorded. Differences in process and practice between medical services and facilities have led to many incomplete and inaccurate medical histories being recorded. To create a single point of truth going forward, it is necessary to correct these inconsistencies. A common way to do this has been to use imputation techniques to predict missing data values based on the known values in the data set. In this paper, we propose a neighborhood similarity measure-based imputation technique and analyze its achieved prediction accuracy in comparison with a number of traditional imputation methods using both an incomplete anonymized diabetes medical data set and a number of simulated data sets as the sources of our data. The aim is to determine whether any improvement could be made in the accuracy of predicting a diabetes diagnosis using the known outcomes of the diabetes patients’ data set. The obtained results have proven the effectiveness of our proposed approach compared to other state-of-the-art single-pass imputation techniques.
ISSN:	2079-9292 2079-9292
DOI:	10.3390/electronics12234809