A novel data preprocessing method for boosting neural network performance: A case study in osteoporosis prediction

Data preprocessing methods have been used in Machine Learning classification problems, transforming datasets into a proper form in order to boost the classification performance. In this paper, a novel data preprocessing method is proposed and evaluated in a difficult classification data set, in whic...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 380; pp. 92 - 100
Main Authors Iliou, Theodoros, Anagnostopoulos, Christos-Nikolaos, Stephanakis, Ioannis M., Anastassopoulos, George
Format Journal Article
LanguageEnglish
Published Elsevier Inc 20.02.2017
Subjects
Online AccessGet full text
ISSN0020-0255
1872-6291
DOI10.1016/j.ins.2015.10.026

Cover

Loading…
More Information
Summary:Data preprocessing methods have been used in Machine Learning classification problems, transforming datasets into a proper form in order to boost the classification performance. In this paper, a novel data preprocessing method is proposed and evaluated in a difficult classification data set, in which various classifiers have average performance lower than 50%. The dataset is related to osteoporosis disease, which is a disease of bones that leads to an increased risk of fracture and it is characterized by low bone mineral density and micro-architectural deterioration of bone tissue. The dataset consists of 589 subjects whose diagnosis was based on laboratory and osteal bone densitometry examination. In all cases, thirty three diagnostic factors for osteoporosis risk prediction, were used in order to categorize subjects into three classes (normal, osteopenia, and osteoporosis). The performance of the proposed multilayer perceptron classifier in various topologies and training parameters was evaluated using the well-known 10-fold cross validation method and the results are reported analytically. The results indicate that the generated features after preprocessing of the original dataset significantly improve the accuracy of the resulting classifiers.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2015.10.026