Deep learning for missing value imputation of continuous data and the effect of data discretization

Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values t...

Full description

Saved in:
Bibliographic Details
Published inKnowledge-based systems Vol. 239; p. 108079
Main Authors Lin, Wei-Chao, Tsai, Chih-Fong, Zhong, Jia Rong
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 05.03.2022
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values to replace the missing ones. Deep learning techniques have been employed for missing value imputation and demonstrated their superiority over many other well-known imputation methods. However, very few studies have attempted to assess the imputation performance of deep learning techniques for tabular or structured data with continuous values. Moreover, the effect on the imputation results when the continuous data need to be discretized has never been examined. In this paper, two supervised deep neural networks, i.e., multilayer perceptron (MLP) and deep belief networks (DBN), are compared for missing value imputation. Moreover, two differently ordered combinations of data discretization and imputation steps are examined. The results show that MLP and DBN significantly outperform the baseline imputation methods based on the mean, KNN, CART, and SVM, with DBN performing the best. On the other hand, when considering the discretization of continuous data, the order in which the two steps are combined is not the most important, but rather, the chosen imputation algorithm. That is, the final performance is much better when using DBN for imputation, regardless of whether discretization is performed in the first or second step, than the other imputation methods. •Deep learning for imputing missing continuous values of tabular or structured data is studied.•In particular, multilayer perceptron (MLP) and deep belief networks (DBN) are employed.•Two different ordered combinations of data discretization and imputation steps are examined.•MLP and DBN significantly outperform the baseline imputation methods.•DBN is the better choice for imputation when the discretization of continuous data is required.
AbstractList Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values to replace the missing ones. Deep learning techniques have been employed for missing value imputation and demonstrated their superiority over many other well-known imputation methods. However, very few studies have attempted to assess the imputation performance of deep learning techniques for tabular or structured data with continuous values. Moreover, the effect on the imputation results when the continuous data need to be discretized has never been examined. In this paper, two supervised deep neural networks, i.e., multilayer perceptron (MLP) and deep belief networks (DBN), are compared for missing value imputation. Moreover, two differently ordered combinations of data discretization and imputation steps are examined. The results show that MLP and DBN significantly outperform the baseline imputation methods based on the mean, KNN, CART, and SVM, with DBN performing the best. On the other hand, when considering the discretization of continuous data, the order in which the two steps are combined is not the most important, but rather, the chosen imputation algorithm. That is, the final performance is much better when using DBN for imputation, regardless of whether discretization is performed in the first or second step, than the other imputation methods.
Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values to replace the missing ones. Deep learning techniques have been employed for missing value imputation and demonstrated their superiority over many other well-known imputation methods. However, very few studies have attempted to assess the imputation performance of deep learning techniques for tabular or structured data with continuous values. Moreover, the effect on the imputation results when the continuous data need to be discretized has never been examined. In this paper, two supervised deep neural networks, i.e., multilayer perceptron (MLP) and deep belief networks (DBN), are compared for missing value imputation. Moreover, two differently ordered combinations of data discretization and imputation steps are examined. The results show that MLP and DBN significantly outperform the baseline imputation methods based on the mean, KNN, CART, and SVM, with DBN performing the best. On the other hand, when considering the discretization of continuous data, the order in which the two steps are combined is not the most important, but rather, the chosen imputation algorithm. That is, the final performance is much better when using DBN for imputation, regardless of whether discretization is performed in the first or second step, than the other imputation methods. •Deep learning for imputing missing continuous values of tabular or structured data is studied.•In particular, multilayer perceptron (MLP) and deep belief networks (DBN) are employed.•Two different ordered combinations of data discretization and imputation steps are examined.•MLP and DBN significantly outperform the baseline imputation methods.•DBN is the better choice for imputation when the discretization of continuous data is required.
ArticleNumber 108079
Author Tsai, Chih-Fong
Lin, Wei-Chao
Zhong, Jia Rong
Author_xml – sequence: 1
  givenname: Wei-Chao
  surname: Lin
  fullname: Lin, Wei-Chao
  organization: Department of Information Management, Chang Gung University, Taoyuan, Taiwan
– sequence: 2
  givenname: Chih-Fong
  surname: Tsai
  fullname: Tsai, Chih-Fong
  email: cftsai@mgt.ncu.edu.tw
  organization: Department of Information Management, National Central University, Zhongli, Taoyuan, Taiwan
– sequence: 3
  givenname: Jia Rong
  surname: Zhong
  fullname: Zhong, Jia Rong
  organization: Department of Information Management, National Central University, Zhongli, Taoyuan, Taiwan
BookMark eNp9UMtOwzAQtFCRaAt_wMES5xQ7iePkgoTKU6rEBc6Wa6_BobWD7VQqX09COHPZXc3OzGpngWbOO0DokpIVJbS6blefzsdjXOUkpwNUE96coDmteZ7xkjQzNCcNIxknjJ6hRYwtISTPaT1H6g6gwzuQwVn3jo0PeG9jHOeD3PWA7b7rk0zWO-wNVt4l63rfR6xlklg6jdMHYDAGVBoZv7C2UQVI9vtXeI5OjdxFuPjrS_T2cP-6fso2L4_P69tNpoqiTJnkW7UljeKmqHJKSqgYqxljRQnEFGU1FKUZM6Br3QAYWta1kVXDVSn1sCiW6Gry7YL_6iEm0fo-uOGkyKui5rwgVTWwyomlgo8xgBFdsHsZjoISMcYpWjHFKcY4xRTnILuZZDB8cLAQRFQWnAJtw_C60N7-b_ADkniDNQ
CitedBy_id crossref_primary_10_1016_j_dajour_2023_100341
crossref_primary_10_1007_s42835_024_01827_6
crossref_primary_10_1109_ACCESS_2022_3218067
crossref_primary_10_1109_JIOT_2023_3305006
crossref_primary_10_1016_j_knosys_2022_109440
crossref_primary_10_1007_s10115_024_02159_7
crossref_primary_10_1109_ACCESS_2024_3357533
crossref_primary_10_54525_tbbmd_1167316
crossref_primary_10_1016_j_eswa_2022_117298
crossref_primary_10_1016_j_asoc_2023_110163
crossref_primary_10_1029_2021WR030827
crossref_primary_10_1016_j_fss_2023_108683
crossref_primary_10_1016_j_asoc_2022_109273
crossref_primary_10_1016_j_knosys_2023_111171
crossref_primary_10_1016_j_istruc_2023_105277
crossref_primary_10_1061_JPSEA2_PSENG_1486
crossref_primary_10_1016_j_engappai_2023_107285
crossref_primary_10_1016_j_eswa_2023_122307
crossref_primary_10_3390_agriculture13091718
crossref_primary_10_1016_j_compbiomed_2022_106097
crossref_primary_10_1016_j_envres_2023_115549
crossref_primary_10_32628_IJSRST52411130
crossref_primary_10_3390_agriculture13051015
crossref_primary_10_3390_su151712790
crossref_primary_10_3390_pr11061594
crossref_primary_10_1016_j_ins_2024_120824
crossref_primary_10_1016_j_knosys_2023_110603
crossref_primary_10_3390_s22155645
crossref_primary_10_1109_JIOT_2024_3382878
crossref_primary_10_4108_eetpht_10_5147
crossref_primary_10_1016_j_knosys_2023_111215
crossref_primary_10_1016_j_eswa_2024_123745
crossref_primary_10_1007_s00521_024_09676_0
crossref_primary_10_3390_app12178774
crossref_primary_10_1007_s00253_022_11963_6
crossref_primary_10_3233_JIFS_238245
crossref_primary_10_1109_ACCESS_2023_3323435
crossref_primary_10_1371_journal_pone_0295032
crossref_primary_10_1016_j_ins_2022_06_060
Cites_doi 10.1007/s10462-014-9426-2
10.1016/j.patcog.2013.05.025
10.1007/s10462-019-09709-4
10.1162/neco.2006.18.7.1527
10.1613/jair.1.12312
10.1007/s10489-019-01560-y
10.1109/TKDE.2012.35
10.1016/j.asoc.2014.09.052
10.1007/s00521-009-0295-6
10.1016/j.neucom.2019.10.118
10.4018/IJDWM.2017100104
10.1007/s10115-019-01427-1
10.1016/j.dss.2021.113624
10.1016/B978-1-55860-377-6.50032-3
10.1023/A:1016304305535
10.1007/s10115-017-1025-5
10.1007/s42044-020-00065-z
10.1016/j.dss.2020.113339
10.1142/S0218001403002460
10.1109/CIT/IUCC/DASC/PICOM.2015.184
10.1145/3234150
10.3389/fpsyt.2020.00673
10.1016/j.csda.2011.04.012
10.1109/32.962560
ContentType Journal Article
Copyright 2021 Elsevier B.V.
Copyright Elsevier Science Ltd. Mar 5, 2022
Copyright_xml – notice: 2021 Elsevier B.V.
– notice: Copyright Elsevier Science Ltd. Mar 5, 2022
DBID AAYXX
CITATION
7SC
8FD
E3H
F2A
JQ2
L7M
L~C
L~D
DOI 10.1016/j.knosys.2021.108079
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Library & Information Sciences Abstracts (LISA)
Library & Information Science Abstracts (LISA)
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Library and Information Science Abstracts (LISA)
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7409
ExternalDocumentID 10_1016_j_knosys_2021_108079
S0950705121011527
GroupedDBID --K
--M
.DC
.~1
0R~
1B1
1~.
1~5
4.4
457
4G.
5VS
7-5
71M
77K
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXUO
AAYFN
ABAOU
ABBOA
ABIVO
ABJNI
ABMAC
ABYKQ
ACAZW
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
ADGUI
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ARUGR
AXJTR
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
JJJVA
KOM
LG9
LY7
M41
MHUIS
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SST
SSV
SSW
SSZ
T5K
WH7
XPP
ZMT
~02
~G-
29L
AAQXK
AAXKI
AAYXX
ABXDB
ACNNM
ADJOM
ADMUD
AFJKZ
AKRWK
ASPBG
AVWKF
AZFZN
CITATION
EJD
FEDTE
FGOYB
G-2
G8K
HLZ
HVGLF
HZ~
R2-
RIG
SBC
SET
SEW
UHS
WUQ
7SC
8FD
E3H
F2A
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c334t-a7bcb09c7f362104e655855534e0f3460f3cd55fed8d9eef1488fa697c4add553
IEDL.DBID AIKHN
ISSN 0950-7051
IngestDate Thu Oct 10 17:27:18 EDT 2024
Thu Sep 26 16:18:33 EDT 2024
Fri Feb 23 02:39:56 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Deep learning
Data science
Missing value imputation
Machine learning
Data discretization
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c334t-a7bcb09c7f362104e655855534e0f3460f3cd55fed8d9eef1488fa697c4add553
PQID 2638773066
PQPubID 2035257
ParticipantIDs proquest_journals_2638773066
crossref_primary_10_1016_j_knosys_2021_108079
elsevier_sciencedirect_doi_10_1016_j_knosys_2021_108079
PublicationCentury 2000
PublicationDate 2022-03-05
PublicationDateYYYYMMDD 2022-03-05
PublicationDate_xml – month: 03
  year: 2022
  text: 2022-03-05
  day: 05
PublicationDecade 2020
PublicationPlace Amsterdam
PublicationPlace_xml – name: Amsterdam
PublicationTitle Knowledge-based systems
PublicationYear 2022
Publisher Elsevier B.V
Elsevier Science Ltd
Publisher_xml – name: Elsevier B.V
– name: Elsevier Science Ltd
References J. Dougherty, R. Kohavi, M. Sahami, Supervised and unsupervised discretization of continuous features, in: International Conference on Machine Learning, 1995, pp. 194–202.
U.M. Fayyad, K.B. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: International Joint Conference on Artificial Intelligence, 1993, pp. 1022–1029.
Nikfalazar, Yeh, Bedingfield, Khorshidi (b7) 2020; 62
R. Kerber, ChiMerge: discretization of numeric attributes, in: AAAI Conference on Artificial Intelligence, 1992, pp. 123–128.
Pouyanfar, Sadiq, Yan, Tian, Tao, Reyes, Shyu, Chen, Iyengar (b9) 2019; 51
Templeton, Kang, Tahmasbi (b23) 2021
Salcedo-Sanz, Rojo-Alvarez, Martinez-Ramon, Camps-Valls (b34) 2014; 4
Z. Chen, S. Liu, K. Jiang, H. Xu, X. Cheng, A data imputation method based on deep belief network, in: IEEE International Conference on Computer and Information Technology, Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, 2015, pp. 1238–1243.
Liu, Hussain, Tan, Dash (b18) 2002; 6
Silva-Ramirez, Pino-ejias, Lopez-Coello (b22) 2015; 29
van Buuren (b1) 2018
Lin, Ke, Tsai (b5) 2017; 13
Little, Rubin (b6) 2002
Byun, Lee (b32) 2003; 17
Gad, Hosahalli, Majunatha, Ghoneim (b11) 2021; 4
Cervantes, Garcia-Lamont, Rodriguez-Mazahua, Lopez (b33) 2020; 408
Cheng, Tseng, Chang, Chang, Gau (b10) 2020; 11
Ali, Siddiqi, Lee (b24) 2015; 44
Pati, Das (b20) 2017; 52
Aggarwal (b26) 2018
Lin, Tsai (b3) 2020; 53
Kotsiantis, Kanellopoulos (b25) 2006; 32
Demsar (b35) 2006; 7
Pereira, Santos, Rodrigues, Abreu (b15) 2020; 69
M. Smieja, L. Struski, J. Tabor, B. Zielinski, P. Spurek, Processing of missing data by neural networks, in: International Conference on Neural Information Processing Systems, 2018, pp. 2724–2734.
Haykin (b27) 1999
Templ, Kowarik, Filzmoser (b19) 2011; 55
Hinton, Osindero, The (b29) 2006; 18
Lin, Li, Alam, Ma (b14) 2020; 50
Garcia, Luengo, Saez, Lopez, Herrera (b17) 2013; 25
Fischer, Igel (b28) 2014; 47
Piri (b21) 2020; 136
Garcia-Laencina, Sancho-Gomez, Figueiras-Vidal (b2) 2010; 19
Strike, Emam, Madhavji (b4) 2001; 27
Dong, Wang, Abbas (b8) 2021; 40
Garcia (10.1016/j.knosys.2021.108079_b17) 2013; 25
Lin (10.1016/j.knosys.2021.108079_b3) 2020; 53
Silva-Ramirez (10.1016/j.knosys.2021.108079_b22) 2015; 29
Byun (10.1016/j.knosys.2021.108079_b32) 2003; 17
10.1016/j.knosys.2021.108079_b30
van Buuren (10.1016/j.knosys.2021.108079_b1) 2018
10.1016/j.knosys.2021.108079_b31
Liu (10.1016/j.knosys.2021.108079_b18) 2002; 6
Strike (10.1016/j.knosys.2021.108079_b4) 2001; 27
Garcia-Laencina (10.1016/j.knosys.2021.108079_b2) 2010; 19
Pati (10.1016/j.knosys.2021.108079_b20) 2017; 52
Hinton (10.1016/j.knosys.2021.108079_b29) 2006; 18
Little (10.1016/j.knosys.2021.108079_b6) 2002
Lin (10.1016/j.knosys.2021.108079_b14) 2020; 50
Ali (10.1016/j.knosys.2021.108079_b24) 2015; 44
Pereira (10.1016/j.knosys.2021.108079_b15) 2020; 69
Gad (10.1016/j.knosys.2021.108079_b11) 2021; 4
Demsar (10.1016/j.knosys.2021.108079_b35) 2006; 7
Cheng (10.1016/j.knosys.2021.108079_b10) 2020; 11
Salcedo-Sanz (10.1016/j.knosys.2021.108079_b34) 2014; 4
Pouyanfar (10.1016/j.knosys.2021.108079_b9) 2019; 51
Cervantes (10.1016/j.knosys.2021.108079_b33) 2020; 408
Kotsiantis (10.1016/j.knosys.2021.108079_b25) 2006; 32
10.1016/j.knosys.2021.108079_b16
Lin (10.1016/j.knosys.2021.108079_b5) 2017; 13
Haykin (10.1016/j.knosys.2021.108079_b27) 1999
Dong (10.1016/j.knosys.2021.108079_b8) 2021; 40
Aggarwal (10.1016/j.knosys.2021.108079_b26) 2018
10.1016/j.knosys.2021.108079_b12
Nikfalazar (10.1016/j.knosys.2021.108079_b7) 2020; 62
Fischer (10.1016/j.knosys.2021.108079_b28) 2014; 47
10.1016/j.knosys.2021.108079_b13
Templeton (10.1016/j.knosys.2021.108079_b23) 2021
Piri (10.1016/j.knosys.2021.108079_b21) 2020; 136
Templ (10.1016/j.knosys.2021.108079_b19) 2011; 55
References_xml – volume: 62
  start-page: 2419
  year: 2020
  end-page: 2437
  ident: b7
  article-title: Missing data imputation using decision trees and fuzzy clustering with iterative learning
  publication-title: Knowl. Inf. Syst.
  contributor:
    fullname: Khorshidi
– volume: 44
  start-page: 235
  year: 2015
  end-page: 263
  ident: b24
  article-title: Rough set-based approaches for discretization: a compact review
  publication-title: Artif. Intell. Rev.
  contributor:
    fullname: Lee
– volume: 29
  start-page: 65
  year: 2015
  end-page: 74
  ident: b22
  article-title: Single imputation with multilayer perceptron and multiple imputation combining multilayer perceptron and k-nearest neighbors for monotone patterns
  publication-title: Appl. Soft Comput.
  contributor:
    fullname: Lopez-Coello
– volume: 4
  start-page: 234
  year: 2014
  end-page: 267
  ident: b34
  article-title: Support vector machines in engineering: an overview
  publication-title: Wiley Interdiscip. Rev.: Data Min. Knowl. Discov.
  contributor:
    fullname: Camps-Valls
– volume: 136
  year: 2020
  ident: b21
  article-title: Missing care: a framework to address the issue of frequent missing values: the case of a clinical decision support system for Parkinson’s disease
  publication-title: Decis. Support Syst.
  contributor:
    fullname: Piri
– year: 2021
  ident: b23
  article-title: Regression imputation optimization sample size and emulation: demonstrations and comparisons to prominent methods
  publication-title: Decis. Support Syst.
  contributor:
    fullname: Tahmasbi
– volume: 18
  start-page: 1527
  year: 2006
  end-page: 1554
  ident: b29
  article-title: A fast learning algorithm for deep belief nets
  publication-title: Neural Comput.
  contributor:
    fullname: The
– volume: 13
  start-page: 53
  year: 2017
  end-page: 63
  ident: b5
  article-title: When should we ignore examples with missing values?
  publication-title: Int. J. Data Warehous. Min.
  contributor:
    fullname: Tsai
– volume: 408
  start-page: 189
  year: 2020
  end-page: 215
  ident: b33
  article-title: A comprehensive survey on support vector machine classification: applications, challenges and trends
  publication-title: Neurocomputing
  contributor:
    fullname: Lopez
– volume: 19
  start-page: 263
  year: 2010
  end-page: 282
  ident: b2
  article-title: Pattern classification with missing data: a review
  publication-title: Neural Comput. Appl.
  contributor:
    fullname: Figueiras-Vidal
– volume: 17
  start-page: 459
  year: 2003
  end-page: 486
  ident: b32
  article-title: A survey on pattern recognition applications of support vector machines
  publication-title: Int. J. Pattern Recognit. Artif. Intell.
  contributor:
    fullname: Lee
– volume: 6
  start-page: 393
  year: 2002
  end-page: 423
  ident: b18
  article-title: Discretization: an enabling technique
  publication-title: Data Min. Knowl. Discov.
  contributor:
    fullname: Dash
– year: 2002
  ident: b6
  article-title: Statistical Analysis with Missing Data
  contributor:
    fullname: Rubin
– volume: 4
  start-page: 67
  year: 2021
  end-page: 84
  ident: b11
  article-title: A robust deep learning model for missing value imputation in big NCDC dataset
  publication-title: Iran J. Comput. Sci.
  contributor:
    fullname: Ghoneim
– year: 2018
  ident: b1
  article-title: Flexible Imputation of Missing Data
  contributor:
    fullname: van Buuren
– volume: 47
  start-page: 25
  year: 2014
  end-page: 39
  ident: b28
  article-title: Training restricted Boltzmann machines: an introduction
  publication-title: Pattern Recognit.
  contributor:
    fullname: Igel
– volume: 52
  start-page: 709
  year: 2017
  end-page: 750
  ident: b20
  article-title: Missing value estimation for microarray data through cluster analysis
  publication-title: Knowl. Inf. Syst.
  contributor:
    fullname: Das
– volume: 53
  start-page: 1487
  year: 2020
  end-page: 1509
  ident: b3
  article-title: Missing value imputation: a review and analysis of the literature (2006–2017)
  publication-title: Artif. Intell. Rev.
  contributor:
    fullname: Tsai
– volume: 50
  start-page: 860
  year: 2020
  end-page: 877
  ident: b14
  article-title: Data-driven missing data imputation in cluster monitoring system based on deep neural network
  publication-title: Appl. Intell.
  contributor:
    fullname: Ma
– volume: 69
  start-page: 1255
  year: 2020
  end-page: 1285
  ident: b15
  article-title: Reviewing autoencoders for missing data imputation: technical trends, applications, and outcomes
  publication-title: J. Artificial Intelligence Res.
  contributor:
    fullname: Abreu
– volume: 27
  start-page: 890
  year: 2001
  end-page: 908
  ident: b4
  article-title: Software cost estimation with incomplete data
  publication-title: IEEE Trans. Softw. Eng.
  contributor:
    fullname: Madhavji
– volume: 55
  start-page: 2793
  year: 2011
  end-page: 2806
  ident: b19
  article-title: Iterative stepwise regression imputation using standard and robust methods
  publication-title: Comput. Statist. Data Anal.
  contributor:
    fullname: Filzmoser
– volume: 40
  year: 2021
  ident: b8
  article-title: A survey on deep learning and its applications
  publication-title: Comp. Sci. Rev.
  contributor:
    fullname: Abbas
– volume: 32
  start-page: 47
  year: 2006
  end-page: 58
  ident: b25
  article-title: Discretization techniques: a recent survey
  publication-title: GESTS Int. Trans. Comput. Sci. Eng.
  contributor:
    fullname: Kanellopoulos
– volume: 11
  start-page: 673
  year: 2020
  ident: b10
  article-title: A deep learning approach for missing data imputation of rating scales assessing attention-deficit hyperactivity disorder
  publication-title: Front. Psychiatry
  contributor:
    fullname: Gau
– year: 2018
  ident: b26
  article-title: Neural Networks and Deep Learning: A Textbook
  contributor:
    fullname: Aggarwal
– year: 1999
  ident: b27
  article-title: Neural Networks: A Comprehensive Foundation
  contributor:
    fullname: Haykin
– volume: 7
  start-page: 1
  year: 2006
  end-page: 30
  ident: b35
  article-title: Statistical comparisons of classifiers over multiple data sets
  publication-title: J. Mach. Learn. Res.
  contributor:
    fullname: Demsar
– volume: 51
  year: 2019
  ident: b9
  article-title: A survey on deep learning: algorithms, techniques, and applications
  publication-title: ACM Comput. Surv.
  contributor:
    fullname: Iyengar
– volume: 25
  start-page: 734
  year: 2013
  end-page: 750
  ident: b17
  article-title: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning
  publication-title: IEEE Trans. Knowl. Data Eng.
  contributor:
    fullname: Herrera
– volume: 44
  start-page: 235
  issue: 2
  year: 2015
  ident: 10.1016/j.knosys.2021.108079_b24
  article-title: Rough set-based approaches for discretization: a compact review
  publication-title: Artif. Intell. Rev.
  doi: 10.1007/s10462-014-9426-2
  contributor:
    fullname: Ali
– year: 2018
  ident: 10.1016/j.knosys.2021.108079_b1
  contributor:
    fullname: van Buuren
– volume: 47
  start-page: 25
  year: 2014
  ident: 10.1016/j.knosys.2021.108079_b28
  article-title: Training restricted Boltzmann machines: an introduction
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2013.05.025
  contributor:
    fullname: Fischer
– volume: 53
  start-page: 1487
  year: 2020
  ident: 10.1016/j.knosys.2021.108079_b3
  article-title: Missing value imputation: a review and analysis of the literature (2006–2017)
  publication-title: Artif. Intell. Rev.
  doi: 10.1007/s10462-019-09709-4
  contributor:
    fullname: Lin
– volume: 18
  start-page: 1527
  issue: 7
  year: 2006
  ident: 10.1016/j.knosys.2021.108079_b29
  article-title: A fast learning algorithm for deep belief nets
  publication-title: Neural Comput.
  doi: 10.1162/neco.2006.18.7.1527
  contributor:
    fullname: Hinton
– volume: 69
  start-page: 1255
  year: 2020
  ident: 10.1016/j.knosys.2021.108079_b15
  article-title: Reviewing autoencoders for missing data imputation: technical trends, applications, and outcomes
  publication-title: J. Artificial Intelligence Res.
  doi: 10.1613/jair.1.12312
  contributor:
    fullname: Pereira
– volume: 50
  start-page: 860
  year: 2020
  ident: 10.1016/j.knosys.2021.108079_b14
  article-title: Data-driven missing data imputation in cluster monitoring system based on deep neural network
  publication-title: Appl. Intell.
  doi: 10.1007/s10489-019-01560-y
  contributor:
    fullname: Lin
– volume: 25
  start-page: 734
  issue: 4
  year: 2013
  ident: 10.1016/j.knosys.2021.108079_b17
  article-title: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/TKDE.2012.35
  contributor:
    fullname: Garcia
– volume: 29
  start-page: 65
  year: 2015
  ident: 10.1016/j.knosys.2021.108079_b22
  article-title: Single imputation with multilayer perceptron and multiple imputation combining multilayer perceptron and k-nearest neighbors for monotone patterns
  publication-title: Appl. Soft Comput.
  doi: 10.1016/j.asoc.2014.09.052
  contributor:
    fullname: Silva-Ramirez
– volume: 19
  start-page: 263
  year: 2010
  ident: 10.1016/j.knosys.2021.108079_b2
  article-title: Pattern classification with missing data: a review
  publication-title: Neural Comput. Appl.
  doi: 10.1007/s00521-009-0295-6
  contributor:
    fullname: Garcia-Laencina
– year: 2018
  ident: 10.1016/j.knosys.2021.108079_b26
  contributor:
    fullname: Aggarwal
– volume: 408
  start-page: 189
  year: 2020
  ident: 10.1016/j.knosys.2021.108079_b33
  article-title: A comprehensive survey on support vector machine classification: applications, challenges and trends
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2019.10.118
  contributor:
    fullname: Cervantes
– volume: 13
  start-page: 53
  issue: 4
  year: 2017
  ident: 10.1016/j.knosys.2021.108079_b5
  article-title: When should we ignore examples with missing values?
  publication-title: Int. J. Data Warehous. Min.
  doi: 10.4018/IJDWM.2017100104
  contributor:
    fullname: Lin
– ident: 10.1016/j.knosys.2021.108079_b12
– volume: 7
  start-page: 1
  year: 2006
  ident: 10.1016/j.knosys.2021.108079_b35
  article-title: Statistical comparisons of classifiers over multiple data sets
  publication-title: J. Mach. Learn. Res.
  contributor:
    fullname: Demsar
– volume: 62
  start-page: 2419
  year: 2020
  ident: 10.1016/j.knosys.2021.108079_b7
  article-title: Missing data imputation using decision trees and fuzzy clustering with iterative learning
  publication-title: Knowl. Inf. Syst.
  doi: 10.1007/s10115-019-01427-1
  contributor:
    fullname: Nikfalazar
– year: 1999
  ident: 10.1016/j.knosys.2021.108079_b27
  contributor:
    fullname: Haykin
– year: 2021
  ident: 10.1016/j.knosys.2021.108079_b23
  article-title: Regression imputation optimization sample size and emulation: demonstrations and comparisons to prominent methods
  publication-title: Decis. Support Syst.
  doi: 10.1016/j.dss.2021.113624
  contributor:
    fullname: Templeton
– volume: 32
  start-page: 47
  issue: 1
  year: 2006
  ident: 10.1016/j.knosys.2021.108079_b25
  article-title: Discretization techniques: a recent survey
  publication-title: GESTS Int. Trans. Comput. Sci. Eng.
  contributor:
    fullname: Kotsiantis
– ident: 10.1016/j.knosys.2021.108079_b31
– ident: 10.1016/j.knosys.2021.108079_b30
– ident: 10.1016/j.knosys.2021.108079_b16
  doi: 10.1016/B978-1-55860-377-6.50032-3
– volume: 6
  start-page: 393
  issue: 4
  year: 2002
  ident: 10.1016/j.knosys.2021.108079_b18
  article-title: Discretization: an enabling technique
  publication-title: Data Min. Knowl. Discov.
  doi: 10.1023/A:1016304305535
  contributor:
    fullname: Liu
– volume: 52
  start-page: 709
  issue: 3
  year: 2017
  ident: 10.1016/j.knosys.2021.108079_b20
  article-title: Missing value estimation for microarray data through cluster analysis
  publication-title: Knowl. Inf. Syst.
  doi: 10.1007/s10115-017-1025-5
  contributor:
    fullname: Pati
– volume: 4
  start-page: 234
  issue: 3
  year: 2014
  ident: 10.1016/j.knosys.2021.108079_b34
  article-title: Support vector machines in engineering: an overview
  publication-title: Wiley Interdiscip. Rev.: Data Min. Knowl. Discov.
  contributor:
    fullname: Salcedo-Sanz
– volume: 4
  start-page: 67
  year: 2021
  ident: 10.1016/j.knosys.2021.108079_b11
  article-title: A robust deep learning model for missing value imputation in big NCDC dataset
  publication-title: Iran J. Comput. Sci.
  doi: 10.1007/s42044-020-00065-z
  contributor:
    fullname: Gad
– volume: 136
  year: 2020
  ident: 10.1016/j.knosys.2021.108079_b21
  article-title: Missing care: a framework to address the issue of frequent missing values: the case of a clinical decision support system for Parkinson’s disease
  publication-title: Decis. Support Syst.
  doi: 10.1016/j.dss.2020.113339
  contributor:
    fullname: Piri
– year: 2002
  ident: 10.1016/j.knosys.2021.108079_b6
  contributor:
    fullname: Little
– volume: 17
  start-page: 459
  issue: 3
  year: 2003
  ident: 10.1016/j.knosys.2021.108079_b32
  article-title: A survey on pattern recognition applications of support vector machines
  publication-title: Int. J. Pattern Recognit. Artif. Intell.
  doi: 10.1142/S0218001403002460
  contributor:
    fullname: Byun
– ident: 10.1016/j.knosys.2021.108079_b13
  doi: 10.1109/CIT/IUCC/DASC/PICOM.2015.184
– volume: 51
  issue: 5
  year: 2019
  ident: 10.1016/j.knosys.2021.108079_b9
  article-title: A survey on deep learning: algorithms, techniques, and applications
  publication-title: ACM Comput. Surv.
  doi: 10.1145/3234150
  contributor:
    fullname: Pouyanfar
– volume: 40
  year: 2021
  ident: 10.1016/j.knosys.2021.108079_b8
  article-title: A survey on deep learning and its applications
  publication-title: Comp. Sci. Rev.
  contributor:
    fullname: Dong
– volume: 11
  start-page: 673
  year: 2020
  ident: 10.1016/j.knosys.2021.108079_b10
  article-title: A deep learning approach for missing data imputation of rating scales assessing attention-deficit hyperactivity disorder
  publication-title: Front. Psychiatry
  doi: 10.3389/fpsyt.2020.00673
  contributor:
    fullname: Cheng
– volume: 55
  start-page: 2793
  issue: 10
  year: 2011
  ident: 10.1016/j.knosys.2021.108079_b19
  article-title: Iterative stepwise regression imputation using standard and robust methods
  publication-title: Comput. Statist. Data Anal.
  doi: 10.1016/j.csda.2011.04.012
  contributor:
    fullname: Templ
– volume: 27
  start-page: 890
  issue: 10
  year: 2001
  ident: 10.1016/j.knosys.2021.108079_b4
  article-title: Software cost estimation with incomplete data
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/32.962560
  contributor:
    fullname: Strike
SSID ssj0002218
Score 2.5530968
Snippet Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Publisher
StartPage 108079
SubjectTerms Algorithms
Artificial neural networks
Belief networks
Data discretization
Data mining
Data science
Datasets
Deep learning
Discretization
Machine learning
Missing value imputation
Multilayer perceptrons
Structured data
Tables (data)
Title Deep learning for missing value imputation of continuous data and the effect of data discretization
URI https://dx.doi.org/10.1016/j.knosys.2021.108079
https://www.proquest.com/docview/2638773066
Volume 239
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED7RsrDwRjwK8sBqmocf6YgKVQHBAkhsUeLYKCAlFW0HFn47d4mDBEJCYsngnK3ozv7u7Hx3BjhVYW6ksYqbIgi40DrjSeg0z6SURe6MzZo_-Ld3avoorp_k0wqMu1wYolV67G8xvUFr3zL02hzOynJ4j8EBzldJFbBCupy1B6vojoTow-r51c307guQo6g55iN5Th26DLqG5vVa1fN3qtsdhQ3fjjhdv3uoH1jdOKDJJqz7yJGdtx-3BSu22oaN7lYG5hfpDpgLa2fM3wbxzDAoZWhLOhJgVNnbspJ6NAZhtWPEVS-rZb2cM2KLsqwqGAaFrCV6kETTTNm7lPDYZm3uwuPk8mE85f4qBW7iWCx4pnOTByOjHTos3IFZJXGfIGUsbOBiofBhCimdLZJiZK3DTVLiMjXSRiAAotwe9Ku6svvAlIxxMK2ELZxQMspQNg4wTkwCK2wcHgDv1JfO2ooZaUcle0lbdaek7rRV9wHoTsfpN8unCOp_9Bx0Jkn9ysP3CCgaYUupw38PfARrEWU5ENVMDqC_eFvaY4w9FvkJ9M4-whM_wz4BK13aEw
link.rule.ids 315,786,790,4521,24144,27955,27956,45618,45712
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5BGWDhjShPD6xW8_CjHSseKo92oUhsVuLYqCClFW0H_j13iYMEQkJiyeCcrejO_nx2vrsDuFBxbqV1itsiirjQOuPd2GueSSmL3FuXVX_whyM1eBJ3z_J5BS6bWBiiVQbsrzG9QuvQ0gna7Mwmk84jOgc4XyVlwIqpOOsqrAmp46QFa_3b-8HoC5CTpLrmI3lOHZoIuorm9VZO5x-UtzuJK74dcbp-36F-YHW1Ad1sw2bwHFm__rgdWHHlLmw1VRlYWKR7YK-cm7FQDeKFoVPK0JZ0JcAos7djE-pRGYRNPSOu-qRcTpdzRmxRlpUFQ6eQ1UQPkqiaKXqXAh7rqM19eLq5Hl8OeCilwG2aigXPdG7zqGe1xw0LT2BOSTwnSJkKF_lUKHzYQkrvim7Rc87jIanrM9XTViAAotwBtMpp6Q6BKZniYFoJV3ihZJKhbBqhn9iNnHBp3AbeqM_M6owZpqGSvZpa3YbUbWp1t0E3OjbfLG8Q1P_oedKYxISVh-8RUDTCllJH_x74HNYH4-GDebgd3R_DRkIRD0Q7kyfQWrwv3Sn6IYv8LMyzT2rd3AM
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+learning+for+missing+value+imputation+of+continuous+data+and+the+effect+of+data+discretization&rft.jtitle=Knowledge-based+systems&rft.au=Lin%2C+Wei-Chao&rft.au=Tsai%2C+Chih-Fong&rft.au=Zhong%2C+Jia+Rong&rft.date=2022-03-05&rft.pub=Elsevier+B.V&rft.issn=0950-7051&rft.eissn=1872-7409&rft.volume=239&rft_id=info:doi/10.1016%2Fj.knosys.2021.108079&rft.externalDocID=S0950705121011527
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0950-7051&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0950-7051&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0950-7051&client=summon