Deep learning for missing value imputation of continuous data and the effect of data discretization
Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values t...
Saved in:
Published in | Knowledge-based systems Vol. 239; p. 108079 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Amsterdam
Elsevier B.V
05.03.2022
Elsevier Science Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values to replace the missing ones. Deep learning techniques have been employed for missing value imputation and demonstrated their superiority over many other well-known imputation methods. However, very few studies have attempted to assess the imputation performance of deep learning techniques for tabular or structured data with continuous values. Moreover, the effect on the imputation results when the continuous data need to be discretized has never been examined. In this paper, two supervised deep neural networks, i.e., multilayer perceptron (MLP) and deep belief networks (DBN), are compared for missing value imputation. Moreover, two differently ordered combinations of data discretization and imputation steps are examined. The results show that MLP and DBN significantly outperform the baseline imputation methods based on the mean, KNN, CART, and SVM, with DBN performing the best. On the other hand, when considering the discretization of continuous data, the order in which the two steps are combined is not the most important, but rather, the chosen imputation algorithm. That is, the final performance is much better when using DBN for imputation, regardless of whether discretization is performed in the first or second step, than the other imputation methods.
•Deep learning for imputing missing continuous values of tabular or structured data is studied.•In particular, multilayer perceptron (MLP) and deep belief networks (DBN) are employed.•Two different ordered combinations of data discretization and imputation steps are examined.•MLP and DBN significantly outperform the baseline imputation methods.•DBN is the better choice for imputation when the discretization of continuous data is required. |
---|---|
AbstractList | Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values to replace the missing ones. Deep learning techniques have been employed for missing value imputation and demonstrated their superiority over many other well-known imputation methods. However, very few studies have attempted to assess the imputation performance of deep learning techniques for tabular or structured data with continuous values. Moreover, the effect on the imputation results when the continuous data need to be discretized has never been examined. In this paper, two supervised deep neural networks, i.e., multilayer perceptron (MLP) and deep belief networks (DBN), are compared for missing value imputation. Moreover, two differently ordered combinations of data discretization and imputation steps are examined. The results show that MLP and DBN significantly outperform the baseline imputation methods based on the mean, KNN, CART, and SVM, with DBN performing the best. On the other hand, when considering the discretization of continuous data, the order in which the two steps are combined is not the most important, but rather, the chosen imputation algorithm. That is, the final performance is much better when using DBN for imputation, regardless of whether discretization is performed in the first or second step, than the other imputation methods. Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot directly handle incomplete datasets. Missing value imputation is the major solution for constructing a learning model to estimate specific values to replace the missing ones. Deep learning techniques have been employed for missing value imputation and demonstrated their superiority over many other well-known imputation methods. However, very few studies have attempted to assess the imputation performance of deep learning techniques for tabular or structured data with continuous values. Moreover, the effect on the imputation results when the continuous data need to be discretized has never been examined. In this paper, two supervised deep neural networks, i.e., multilayer perceptron (MLP) and deep belief networks (DBN), are compared for missing value imputation. Moreover, two differently ordered combinations of data discretization and imputation steps are examined. The results show that MLP and DBN significantly outperform the baseline imputation methods based on the mean, KNN, CART, and SVM, with DBN performing the best. On the other hand, when considering the discretization of continuous data, the order in which the two steps are combined is not the most important, but rather, the chosen imputation algorithm. That is, the final performance is much better when using DBN for imputation, regardless of whether discretization is performed in the first or second step, than the other imputation methods. •Deep learning for imputing missing continuous values of tabular or structured data is studied.•In particular, multilayer perceptron (MLP) and deep belief networks (DBN) are employed.•Two different ordered combinations of data discretization and imputation steps are examined.•MLP and DBN significantly outperform the baseline imputation methods.•DBN is the better choice for imputation when the discretization of continuous data is required. |
ArticleNumber | 108079 |
Author | Tsai, Chih-Fong Lin, Wei-Chao Zhong, Jia Rong |
Author_xml | – sequence: 1 givenname: Wei-Chao surname: Lin fullname: Lin, Wei-Chao organization: Department of Information Management, Chang Gung University, Taoyuan, Taiwan – sequence: 2 givenname: Chih-Fong surname: Tsai fullname: Tsai, Chih-Fong email: cftsai@mgt.ncu.edu.tw organization: Department of Information Management, National Central University, Zhongli, Taoyuan, Taiwan – sequence: 3 givenname: Jia Rong surname: Zhong fullname: Zhong, Jia Rong organization: Department of Information Management, National Central University, Zhongli, Taoyuan, Taiwan |
BookMark | eNp9UMtOwzAQtFCRaAt_wMES5xQ7iePkgoTKU6rEBc6Wa6_BobWD7VQqX09COHPZXc3OzGpngWbOO0DokpIVJbS6blefzsdjXOUkpwNUE96coDmteZ7xkjQzNCcNIxknjJ6hRYwtISTPaT1H6g6gwzuQwVn3jo0PeG9jHOeD3PWA7b7rk0zWO-wNVt4l63rfR6xlklg6jdMHYDAGVBoZv7C2UQVI9vtXeI5OjdxFuPjrS_T2cP-6fso2L4_P69tNpoqiTJnkW7UljeKmqHJKSqgYqxljRQnEFGU1FKUZM6Br3QAYWta1kVXDVSn1sCiW6Gry7YL_6iEm0fo-uOGkyKui5rwgVTWwyomlgo8xgBFdsHsZjoISMcYpWjHFKcY4xRTnILuZZDB8cLAQRFQWnAJtw_C60N7-b_ADkniDNQ |
CitedBy_id | crossref_primary_10_1016_j_dajour_2023_100341 crossref_primary_10_1007_s42835_024_01827_6 crossref_primary_10_1109_ACCESS_2022_3218067 crossref_primary_10_1109_JIOT_2023_3305006 crossref_primary_10_1016_j_knosys_2022_109440 crossref_primary_10_1007_s10115_024_02159_7 crossref_primary_10_1109_ACCESS_2024_3357533 crossref_primary_10_54525_tbbmd_1167316 crossref_primary_10_1016_j_eswa_2022_117298 crossref_primary_10_1016_j_asoc_2023_110163 crossref_primary_10_1029_2021WR030827 crossref_primary_10_1016_j_fss_2023_108683 crossref_primary_10_1016_j_asoc_2022_109273 crossref_primary_10_1016_j_knosys_2023_111171 crossref_primary_10_1016_j_istruc_2023_105277 crossref_primary_10_1061_JPSEA2_PSENG_1486 crossref_primary_10_1016_j_engappai_2023_107285 crossref_primary_10_1016_j_eswa_2023_122307 crossref_primary_10_3390_agriculture13091718 crossref_primary_10_1016_j_compbiomed_2022_106097 crossref_primary_10_1016_j_envres_2023_115549 crossref_primary_10_32628_IJSRST52411130 crossref_primary_10_3390_agriculture13051015 crossref_primary_10_3390_su151712790 crossref_primary_10_3390_pr11061594 crossref_primary_10_1016_j_ins_2024_120824 crossref_primary_10_1016_j_knosys_2023_110603 crossref_primary_10_3390_s22155645 crossref_primary_10_1109_JIOT_2024_3382878 crossref_primary_10_4108_eetpht_10_5147 crossref_primary_10_1016_j_knosys_2023_111215 crossref_primary_10_1016_j_eswa_2024_123745 crossref_primary_10_1007_s00521_024_09676_0 crossref_primary_10_3390_app12178774 crossref_primary_10_1007_s00253_022_11963_6 crossref_primary_10_3233_JIFS_238245 crossref_primary_10_1109_ACCESS_2023_3323435 crossref_primary_10_1371_journal_pone_0295032 crossref_primary_10_1016_j_ins_2022_06_060 |
Cites_doi | 10.1007/s10462-014-9426-2 10.1016/j.patcog.2013.05.025 10.1007/s10462-019-09709-4 10.1162/neco.2006.18.7.1527 10.1613/jair.1.12312 10.1007/s10489-019-01560-y 10.1109/TKDE.2012.35 10.1016/j.asoc.2014.09.052 10.1007/s00521-009-0295-6 10.1016/j.neucom.2019.10.118 10.4018/IJDWM.2017100104 10.1007/s10115-019-01427-1 10.1016/j.dss.2021.113624 10.1016/B978-1-55860-377-6.50032-3 10.1023/A:1016304305535 10.1007/s10115-017-1025-5 10.1007/s42044-020-00065-z 10.1016/j.dss.2020.113339 10.1142/S0218001403002460 10.1109/CIT/IUCC/DASC/PICOM.2015.184 10.1145/3234150 10.3389/fpsyt.2020.00673 10.1016/j.csda.2011.04.012 10.1109/32.962560 |
ContentType | Journal Article |
Copyright | 2021 Elsevier B.V. Copyright Elsevier Science Ltd. Mar 5, 2022 |
Copyright_xml | – notice: 2021 Elsevier B.V. – notice: Copyright Elsevier Science Ltd. Mar 5, 2022 |
DBID | AAYXX CITATION 7SC 8FD E3H F2A JQ2 L7M L~C L~D |
DOI | 10.1016/j.knosys.2021.108079 |
DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database Library & Information Sciences Abstracts (LISA) Library & Information Science Abstracts (LISA) ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Library and Information Science Abstracts (LISA) ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1872-7409 |
ExternalDocumentID | 10_1016_j_knosys_2021_108079 S0950705121011527 |
GroupedDBID | --K --M .DC .~1 0R~ 1B1 1~. 1~5 4.4 457 4G. 5VS 7-5 71M 77K 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXUO AAYFN ABAOU ABBOA ABIVO ABJNI ABMAC ABYKQ ACAZW ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE ADGUI ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ARUGR AXJTR BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W JJJVA KOM LG9 LY7 M41 MHUIS MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 ROL RPZ SDF SDG SDP SES SPC SPCBC SST SSV SSW SSZ T5K WH7 XPP ZMT ~02 ~G- 29L AAQXK AAXKI AAYXX ABXDB ACNNM ADJOM ADMUD AFJKZ AKRWK ASPBG AVWKF AZFZN CITATION EJD FEDTE FGOYB G-2 G8K HLZ HVGLF HZ~ R2- RIG SBC SET SEW UHS WUQ 7SC 8FD E3H F2A JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c334t-a7bcb09c7f362104e655855534e0f3460f3cd55fed8d9eef1488fa697c4add553 |
IEDL.DBID | AIKHN |
ISSN | 0950-7051 |
IngestDate | Thu Oct 10 17:27:18 EDT 2024 Thu Sep 26 16:18:33 EDT 2024 Fri Feb 23 02:39:56 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | Deep learning Data science Missing value imputation Machine learning Data discretization |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c334t-a7bcb09c7f362104e655855534e0f3460f3cd55fed8d9eef1488fa697c4add553 |
PQID | 2638773066 |
PQPubID | 2035257 |
ParticipantIDs | proquest_journals_2638773066 crossref_primary_10_1016_j_knosys_2021_108079 elsevier_sciencedirect_doi_10_1016_j_knosys_2021_108079 |
PublicationCentury | 2000 |
PublicationDate | 2022-03-05 |
PublicationDateYYYYMMDD | 2022-03-05 |
PublicationDate_xml | – month: 03 year: 2022 text: 2022-03-05 day: 05 |
PublicationDecade | 2020 |
PublicationPlace | Amsterdam |
PublicationPlace_xml | – name: Amsterdam |
PublicationTitle | Knowledge-based systems |
PublicationYear | 2022 |
Publisher | Elsevier B.V Elsevier Science Ltd |
Publisher_xml | – name: Elsevier B.V – name: Elsevier Science Ltd |
References | J. Dougherty, R. Kohavi, M. Sahami, Supervised and unsupervised discretization of continuous features, in: International Conference on Machine Learning, 1995, pp. 194–202. U.M. Fayyad, K.B. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: International Joint Conference on Artificial Intelligence, 1993, pp. 1022–1029. Nikfalazar, Yeh, Bedingfield, Khorshidi (b7) 2020; 62 R. Kerber, ChiMerge: discretization of numeric attributes, in: AAAI Conference on Artificial Intelligence, 1992, pp. 123–128. Pouyanfar, Sadiq, Yan, Tian, Tao, Reyes, Shyu, Chen, Iyengar (b9) 2019; 51 Templeton, Kang, Tahmasbi (b23) 2021 Salcedo-Sanz, Rojo-Alvarez, Martinez-Ramon, Camps-Valls (b34) 2014; 4 Z. Chen, S. Liu, K. Jiang, H. Xu, X. Cheng, A data imputation method based on deep belief network, in: IEEE International Conference on Computer and Information Technology, Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, 2015, pp. 1238–1243. Liu, Hussain, Tan, Dash (b18) 2002; 6 Silva-Ramirez, Pino-ejias, Lopez-Coello (b22) 2015; 29 van Buuren (b1) 2018 Lin, Ke, Tsai (b5) 2017; 13 Little, Rubin (b6) 2002 Byun, Lee (b32) 2003; 17 Gad, Hosahalli, Majunatha, Ghoneim (b11) 2021; 4 Cervantes, Garcia-Lamont, Rodriguez-Mazahua, Lopez (b33) 2020; 408 Cheng, Tseng, Chang, Chang, Gau (b10) 2020; 11 Ali, Siddiqi, Lee (b24) 2015; 44 Pati, Das (b20) 2017; 52 Aggarwal (b26) 2018 Lin, Tsai (b3) 2020; 53 Kotsiantis, Kanellopoulos (b25) 2006; 32 Demsar (b35) 2006; 7 Pereira, Santos, Rodrigues, Abreu (b15) 2020; 69 M. Smieja, L. Struski, J. Tabor, B. Zielinski, P. Spurek, Processing of missing data by neural networks, in: International Conference on Neural Information Processing Systems, 2018, pp. 2724–2734. Haykin (b27) 1999 Templ, Kowarik, Filzmoser (b19) 2011; 55 Hinton, Osindero, The (b29) 2006; 18 Lin, Li, Alam, Ma (b14) 2020; 50 Garcia, Luengo, Saez, Lopez, Herrera (b17) 2013; 25 Fischer, Igel (b28) 2014; 47 Piri (b21) 2020; 136 Garcia-Laencina, Sancho-Gomez, Figueiras-Vidal (b2) 2010; 19 Strike, Emam, Madhavji (b4) 2001; 27 Dong, Wang, Abbas (b8) 2021; 40 Garcia (10.1016/j.knosys.2021.108079_b17) 2013; 25 Lin (10.1016/j.knosys.2021.108079_b3) 2020; 53 Silva-Ramirez (10.1016/j.knosys.2021.108079_b22) 2015; 29 Byun (10.1016/j.knosys.2021.108079_b32) 2003; 17 10.1016/j.knosys.2021.108079_b30 van Buuren (10.1016/j.knosys.2021.108079_b1) 2018 10.1016/j.knosys.2021.108079_b31 Liu (10.1016/j.knosys.2021.108079_b18) 2002; 6 Strike (10.1016/j.knosys.2021.108079_b4) 2001; 27 Garcia-Laencina (10.1016/j.knosys.2021.108079_b2) 2010; 19 Pati (10.1016/j.knosys.2021.108079_b20) 2017; 52 Hinton (10.1016/j.knosys.2021.108079_b29) 2006; 18 Little (10.1016/j.knosys.2021.108079_b6) 2002 Lin (10.1016/j.knosys.2021.108079_b14) 2020; 50 Ali (10.1016/j.knosys.2021.108079_b24) 2015; 44 Pereira (10.1016/j.knosys.2021.108079_b15) 2020; 69 Gad (10.1016/j.knosys.2021.108079_b11) 2021; 4 Demsar (10.1016/j.knosys.2021.108079_b35) 2006; 7 Cheng (10.1016/j.knosys.2021.108079_b10) 2020; 11 Salcedo-Sanz (10.1016/j.knosys.2021.108079_b34) 2014; 4 Pouyanfar (10.1016/j.knosys.2021.108079_b9) 2019; 51 Cervantes (10.1016/j.knosys.2021.108079_b33) 2020; 408 Kotsiantis (10.1016/j.knosys.2021.108079_b25) 2006; 32 10.1016/j.knosys.2021.108079_b16 Lin (10.1016/j.knosys.2021.108079_b5) 2017; 13 Haykin (10.1016/j.knosys.2021.108079_b27) 1999 Dong (10.1016/j.knosys.2021.108079_b8) 2021; 40 Aggarwal (10.1016/j.knosys.2021.108079_b26) 2018 10.1016/j.knosys.2021.108079_b12 Nikfalazar (10.1016/j.knosys.2021.108079_b7) 2020; 62 Fischer (10.1016/j.knosys.2021.108079_b28) 2014; 47 10.1016/j.knosys.2021.108079_b13 Templeton (10.1016/j.knosys.2021.108079_b23) 2021 Piri (10.1016/j.knosys.2021.108079_b21) 2020; 136 Templ (10.1016/j.knosys.2021.108079_b19) 2011; 55 |
References_xml | – volume: 62 start-page: 2419 year: 2020 end-page: 2437 ident: b7 article-title: Missing data imputation using decision trees and fuzzy clustering with iterative learning publication-title: Knowl. Inf. Syst. contributor: fullname: Khorshidi – volume: 44 start-page: 235 year: 2015 end-page: 263 ident: b24 article-title: Rough set-based approaches for discretization: a compact review publication-title: Artif. Intell. Rev. contributor: fullname: Lee – volume: 29 start-page: 65 year: 2015 end-page: 74 ident: b22 article-title: Single imputation with multilayer perceptron and multiple imputation combining multilayer perceptron and k-nearest neighbors for monotone patterns publication-title: Appl. Soft Comput. contributor: fullname: Lopez-Coello – volume: 4 start-page: 234 year: 2014 end-page: 267 ident: b34 article-title: Support vector machines in engineering: an overview publication-title: Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. contributor: fullname: Camps-Valls – volume: 136 year: 2020 ident: b21 article-title: Missing care: a framework to address the issue of frequent missing values: the case of a clinical decision support system for Parkinson’s disease publication-title: Decis. Support Syst. contributor: fullname: Piri – year: 2021 ident: b23 article-title: Regression imputation optimization sample size and emulation: demonstrations and comparisons to prominent methods publication-title: Decis. Support Syst. contributor: fullname: Tahmasbi – volume: 18 start-page: 1527 year: 2006 end-page: 1554 ident: b29 article-title: A fast learning algorithm for deep belief nets publication-title: Neural Comput. contributor: fullname: The – volume: 13 start-page: 53 year: 2017 end-page: 63 ident: b5 article-title: When should we ignore examples with missing values? publication-title: Int. J. Data Warehous. Min. contributor: fullname: Tsai – volume: 408 start-page: 189 year: 2020 end-page: 215 ident: b33 article-title: A comprehensive survey on support vector machine classification: applications, challenges and trends publication-title: Neurocomputing contributor: fullname: Lopez – volume: 19 start-page: 263 year: 2010 end-page: 282 ident: b2 article-title: Pattern classification with missing data: a review publication-title: Neural Comput. Appl. contributor: fullname: Figueiras-Vidal – volume: 17 start-page: 459 year: 2003 end-page: 486 ident: b32 article-title: A survey on pattern recognition applications of support vector machines publication-title: Int. J. Pattern Recognit. Artif. Intell. contributor: fullname: Lee – volume: 6 start-page: 393 year: 2002 end-page: 423 ident: b18 article-title: Discretization: an enabling technique publication-title: Data Min. Knowl. Discov. contributor: fullname: Dash – year: 2002 ident: b6 article-title: Statistical Analysis with Missing Data contributor: fullname: Rubin – volume: 4 start-page: 67 year: 2021 end-page: 84 ident: b11 article-title: A robust deep learning model for missing value imputation in big NCDC dataset publication-title: Iran J. Comput. Sci. contributor: fullname: Ghoneim – year: 2018 ident: b1 article-title: Flexible Imputation of Missing Data contributor: fullname: van Buuren – volume: 47 start-page: 25 year: 2014 end-page: 39 ident: b28 article-title: Training restricted Boltzmann machines: an introduction publication-title: Pattern Recognit. contributor: fullname: Igel – volume: 52 start-page: 709 year: 2017 end-page: 750 ident: b20 article-title: Missing value estimation for microarray data through cluster analysis publication-title: Knowl. Inf. Syst. contributor: fullname: Das – volume: 53 start-page: 1487 year: 2020 end-page: 1509 ident: b3 article-title: Missing value imputation: a review and analysis of the literature (2006–2017) publication-title: Artif. Intell. Rev. contributor: fullname: Tsai – volume: 50 start-page: 860 year: 2020 end-page: 877 ident: b14 article-title: Data-driven missing data imputation in cluster monitoring system based on deep neural network publication-title: Appl. Intell. contributor: fullname: Ma – volume: 69 start-page: 1255 year: 2020 end-page: 1285 ident: b15 article-title: Reviewing autoencoders for missing data imputation: technical trends, applications, and outcomes publication-title: J. Artificial Intelligence Res. contributor: fullname: Abreu – volume: 27 start-page: 890 year: 2001 end-page: 908 ident: b4 article-title: Software cost estimation with incomplete data publication-title: IEEE Trans. Softw. Eng. contributor: fullname: Madhavji – volume: 55 start-page: 2793 year: 2011 end-page: 2806 ident: b19 article-title: Iterative stepwise regression imputation using standard and robust methods publication-title: Comput. Statist. Data Anal. contributor: fullname: Filzmoser – volume: 40 year: 2021 ident: b8 article-title: A survey on deep learning and its applications publication-title: Comp. Sci. Rev. contributor: fullname: Abbas – volume: 32 start-page: 47 year: 2006 end-page: 58 ident: b25 article-title: Discretization techniques: a recent survey publication-title: GESTS Int. Trans. Comput. Sci. Eng. contributor: fullname: Kanellopoulos – volume: 11 start-page: 673 year: 2020 ident: b10 article-title: A deep learning approach for missing data imputation of rating scales assessing attention-deficit hyperactivity disorder publication-title: Front. Psychiatry contributor: fullname: Gau – year: 2018 ident: b26 article-title: Neural Networks and Deep Learning: A Textbook contributor: fullname: Aggarwal – year: 1999 ident: b27 article-title: Neural Networks: A Comprehensive Foundation contributor: fullname: Haykin – volume: 7 start-page: 1 year: 2006 end-page: 30 ident: b35 article-title: Statistical comparisons of classifiers over multiple data sets publication-title: J. Mach. Learn. Res. contributor: fullname: Demsar – volume: 51 year: 2019 ident: b9 article-title: A survey on deep learning: algorithms, techniques, and applications publication-title: ACM Comput. Surv. contributor: fullname: Iyengar – volume: 25 start-page: 734 year: 2013 end-page: 750 ident: b17 article-title: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning publication-title: IEEE Trans. Knowl. Data Eng. contributor: fullname: Herrera – volume: 44 start-page: 235 issue: 2 year: 2015 ident: 10.1016/j.knosys.2021.108079_b24 article-title: Rough set-based approaches for discretization: a compact review publication-title: Artif. Intell. Rev. doi: 10.1007/s10462-014-9426-2 contributor: fullname: Ali – year: 2018 ident: 10.1016/j.knosys.2021.108079_b1 contributor: fullname: van Buuren – volume: 47 start-page: 25 year: 2014 ident: 10.1016/j.knosys.2021.108079_b28 article-title: Training restricted Boltzmann machines: an introduction publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2013.05.025 contributor: fullname: Fischer – volume: 53 start-page: 1487 year: 2020 ident: 10.1016/j.knosys.2021.108079_b3 article-title: Missing value imputation: a review and analysis of the literature (2006–2017) publication-title: Artif. Intell. Rev. doi: 10.1007/s10462-019-09709-4 contributor: fullname: Lin – volume: 18 start-page: 1527 issue: 7 year: 2006 ident: 10.1016/j.knosys.2021.108079_b29 article-title: A fast learning algorithm for deep belief nets publication-title: Neural Comput. doi: 10.1162/neco.2006.18.7.1527 contributor: fullname: Hinton – volume: 69 start-page: 1255 year: 2020 ident: 10.1016/j.knosys.2021.108079_b15 article-title: Reviewing autoencoders for missing data imputation: technical trends, applications, and outcomes publication-title: J. Artificial Intelligence Res. doi: 10.1613/jair.1.12312 contributor: fullname: Pereira – volume: 50 start-page: 860 year: 2020 ident: 10.1016/j.knosys.2021.108079_b14 article-title: Data-driven missing data imputation in cluster monitoring system based on deep neural network publication-title: Appl. Intell. doi: 10.1007/s10489-019-01560-y contributor: fullname: Lin – volume: 25 start-page: 734 issue: 4 year: 2013 ident: 10.1016/j.knosys.2021.108079_b17 article-title: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning publication-title: IEEE Trans. Knowl. Data Eng. doi: 10.1109/TKDE.2012.35 contributor: fullname: Garcia – volume: 29 start-page: 65 year: 2015 ident: 10.1016/j.knosys.2021.108079_b22 article-title: Single imputation with multilayer perceptron and multiple imputation combining multilayer perceptron and k-nearest neighbors for monotone patterns publication-title: Appl. Soft Comput. doi: 10.1016/j.asoc.2014.09.052 contributor: fullname: Silva-Ramirez – volume: 19 start-page: 263 year: 2010 ident: 10.1016/j.knosys.2021.108079_b2 article-title: Pattern classification with missing data: a review publication-title: Neural Comput. Appl. doi: 10.1007/s00521-009-0295-6 contributor: fullname: Garcia-Laencina – year: 2018 ident: 10.1016/j.knosys.2021.108079_b26 contributor: fullname: Aggarwal – volume: 408 start-page: 189 year: 2020 ident: 10.1016/j.knosys.2021.108079_b33 article-title: A comprehensive survey on support vector machine classification: applications, challenges and trends publication-title: Neurocomputing doi: 10.1016/j.neucom.2019.10.118 contributor: fullname: Cervantes – volume: 13 start-page: 53 issue: 4 year: 2017 ident: 10.1016/j.knosys.2021.108079_b5 article-title: When should we ignore examples with missing values? publication-title: Int. J. Data Warehous. Min. doi: 10.4018/IJDWM.2017100104 contributor: fullname: Lin – ident: 10.1016/j.knosys.2021.108079_b12 – volume: 7 start-page: 1 year: 2006 ident: 10.1016/j.knosys.2021.108079_b35 article-title: Statistical comparisons of classifiers over multiple data sets publication-title: J. Mach. Learn. Res. contributor: fullname: Demsar – volume: 62 start-page: 2419 year: 2020 ident: 10.1016/j.knosys.2021.108079_b7 article-title: Missing data imputation using decision trees and fuzzy clustering with iterative learning publication-title: Knowl. Inf. Syst. doi: 10.1007/s10115-019-01427-1 contributor: fullname: Nikfalazar – year: 1999 ident: 10.1016/j.knosys.2021.108079_b27 contributor: fullname: Haykin – year: 2021 ident: 10.1016/j.knosys.2021.108079_b23 article-title: Regression imputation optimization sample size and emulation: demonstrations and comparisons to prominent methods publication-title: Decis. Support Syst. doi: 10.1016/j.dss.2021.113624 contributor: fullname: Templeton – volume: 32 start-page: 47 issue: 1 year: 2006 ident: 10.1016/j.knosys.2021.108079_b25 article-title: Discretization techniques: a recent survey publication-title: GESTS Int. Trans. Comput. Sci. Eng. contributor: fullname: Kotsiantis – ident: 10.1016/j.knosys.2021.108079_b31 – ident: 10.1016/j.knosys.2021.108079_b30 – ident: 10.1016/j.knosys.2021.108079_b16 doi: 10.1016/B978-1-55860-377-6.50032-3 – volume: 6 start-page: 393 issue: 4 year: 2002 ident: 10.1016/j.knosys.2021.108079_b18 article-title: Discretization: an enabling technique publication-title: Data Min. Knowl. Discov. doi: 10.1023/A:1016304305535 contributor: fullname: Liu – volume: 52 start-page: 709 issue: 3 year: 2017 ident: 10.1016/j.knosys.2021.108079_b20 article-title: Missing value estimation for microarray data through cluster analysis publication-title: Knowl. Inf. Syst. doi: 10.1007/s10115-017-1025-5 contributor: fullname: Pati – volume: 4 start-page: 234 issue: 3 year: 2014 ident: 10.1016/j.knosys.2021.108079_b34 article-title: Support vector machines in engineering: an overview publication-title: Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. contributor: fullname: Salcedo-Sanz – volume: 4 start-page: 67 year: 2021 ident: 10.1016/j.knosys.2021.108079_b11 article-title: A robust deep learning model for missing value imputation in big NCDC dataset publication-title: Iran J. Comput. Sci. doi: 10.1007/s42044-020-00065-z contributor: fullname: Gad – volume: 136 year: 2020 ident: 10.1016/j.knosys.2021.108079_b21 article-title: Missing care: a framework to address the issue of frequent missing values: the case of a clinical decision support system for Parkinson’s disease publication-title: Decis. Support Syst. doi: 10.1016/j.dss.2020.113339 contributor: fullname: Piri – year: 2002 ident: 10.1016/j.knosys.2021.108079_b6 contributor: fullname: Little – volume: 17 start-page: 459 issue: 3 year: 2003 ident: 10.1016/j.knosys.2021.108079_b32 article-title: A survey on pattern recognition applications of support vector machines publication-title: Int. J. Pattern Recognit. Artif. Intell. doi: 10.1142/S0218001403002460 contributor: fullname: Byun – ident: 10.1016/j.knosys.2021.108079_b13 doi: 10.1109/CIT/IUCC/DASC/PICOM.2015.184 – volume: 51 issue: 5 year: 2019 ident: 10.1016/j.knosys.2021.108079_b9 article-title: A survey on deep learning: algorithms, techniques, and applications publication-title: ACM Comput. Surv. doi: 10.1145/3234150 contributor: fullname: Pouyanfar – volume: 40 year: 2021 ident: 10.1016/j.knosys.2021.108079_b8 article-title: A survey on deep learning and its applications publication-title: Comp. Sci. Rev. contributor: fullname: Dong – volume: 11 start-page: 673 year: 2020 ident: 10.1016/j.knosys.2021.108079_b10 article-title: A deep learning approach for missing data imputation of rating scales assessing attention-deficit hyperactivity disorder publication-title: Front. Psychiatry doi: 10.3389/fpsyt.2020.00673 contributor: fullname: Cheng – volume: 55 start-page: 2793 issue: 10 year: 2011 ident: 10.1016/j.knosys.2021.108079_b19 article-title: Iterative stepwise regression imputation using standard and robust methods publication-title: Comput. Statist. Data Anal. doi: 10.1016/j.csda.2011.04.012 contributor: fullname: Templ – volume: 27 start-page: 890 issue: 10 year: 2001 ident: 10.1016/j.knosys.2021.108079_b4 article-title: Software cost estimation with incomplete data publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/32.962560 contributor: fullname: Strike |
SSID | ssj0002218 |
Score | 2.5530968 |
Snippet | Often real-world datasets are incomplete and contain some missing attribute values. Furthermore, many data mining and machine learning techniques cannot... |
SourceID | proquest crossref elsevier |
SourceType | Aggregation Database Publisher |
StartPage | 108079 |
SubjectTerms | Algorithms Artificial neural networks Belief networks Data discretization Data mining Data science Datasets Deep learning Discretization Machine learning Missing value imputation Multilayer perceptrons Structured data Tables (data) |
Title | Deep learning for missing value imputation of continuous data and the effect of data discretization |
URI | https://dx.doi.org/10.1016/j.knosys.2021.108079 https://www.proquest.com/docview/2638773066 |
Volume | 239 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED7RsrDwRjwK8sBqmocf6YgKVQHBAkhsUeLYKCAlFW0HFn47d4mDBEJCYsngnK3ozv7u7Hx3BjhVYW6ksYqbIgi40DrjSeg0z6SURe6MzZo_-Ld3avoorp_k0wqMu1wYolV67G8xvUFr3zL02hzOynJ4j8EBzldJFbBCupy1B6vojoTow-r51c307guQo6g55iN5Th26DLqG5vVa1fN3qtsdhQ3fjjhdv3uoH1jdOKDJJqz7yJGdtx-3BSu22oaN7lYG5hfpDpgLa2fM3wbxzDAoZWhLOhJgVNnbspJ6NAZhtWPEVS-rZb2cM2KLsqwqGAaFrCV6kETTTNm7lPDYZm3uwuPk8mE85f4qBW7iWCx4pnOTByOjHTos3IFZJXGfIGUsbOBiofBhCimdLZJiZK3DTVLiMjXSRiAAotwe9Ku6svvAlIxxMK2ELZxQMspQNg4wTkwCK2wcHgDv1JfO2ooZaUcle0lbdaek7rRV9wHoTsfpN8unCOp_9Bx0Jkn9ysP3CCgaYUupw38PfARrEWU5ENVMDqC_eFvaY4w9FvkJ9M4-whM_wz4BK13aEw |
link.rule.ids | 315,786,790,4521,24144,27955,27956,45618,45712 |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5BGWDhjShPD6xW8_CjHSseKo92oUhsVuLYqCClFW0H_j13iYMEQkJiyeCcrejO_nx2vrsDuFBxbqV1itsiirjQOuPd2GueSSmL3FuXVX_whyM1eBJ3z_J5BS6bWBiiVQbsrzG9QuvQ0gna7Mwmk84jOgc4XyVlwIqpOOsqrAmp46QFa_3b-8HoC5CTpLrmI3lOHZoIuorm9VZO5x-UtzuJK74dcbp-36F-YHW1Ad1sw2bwHFm__rgdWHHlLmw1VRlYWKR7YK-cm7FQDeKFoVPK0JZ0JcAos7djE-pRGYRNPSOu-qRcTpdzRmxRlpUFQ6eQ1UQPkqiaKXqXAh7rqM19eLq5Hl8OeCilwG2aigXPdG7zqGe1xw0LT2BOSTwnSJkKF_lUKHzYQkrvim7Rc87jIanrM9XTViAAotwBtMpp6Q6BKZniYFoJV3ihZJKhbBqhn9iNnHBp3AbeqM_M6owZpqGSvZpa3YbUbWp1t0E3OjbfLG8Q1P_oedKYxISVh-8RUDTCllJH_x74HNYH4-GDebgd3R_DRkIRD0Q7kyfQWrwv3Sn6IYv8LMyzT2rd3AM |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+learning+for+missing+value+imputation+of+continuous+data+and+the+effect+of+data+discretization&rft.jtitle=Knowledge-based+systems&rft.au=Lin%2C+Wei-Chao&rft.au=Tsai%2C+Chih-Fong&rft.au=Zhong%2C+Jia+Rong&rft.date=2022-03-05&rft.pub=Elsevier+B.V&rft.issn=0950-7051&rft.eissn=1872-7409&rft.volume=239&rft_id=info:doi/10.1016%2Fj.knosys.2021.108079&rft.externalDocID=S0950705121011527 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0950-7051&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0950-7051&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0950-7051&client=summon |