A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data

The rapid development in data science and the increasing availability of building operational data have provided great opportunities for developing data-driven solutions for intelligent building energy management. Data preprocessing serves as the foundation for valid data analyses. It is an indispen...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in energy research Vol. 9
Main Authors Fan, Cheng, Chen, Meiling, Wang, Xinghua, Wang, Jiayuan, Huang, Bufu
Format Journal Article
LanguageEnglish
Published Frontiers Media S.A 29.03.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The rapid development in data science and the increasing availability of building operational data have provided great opportunities for developing data-driven solutions for intelligent building energy management. Data preprocessing serves as the foundation for valid data analyses. It is an indispensable step in building operational data analysis considering the intrinsic complexity of building operations and deficiencies in data quality. Data preprocessing refers to a set of techniques for enhancing the quality of the raw data, such as outlier removal and missing value imputation. This article serves as a comprehensive review of data preprocessing techniques for analysing massive building operational data. A wide variety of data preprocessing techniques are summarised in terms of their applications in missing value imputation, outlier detection, data reduction, data scaling, data transformation, and data partitioning. In addition, three state-of-the-art data science techniques are proposed to tackle practical data challenges in the building field, i.e., data augmentation, transfer learning, and semi-supervised learning. In-depth discussions have been presented to describe the pros and cons of existing preprocessing methods, possible directions for future research and potential applications in smart building energy management. The research outcomes are helpful for the development of data-driven research in the building field.
AbstractList The rapid development in data science and the increasing availability of building operational data have provided great opportunities for developing data-driven solutions for intelligent building energy management. Data preprocessing serves as the foundation for valid data analyses. It is an indispensable step in building operational data analysis considering the intrinsic complexity of building operations and deficiencies in data quality. Data preprocessing refers to a set of techniques for enhancing the quality of the raw data, such as outlier removal and missing value imputation. This article serves as a comprehensive review of data preprocessing techniques for analysing massive building operational data. A wide variety of data preprocessing techniques are summarised in terms of their applications in missing value imputation, outlier detection, data reduction, data scaling, data transformation, and data partitioning. In addition, three state-of-the-art data science techniques are proposed to tackle practical data challenges in the building field, i.e., data augmentation, transfer learning, and semi-supervised learning. In-depth discussions have been presented to describe the pros and cons of existing preprocessing methods, possible directions for future research and potential applications in smart building energy management. The research outcomes are helpful for the development of data-driven research in the building field.
Author Fan, Cheng
Wang, Xinghua
Chen, Meiling
Wang, Jiayuan
Huang, Bufu
Author_xml – sequence: 1
  givenname: Cheng
  surname: Fan
  fullname: Fan, Cheng
– sequence: 2
  givenname: Meiling
  surname: Chen
  fullname: Chen, Meiling
– sequence: 3
  givenname: Xinghua
  surname: Wang
  fullname: Wang, Xinghua
– sequence: 4
  givenname: Jiayuan
  surname: Wang
  fullname: Wang, Jiayuan
– sequence: 5
  givenname: Bufu
  surname: Huang
  fullname: Huang, Bufu
BookMark eNp1UV1LHDEUDWJBa_0BvuUP7DYfk2Tm0e5quygosoJv4U5yZ42MyTYZXba_vrO7FkqhT_dy4HxxPpPjmCIScsHZVMq6-dphzKupYIJPtRI140fkVIhGT1RTPx3_9Z-Q81JeGGNcClVxdkp-XdIHfA-4oSnSOQxA7zOuc3JYSogrukT3HMPPNyx0mTaQPb3quuACxoFC9CO5D9D2SG9i2vToV0jnobj0jnlLr3N6pd_eQu93UndrzDCEFKHfO30hnzroC55_3DPyeH21nP2Y3N59X8wubyeuEmyYCKORg2qVN61Gw1uPnRrLtLr2Blht1K44F9p3UhhZKdd4xoVhVdOOuJJnZHHQ9Qle7DqHV8hbmyDYPZDyykIeguvRtrJG3VQGpGIVh9EDwXCpha5c2wAftcxBy-VUSsbOujDsSw0ZQm85s7tF7H4RuwtmD4uMTP4P80-S_3N-AwBskcU
CitedBy_id crossref_primary_10_1021_acs_est_3c00348
crossref_primary_10_3389_fstro_2024_1488313
crossref_primary_10_12991_jrespharm_1644357
crossref_primary_10_24003_emitter_v12i2_835
crossref_primary_10_52158_jacost_v4i1_493
crossref_primary_10_1007_s00704_024_05155_7
crossref_primary_10_1016_j_compbiomed_2024_109237
crossref_primary_10_1007_s42979_025_03761_4
crossref_primary_10_32604_cmc_2023_046648
crossref_primary_10_1016_j_jpi_2023_100341
crossref_primary_10_51583_IJLTEMAS_2024_130805
crossref_primary_10_3390_sym15091723
crossref_primary_10_1016_j_humgen_2022_201135
crossref_primary_10_1016_j_rockmb_2024_100146
crossref_primary_10_1016_j_ejca_2023_113291
crossref_primary_10_1007_s00484_024_02763_w
crossref_primary_10_1186_s42162_022_00214_7
crossref_primary_10_1109_ACCESS_2024_3496561
crossref_primary_10_1007_s10845_023_02126_z
crossref_primary_10_3389_fpsyt_2023_1143272
crossref_primary_10_1109_ACCESS_2023_3258399
crossref_primary_10_1016_j_enconman_2025_119737
crossref_primary_10_3390_buildings14123894
crossref_primary_10_1080_08839514_2023_2222494
crossref_primary_10_1007_s44163_025_00241_9
crossref_primary_10_1093_postmj_qgae080
crossref_primary_10_3390_w17050756
crossref_primary_10_1007_s11042_024_18280_2
crossref_primary_10_1016_j_iot_2022_100628
crossref_primary_10_1007_s11042_024_18576_3
crossref_primary_10_1016_j_rineng_2024_102766
crossref_primary_10_1016_j_enbuild_2022_112204
crossref_primary_10_1007_s12273_021_0807_6
crossref_primary_10_2196_59587
crossref_primary_10_1016_j_autcon_2024_105844
crossref_primary_10_3390_jpm12121954
crossref_primary_10_12720_jait_15_10_1193_1204
crossref_primary_10_3390_su151813671
crossref_primary_10_3390_su16124954
crossref_primary_10_1016_j_apenergy_2023_120701
crossref_primary_10_1016_j_heliyon_2024_e36846
crossref_primary_10_3390_electronics13234715
crossref_primary_10_1016_j_mtcomm_2024_110209
crossref_primary_10_1016_j_jconrel_2022_11_014
crossref_primary_10_1007_s10836_024_06152_4
crossref_primary_10_1109_ACCESS_2024_3398635
crossref_primary_10_1007_s13198_024_02535_0
crossref_primary_10_1021_acs_chemrev_2c00360
crossref_primary_10_1016_j_procs_2023_12_155
crossref_primary_10_1021_acsestengg_3c00043
crossref_primary_10_51646_jsesd_v14iFICTS_2024_446
crossref_primary_10_1021_acsami_3c06498
crossref_primary_10_1007_s12273_021_0811_x
crossref_primary_10_1186_s13014_024_02573_9
crossref_primary_10_51583_IJLTEMAS_2025_14020019
crossref_primary_10_1016_j_est_2024_114524
crossref_primary_10_1016_j_jpbao_2024_100041
crossref_primary_10_3390_electronics13244981
crossref_primary_10_3390_info16010034
crossref_primary_10_1007_s13369_023_08288_5
crossref_primary_10_1016_j_dmpk_2024_101004
crossref_primary_10_1016_j_jsv_2023_117769
crossref_primary_10_1016_j_cell_2024_07_045
crossref_primary_10_3390_math12091289
crossref_primary_10_3390_s21237902
crossref_primary_10_1016_j_eswa_2025_126991
crossref_primary_10_1177_23998083241259069
crossref_primary_10_1007_s43762_024_00116_2
crossref_primary_10_1016_j_apenergy_2021_117829
crossref_primary_10_1016_j_enbuild_2024_114802
crossref_primary_10_1016_j_rser_2024_114804
crossref_primary_10_1016_j_scs_2023_104770
crossref_primary_10_1289_EHP12901
crossref_primary_10_1016_j_techfore_2024_123901
crossref_primary_10_1007_s00521_024_10076_7
crossref_primary_10_3390_su15065267
crossref_primary_10_1016_j_energy_2023_129939
crossref_primary_10_23939_sisn2023_14_348
crossref_primary_10_1038_s41598_023_31461_7
crossref_primary_10_1016_j_jobe_2022_104445
crossref_primary_10_1109_ACCESS_2024_3362239
crossref_primary_10_1007_s11042_024_19661_3
crossref_primary_10_3389_fnut_2024_1479501
crossref_primary_10_1016_j_dsx_2023_102919
crossref_primary_10_1109_JIOT_2024_3485874
crossref_primary_10_1051_itmconf_20235602007
crossref_primary_10_1016_j_rser_2024_114472
crossref_primary_10_1109_JSTARS_2022_3201273
crossref_primary_10_1109_ACCESS_2024_3518516
crossref_primary_10_2118_207877_PA
crossref_primary_10_1007_s11096_025_01899_y
crossref_primary_10_1016_j_jhazmat_2025_137185
crossref_primary_10_1109_ACCESS_2022_3151652
crossref_primary_10_1007_s00521_025_10976_2
crossref_primary_10_3389_fpls_2023_1286088
crossref_primary_10_1016_j_engstruct_2023_116359
crossref_primary_10_3390_s22134670
crossref_primary_10_1155_2023_8833753
crossref_primary_10_3390_math12172652
crossref_primary_10_1002_bbb_2596
crossref_primary_10_3390_diagnostics13142417
crossref_primary_10_1007_s12273_024_1114_9
crossref_primary_10_3390_jmse12030493
crossref_primary_10_1109_ACCESS_2024_3450520
crossref_primary_10_1155_2022_2206689
crossref_primary_10_3390_bdcc7010045
crossref_primary_10_3390_math11081846
crossref_primary_10_1016_j_uclim_2023_101570
crossref_primary_10_1016_j_atech_2022_100155
crossref_primary_10_1016_j_jinse_2024_100008
crossref_primary_10_1002_ima_22674
crossref_primary_10_54688_ayd_1182620
crossref_primary_10_1016_j_apenergy_2023_121030
crossref_primary_10_12720_jait_16_3_342_356
crossref_primary_10_1016_j_apenergy_2024_123016
crossref_primary_10_1016_j_conbuildmat_2024_135530
crossref_primary_10_1109_TAI_2024_3491938
crossref_primary_10_1016_j_procir_2024_10_266
crossref_primary_10_1007_s11042_024_19816_2
crossref_primary_10_1016_j_jgsce_2024_205522
crossref_primary_10_3390_info14090477
crossref_primary_10_1016_j_orp_2023_100292
crossref_primary_10_3390_electronics13193885
crossref_primary_10_1007_s11227_024_06606_8
crossref_primary_10_1016_j_egyr_2024_08_075
crossref_primary_10_1038_s41598_024_82420_9
crossref_primary_10_1038_s41598_025_90964_7
crossref_primary_10_3390_s24237503
crossref_primary_10_1007_s12652_023_04694_7
crossref_primary_10_3390_s24030886
crossref_primary_10_3390_info14100550
crossref_primary_10_1016_j_buildenv_2022_109279
crossref_primary_10_7837_kosomes_2024_30_5_415
crossref_primary_10_3390_make7010013
crossref_primary_10_1016_j_enbuild_2022_112372
crossref_primary_10_1016_j_enbuild_2024_115162
Cites_doi 10.1016/j.enbuild.2018.07.017
10.16711/j.1001-7100.2019.03.0108
10.1016/j.apenergy.2019.113497
10.1007/s12273-020-0650-1
10.1016/j.eswa.2015.01.010
10.1016/j.apenergy.2019.113395
10.1016/j.energy.2019.116813
10.7551/mitpress/9780262033589.001.0001
10.1109/AIPR.2018.8707390
10.1016/j.apenergy.2010.04.008
10.1109/tsg.2014.2384997
10.1016/j.apenergy.2017.03.064
10.1016/j.enbuild.2018.01.034
10.1016/j.apenergy.2014.04.016
10.1016/j.apenergy.2018.12.004
10.1109/tsg.2013.2278477
10.1016/j.neucom.2018.10.109
10.1016/j.aei.2019.100944
10.1016/j.csda.2013.02.010
10.1007/s10115-017-1118-1
10.1016/j.neucom.2013.02.016
10.1177/0143624417704977
10.5815/ijitcs.2017.05.04
10.1016/j.enbuild.2019.01.034
10.1016/j.scs.2018.02.016
10.1016/j.apenergy.2015.02.048
10.1016/j.autcon.2014.12.006
10.1016/j.enbuild.2014.02.005
10.1016/j.enbuild.2021.110733
10.3390/su9112119
10.1007/s12273-020-0723-1
10.1007/978-981-32-9868-2_6
10.1016/j.apenergy.2020.114499
10.1016/j.enbuild.2020.110369
10.1016/j.eswa.2013.09.013
10.1016/j.seta.2020.100770
10.1016/j.enbuild.2010.05.007
10.1109/tie.2009.2027926
10.1016/j.neucom.2011.12.036
10.1109/tkde.2009.191
10.1016/j.patcog.2019.107049
10.1016/j.enbuild.2011.12.018
10.1016/j.enbuild.2015.09.060
10.1016/j.enbuild.2015.11.045
10.1016/j.ijrefrig.2019.07.018
10.1016/j.enbuild.2018.10.016
10.1016/j.apenergy.2019.02.052
10.1016/j.egypro.2013.11.057
10.1063/1.106515
10.1016/j.enbuild.2006.03.033
10.1007/s10115-013-0706-y
10.1016/j.energy.2016.02.061
10.1016/j.apenergy.2016.10.091
10.1016/j.enbuild.2018.04.052
ContentType Journal Article
DBID AAYXX
CITATION
DOA
DOI 10.3389/fenrg.2021.652801
DatabaseName CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList CrossRef

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2296-598X
ExternalDocumentID oai_doaj_org_article_b38e6947a35041a8b6ea7136264cb9a1
10_3389_fenrg_2021_652801
GroupedDBID 5VS
9T4
AAFWJ
AAYXX
ACGFS
ACXDI
ADBBV
AFPKN
ALMA_UNASSIGNED_HOLDINGS
BCNDV
CITATION
GROUPED_DOAJ
KQ8
M~E
OK1
ID FETCH-LOGICAL-c420t-276e1a5b5d7b6e71bdef5598b68d7a08752021126df327345c9d0127049b21153
IEDL.DBID DOA
ISSN 2296-598X
IngestDate Wed Aug 27 01:19:58 EDT 2025
Tue Jul 01 03:00:19 EDT 2025
Thu Apr 24 23:06:47 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c420t-276e1a5b5d7b6e71bdef5598b68d7a08752021126df327345c9d0127049b21153
OpenAccessLink https://doaj.org/article/b38e6947a35041a8b6ea7136264cb9a1
ParticipantIDs doaj_primary_oai_doaj_org_article_b38e6947a35041a8b6ea7136264cb9a1
crossref_citationtrail_10_3389_fenrg_2021_652801
crossref_primary_10_3389_fenrg_2021_652801
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-03-29
PublicationDateYYYYMMDD 2021-03-29
PublicationDate_xml – month: 03
  year: 2021
  text: 2021-03-29
  day: 29
PublicationDecade 2020
PublicationTitle Frontiers in energy research
PublicationYear 2021
Publisher Frontiers Media S.A
Publisher_xml – name: Frontiers Media S.A
References B65
Wang (B64) 2019; 47
Seem (B54) 2007; 39
Wang (B63) 2015; 146
Yu (B70) 2012; 47
Fan (B14); 236
Capozzoli (B3) 2015; 42
Piscitelli (B50) 2020; 226
Wahid (B62) 2017; 9
Zhang (B71) 2018; 39
Tian (B58) 2012; 90
Sermanet (B55) 2014
Tian (B57) 2019; 186
Triguero (B59) 2015; 42
Fan (B20) 2017; 195
(B28) 2019
Jalori (B29); 121
Li (B41)
Gao (B23) 2020; 396
B34
Vandewalle (B61) 2013; 64
Li (B42); 40
Noh (B47) 2017; 9
Ramesh (B51) 2010; 42
Li (B40) 2017; 185
Liu (B46) 2019; 107
Ashouri (B2) 2018; 172
Liu (B45) 2018; 175
Pan (B48) 2010; 22
Fan (B18) 2014; 127
Gulbinas (B25) 2015; 6
Kwac (B38) 2014; 5
Talukdar (B56) 2010
Li (B43) 2010; 57
Dey (B10) 2018
Piscitelli (B49) 2021; 14
Fan (B17); 251
Zhang (B72) 2019; 28
Fan (B15); 109
Kornish (B36) 2018
Goodfellow (B24) 2016
Han (B27) 2011
Chae (B4) 2016; 111
Kolter (B35) 2011
Xiao (B66) 2014; 75
Yan (B68) 2018; 181
Fan (B13); 240
Kang (B32) 2013; 118
Xiao (B67) 2017
Fan (B19); 50
Jenghara (B31) 2018; 56
Rashid (B52) 2019; 42
Guyon (B26) 2003; 3
Um (B60) 2017
Li (B44); 98
Chapelle (B5) 2006
Le Cam (B39) 2016; 101
Fan (B11); 234
Fan (B12) 2020; 262
Ashouri (B1) 2020; 194
Chou (B8) 2014; 41
Yu (B69) 2019; 253
Chollet (B7) 2018
Fan (B21); 14
Ribeiro (B53) 2018; 165
Khan (B33) 2013; 42
Cheng (B6) 2016
Cui (B9) 2018; 34
Jalori (B30); 121
Frid-Adar (B22) 2018
Fan (B16) 2018; 39
Kusiak (B37) 2010; 87
References_xml – volume: 175
  start-page: 148
  year: 2018
  ident: B45
  article-title: Energy diagnosis of variable refrigerant flow (VRF) systems: data mining technique and statistical quality control approach
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2018.07.017
– start-page: 216
  year: 2017
  ident: B60
  article-title: Data augmentation of wearable sensor data for Parkinson’s disease monitoring using convolutional neural networks
– volume-title: Energy efficiency: buildings
  year: 2019
  ident: B28
– volume: 47
  start-page: 49
  year: 2019
  ident: B64
  article-title: Feature-optimizing selection for chiller fault detection and diagnosis
  publication-title: Cyro. Supercond.
  doi: 10.16711/j.1001-7100.2019.03.0108
– volume: 253
  start-page: 113497
  year: 2019
  ident: B69
  article-title: A data-driven approach to extract operational signatures of HVAC systems and analyze impact on electricity consumption
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2019.113497
– volume: 14
  start-page: 131
  year: 2021
  ident: B49
  article-title: A data analytics-based tool for the detection and diagnosis of anomalous daily energy patterns in buildings
  publication-title: Build. Simul.
  doi: 10.1007/s12273-020-0650-1
– volume: 34
  start-page: 94
  year: 2018
  ident: B9
  article-title: Research on preprocessing technology of building energy consumption monitoring data based on machine learning algorithm
  publication-title: Build. Sci.
– volume: 42
  start-page: 4324
  year: 2015
  ident: B3
  article-title: Fault detection analysis using data mining techniques for a cluster of smart office buildings
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2015.01.010
– volume-title: Deep learning with R
  year: 2018
  ident: B7
– volume: 251
  start-page: 113395
  ident: B17
  article-title: A graph mining-based methodology for discovering and visualizing high-level knowledge for building energy management
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2019.113395
– volume: 194
  start-page: 116813
  year: 2020
  ident: B1
  article-title: Systematic approach to provide building occupants with feedback to reduce energy consumption
  publication-title: Energy
  doi: 10.1016/j.energy.2019.116813
– volume-title: Semi-supervised learning
  year: 2006
  ident: B5
  doi: 10.7551/mitpress/9780262033589.001.0001
– year: 2018
  ident: B36
  article-title: DCNN augmentation via synthetic data from variational autoencoders and generative adversarial networks
  doi: 10.1109/AIPR.2018.8707390
– volume: 87
  start-page: 3092
  year: 2010
  ident: B37
  article-title: Modeling and optimization of HVAC energy consumption
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2010.04.008
– volume: 6
  start-page: 1414
  year: 2015
  ident: B25
  article-title: Segmentation and classification of commercial building occupants by energy-use efficiency and predictability
  publication-title: IEEE Trans. Smart Grid
  doi: 10.1109/tsg.2014.2384997
– volume: 195
  start-page: 222
  year: 2017
  ident: B20
  article-title: A short-term building cooling load prediction method using deep learning algorithms
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2017.03.064
– volume: 165
  start-page: 352
  year: 2018
  ident: B53
  article-title: Transfer learning with seasonal and trend adjustment for cross-building energy forecasting
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2018.01.034
– volume: 127
  start-page: 1
  year: 2014
  ident: B18
  article-title: Development of prediction models for next-day building energy consumption and peak power demand using data mining techniques
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2014.04.016
– volume-title: Data mining: concepts and techniques
  year: 2011
  ident: B27
– volume: 236
  start-page: 700
  ident: B14
  article-title: Assessment of deep recurrent neural network-based strategies for short-term building energy predictions
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2018.12.004
– start-page: 1349
  year: 2011
  ident: B35
  article-title: A large-scale study on predicting and contextualizing building energy usage
– volume: 5
  start-page: 420
  year: 2014
  ident: B38
  article-title: Household energy consumption segmentation using hourly data
  publication-title: IEEE Trans. Smart Grid
  doi: 10.1109/tsg.2013.2278477
– year: 2014
  ident: B55
  article-title: Overfeat: integrated recognition, localization and detection using convolutional networks
– volume: 396
  start-page: 487
  year: 2020
  ident: B23
  article-title: Data augmentation in fault diagnosis based on the Wasserstein generative adversarial network with gradient penalty
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2018.10.109
– volume: 42
  start-page: 100944
  year: 2019
  ident: B52
  article-title: Times-series data augmentation and deep learning for construction equipment activity recognition
  publication-title: Adv. Eng. Inform.
  doi: 10.1016/j.aei.2019.100944
– volume: 64
  start-page: 220
  year: 2013
  ident: B61
  article-title: A predictive deviance criterion for selecting a generative model in semi-supervised classification
  publication-title: Comput. Stat. Data Anal.
  doi: 10.1016/j.csda.2013.02.010
– volume: 56
  start-page: 123
  year: 2018
  ident: B31
  article-title: Imputing missing value through ensemble concept based on statistical measures
  publication-title: Knowledge Inf. Syst.
  doi: 10.1007/s10115-017-1118-1
– volume: 118
  start-page: 65
  year: 2013
  ident: B32
  article-title: Locally linear reconstruction based missing value imputation for supervised learning
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2013.02.016
– volume: 39
  start-page: 117
  year: 2018
  ident: B16
  article-title: Mining big building operational data for improving building energy efficiency: a case study
  publication-title: Build. Serv. Eng. Res. Technol.
  doi: 10.1177/0143624417704977
– volume: 9
  start-page: 23
  year: 2017
  ident: B62
  article-title: Statistical features based approach (SFBA) for hourly energy consumption prediction using neural network
  publication-title: Networks
  doi: 10.5815/ijitcs.2017.05.04
– volume: 186
  start-page: 230
  year: 2019
  ident: B57
  article-title: Data driven parallel prediction of building energy consumption using generative adversarial nets
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2019.01.034
– ident: B65
– start-page: 872
  year: 2018
  ident: B10
  article-title: Semi-supervised learning techniques for automated fault detection and diagnosis of HVAC systems
– start-page: 646
  year: 2016
  ident: B6
  article-title: Case studies of fault diagnosis and energy saving in buildings using data mining techniques
– volume: 39
  start-page: 508
  year: 2018
  ident: B71
  article-title: On the feature engineering of building energy data mining
  publication-title: Sustain. Cities Soc.
  doi: 10.1016/j.scs.2018.02.016
– volume: 146
  start-page: 92
  year: 2015
  ident: B63
  article-title: Benchmarking whole-building energy performance with multi-criteria technique for order preference by similarity to ideal solution using a selective objective-weighting approach
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2015.02.048
– volume: 50
  start-page: 81
  ident: B19
  article-title: A framework for knowledge discovery in massive building automation data and its application in building diagnostics
  publication-title: Autom. Constr.
  doi: 10.1016/j.autcon.2014.12.006
– volume: 75
  start-page: 109
  year: 2014
  ident: B66
  article-title: Data mining in building automation system for improving building operational performance
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2014.02.005
– ident: B34
– volume: 121
  start-page: 156
  ident: B30
  article-title: A unified inverse modeling framework for whole-building energy interval data: daily and hourly baseline modeling and short-term load forecasting
  publication-title: ASHRAE Trans.
– volume: 234
  start-page: 110733
  ident: B11
  article-title: Statistical characterization of semi-supervised neural networks for fault detection and diagnosis of air handling units
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2021.110733
– volume: 9
  start-page: 2119
  year: 2017
  ident: B47
  article-title: In-depth analysis of energy efficiency related factors in commercial buildings using data cube and association rule mining
  publication-title: Sustainability
  doi: 10.3390/su9112119
– volume: 14
  start-page: 3
  ident: B21
  article-title: Advanced data analytics for enhancing building performances: from data-driven to big data-driven approaches
  publication-title: Build. Simul.
  doi: 10.1007/s12273-020-0723-1
– start-page: 1473
  volume-title: Experiments in graph-based semi-supervised learning methods for class-instance acquisition
  year: 2010
  ident: B56
– start-page: 61
  ident: B41
  article-title: Using evidence accumulation-based clustering and symbolic transformation to group multiple buildings based on electricity usage patterns
  publication-title: Sustain. Energy Build.
  doi: 10.1007/978-981-32-9868-2_6
– volume: 262
  start-page: 114499
  year: 2020
  ident: B12
  article-title: Statistical investigations of transfer learning-based methodology for short-term building energy predictions
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2020.114499
– volume: 226
  start-page: 110369
  year: 2020
  ident: B50
  article-title: Enhancing operational performance of ahus through an advanced fault detection and diagnosis process based on temporal association and decision rules
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2020.110369
– volume: 41
  start-page: 2144
  year: 2014
  ident: B8
  article-title: Smart meter monitoring and data mining techniques for predicting refrigeration system performance
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2013.09.013
– volume: 40
  start-page: 100770
  ident: B42
  article-title: A new strategy to benchmark and evaluate building electricity usage using multiple data mining technologies
  publication-title: Sustain. Energy Technol. Assess.
  doi: 10.1016/j.seta.2020.100770
– volume: 42
  start-page: 1592
  year: 2010
  ident: B51
  article-title: Life cycle energy analysis of buildings: an overview
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2010.05.007
– volume: 57
  start-page: 3639
  year: 2010
  ident: B43
  article-title: Classification of energy consumption in buildings with outlier detection
  publication-title: IEEE Trans. Ind. Electron.
  doi: 10.1109/tie.2009.2027926
– volume: 90
  start-page: 46
  year: 2012
  ident: B58
  article-title: A multiple kernel framework for inductive semi-supervised SVM learning
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2011.12.036
– volume-title: Deep learning
  year: 2016
  ident: B24
– volume: 22
  start-page: 1345
  year: 2010
  ident: B48
  article-title: A survey on transfer learning
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/tkde.2009.191
– volume: 98
  start-page: 107049
  ident: B44
  article-title: A baseline regularization scheme for transfer learning with convolutional neural networks
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2019.107049
– start-page: 1
  year: 2017
  ident: B67
  article-title: Mining big building operational data for building cooling load prediction and energy efficiency improvement
– volume: 121
  start-page: 33
  ident: B29
  article-title: A new clustering method to identify outliers and diurnal schedules from building energy interval data
  publication-title: ASHRAE Trans.
– volume: 47
  start-page: 430
  year: 2012
  ident: B70
  article-title: A novel methodology for knowledge discovery through mining associations between building operational data
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2011.12.018
– volume: 109
  start-page: 75
  ident: B15
  article-title: Temporal knowledge discovery in big BAS data for building energy management
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2015.09.060
– volume: 111
  start-page: 184
  year: 2016
  ident: B4
  article-title: Artificial neural network model for forecasting sub-hourly electricity usage in commercial buildings
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2015.11.045
– volume: 107
  start-page: 39
  year: 2019
  ident: B46
  article-title: A novel deep reinforcement learning based methodology for short-term HVAC system energy consumption prediction
  publication-title: Int. J. Refrig.
  doi: 10.1016/j.ijrefrig.2019.07.018
– volume: 181
  start-page: 75
  year: 2018
  ident: B68
  article-title: Semi-supervised learning for early detection and diagnosis of various air handling unit faults
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2018.10.016
– volume: 240
  start-page: 35
  ident: B13
  article-title: Deep learning-based feature engineering methods for improved building energy prediction
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2019.02.052
– volume: 42
  start-page: 557
  year: 2013
  ident: B33
  article-title: Fault detection analysis of building energy consumption using data mining techniques
  publication-title: Energy Procedia
  doi: 10.1016/j.egypro.2013.11.057
– volume: 3
  start-page: 1157
  year: 2003
  ident: B26
  article-title: An introduction to variable and feature selection
  publication-title: J. Machine Learn. Res.
  doi: 10.1063/1.106515
– start-page: 289
  year: 2018
  ident: B22
  article-title: Synthetic data augmentation using GAN for improved liver lesion classification
– volume: 28
  start-page: 201
  year: 2019
  ident: B72
  article-title: Data augmentation method based on generative adversarial network
  publication-title: Computer Syst. Appl.
– volume: 39
  start-page: 52
  year: 2007
  ident: B54
  article-title: Using intelligent data analysis to detect abnormal energy consumption in buildings
  publication-title: Energy and Buildings
  doi: 10.1016/j.enbuild.2006.03.033
– volume: 42
  start-page: 245
  year: 2015
  ident: B59
  article-title: Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study
  publication-title: Knowl Inf. Syst.
  doi: 10.1007/s10115-013-0706-y
– volume: 101
  start-page: 541
  year: 2016
  ident: B39
  article-title: Forecasting electric demand of supply fan using data mining techniques
  publication-title: Energy
  doi: 10.1016/j.energy.2016.02.061
– volume: 185
  start-page: 846
  year: 2017
  ident: B40
  article-title: Data partitioning and association mining for identifying VRF energy consumption patterns under various part loads and refrigerant charge conditions
  publication-title: Appl. Energy
  doi: 10.1016/j.apenergy.2016.10.091
– volume: 172
  start-page: 139
  year: 2018
  ident: B2
  article-title: Development of building energy saving advisory: a data mining approach
  publication-title: Energy Build.
  doi: 10.1016/j.enbuild.2018.04.052
SSID ssj0001325410
Score 2.5371087
SecondaryResourceType review_article
Snippet The rapid development in data science and the increasing availability of building operational data have provided great opportunities for developing data-driven...
SourceID doaj
crossref
SourceType Open Website
Enrichment Source
Index Database
SubjectTerms building energy management
building operational data analysis
data preprocessing
data science
knowledge discovery
Title A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data
URI https://doaj.org/article/b38e6947a35041a8b6ea7136264cb9a1
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQJxgQT1Fe8sCEFFo7thOPLW1VgQQMrdQtshMHIUFatWGAX8-dk1SZYGG1nNj6fNI9_Pk7Qm5U5EKrtAwgnLeBEMIFseNhwJhRqZQREt6RbfGkpnPxsJCLVqsv5IRV8sAVcD0bxk5pEZlQ9gUzsVXOQGIFcbhIrTY-8QGf10qmfHUlhMSH1deYkIXpXg7H8Qr5IGd3SvK4bgLTOKKWXr93LJMDsl9HhHRQ7eSQ7LjiiOy1dAKPyfeAVjV8uizoyJSGvqAYpaf4wwQ6a4RYN3TmabB07JUhwKFQU2QUecf4RIo-NhU0OnrbpMje_KKT9fKDDuvu2PR55dZ1fdCvdELmk_HsfhrUTROCVPB-GfBIOWaklVkEUEXMZi5HEXar4iwyqF-PGDCusjxEaRuZ6sxfPwttYVyGp6RTLAt3RigDeJ00SjOnRWa1zrnkKTg0fIwKn3ZJv0EwSWtFcWxs8Z5AZoGgJx70BBdMKtC75Hb7yaqS0_ht8hCPZTsRlbD9ANhHUttH8pd9nP_HTy7ILu4LuWdcX5JOuf50VxCMlPba290PRyrZvA
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Review+on+Data+Preprocessing+Techniques+Toward+Efficient+and+Reliable+Knowledge+Discovery+From+Building+Operational+Data&rft.jtitle=Frontiers+in+energy+research&rft.au=Fan%2C+Cheng&rft.au=Chen%2C+Meiling&rft.au=Wang%2C+Xinghua&rft.au=Wang%2C+Jiayuan&rft.date=2021-03-29&rft.issn=2296-598X&rft.eissn=2296-598X&rft.volume=9&rft_id=info:doi/10.3389%2Ffenrg.2021.652801&rft.externalDBID=n%2Fa&rft.externalDocID=10_3389_fenrg_2021_652801
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2296-598X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2296-598X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2296-598X&client=summon