The choice of scaling technique matters for classification performance

Dataset scaling, also known as normalization, is an essential preprocessing step in a machine learning pipeline. It is aimed at adjusting attributes scales in a way that they all vary within the same range. This transformation is known to improve the performance of classification models, but there a...

Full description

Saved in:
Bibliographic Details
Published inApplied soft computing Vol. 133; p. 109924
Main Authors de Amorim, Lucas B.V., Cavalcanti, George D.C., Cruz, Rafael M.O.
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.01.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Dataset scaling, also known as normalization, is an essential preprocessing step in a machine learning pipeline. It is aimed at adjusting attributes scales in a way that they all vary within the same range. This transformation is known to improve the performance of classification models, but there are several scaling techniques to choose from, and this choice is not generally done carefully. In this paper, we execute a broad experiment comparing the impact of 5 scaling techniques on the performances of 20 classification algorithms among monolithic and ensemble models, applying them to 82 publicly available datasets with varying imbalance ratios. Results show that the choice of scaling technique matters for classification performance, and the performance difference between the best and the worst scaling technique is relevant and statistically significant in most cases. They also indicate that choosing an inadequate technique can be more detrimental to classification performance than not scaling the data at all. We also show how the performance variation of an ensemble model, considering different scaling techniques, tends to be dictated by that of its base model. Finally, we discuss the relationship between a model’s sensitivity to the choice of scaling technique and its performance and provide insights into its applicability on different model deployment scenarios. Full results and source code for the experiments in this paper are available in a GitHub repository.11https://github.com/amorimlb/scaling_matters. •Compares classification performances after applying five scaling techniques.•Performance difference between best and worst scaling technique is largely relevant.•This difference increases when highly imbalanced datasets are considered.•The performance variation of an ensemble tends to be dictated by that of its base model.•Provides an analysis of sensitivity to the choice of scaling tech. vs model performance.
AbstractList Dataset scaling, also known as normalization, is an essential preprocessing step in a machine learning pipeline. It is aimed at adjusting attributes scales in a way that they all vary within the same range. This transformation is known to improve the performance of classification models, but there are several scaling techniques to choose from, and this choice is not generally done carefully. In this paper, we execute a broad experiment comparing the impact of 5 scaling techniques on the performances of 20 classification algorithms among monolithic and ensemble models, applying them to 82 publicly available datasets with varying imbalance ratios. Results show that the choice of scaling technique matters for classification performance, and the performance difference between the best and the worst scaling technique is relevant and statistically significant in most cases. They also indicate that choosing an inadequate technique can be more detrimental to classification performance than not scaling the data at all. We also show how the performance variation of an ensemble model, considering different scaling techniques, tends to be dictated by that of its base model. Finally, we discuss the relationship between a model’s sensitivity to the choice of scaling technique and its performance and provide insights into its applicability on different model deployment scenarios. Full results and source code for the experiments in this paper are available in a GitHub repository.11https://github.com/amorimlb/scaling_matters. •Compares classification performances after applying five scaling techniques.•Performance difference between best and worst scaling technique is largely relevant.•This difference increases when highly imbalanced datasets are considered.•The performance variation of an ensemble tends to be dictated by that of its base model.•Provides an analysis of sensitivity to the choice of scaling tech. vs model performance.
ArticleNumber 109924
Author Cavalcanti, George D.C.
de Amorim, Lucas B.V.
Cruz, Rafael M.O.
Author_xml – sequence: 1
  givenname: Lucas B.V.
  orcidid: 0000-0003-2725-6527
  surname: de Amorim
  fullname: de Amorim, Lucas B.V.
  email: lucas@ic.ufal.br
  organization: Centro de Informática - Universidade Federal de Pernambuco, Brazil
– sequence: 2
  givenname: George D.C.
  orcidid: 0000-0001-7714-2283
  surname: Cavalcanti
  fullname: Cavalcanti, George D.C.
  organization: Centro de Informática - Universidade Federal de Pernambuco, Brazil
– sequence: 3
  givenname: Rafael M.O.
  orcidid: 0000-0001-9446-1040
  surname: Cruz
  fullname: Cruz, Rafael M.O.
  organization: École de Technologie Supérieure, Université du Québec, Canada
BookMark eNp9kMFKAzEQhoNUsK2-gKe8wK5JdjebgBcpVoWCl3oO6ezEpmw3NYmCb-_WevLQ0ww_fMP834xMhjAgIbeclZxxebcrbQpQCibEGGgt6gsy5aoVhZaKT8a9kaqodS2vyCylHRshLdSULNdbpLANHpAGRxPY3g_vNCNsB__xiXRvc8aYqAuRQm9T8s6DzT4M9IBxTPd2ALwml872CW_-5py8LR_Xi-di9fr0snhYFVAxlgunqpo3rW1a5jaVFMgtgw407ypWOS2FalvRCKzbjbWNhE6BZkzJatM5ydFVc6JOdyGGlCI6Az7_fpOj9b3hzBx9mJ05-jBHH-bkY0TFP_QQ_d7G7_PQ_QnCsdSXx2gSeBwLdz4iZNMFfw7_Aal_fC8
CitedBy_id crossref_primary_10_1109_ACCESS_2024_3406133
crossref_primary_10_1016_j_ecolind_2024_112577
crossref_primary_10_1109_ACCESS_2024_3412975
crossref_primary_10_15829_1728_8800_2025_4130
crossref_primary_10_1016_j_jretconser_2024_103778
crossref_primary_10_1186_s13244_023_01575_7
crossref_primary_10_1016_j_inffus_2023_102036
crossref_primary_10_1016_j_jenvman_2024_123478
crossref_primary_10_1016_j_rineng_2024_103434
crossref_primary_10_1109_ACCESS_2024_3488743
crossref_primary_10_1007_s00704_024_04923_9
crossref_primary_10_3390_info15060295
crossref_primary_10_15622_ia_24_1_8
crossref_primary_10_1007_s10462_024_10872_6
crossref_primary_10_1016_j_matdes_2024_113070
crossref_primary_10_1002_jso_27854
crossref_primary_10_1016_j_mex_2024_103031
crossref_primary_10_3390_particles8010025
crossref_primary_10_1038_s41598_024_64310_2
crossref_primary_10_3390_en17246453
crossref_primary_10_1007_s10586_024_04422_6
crossref_primary_10_1016_j_compag_2025_109905
crossref_primary_10_1016_j_knosys_2024_111833
crossref_primary_10_1016_j_tele_2024_102134
crossref_primary_10_1007_s00521_023_09155_y
crossref_primary_10_1016_j_acags_2025_100230
crossref_primary_10_3390_a17060229
crossref_primary_10_3390_bdcc8090116
crossref_primary_10_1016_j_ecoinf_2024_102868
crossref_primary_10_3233_IDT_240465
crossref_primary_10_1109_ACCESS_2024_3423807
crossref_primary_10_1016_j_ribaf_2024_102639
crossref_primary_10_1016_j_prostr_2024_09_405
crossref_primary_10_1155_acis_2766701
crossref_primary_10_18493_kmusekad_1459230
crossref_primary_10_3847_1538_4357_ad9020
crossref_primary_10_1055_a_2500_7594
crossref_primary_10_1007_s12665_024_11942_2
crossref_primary_10_1016_j_enbuild_2025_115630
crossref_primary_10_1016_j_ijhydene_2024_04_331
crossref_primary_10_1016_j_geoen_2023_212587
crossref_primary_10_1007_s42001_024_00344_w
crossref_primary_10_1016_j_comcom_2024_01_006
crossref_primary_10_3390_app14219821
crossref_primary_10_1038_s41597_024_02975_0
crossref_primary_10_3390_electronics13193885
crossref_primary_10_1136_bmjopen_2024_092594
crossref_primary_10_1109_ACCESS_2025_3543813
crossref_primary_10_1029_2024WR039054
crossref_primary_10_1109_ACCESS_2025_3530261
crossref_primary_10_1186_s40537_025_01120_x
crossref_primary_10_1016_j_aej_2023_12_050
crossref_primary_10_1016_j_drudis_2024_104025
crossref_primary_10_1109_ACCESS_2024_3359989
crossref_primary_10_1109_JSEN_2024_3463209
crossref_primary_10_1177_14613484241287620
crossref_primary_10_1051_bioconf_202414802034
crossref_primary_10_1098_rsos_240699
crossref_primary_10_3390_math12182949
crossref_primary_10_1016_j_comnet_2024_110493
crossref_primary_10_1109_ACCESS_2025_3529526
crossref_primary_10_1109_TNNLS_2024_3366615
crossref_primary_10_1088_2631_8695_ad780f
crossref_primary_10_1016_j_compag_2024_109101
crossref_primary_10_1007_s44163_025_00224_w
crossref_primary_10_3390_app14041559
crossref_primary_10_3390_fire7090329
crossref_primary_10_59681_2175_4411_v16_iEspecial_2024_1333
crossref_primary_10_1049_cps2_12097
crossref_primary_10_1038_s41598_024_64594_4
crossref_primary_10_54752_ct_1569636
crossref_primary_10_3390_s24041137
Cites_doi 10.1038/s41592-019-0686-2
10.1007/s00521-011-0737-9
10.1016/j.eswa.2018.04.008
10.1016/j.patcog.2014.05.003
10.1142/S0129065704001899
10.1023/A:1010933404324
10.1006/jcss.1997.1504
10.1016/j.inffus.2017.09.010
10.1016/S0003-2670(03)00094-1
10.1214/aos/1013203451
10.1109/IJCNN.2015.7280594
10.1007/BF00058655
10.1007/BF00994018
10.1007/978-3-030-82014-5_41
10.1016/j.patcog.2007.10.015
10.1016/j.patcog.2014.12.003
10.1016/j.patcog.2018.08.004
10.35784/iapgos.62
10.1016/j.asoc.2019.105524
10.1016/S0031-3203(00)00150-3
10.1109/34.982906
10.1109/TIT.1967.1053964
10.1109/34.588027
ContentType Journal Article
Copyright 2022 Elsevier B.V.
Copyright_xml – notice: 2022 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.asoc.2022.109924
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-9681
ExternalDocumentID 10_1016_j_asoc_2022_109924
S1568494622009735
GroupedDBID --K
--M
.DC
.~1
0R~
1B1
1~.
1~5
23M
4.4
457
4G.
53G
5GY
5VS
6J9
7-5
71M
8P~
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
UHS
UNMZH
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AFXIZ
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
BNPGV
CITATION
SSH
ID FETCH-LOGICAL-c300t-f834157a570fb362e1a0cdc91d303f962877252e47baa56cd8c900863bdf61ef3
IEDL.DBID .~1
ISSN 1568-4946
IngestDate Tue Jul 01 01:50:18 EDT 2025
Thu Apr 24 23:12:03 EDT 2025
Fri Feb 23 02:37:45 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Multiple Classifier System
Preprocessing
Classification
Standardization
Scaling
Ensemble of classifiers
Normalization
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c300t-f834157a570fb362e1a0cdc91d303f962877252e47baa56cd8c900863bdf61ef3
ORCID 0000-0001-9446-1040
0000-0001-7714-2283
0000-0003-2725-6527
ParticipantIDs crossref_citationtrail_10_1016_j_asoc_2022_109924
crossref_primary_10_1016_j_asoc_2022_109924
elsevier_sciencedirect_doi_10_1016_j_asoc_2022_109924
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate January 2023
2023-01-00
PublicationDateYYYYMMDD 2023-01-01
PublicationDate_xml – month: 01
  year: 2023
  text: January 2023
PublicationDecade 2020
PublicationTitle Applied soft computing
PublicationYear 2023
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Friedman (b30) 2001; 29
Giacinto, Roli (b32) 2001; 34
Cruz, Sabourin, Cavalcanti, Ing Ren (b42) 2015; 48
(b11) 2021
Cover, Hart (b12) 1967; 13
Souza, Cavalcanti, Cruz, Sabourin (b41) 2019; 85
A. Sato, K. Yamada, Generalized Learning Vector Quantization, in: Proceedings of the 8th International Conference on Neural Information Processing Systems, 1996, pp. 423–429.
Dua, Graff (b37) 2017
Cavalin, Sabourin, Suen (b40) 2013; 22
Hu, Gripon, Pateux (b10) 2021
Seeger (b18) 2004; 14
Kuncheva (b22) 2014
Tulyakov, Jaeger, Govindaraju, Doermann (b24) 2008
Kuncheva (b35) 2002; 24
Aggarwal (b21) 2018
Akosa (b36) 2017; 942
Raju, Lakshmi, Jain, Kalidindi, Padma (b6) 2020
Tung (b19) 2009
Britto, Sabourin, Oliveira (b33) 2014; 47
Cruz, Hafemann, Sabourin, Cavalcanti (b26) 2020; 21
Singh, Singh (b1) 2020; 97
Alcalá-Fdez, Fernández, Luengo, Derrac, García, Sánchez, Herrera (b2) 2011; 17
Eriksson (b8) 1999
Mishkov, Zorin, Kovtoniuk, Dereko, Morgun (b3) 2022; 77
Woods, Kegelmeyer, Bowyer (b31) 1997; 19
(July).
Dzierżak (b5) 2019; 9
Breiman (b28) 2001; 45
Virtanen, Gommers, Oliphant, Haberland, Reddy, Cournapeau, Burovski, Peterson, Weckesser, Bright, van der Walt, Brett, Wilson, Millman, Mayorov, Nelson, Jones, Kern, Larson, Carey, Polat, Feng, Moore, VanderPlas, Laxalde, Perktold, Cimrman, Henriksen, Quintero, Harris, Archibald, Ribeiro, Pedregosa, van Mulbregt, SciPy 1.0 Contributors (b38) 2020; 17
H. Zhang, The optimality of Naive Bayes, in: Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2004, Vol. 2, 2004, pp. 562–567.
Cavalin, Sabourin, Suen (b39) 2010
Breiman (b27) 1996; 24
Chen, Guestrin (b25) 2016
I. Rish, An Empirical Study of the Naïve Bayes Classifier An empirical study of the naive Bayes classifier, in: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Vol. 3, 2001, pp. 41–46, (22).
Keun, Ebbels, Antti, Bollard, Beckonert, Holmes, Lindon, Nicholson (b9) 2003; 490
Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Dubourg, Vanderplas, Passos, Cournapeau, Brucher, Perrot, Duchesnay (b15) 2011; 12
Zhou (b23) 2012
R.M. Cruz, R. Sabourin, G.D. Cavalcanti, META-DES.H: A Dynamic Ensemble Selection technique using meta-learning and a dynamic weighting approach, in: Proceedings of the International Joint Conference on Neural Networks, Vol. 2015-September, ISBN: 9781479919604, 2015
Cortes, Vapnik (b14) 1995; 20
Breiman, Friedman, Olshen, Stone (b20) 2017
Freund, Schapire (b29) 1997; 55
Jain, Shukla, Wadhvani (b4) 2018; 106
Cruz, Sabourin, Cavalcanti (b7) 2018; 41
Ko, Sabourin, Britto, Jr. (b34) 2008; 41
Zhou (10.1016/j.asoc.2022.109924_b23) 2012
Hu (10.1016/j.asoc.2022.109924_b10) 2021
(10.1016/j.asoc.2022.109924_b11) 2021
Kuncheva (10.1016/j.asoc.2022.109924_b35) 2002; 24
Aggarwal (10.1016/j.asoc.2022.109924_b21) 2018
Freund (10.1016/j.asoc.2022.109924_b29) 1997; 55
Alcalá-Fdez (10.1016/j.asoc.2022.109924_b2) 2011; 17
Giacinto (10.1016/j.asoc.2022.109924_b32) 2001; 34
Breiman (10.1016/j.asoc.2022.109924_b27) 1996; 24
Kuncheva (10.1016/j.asoc.2022.109924_b22) 2014
10.1016/j.asoc.2022.109924_b43
Pedregosa (10.1016/j.asoc.2022.109924_b15) 2011; 12
Cavalin (10.1016/j.asoc.2022.109924_b39) 2010
Cruz (10.1016/j.asoc.2022.109924_b26) 2020; 21
Mishkov (10.1016/j.asoc.2022.109924_b3) 2022; 77
Dzierżak (10.1016/j.asoc.2022.109924_b5) 2019; 9
Seeger (10.1016/j.asoc.2022.109924_b18) 2004; 14
Cortes (10.1016/j.asoc.2022.109924_b14) 1995; 20
Tulyakov (10.1016/j.asoc.2022.109924_b24) 2008
Souza (10.1016/j.asoc.2022.109924_b41) 2019; 85
Cruz (10.1016/j.asoc.2022.109924_b42) 2015; 48
Cover (10.1016/j.asoc.2022.109924_b12) 1967; 13
Eriksson (10.1016/j.asoc.2022.109924_b8) 1999
Jain (10.1016/j.asoc.2022.109924_b4) 2018; 106
Cruz (10.1016/j.asoc.2022.109924_b7) 2018; 41
Breiman (10.1016/j.asoc.2022.109924_b28) 2001; 45
Breiman (10.1016/j.asoc.2022.109924_b20) 2017
Woods (10.1016/j.asoc.2022.109924_b31) 1997; 19
Britto (10.1016/j.asoc.2022.109924_b33) 2014; 47
Keun (10.1016/j.asoc.2022.109924_b9) 2003; 490
10.1016/j.asoc.2022.109924_b17
10.1016/j.asoc.2022.109924_b16
Tung (10.1016/j.asoc.2022.109924_b19) 2009
Friedman (10.1016/j.asoc.2022.109924_b30) 2001; 29
10.1016/j.asoc.2022.109924_b13
Raju (10.1016/j.asoc.2022.109924_b6) 2020
Ko (10.1016/j.asoc.2022.109924_b34) 2008; 41
Singh (10.1016/j.asoc.2022.109924_b1) 2020; 97
Cavalin (10.1016/j.asoc.2022.109924_b40) 2013; 22
Virtanen (10.1016/j.asoc.2022.109924_b38) 2020; 17
Akosa (10.1016/j.asoc.2022.109924_b36) 2017; 942
Dua (10.1016/j.asoc.2022.109924_b37) 2017
Chen (10.1016/j.asoc.2022.109924_b25) 2016
References_xml – start-page: 361
  year: 2008
  end-page: 386
  ident: b24
  article-title: Review of classifier combination methods
  publication-title: Machine Learning in Document Analysis and Recognition
– volume: 17
  start-page: 261
  year: 2020
  end-page: 272
  ident: b38
  article-title: SciPy 1.0: Fundamental algorithms for scientific computing in python
  publication-title: Nature Methods
– start-page: 785
  year: 2016
  end-page: 794
  ident: b25
  article-title: XGBoost: A scalable tree boosting system
  publication-title: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
– volume: 14
  start-page: 69
  year: 2004
  end-page: 106
  ident: b18
  article-title: Gaussian processes for machine learning
  publication-title: Int. J. Neural Syst.
– start-page: 145
  year: 2010
  end-page: 154
  ident: b39
  article-title: Dynamic selection of ensembles of classifiers using contextual information
  publication-title: Multiple Classifier Systems
– volume: 9
  start-page: 66
  year: 2019
  end-page: 69
  ident: b5
  article-title: Comparison of the influence of standardization and normalization of data on the effectiveness of spongy tissue texture classification
  publication-title: Inform. Autom. Pomiary Gospod. Ochr. Środowiska
– volume: 19
  start-page: 405
  year: 1997
  end-page: 410
  ident: b31
  article-title: Combination of multiple classifiers using local accuracy estimates
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– start-page: 1
  year: 2017
  end-page: 358
  ident: b20
  article-title: Classification and regression trees
  publication-title: Classification and Regression Trees
– volume: 29
  start-page: 1189
  year: 2001
  end-page: 1232
  ident: b30
  article-title: Greedy function approximation: A gradient boosting machine
  publication-title: Ann. Statist.
– volume: 17
  start-page: 255
  year: 2011
  end-page: 287
  ident: b2
  article-title: KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework
  publication-title: J. Mult.-Valued Logic Soft Comput.
– volume: 490
  start-page: 265
  year: 2003
  end-page: 276
  ident: b9
  article-title: Improved analysis of multivariate data by variable stability scaling: Application to NMR-based metabolic profiling
  publication-title: Anal. Chim. Acta
– volume: 97
  year: 2020
  ident: b1
  article-title: Investigating the impact of data normalization on classification performance
  publication-title: Appl. Soft Comput.
– start-page: 232
  year: 2012
  ident: b23
  article-title: Ensemble Methods, Foundations and Algorithms
– reference: R.M. Cruz, R. Sabourin, G.D. Cavalcanti, META-DES.H: A Dynamic Ensemble Selection technique using meta-learning and a dynamic weighting approach, in: Proceedings of the International Joint Conference on Neural Networks, Vol. 2015-September, ISBN: 9781479919604, 2015,
– volume: 34
  start-page: 1879
  year: 2001
  end-page: 1881
  ident: b32
  article-title: Dynamic classifier selection based on multiple classifier behaviour
  publication-title: Pattern Recognit.
– reference: I. Rish, An Empirical Study of the Naïve Bayes Classifier An empirical study of the naive Bayes classifier, in: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Vol. 3, 2001, pp. 41–46, (22).
– year: 2017
  ident: b37
  article-title: UCI machine learning repository
– volume: 24
  start-page: 123
  year: 1996
  end-page: 140
  ident: b27
  article-title: Bagging predictors
  publication-title: Mach. Learn.
– volume: 12
  start-page: 2825
  year: 2011
  end-page: 2830
  ident: b15
  article-title: Scikit-learn: Machine learning in Python
  publication-title: J. Mach. Learn. Res.
– start-page: 487
  year: 2021
  end-page: 499
  ident: b10
  article-title: Leveraging the feature distribution in transfer-based few-shot learning
  publication-title: Artificial Neural Networks and Machine Learning – ICANN 2021
– volume: 21
  start-page: 1
  year: 2020
  end-page: 5
  ident: b26
  article-title: DESlib: A dynamic ensemble selection library in Python
  publication-title: J. Mach. Learn. Res.
– volume: 47
  start-page: 3665
  year: 2014
  end-page: 3680
  ident: b33
  article-title: Dynamic selection of classifiers — A comprehensive review
  publication-title: Pattern Recognit.
– volume: 77
  start-page: 602
  year: 2022
  end-page: 612
  ident: b3
  article-title: Comparative analysis of normalizing techniques based on the use of classification quality criteria
  publication-title: Lect. Notes Data Eng. Commun. Technol.
– reference: H. Zhang, The optimality of Naive Bayes, in: Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2004, Vol. 2, 2004, pp. 562–567.
– start-page: 351
  year: 2014
  ident: b22
  article-title: Combining Pattern Classifiers: Methods and Algoritms
– volume: 20
  start-page: 273
  year: 1995
  end-page: 297
  ident: b14
  article-title: Support-vector networks
  publication-title: Mach. Learn.
– start-page: 213
  year: 1999
  end-page: 225
  ident: b8
  article-title: Introduction to Multi-and Megavariate Data Analysis using Projection Methods (PCA & PLS)
– reference: A. Sato, K. Yamada, Generalized Learning Vector Quantization, in: Proceedings of the 8th International Conference on Neural Information Processing Systems, 1996, pp. 423–429.
– volume: 24
  start-page: 281
  year: 2002
  end-page: 286
  ident: b35
  article-title: A theoretical study on six classifier fusion strategies
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– volume: 48
  start-page: 1925
  year: 2015
  end-page: 1935
  ident: b42
  article-title: META-DES: A dynamic ensemble selection framework using meta-learning
  publication-title: Pattern Recognit.
– start-page: 729
  year: 2020
  end-page: 735
  ident: b6
  article-title: Study the influence of normalization/transformation process on the accuracy of supervised classification
  publication-title: 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT)
– volume: 41
  start-page: 195
  year: 2018
  end-page: 216
  ident: b7
  article-title: Dynamic classifier selection: Recent advances and perspectives
  publication-title: Inf. Fusion
– volume: 41
  start-page: 1718
  year: 2008
  end-page: 1731
  ident: b34
  article-title: From dynamic classifier selection to dynamic ensemble selection
  publication-title: Pattern Recognit.
– year: 2021
  ident: b11
  article-title: 6.3. Preprocessing data
– volume: 13
  start-page: 21
  year: 1967
  end-page: 27
  ident: b12
  article-title: Nearest neighbor pattern classification
  publication-title: IEEE Trans. Inform. Theory
– reference: , (July).
– volume: 85
  start-page: 132
  year: 2019
  end-page: 148
  ident: b41
  article-title: Online local pool generation for dynamic classifier selection
  publication-title: Pattern Recognit.
– volume: 55
  start-page: 119
  year: 1997
  end-page: 139
  ident: b29
  article-title: A decision-theoretic generalization of on-line learning and an application to boosting
  publication-title: J. Comput. System Sci.
– start-page: 2459
  year: 2009
  end-page: 2462
  ident: b19
  article-title: Rule-based classification
  publication-title: Encyclopedia of Database Systems
– start-page: 497
  year: 2018
  ident: b21
  article-title: Neural Networks and Deep Learning
– volume: 106
  start-page: 252
  year: 2018
  end-page: 262
  ident: b4
  article-title: Dynamic selection of normalization techniques using data complexity measures
  publication-title: Expert Syst. Appl.
– volume: 45
  start-page: 5
  year: 2001
  end-page: 32
  ident: b28
  article-title: Random forests
  publication-title: Mach. Learn.
– volume: 942
  start-page: 1
  year: 2017
  end-page: 12
  ident: b36
  article-title: Predictive accuracy : A misleading performance measure for highly imbalanced data
  publication-title: SAS Glob. Forum
– volume: 22
  start-page: 673
  year: 2013
  end-page: 688
  ident: b40
  article-title: Dynamic selection approaches for multiple classifier systems
  publication-title: Neural Comput. Appl.
– volume: 17
  start-page: 261
  year: 2020
  ident: 10.1016/j.asoc.2022.109924_b38
  article-title: SciPy 1.0: Fundamental algorithms for scientific computing in python
  publication-title: Nature Methods
  doi: 10.1038/s41592-019-0686-2
– volume: 22
  start-page: 673
  issue: 3–4
  year: 2013
  ident: 10.1016/j.asoc.2022.109924_b40
  article-title: Dynamic selection approaches for multiple classifier systems
  publication-title: Neural Comput. Appl.
  doi: 10.1007/s00521-011-0737-9
– start-page: 361
  year: 2008
  ident: 10.1016/j.asoc.2022.109924_b24
  article-title: Review of classifier combination methods
– volume: 106
  start-page: 252
  year: 2018
  ident: 10.1016/j.asoc.2022.109924_b4
  article-title: Dynamic selection of normalization techniques using data complexity measures
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2018.04.008
– volume: 47
  start-page: 3665
  issue: 11
  year: 2014
  ident: 10.1016/j.asoc.2022.109924_b33
  article-title: Dynamic selection of classifiers — A comprehensive review
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2014.05.003
– volume: 942
  start-page: 1
  year: 2017
  ident: 10.1016/j.asoc.2022.109924_b36
  article-title: Predictive accuracy : A misleading performance measure for highly imbalanced data
  publication-title: SAS Glob. Forum
– start-page: 145
  year: 2010
  ident: 10.1016/j.asoc.2022.109924_b39
  article-title: Dynamic selection of ensembles of classifiers using contextual information
– year: 2017
  ident: 10.1016/j.asoc.2022.109924_b37
– volume: 14
  start-page: 69
  issue: 2
  year: 2004
  ident: 10.1016/j.asoc.2022.109924_b18
  article-title: Gaussian processes for machine learning
  publication-title: Int. J. Neural Syst.
  doi: 10.1142/S0129065704001899
– volume: 45
  start-page: 5
  issue: 1
  year: 2001
  ident: 10.1016/j.asoc.2022.109924_b28
  article-title: Random forests
  publication-title: Mach. Learn.
  doi: 10.1023/A:1010933404324
– volume: 55
  start-page: 119
  issue: 1
  year: 1997
  ident: 10.1016/j.asoc.2022.109924_b29
  article-title: A decision-theoretic generalization of on-line learning and an application to boosting
  publication-title: J. Comput. System Sci.
  doi: 10.1006/jcss.1997.1504
– volume: 12
  start-page: 2825
  year: 2011
  ident: 10.1016/j.asoc.2022.109924_b15
  article-title: Scikit-learn: Machine learning in Python
  publication-title: J. Mach. Learn. Res.
– volume: 17
  start-page: 255
  issue: 2–3
  year: 2011
  ident: 10.1016/j.asoc.2022.109924_b2
  article-title: KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework
  publication-title: J. Mult.-Valued Logic Soft Comput.
– volume: 41
  start-page: 195
  year: 2018
  ident: 10.1016/j.asoc.2022.109924_b7
  article-title: Dynamic classifier selection: Recent advances and perspectives
  publication-title: Inf. Fusion
  doi: 10.1016/j.inffus.2017.09.010
– start-page: 232
  year: 2012
  ident: 10.1016/j.asoc.2022.109924_b23
– volume: 490
  start-page: 265
  issue: 1–2
  year: 2003
  ident: 10.1016/j.asoc.2022.109924_b9
  article-title: Improved analysis of multivariate data by variable stability scaling: Application to NMR-based metabolic profiling
  publication-title: Anal. Chim. Acta
  doi: 10.1016/S0003-2670(03)00094-1
– start-page: 785
  year: 2016
  ident: 10.1016/j.asoc.2022.109924_b25
  article-title: XGBoost: A scalable tree boosting system
– start-page: 497
  year: 2018
  ident: 10.1016/j.asoc.2022.109924_b21
– volume: 29
  start-page: 1189
  issue: 5
  year: 2001
  ident: 10.1016/j.asoc.2022.109924_b30
  article-title: Greedy function approximation: A gradient boosting machine
  publication-title: Ann. Statist.
  doi: 10.1214/aos/1013203451
– ident: 10.1016/j.asoc.2022.109924_b13
– start-page: 1
  year: 2017
  ident: 10.1016/j.asoc.2022.109924_b20
  article-title: Classification and regression trees
– ident: 10.1016/j.asoc.2022.109924_b17
– ident: 10.1016/j.asoc.2022.109924_b43
  doi: 10.1109/IJCNN.2015.7280594
– start-page: 487
  year: 2021
  ident: 10.1016/j.asoc.2022.109924_b10
  article-title: Leveraging the feature distribution in transfer-based few-shot learning
– start-page: 729
  year: 2020
  ident: 10.1016/j.asoc.2022.109924_b6
  article-title: Study the influence of normalization/transformation process on the accuracy of supervised classification
– volume: 24
  start-page: 123
  issue: 2
  year: 1996
  ident: 10.1016/j.asoc.2022.109924_b27
  article-title: Bagging predictors
  publication-title: Mach. Learn.
  doi: 10.1007/BF00058655
– volume: 20
  start-page: 273
  issue: 3
  year: 1995
  ident: 10.1016/j.asoc.2022.109924_b14
  article-title: Support-vector networks
  publication-title: Mach. Learn.
  doi: 10.1007/BF00994018
– volume: 77
  start-page: 602
  year: 2022
  ident: 10.1016/j.asoc.2022.109924_b3
  article-title: Comparative analysis of normalizing techniques based on the use of classification quality criteria
  publication-title: Lect. Notes Data Eng. Commun. Technol.
  doi: 10.1007/978-3-030-82014-5_41
– start-page: 213
  year: 1999
  ident: 10.1016/j.asoc.2022.109924_b8
– start-page: 351
  year: 2014
  ident: 10.1016/j.asoc.2022.109924_b22
– volume: 41
  start-page: 1718
  issue: 5
  year: 2008
  ident: 10.1016/j.asoc.2022.109924_b34
  article-title: From dynamic classifier selection to dynamic ensemble selection
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2007.10.015
– year: 2021
  ident: 10.1016/j.asoc.2022.109924_b11
– volume: 21
  start-page: 1
  issue: 8
  year: 2020
  ident: 10.1016/j.asoc.2022.109924_b26
  article-title: DESlib: A dynamic ensemble selection library in Python
  publication-title: J. Mach. Learn. Res.
– volume: 48
  start-page: 1925
  issue: 5
  year: 2015
  ident: 10.1016/j.asoc.2022.109924_b42
  article-title: META-DES: A dynamic ensemble selection framework using meta-learning
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2014.12.003
– start-page: 2459
  year: 2009
  ident: 10.1016/j.asoc.2022.109924_b19
  article-title: Rule-based classification
– volume: 85
  start-page: 132
  issue: 1
  year: 2019
  ident: 10.1016/j.asoc.2022.109924_b41
  article-title: Online local pool generation for dynamic classifier selection
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2018.08.004
– volume: 9
  start-page: 66
  issue: 3
  year: 2019
  ident: 10.1016/j.asoc.2022.109924_b5
  article-title: Comparison of the influence of standardization and normalization of data on the effectiveness of spongy tissue texture classification
  publication-title: Inform. Autom. Pomiary Gospod. Ochr. Środowiska
  doi: 10.35784/iapgos.62
– volume: 97
  year: 2020
  ident: 10.1016/j.asoc.2022.109924_b1
  article-title: Investigating the impact of data normalization on classification performance
  publication-title: Appl. Soft Comput.
  doi: 10.1016/j.asoc.2019.105524
– volume: 34
  start-page: 1879
  issue: 9
  year: 2001
  ident: 10.1016/j.asoc.2022.109924_b32
  article-title: Dynamic classifier selection based on multiple classifier behaviour
  publication-title: Pattern Recognit.
  doi: 10.1016/S0031-3203(00)00150-3
– volume: 24
  start-page: 281
  issue: 2
  year: 2002
  ident: 10.1016/j.asoc.2022.109924_b35
  article-title: A theoretical study on six classifier fusion strategies
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/34.982906
– volume: 13
  start-page: 21
  issue: 1
  year: 1967
  ident: 10.1016/j.asoc.2022.109924_b12
  article-title: Nearest neighbor pattern classification
  publication-title: IEEE Trans. Inform. Theory
  doi: 10.1109/TIT.1967.1053964
– volume: 19
  start-page: 405
  issue: 4
  year: 1997
  ident: 10.1016/j.asoc.2022.109924_b31
  article-title: Combination of multiple classifiers using local accuracy estimates
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/34.588027
– ident: 10.1016/j.asoc.2022.109924_b16
SSID ssj0016928
Score 2.6696987
Snippet Dataset scaling, also known as normalization, is an essential preprocessing step in a machine learning pipeline. It is aimed at adjusting attributes scales in...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 109924
SubjectTerms Classification
Ensemble of classifiers
Multiple Classifier System
Normalization
Preprocessing
Scaling
Standardization
Title The choice of scaling technique matters for classification performance
URI https://dx.doi.org/10.1016/j.asoc.2022.109924
Volume 133
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bS8MwFA5jvvjiXZyXkQffpK5Jk7R5HMMxb0PUwd5KkiYwcRd0vvrbPWnTqSB7EAqFkgPlS86tPec7CJ0bqyyEqTpyzoqIFUaDSnETZc5pkmSuEOUf3fuhGIzYzZiPG6hX98L4sspg-yubXlrr8KQT0OwsJpPOE2QeGZNMUFpyzvhGc8ZSf8ovP1dlHkTIcr6qXxz51aFxpqrxUoAA5IiUelYlSdnfzumHw-nvoK0QKeJu9TK7qGFne2i7nsKAg1Luoz7sNAYjBhqP5w6_A-jgjvCKnBVPKwpNDOEpNj5Y9tVB5YbgxXfbwAEa9a-ee4MoTEeITBLHy8hl4IB4qngaOw1uyBIVm8JIUoBXclJAKpRSTi1LtVJceA4A6ROYRBdOEOuSQ9SczWf2CGHmMk2kTQ2Di1KtnOYqk1RxSajTaQuRGpbcBOpwP8HiNa9rxF5yD2XuocwrKFvoYiWzqIgz1q7mNdr5r-3PwbKvkTv-p9wJ2vRz46tvKaeouXz7sGcQXSx1uzw-bbTR7T3ePfj79e1g-AVzZ9DK
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV07T8MwED7xGGDhjXjjASYUmjh2Eg8MCKhaCl1opW7BdmwJBKWCIsTCn-IPcs6jgIQ6IFXKFOUi5_Plu7vkHgAH2kiDbqryrDWRxzKt8JXi2kusVUGY2CzK_-het6NGl132eG8KPqtaGJdWWXJ_wek5W5dnaiWatcHdXe0GI4-ECRZRmvecqTIrW-b9DeO2l5PmOW7yIaX1i85ZwytHC3g69P2hZxNkbx5LHvtWIYebQPo60yLIkNKtiDCOiCmnhsVKSh65AnrhvP9QZTYKjA3xvtMwy5Au3NiE449RXkkQiXygq1ud55ZXVuoUSWUSIceglFLXxklQ9rc1_GHh6kuwULqm5LR4-mWYMv0VWKzGPpCSBVahjqpFkDWRYsiTJS-4y2j_yKgbLHksenYS9IeJdt65S0fKNYAMvusU1qA7EczWYab_1DcbQJhNVCBMrBkelCppFZeJoJKLgFoVb0JQwZLqsle5G5nxkFZJafepgzJ1UKYFlJtwNJIZFJ06xl7NK7TTX_qWoikZI7f1T7l9mGt0rq_Sq2a7tQ3zbmh98SFnB2aGz69mF12bodrLVYnA7aR19wvnDQo6
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+choice+of+scaling+technique+matters+for+classification+performance&rft.jtitle=Applied+soft+computing&rft.au=de+Amorim%2C+Lucas+B.V.&rft.au=Cavalcanti%2C+George+D.C.&rft.au=Cruz%2C+Rafael+M.O.&rft.date=2023-01-01&rft.pub=Elsevier+B.V&rft.issn=1568-4946&rft.eissn=1872-9681&rft.volume=133&rft_id=info:doi/10.1016%2Fj.asoc.2022.109924&rft.externalDocID=S1568494622009735
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1568-4946&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1568-4946&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1568-4946&client=summon