RB-CCR: Radial-Based Combined Cleaning and Resampling algorithm for imbalanced data classification

Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous misclassification costs. In such cases, the classification model must achieve a high recall without significantly impacting precision. Resampling the train...

Full description

Saved in:
Bibliographic Details
Published inMachine learning Vol. 110; no. 11-12; pp. 3059 - 3093
Main Authors Koziarski, Michał, Bellinger, Colin, Woźniak, Michał
Format Journal Article
LanguageEnglish
Published New York Springer US 01.12.2021
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0885-6125
1573-0565
DOI10.1007/s10994-021-06012-8

Cover

Abstract Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous misclassification costs. In such cases, the classification model must achieve a high recall without significantly impacting precision. Resampling the training data is the standard approach to improving classification performance on imbalanced binary data. However, the state-of-the-art methods ignore the local joint distribution of the data or correct it as a post-processing step. This can causes sub-optimal shifts in the training distribution, particularly when the target data distribution is complex. In this paper, we propose Radial-Based Combined Cleaning and Resampling (RB-CCR). RB-CCR utilizes the concept of class potential to refine the energy-based resampling approach of CCR. In particular, RB-CCR exploits the class potential to accurately locate sub-regions of the data-space for synthetic oversampling. The category sub-region for oversampling can be specified as an input parameter to meet domain-specific needs or be automatically selected via cross-validation. Our 5 × 2 cross-validated results on 57 benchmark binary datasets with 9 classifiers show that RB-CCR achieves a better precision-recall trade-off than CCR and generally out-performs the state-of-the-art resampling methods in terms of AUC and G-mean.
AbstractList Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous misclassification costs. In such cases, the classification model must achieve a high recall without significantly impacting precision. Resampling the training data is the standard approach to improving classification performance on imbalanced binary data. However, the state-of-the-art methods ignore the local joint distribution of the data or correct it as a post-processing step. This can causes sub-optimal shifts in the training distribution, particularly when the target data distribution is complex. In this paper, we propose Radial-Based Combined Cleaning and Resampling (RB-CCR). RB-CCR utilizes the concept of class potential to refine the energy-based resampling approach of CCR. In particular, RB-CCR exploits the class potential to accurately locate sub-regions of the data-space for synthetic oversampling. The category sub-region for oversampling can be specified as an input parameter to meet domain-specific needs or be automatically selected via cross-validation. Our 5 × 2 cross-validated results on 57 benchmark binary datasets with 9 classifiers show that RB-CCR achieves a better precision-recall trade-off than CCR and generally out-performs the state-of-the-art resampling methods in terms of AUC and G-mean.
Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous misclassification costs. In such cases, the classification model must achieve a high recall without significantly impacting precision. Resampling the training data is the standard approach to improving classification performance on imbalanced binary data. However, the state-of-the-art methods ignore the local joint distribution of the data or correct it as a post-processing step. This can causes sub-optimal shifts in the training distribution, particularly when the target data distribution is complex. In this paper, we propose Radial-Based Combined Cleaning and Resampling (RB-CCR). RB-CCR utilizes the concept of class potential to refine the energy-based resampling approach of CCR. In particular, RB-CCR exploits the class potential to accurately locate sub-regions of the data-space for synthetic oversampling. The category sub-region for oversampling can be specified as an input parameter to meet domain-specific needs or be automatically selected via cross-validation. Our $$5\times 2$$ 5 × 2 cross-validated results on 57 benchmark binary datasets with 9 classifiers show that RB-CCR achieves a better precision-recall trade-off than CCR and generally out-performs the state-of-the-art resampling methods in terms of AUC and G-mean.
Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous misclassification costs. In such cases, the classification model must achieve a high recall without significantly impacting precision. Resampling the training data is the standard approach to improving classification performance on imbalanced binary data. However, the state-of-the-art methods ignore the local joint distribution of the data or correct it as a post-processing step. This can causes sub-optimal shifts in the training distribution, particularly when the target data distribution is complex. In this paper, we propose Radial-Based Combined Cleaning and Resampling (RB-CCR). RB-CCR utilizes the concept of class potential to refine the energy-based resampling approach of CCR. In particular, RB-CCR exploits the class potential to accurately locate sub-regions of the data-space for synthetic oversampling. The category sub-region for oversampling can be specified as an input parameter to meet domain-specific needs or be automatically selected via cross-validation. Our 5×2 cross-validated results on 57 benchmark binary datasets with 9 classifiers show that RB-CCR achieves a better precision-recall trade-off than CCR and generally out-performs the state-of-the-art resampling methods in terms of AUC and G-mean.
Author Koziarski, Michał
Bellinger, Colin
Woźniak, Michał
Author_xml – sequence: 1
  givenname: Michał
  surname: Koziarski
  fullname: Koziarski, Michał
  email: michal.koziarski@agh.edu.pl
  organization: Department of Electronics, AGH University of Science and Technology
– sequence: 2
  givenname: Colin
  surname: Bellinger
  fullname: Bellinger, Colin
  organization: Digital Technologies, National Research Council of Canada
– sequence: 3
  givenname: Michał
  surname: Woźniak
  fullname: Woźniak, Michał
  organization: Department of Systems and Computer Networks, Wrocław University of Science and Technology
BookMark eNp9kE1LAzEQhoNUsK3-AU8LnqOTZD9Sb3bxCwrCoucwyWZrym62JtuD_95tKwgeepoZeJ-Z4ZmRie-9JeSawS0DKO4ig8UipcAZhRwYp_KMTFlWCApZnk3IFKTMaM54dkFmMW4AgOcynxJdLWlZVvdJhbXDli4x2jop-047v29ai975dYK-Tiobsdu2h7Fd98ENn13S9CFxncYWvRmBGgdMTIsxusYZHFzvL8l5g220V791Tj6eHt_LF7p6e34tH1bUiFwMNMV6kWkOdSMaoTU0gsvcSJuikFgUBTLTyGIh04IVWssxDhwFQ0CdGiu1mJOb495t6L92Ng5q0--CH08qPjphWSrzbEzJY8qEPsZgG2XccPhzCOhaxUDtjaqjUTUaVQejSo4o_4dug-swfJ-GxBGKY9ivbfj76gT1A5nfipA
CitedBy_id crossref_primary_10_3390_e24111602
crossref_primary_10_1007_s10994_023_06448_0
crossref_primary_10_1007_s10115_023_01881_y
crossref_primary_10_1007_s10994_022_06296_4
crossref_primary_10_1109_TSMC_2023_3319694
crossref_primary_10_1111_coin_12566
crossref_primary_10_1007_s10994_024_06558_3
Cites_doi 10.1007/978-3-319-46128-1_16
10.1109/TNNLS.2015.2461436
10.1515/amcs-2017-0050
10.1016/j.eswa.2011.12.043
10.1109/CIDM.2011.5949434
10.1109/ICDM.2018.00060
10.1613/jair.953
10.1007/978-3-642-28931-6_14
10.1007/s11222-017-9746-6
10.1007/3-540-62858-4_79
10.1109/ICDM.2012.115
10.1016/j.patcog.2020.107262
10.1109/TKDE.2012.232
10.1007/s13748-016-0094-0
10.1007/3-540-48229-6_9
10.1016/j.knosys.2011.06.013
10.1007/11538059_91
10.1109/TNNLS.2019.2913673
10.1016/j.knosys.2020.106223
10.1007/s10115-019-01380-z
10.1109/TKDE.2008.239
10.1007/978-3-319-18781-5_17
10.1016/j.neucom.2018.04.089
10.1162/089976699300016007
10.1109/TNNLS.2017.2732482
10.1109/TEVC.2012.2199119
10.1109/TSMCC.2011.2161285
10.1016/j.asoc.2013.08.014
10.1109/TSMC.1972.4309137
10.1007/s10994-017-5670-4
10.1016/j.ins.2017.09.013
10.1145/2907070
10.1007/978-3-540-39804-2_12
10.1145/1401890.1401910
10.1007/978-3-642-01307-2_43
10.1016/j.ins.2013.12.019
10.1109/TNNLS.2017.2751612
10.1145/1143844.1143874
10.1109/IJCNN52387.2021.9533415
10.1016/j.inffus.2013.04.006
10.1109/ACII.2013.47
10.1109/IJCNN.2010.5596702
10.1109/TKDE.2006.17
10.1109/TNNLS.2019.2899061
ContentType Journal Article
Copyright Crown 2021 2021
Crown 2021 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: Crown 2021 2021
– notice: Crown 2021 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
3V.
7SC
7XB
88I
8AL
8AO
8FD
8FE
8FG
8FK
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
L7M
L~C
L~D
M0N
M2P
P5Z
P62
PHGZM
PHGZT
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
DOI 10.1007/s10994-021-06012-8
DatabaseName Springer Nature OA Free Journals
CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ProQuest Central (purchase pre-March 2016)
Science Database (Alumni Edition)
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
ProQuest Technology Collection (LUT)
ProQuest One Community College
ProQuest Central
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
DatabaseTitle CrossRef
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Pharma Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Advanced Technologies & Aerospace Collection
ProQuest Computing
ProQuest Science Journals (Alumni Edition)
ProQuest Central Basic
ProQuest Science Journals
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest Central (Alumni)
ProQuest One Academic (New)
DatabaseTitleList
CrossRef
Computer Science Database
Database_xml – sequence: 1
  dbid: C6C
  name: Springer Nature OA Free Journals
  url: http://www.springeropen.com/
  sourceTypes: Publisher
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1573-0565
EndPage 3093
ExternalDocumentID 10_1007_s10994_021_06012_8
GrantInformation_xml – fundername: Narodowe Centrum Nauki
  grantid: 2017/27/N/ST6/01705; 2017/27/B/ST6/01325
  funderid: http://dx.doi.org/10.13039/501100004281
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
-~X
.4S
.86
.DC
.VR
06D
0R~
0VY
199
1N0
1SB
2.D
203
28-
29M
2J2
2JN
2JY
2KG
2KM
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5QI
5VS
67Z
6NX
6TJ
78A
88I
8AO
8FE
8FG
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAEWM
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABIVO
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFS
ACGOD
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACNCT
ACOKC
ACOMO
ACPIV
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BBWZM
BDATZ
BENPR
BGLVJ
BGNMA
BPHCQ
BSONS
C6C
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
F5P
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I-F
I09
IHE
IJ-
IKXTQ
ITG
ITH
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Y
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K6V
K7-
KDC
KOV
KOW
LAK
LLZTM
M0N
M2P
M4Y
MA-
MVM
N2Q
N9A
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P62
P9O
PF-
PQQKQ
PROAC
PT4
Q2X
QF4
QM1
QN7
QO4
QOK
QOS
R4E
R89
R9I
RHV
RIG
RNI
RNS
ROL
RPX
RSV
RZC
RZE
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TAE
TEORI
TN5
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
VXZ
W23
W48
WH7
WIP
WK8
XJT
YLTOR
Z45
Z7R
Z7S
Z7U
Z7V
Z7W
Z7X
Z7Y
Z7Z
Z81
Z83
Z85
Z86
Z87
Z88
Z8M
Z8N
Z8O
Z8P
Z8Q
Z8R
Z8S
Z8T
Z8U
Z8W
Z8Z
Z91
Z92
ZMTXR
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ACSTC
ADHKG
ADKFA
AEZWR
AFDZB
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
AMVHM
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
7SC
7XB
8AL
8FD
8FK
ABRTQ
JQ2
L7M
L~C
L~D
PKEHL
PQEST
PQGLB
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c363t-4ad95b20df3f3bb0f3286c8e4a38a777a1cf87984717bb84ad02a31a0ab4ce8b3
IEDL.DBID 8FG
ISSN 0885-6125
IngestDate Fri Jul 25 06:10:08 EDT 2025
Tue Jul 01 00:46:07 EDT 2025
Thu Apr 24 23:04:36 EDT 2025
Fri Feb 21 02:48:00 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 11-12
Keywords Imbalanced data
Radial basis functions
Oversampling
Machine learning
Classification
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c363t-4ad95b20df3f3bb0f3286c8e4a38a777a1cf87984717bb84ad02a31a0ab4ce8b3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://proxy.k.utb.cz/login?url=https://link.springer.com/10.1007/s10994-021-06012-8
PQID 2601154865
PQPubID 54194
PageCount 35
ParticipantIDs proquest_journals_2601154865
crossref_citationtrail_10_1007_s10994_021_06012_8
crossref_primary_10_1007_s10994_021_06012_8
springer_journals_10_1007_s10994_021_06012_8
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20211200
2021-12-00
20211201
PublicationDateYYYYMMDD 2021-12-01
PublicationDate_xml – month: 12
  year: 2021
  text: 20211200
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Dordrecht
PublicationTitle Machine learning
PublicationTitleAbbrev Mach Learn
PublicationYear 2021
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References Galar, Fernandez, Barrenechea, Bustince, Herrera (CR15) 2011; 42
Wilson (CR50) 1972; 2
CR36
CR35
CR34
Stefanowski, Matwin, Mielniczuk (CR46) 2016
Brzezinski, Stefanowski, Susmaga, Szczęch (CR9) 2019; 31
Zhou, Liu (CR52) 2006; 18
Bhowan, Johnston, Zhang, Yao (CR7) 2012; 17
Koziarski, Wożniak (CR28) 2017; 27
Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Dubourg (CR43) 2011; 12
Alcalá-Fdez, Fernández, Luengo, Derrac, García (CR1) 2011; 17
CR4
López, Fernández, Moreno-Torres, Herrera (CR39) 2012; 39
Krawczyk, Wozniak, Cyganek (CR32) 2014; 264
Barua, Islam, Yao, Murase (CR3) 2012; 26
Krawczyk, Woźniak, Schaefer (CR33) 2014; 14
CR49
CR48
CR45
CR42
Alpaydin (CR2) 1999; 11
CR40
García, Sánchez, Mollineda (CR16) 2012; 25
Mathew, Pang, Luo, Leong (CR41) 2018; 29
Woźniak, Graña, Corchado (CR51) 2014; 16
Tomek (CR47) 1976; 6
CR18
CR17
Koziarski, Woźniak, Krawczyk (CR29) 2020; 204
CR14
CR13
CR12
Koziarski, Krawczyk, Woźniak (CR27) 2019; 343
Branco, Torgo, Ribeiro (CR8) 2016; 49
Bellinger, Sharma, Japkowicz, Zaïane (CR6) 2020; 62
CR10
Khan, Hayat, Bennamoun, Sohel, Togneri (CR24) 2018; 29
He, Garcia (CR21) 2009; 21
Hand, Christen (CR19) 2018; 28
Pérez-Ortiz, Gutiérrez, Tiño, Hervás-Martínez (CR44) 2016; 27
Krawczyk, Koziarski, Woźniak (CR31) 2019; 31
Bellinger, Drummond, Japkowicz (CR5) 2018; 107
Lemaitre, Nogueira, Aridas (CR37) 2017; 18
Chawla, Bowyer, Hall, Kegelmeyer (CR11) 2002; 16
CR25
Koziarski (CR26) 2020; 102
Krawczyk (CR30) 2016; 5
CR23
CR22
CR20
Li, Zhang, Zhang, Chunlei, Yue, Tian (CR38) 2018; 422
M Koziarski (6012_CR26) 2020; 102
E Alpaydin (6012_CR2) 1999; 11
I Tomek (6012_CR47) 1976; 6
J Alcalá-Fdez (6012_CR1) 2011; 17
6012_CR35
M Koziarski (6012_CR27) 2019; 343
6012_CR34
B Krawczyk (6012_CR31) 2019; 31
6012_CR36
J Mathew (6012_CR41) 2018; 29
B Krawczyk (6012_CR32) 2014; 264
Z-H Zhou (6012_CR52) 2006; 18
V López (6012_CR39) 2012; 39
H He (6012_CR21) 2009; 21
U Bhowan (6012_CR7) 2012; 17
6012_CR23
B Krawczyk (6012_CR30) 2016; 5
6012_CR25
6012_CR20
C Bellinger (6012_CR5) 2018; 107
6012_CR22
NV Chawla (6012_CR11) 2002; 16
C Bellinger (6012_CR6) 2020; 62
M Pérez-Ortiz (6012_CR44) 2016; 27
J Stefanowski (6012_CR46) 2016
6012_CR13
6012_CR12
6012_CR14
V García (6012_CR16) 2012; 25
M Koziarski (6012_CR28) 2017; 27
6012_CR10
F Pedregosa (6012_CR43) 2011; 12
DL Wilson (6012_CR50) 1972; 2
6012_CR17
B Krawczyk (6012_CR33) 2014; 14
S Barua (6012_CR3) 2012; 26
6012_CR18
M Galar (6012_CR15) 2011; 42
M Woźniak (6012_CR51) 2014; 16
6012_CR40
D Hand (6012_CR19) 2018; 28
F Li (6012_CR38) 2018; 422
6012_CR45
6012_CR48
6012_CR42
6012_CR4
SH Khan (6012_CR24) 2018; 29
P Branco (6012_CR8) 2016; 49
D Brzezinski (6012_CR9) 2019; 31
G Lemaitre (6012_CR37) 2017; 18
6012_CR49
M Koziarski (6012_CR29) 2020; 204
References_xml – ident: CR45
– ident: CR22
– volume: 204
  start-page: 106223
  year: 2020
  ident: CR29
  article-title: Combined cleaning and resampling algorithm for multi-class imbalanced data with label noise
  publication-title: Knowledge-Based Systems
– volume: 102
  start-page: 107262
  year: 2020
  ident: CR26
  article-title: Radial-based undersampling for imbalanced data classification
  publication-title: Pattern Recognition
– ident: CR49
– ident: CR4
– volume: 17
  start-page: 368
  issue: 3
  year: 2012
  end-page: 386
  ident: CR7
  article-title: Evolving diverse ensembles using genetic programming for classification with unbalanced data
  publication-title: IEEE Transactions on Evolutionary Computation
– volume: 29
  start-page: 3573
  issue: 8
  year: 2018
  end-page: 3587
  ident: CR24
  article-title: Cost-sensitive learning of deep feature representations from imbalanced data
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– ident: CR12
– volume: 18
  start-page: 1
  issue: 17
  year: 2017
  end-page: 5
  ident: CR37
  article-title: Imbalanced-learn: A Python toolbox to tackle the curse of imbalanced datasets in machine learning
  publication-title: Journal of Machine Learning Research
– volume: 31
  start-page: 2868
  issue: 8
  year: 2019
  end-page: 2878
  ident: CR9
  article-title: On the dynamics of classification measures for imbalanced and streaming data
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 27
  start-page: 1947
  issue: 9
  year: 2016
  end-page: 1961
  ident: CR44
  article-title: Oversampling the minority class in the feature space
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 39
  start-page: 6585
  issue: 7
  year: 2012
  end-page: 6608
  ident: CR39
  article-title: Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics.
  publication-title: Expert Systems with Applications
– ident: CR35
– volume: 422
  start-page: 242
  year: 2018
  end-page: 256
  ident: CR38
  article-title: Cost-sensitive and hybrid-attribute measure multi-decision tree over imbalanced data sets
  publication-title: Information Sciences
– ident: CR25
– ident: CR42
– volume: 2
  start-page: 408
  issue: 3
  year: 1972
  end-page: 421
  ident: CR50
  article-title: Asymptotic properties of nearest neighbor rules using edited data
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics
– volume: 264
  start-page: 182
  year: 2014
  end-page: 195
  ident: CR32
  article-title: Clustering-based ensembles for one-class classification
  publication-title: Information Sciences
– volume: 11
  start-page: 1885
  issue: 8
  year: 1999
  end-page: 1892
  ident: CR2
  article-title: Combined 5 × 2 cv F test for comparing supervised classification learning algorithms
  publication-title: Neural Computation
– volume: 31
  start-page: 2818
  issue: 8
  year: 2019
  end-page: 2831
  ident: CR31
  article-title: Radial-based oversampling for multiclass imbalanced data classification
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 343
  start-page: 19
  year: 2019
  end-page: 33
  ident: CR27
  article-title: Radial-based oversampling for noisy imbalanced data classification
  publication-title: Neurocomputing
– volume: 27
  start-page: 727
  issue: 4
  year: 2017
  end-page: 736
  ident: CR28
  article-title: CCR: A combined cleaning and resampling algorithm for imbalanced data classification
  publication-title: International Journal of Applied Mathematics and Computer Science
– ident: CR36
– volume: 28
  start-page: 539
  issue: 3
  year: 2018
  end-page: 547
  ident: CR19
  article-title: A note on using the F-measure for evaluating record linkage algorithms
  publication-title: Statistics and Computing
– volume: 14
  start-page: 554
  issue: Part C
  year: 2014
  end-page: 562
  ident: CR33
  article-title: Cost-sensitive decision tree ensembles for effective imbalanced classification
  publication-title: Applied Soft Computing
– volume: 21
  start-page: 1263
  issue: 9
  year: 2009
  end-page: 1284
  ident: CR21
  article-title: Learning from imbalanced data
  publication-title: IEEE Transactions on Knowledge and Data Engineering
– volume: 49
  start-page: 1
  issue: 2
  year: 2016
  end-page: 50
  ident: CR8
  article-title: A survey of predictive modeling on imbalanced domains
  publication-title: ACM Computing Surveys (CSUR)
– volume: 107
  start-page: 605
  issue: 3
  year: 2018
  end-page: 637
  ident: CR5
  article-title: Manifold-based synthetic oversampling with manifold conformance estimation
  publication-title: Machine Learning
– ident: CR18
– ident: CR14
– volume: 26
  start-page: 405
  issue: 2
  year: 2012
  end-page: 425
  ident: CR3
  article-title: MWMOTE—Majority weighted minority oversampling technique for imbalanced data set learning
  publication-title: IEEE Transactions on Knowledge and Data Engineering
– volume: 25
  start-page: 13
  issue: 1
  year: 2012
  end-page: 21
  ident: CR16
  article-title: On the effectiveness of preprocessing methods when dealing with different levels of class imbalance
  publication-title: Knowledge-Based Systems
– volume: 5
  start-page: 221
  issue: 4
  year: 2016
  end-page: 232
  ident: CR30
  article-title: Learning from imbalanced data: Open challenges and future directions
  publication-title: Progress in Artificial Intelligence
– ident: CR10
– volume: 12
  start-page: 2825
  issue: Oct
  year: 2011
  end-page: 2830
  ident: CR43
  article-title: Scikit-learn: Machine learning in Python
  publication-title: Journal of Machine Learning Research
– ident: CR40
– volume: 18
  start-page: 63
  issue: 1
  year: 2006
  end-page: 77
  ident: CR52
  article-title: Training cost-sensitive neural networks with methods addressing the class imbalance problem
  publication-title: IEEE Transactions on Knowledge and Data Engineering
– volume: 16
  start-page: 3
  year: 2014
  end-page: 17
  ident: CR51
  article-title: A survey of multiple classifier systems as hybrid systems
  publication-title: Information Fusion
– ident: CR23
– volume: 42
  start-page: 463
  issue: 4
  year: 2011
  end-page: 484
  ident: CR15
  article-title: A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)
– start-page: 333
  year: 2016
  end-page: 363
  ident: CR46
  article-title: Dealing with data difficulty factors while learning from imbalanced data
  publication-title: Challenges in computational statistics and data mining
– volume: 16
  start-page: 321
  year: 2002
  end-page: 357
  ident: CR11
  article-title: SMOTE: Synthetic minority over-sampling technique
  publication-title: Journal of Artificial Intelligence Research
– volume: 6
  start-page: 769
  year: 1976
  end-page: 772
  ident: CR47
  article-title: Two modifications of CNN
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics
– ident: CR48
– volume: 29
  start-page: 4065
  issue: 9
  year: 2018
  end-page: 4076
  ident: CR41
  article-title: Classification of imbalanced data by oversampling in kernel space of support vector machines
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– ident: CR17
– ident: CR13
– volume: 17
  start-page: 255
  issue: 2–3
  year: 2011
  end-page: 287
  ident: CR1
  article-title: KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework
  publication-title: Journal of Multiple-Valued Logic & Soft Computing
– volume: 62
  start-page: 841
  issue: 3
  year: 2020
  end-page: 866
  ident: CR6
  article-title: Framework for extreme imbalance classification: SWIM—Sampling with the majority class
  publication-title: Knowledge and Information Systems
– ident: CR34
– ident: CR20
– ident: 6012_CR4
  doi: 10.1007/978-3-319-46128-1_16
– volume: 27
  start-page: 1947
  issue: 9
  year: 2016
  ident: 6012_CR44
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2015.2461436
– volume: 27
  start-page: 727
  issue: 4
  year: 2017
  ident: 6012_CR28
  publication-title: International Journal of Applied Mathematics and Computer Science
  doi: 10.1515/amcs-2017-0050
– volume: 12
  start-page: 2825
  issue: Oct
  year: 2011
  ident: 6012_CR43
  publication-title: Journal of Machine Learning Research
– ident: 6012_CR35
– volume: 39
  start-page: 6585
  issue: 7
  year: 2012
  ident: 6012_CR39
  publication-title: Expert Systems with Applications
  doi: 10.1016/j.eswa.2011.12.043
– ident: 6012_CR40
  doi: 10.1109/CIDM.2011.5949434
– ident: 6012_CR45
  doi: 10.1109/ICDM.2018.00060
– volume: 16
  start-page: 321
  year: 2002
  ident: 6012_CR11
  publication-title: Journal of Artificial Intelligence Research
  doi: 10.1613/jair.953
– ident: 6012_CR22
– ident: 6012_CR42
  doi: 10.1007/978-3-642-28931-6_14
– volume: 28
  start-page: 539
  issue: 3
  year: 2018
  ident: 6012_CR19
  publication-title: Statistics and Computing
  doi: 10.1007/s11222-017-9746-6
– ident: 6012_CR34
  doi: 10.1007/3-540-62858-4_79
– ident: 6012_CR48
  doi: 10.1109/ICDM.2012.115
– volume: 102
  start-page: 107262
  year: 2020
  ident: 6012_CR26
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2020.107262
– volume: 26
  start-page: 405
  issue: 2
  year: 2012
  ident: 6012_CR3
  publication-title: IEEE Transactions on Knowledge and Data Engineering
  doi: 10.1109/TKDE.2012.232
– volume: 5
  start-page: 221
  issue: 4
  year: 2016
  ident: 6012_CR30
  publication-title: Progress in Artificial Intelligence
  doi: 10.1007/s13748-016-0094-0
– ident: 6012_CR36
  doi: 10.1007/3-540-48229-6_9
– volume: 25
  start-page: 13
  issue: 1
  year: 2012
  ident: 6012_CR16
  publication-title: Knowledge-Based Systems
  doi: 10.1016/j.knosys.2011.06.013
– volume: 18
  start-page: 1
  issue: 17
  year: 2017
  ident: 6012_CR37
  publication-title: Journal of Machine Learning Research
– ident: 6012_CR18
  doi: 10.1007/11538059_91
– volume: 31
  start-page: 2818
  issue: 8
  year: 2019
  ident: 6012_CR31
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2019.2913673
– volume: 204
  start-page: 106223
  year: 2020
  ident: 6012_CR29
  publication-title: Knowledge-Based Systems
  doi: 10.1016/j.knosys.2020.106223
– volume: 62
  start-page: 841
  issue: 3
  year: 2020
  ident: 6012_CR6
  publication-title: Knowledge and Information Systems
  doi: 10.1007/s10115-019-01380-z
– volume: 21
  start-page: 1263
  issue: 9
  year: 2009
  ident: 6012_CR21
  publication-title: IEEE Transactions on Knowledge and Data Engineering
  doi: 10.1109/TKDE.2008.239
– start-page: 333
  volume-title: Challenges in computational statistics and data mining
  year: 2016
  ident: 6012_CR46
  doi: 10.1007/978-3-319-18781-5_17
– ident: 6012_CR17
  doi: 10.1007/11538059_91
– volume: 343
  start-page: 19
  year: 2019
  ident: 6012_CR27
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2018.04.089
– volume: 11
  start-page: 1885
  issue: 8
  year: 1999
  ident: 6012_CR2
  publication-title: Neural Computation
  doi: 10.1162/089976699300016007
– volume: 29
  start-page: 3573
  issue: 8
  year: 2018
  ident: 6012_CR24
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2017.2732482
– volume: 17
  start-page: 368
  issue: 3
  year: 2012
  ident: 6012_CR7
  publication-title: IEEE Transactions on Evolutionary Computation
  doi: 10.1109/TEVC.2012.2199119
– volume: 42
  start-page: 463
  issue: 4
  year: 2011
  ident: 6012_CR15
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)
  doi: 10.1109/TSMCC.2011.2161285
– ident: 6012_CR20
– volume: 14
  start-page: 554
  issue: Part C
  year: 2014
  ident: 6012_CR33
  publication-title: Applied Soft Computing
  doi: 10.1016/j.asoc.2013.08.014
– volume: 17
  start-page: 255
  issue: 2–3
  year: 2011
  ident: 6012_CR1
  publication-title: Journal of Multiple-Valued Logic & Soft Computing
– volume: 2
  start-page: 408
  issue: 3
  year: 1972
  ident: 6012_CR50
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics
  doi: 10.1109/TSMC.1972.4309137
– volume: 107
  start-page: 605
  issue: 3
  year: 2018
  ident: 6012_CR5
  publication-title: Machine Learning
  doi: 10.1007/s10994-017-5670-4
– volume: 422
  start-page: 242
  year: 2018
  ident: 6012_CR38
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2017.09.013
– volume: 49
  start-page: 1
  issue: 2
  year: 2016
  ident: 6012_CR8
  publication-title: ACM Computing Surveys (CSUR)
  doi: 10.1145/2907070
– ident: 6012_CR12
  doi: 10.1007/978-3-540-39804-2_12
– ident: 6012_CR13
  doi: 10.1145/1401890.1401910
– volume: 6
  start-page: 769
  year: 1976
  ident: 6012_CR47
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics
– ident: 6012_CR10
  doi: 10.1007/978-3-642-01307-2_43
– volume: 264
  start-page: 182
  year: 2014
  ident: 6012_CR32
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2013.12.019
– volume: 29
  start-page: 4065
  issue: 9
  year: 2018
  ident: 6012_CR41
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2017.2751612
– ident: 6012_CR14
  doi: 10.1145/1143844.1143874
– ident: 6012_CR25
  doi: 10.1109/IJCNN52387.2021.9533415
– volume: 16
  start-page: 3
  year: 2014
  ident: 6012_CR51
  publication-title: Information Fusion
  doi: 10.1016/j.inffus.2013.04.006
– ident: 6012_CR23
  doi: 10.1109/ACII.2013.47
– ident: 6012_CR49
  doi: 10.1109/IJCNN.2010.5596702
– volume: 18
  start-page: 63
  issue: 1
  year: 2006
  ident: 6012_CR52
  publication-title: IEEE Transactions on Knowledge and Data Engineering
  doi: 10.1109/TKDE.2006.17
– volume: 31
  start-page: 2868
  issue: 8
  year: 2019
  ident: 6012_CR9
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2019.2899061
SSID ssj0002686
Score 2.4196477
Snippet Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3059
SubjectTerms Algorithms
Artificial Intelligence
Binary data
Classification
Cleaning
Computer Science
Control
Machine Learning
Mechatronics
Natural Language Processing (NLP)
Oversampling
Recall
Resampling
Robotics
Simulation and Modeling
Special Issue: Foundations of Data Science
Training
SummonAdditionalLinks – databaseName: SpringerLINK - Czech Republic Consortium
  dbid: AGYKE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED5Bu7BQnqJQkAc2MEriOHHY2oqCQDBUVIIpsh0HKvpAbbrw6_GlSQsIkLolytlKfGffOf7uO4BToSJXcY9TnjK7QQkiRiNfMmvLkScC3_A0z626fwhuev7tE38qksKmJdq9PJLMV-ovyW45ja2HSB27rFKxDlXuikhUoNq8fr67WqzAXpBXeLQTiFP04EWyzO-9fHdIyyjzx8Fo7m86NeiVbzqHmbxdzDJ1oT9-kDiu-ilbsFkEoKQ5t5htWDOjHaiVxR1IMdd3QXVbtN3uXpIushcMaMu6u4RYObuVxouBkfhLhchRQrpmKhGZjreDl_Gkn70OiY2GSX-oEDqpbQOEohKNsTqCk3J72INe5-qxfUOLggxUs4Bl1JdJxJXnJClLmVJOyqw-tTBWuUKGYShdnYowQocXKiWsuONJ5kpHKl8bodg-VEbjkTkAgjz1ifFt_COFr6QfKWESR4QyCKTS0quDW2ol1gVbORbNGMRLnmUcxNgOYpwPYizqcLZo8z7n6vhXulEqOy7m7TRGgjXcxAW8Duel7paP_-7tcDXxI9jwUP05LqYBlWwyM8c2usnUSWHMn8mY7Q0
  priority: 102
  providerName: Springer Nature
Title RB-CCR: Radial-Based Combined Cleaning and Resampling algorithm for imbalanced data classification
URI https://link.springer.com/article/10.1007/s10994-021-06012-8
https://www.proquest.com/docview/2601154865
Volume 110
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT-MwELZ4XLjw3NWWl3zgtmttYieOwwU1UQsCbbWKqASnyHYcQCrl0fL_mTEOFUhwiRP5IWVmPPPZHs8QcqRMHpuUpyxtBSxQZC5YnmgBspxzJROXtv5u1b-RPBsn51fpVdhwmwW3yk4nekXdPFjcI_-Loa8QXsv05PGJYdYoPF0NKTSWyWoMlgblXA1P3zUxlz7TI0yklKElD5dmwtU5HxSXo98PKGmmPhqmBdr8dEDq7c5wk6wHwEj7bxzeIktuuk02umQMNMzNHWKqgpVldUwrjDYwYQWYp4ZCO1j64svEadwCoXra0MrNNHqS4-fkBv5yfntPAb3Su3uDro4WOqDrKLWIrdGZyPPvBxkPB5flGQsJFJgVUsxZops8NTxqWtEKY6JWAP2tcsAMpbMs07FtVZajgcqMUdA84lrEOtImsU4Z8ZOsTB-m7hehGFe-cQngFa0So5PcKNdEKtNSamM175G4o15tQ3RxTHIxqRdxkZHiNVC89hSvVY_8fu_z-BZb49vW-x1T6jDPZvVCKnrkT8eoRfXXo-1-P9oeWeMoG95vZZ-szJ9f3AGgj7k59CJ2SFb7w6IYYXl6fTGAshiM_ldQW8oSnmPefwWbw9mH
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwEB7BcoALpTzEUtr6QE_FIms7iYOEKnYLWl4rFIHELdiO0yLtLtDdCvGn-I3MZBNWrQQ3boliW9H488xnex4AW9omLRuKkIeFxA1KlEieKCMRy4nQkfJhUcZWnfWi7qU6vgqvZuCpjoUht8paJ5aKOr91dEa-Q6mviF5H4Y-7e05Vo-h2tS6hMYHFiX98wC3baO_oJ87vNyEODy46XV5VFeBORnLMlcmT0IogL2QhrQ0KiT_ltMc_1CaOY9NyhY4T0tqxtRqbB8LIlgmMVc5rK3HcWZhTFNHagLn2Qe88fdH9IiprS-LSDTlxhypMpwrWK9PwCvI0QrPA9b-mcMpv_7uSLS3d4RIsVhSV7U8w9RFm_HAZPtTlH1ilDVbApm3e6aS7LKX8Bn3eRoOYM2yHm2166HtDhy7MDHOW-pEh33V67f9CuY5_DxjyZXYzsORc6bADOasyR2ye3JdKxKzC5bsIdw0aw9uhXwdGmexzr5AhGa2sUYnVPg90bKLIWGdEE1q19DJX5TOnshr9bJqJmSSeocSzUuKZbsL3lz53k2web7berCclq1b2KJvisAnb9URNP78-2sbbo32F-e7F2Wl2etQ7-QQLgnBSes1sQmP856__jNxnbL9UgGNw_d4Yfwbj6xJF
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bT9swFD6CTpp4gY2BVtZtfmBPYJHauThI07S2KzAYQhVIvAXbcQZSWhgtmvbX9ut2jpsQDQneeEsU23KOP5-LfS4Am8qkXROJiEeFRAMlTiVPQy0Ry6lQceiiwsdW_TiO98_C7-fR-QL8rWNhyK2y5omeUefXls7Idyj1FanXcbRTVG4RJ4Phl5tfnCpI0U1rXU5jDpFD9-c3mm_TzwcDXOtPQgy_nfb3eVVhgFsZyxkPdZ5GRgR5IQtpTFBInKBVDmerdJIkumsLlaTEwRNjFDYPhJZdHWgTWqeMxHEX4UWCf0aGnxru3UsBEfsqk7iJI05aRBWwU4Xt-YS8gnyOUEBw9b9QbDTdB5ezXuYNX8Fypayyr3N0vYYFN1mFlboQBKv4whswox7v90e7bESZDkreQ9GYM2yHZjc9lE7T8QvTk5yN3FSTFzu9lj-RqrPLMUPNmV2NDblZWuxAbqvMkl5PjkweO2tw9iykXYfW5Hri3gKjnPa5C1FX0io0OkyNcnmgEh3H2lgt2tCtqZfZKrM5FdgosyYnM1E8Q4pnnuKZasPWfZ-beV6PJ1t36kXJqj0-zRpEtmG7Xqjm8-OjbTw92kd4icjOjg6OD9_BkiCYePeZDrRmt3fuPSpBM_PBo43BxXPD-x_krRUV
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RB-CCR%3A+Radial-Based+Combined+Cleaning+and+Resampling+algorithm+for+imbalanced+data+classification&rft.jtitle=Machine+learning&rft.au=Koziarski+Micha%C5%82&rft.au=Bellinger%2C+Colin&rft.au=Wo%C5%BAniak+Micha%C5%82&rft.date=2021-12-01&rft.pub=Springer+Nature+B.V&rft.issn=0885-6125&rft.eissn=1573-0565&rft.volume=110&rft.issue=11-12&rft.spage=3059&rft.epage=3093&rft_id=info:doi/10.1007%2Fs10994-021-06012-8&rft.externalDBID=HAS_PDF_LINK
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-6125&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-6125&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-6125&client=summon