A Weakly Supervised Learning-Based Oversampling Framework for Class-Imbalanced Fault Diagnosis

With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle the class-imbalanced problem by generating the minority samples to balance the training set. However, one of the main challenges of the exist...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on reliability Vol. 71; no. 1; pp. 429 - 442
Main Authors Qian, Min, Li, Yan-Fu
Format Journal Article
LanguageEnglish
Published New York IEEE 01.03.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0018-9529
1558-1721
DOI10.1109/TR.2021.3138448

Cover

Loading…
Abstract With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle the class-imbalanced problem by generating the minority samples to balance the training set. However, one of the main challenges of the existing oversampling methods is how to generate high-quality minority samples. Traditional oversampling methods regard all synthetic samples as minority ones to be added to the training set without filtering. The low-quality synthetic samples would distort the distribution of the dataset and worsen the classification performance. In this article, we propose a weakly supervised oversampling method that treats all synthetic samples as unlabeled samples and develops a graph semisupervised learning algorithm to select high-quality synthetic samples, adding into the final training set as minority samples. To improve the quality of synthetic samples, we propose a cost-sensitive neighborhood component analysis dimensionality reduction method to enhance domain information validity in high-dimensional datasets. Finally, combining a boosting-based ensemble framework, we propose a new imbalanced learning framework suitable for high dimensionality and highly imbalanced fault diagnosis in industrial systems. The experimental validation is performed on five real-world wind turbine blade cracking failure datasets and compared to 15 benchmark methods. The experimental results show that average performances and robustness of the proposed framework are significantly better than those of the benchmark methods.
AbstractList With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle the class-imbalanced problem by generating the minority samples to balance the training set. However, one of the main challenges of the existing oversampling methods is how to generate high-quality minority samples. Traditional oversampling methods regard all synthetic samples as minority ones to be added to the training set without filtering. The low-quality synthetic samples would distort the distribution of the dataset and worsen the classification performance. In this article, we propose a weakly supervised oversampling method that treats all synthetic samples as unlabeled samples and develops a graph semisupervised learning algorithm to select high-quality synthetic samples, adding into the final training set as minority samples. To improve the quality of synthetic samples, we propose a cost-sensitive neighborhood component analysis dimensionality reduction method to enhance domain information validity in high-dimensional datasets. Finally, combining a boosting-based ensemble framework, we propose a new imbalanced learning framework suitable for high dimensionality and highly imbalanced fault diagnosis in industrial systems. The experimental validation is performed on five real-world wind turbine blade cracking failure datasets and compared to 15 benchmark methods. The experimental results show that average performances and robustness of the proposed framework are significantly better than those of the benchmark methods.
Author Li, Yan-Fu
Qian, Min
Author_xml – sequence: 1
  givenname: Min
  orcidid: 0000-0002-8622-1773
  surname: Qian
  fullname: Qian, Min
  email: qm19@mails.tsinghua.edu.cn
  organization: Department of Industrial Engineering, Tsinghua University, Beijing, China
– sequence: 2
  givenname: Yan-Fu
  orcidid: 0000-0001-5755-7115
  surname: Li
  fullname: Li, Yan-Fu
  email: liyanfu@tsinghua.edu.cn
  organization: Department of Industrial Engineering, Tsinghua University, Beijing, China
BookMark eNp9kM1PwkAUxDdGEwE9e_DSxHNhv9ruHhFFSUhIEOPN5lFeyUJp627B8N-7DcSDB08vM_nNm2S65LKsSiTkjtE-Y1QPFvM-p5z1BRNKSnVBOiyKVMgSzi5Jh1KmQh1xfU26zm28lFKrDvkcBh8I2-IYvO1rtAfjcBVMEWxpynX4CK2cHdA62NWFt4KxhR1-V3Yb5JUNRgU4F052SyigzDw7hn3RBE8G1mXljLshVzkUDm_Pt0fex8-L0Ws4nb1MRsNpmHGlm5BHSxEvc8y4XOWSK5AsR0HRe5CJjPOVThIlYyZZonPUmWgRCcjpKl-CEj3ycPpb2-prj65JN9Xelr4y5bGIqPSLCE8NTlRmK-cs5mltzQ7sMWU0bUdMF_O0HTE9j-gT0Z9EZhpoTFU2FkzxT-7-lDOI-Nui4yShsRA_qZiAwg
CODEN IERQAD
CitedBy_id crossref_primary_10_1109_TII_2024_3431048
crossref_primary_10_1007_s10489_024_05373_6
crossref_primary_10_1109_TR_2022_3190942
crossref_primary_10_3390_en17071590
crossref_primary_10_1016_j_eswa_2024_123930
crossref_primary_10_1016_j_eswa_2024_124944
crossref_primary_10_1016_j_ress_2024_110189
crossref_primary_10_1109_TIM_2023_3271729
crossref_primary_10_1016_j_eswa_2023_119891
crossref_primary_10_1016_j_aei_2024_102436
crossref_primary_10_1016_j_aei_2024_102612
crossref_primary_10_1016_j_renene_2023_03_097
crossref_primary_10_1109_JIOT_2024_3387741
crossref_primary_10_1360_SSPMA_2024_0474
crossref_primary_10_1016_j_eswa_2023_121799
crossref_primary_10_1109_TKDE_2024_3523043
crossref_primary_10_1109_TII_2022_3207749
crossref_primary_10_3390_su16052042
crossref_primary_10_1109_TII_2022_3228702
crossref_primary_10_1002_cpe_8204
crossref_primary_10_1016_j_knosys_2022_109437
crossref_primary_10_1109_TR_2024_3376601
crossref_primary_10_3390_electronics13173426
crossref_primary_10_1016_j_engappai_2023_107104
crossref_primary_10_1016_j_eswa_2024_123987
crossref_primary_10_1016_j_measurement_2024_115726
crossref_primary_10_1061_AJRUA6_RUENG_1480
crossref_primary_10_3934_math_2024851
crossref_primary_10_1007_s11227_024_06312_5
crossref_primary_10_1109_TIM_2024_3504560
crossref_primary_10_1016_j_ress_2023_109832
crossref_primary_10_1016_j_isatra_2023_09_027
crossref_primary_10_1109_TR_2022_3214519
crossref_primary_10_1109_JSEN_2024_3415713
crossref_primary_10_1016_j_knosys_2024_112354
Cites_doi 10.1109/TR.2013.2259203
10.1109/TR.2018.2803798
10.1561/2200000019
10.1016/j.fss.2007.12.023
10.1109/TSMC.1976.4309452
10.1109/TR.2019.2942049
10.1016/j.ress.2013.02.022
10.1145/1007730.1007735
10.1109/ICCV.2019.00178
10.1109/TII.2017.2683528
10.1016/j.asoc.2019.105662
10.1109/TIE.2018.2798633
10.1016/j.ymssp.2016.10.034
10.1109/CIDM.2009.4938667
10.5555/1642194.1642224
10.1613/jair.953
10.3233/IDA-130630
10.1109/TSE.2018.2836442
10.1016/j.ress.2017.10.004
10.1109/TR.2016.2591504
10.1016/j.ins.2018.06.056
10.1109/TSMCB.2008.2007853
10.1016/j.eswa.2016.12.035
10.1109/ICDE48307.2020.00078
10.1109/TIM.2021.3088489
10.5391/ijfis.2017.17.4.229
10.1016/j.neucom.2019.06.043
10.1007/11538059_91
10.1016/j.cie.2019.106266
10.1093/nsr/nwx106
10.1109/IJCNN.2008.4633969
10.1109/TR.2014.2315911
10.1109/TSMCA.2009.2029559
10.1007/s10994-019-05855-6
10.1109/TR.2012.2194352
10.1109/TSMCC.2011.2161285
10.1016/j.eswa.2019.05.006
10.1007/978-3-540-39804-2_12
10.1109/TSMC.1972.4309137
10.1016/j.ymssp.2018.07.048
10.1007/s10044-003-0192-z
10.1109/TR.2019.2930195
10.1109/TKDE.2008.239
10.1609/aaai.v33i01.33014715
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
7SP
8FD
L7M
DOI 10.1109/TR.2021.3138448
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library
CrossRef
Electronics & Communications Abstracts
Technology Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Technology Research Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-1721
EndPage 442
ExternalDocumentID 10_1109_TR_2021_3138448
9677063
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 71731008
  funderid: 10.13039/501100001809
– fundername: National Key Research and Development Program of China
  grantid: 2018YFB1306100
– fundername: Natural Science Foundation of Beijing Municipality
  grantid: L191022
  funderid: 10.13039/501100004826
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
5VS
6IK
8WZ
97E
A6W
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
OCL
P2P
RIA
RIE
RNS
TN5
VH1
VJK
AAYXX
CITATION
RIG
7SP
8FD
L7M
ID FETCH-LOGICAL-c289t-25b36bfec24df428a41fe30e6bfac3c22d97784614179fe9c38a414ae20dfba83
IEDL.DBID RIE
ISSN 0018-9529
IngestDate Mon Jun 30 10:22:33 EDT 2025
Tue Jul 01 00:49:10 EDT 2025
Thu Apr 24 22:57:06 EDT 2025
Wed Aug 27 02:49:30 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c289t-25b36bfec24df428a41fe30e6bfac3c22d97784614179fe9c38a414ae20dfba83
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-5755-7115
0000-0002-8622-1773
PQID 2635044483
PQPubID 85456
PageCount 14
ParticipantIDs crossref_citationtrail_10_1109_TR_2021_3138448
proquest_journals_2635044483
crossref_primary_10_1109_TR_2021_3138448
ieee_primary_9677063
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-March
2022-3-00
20220301
PublicationDateYYYYMMDD 2022-03-01
PublicationDate_xml – month: 03
  year: 2022
  text: 2022-March
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on reliability
PublicationTitleAbbrev TR
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref14
ref11
ref10
Mani (ref42) 2003
ref17
ref16
ref19
ref18
ref46
ref48
ref47
ref41
ref44
ref43
ref49
ref8
ref7
ref9
ref4
ref3
ref6
ref5
ref40
ref34
ref36
ref31
ref30
ref33
ref2
ref1
ref39
ref38
Alcal-Fdez (ref32) 2011; 17
Snchez-Monedero (ref50) 2013
ref24
ref23
ref26
ref25
Batista (ref45) 2003
ref20
Zhu (ref37) 2003
ref22
ref21
ref28
ref27
ref29
Goldberger (ref35) 2005
References_xml – ident: ref15
  doi: 10.1109/TR.2013.2259203
– ident: ref8
  doi: 10.1109/TR.2018.2803798
– ident: ref34
  doi: 10.1561/2200000019
– ident: ref38
  doi: 10.1016/j.fss.2007.12.023
– ident: ref41
  doi: 10.1109/TSMC.1976.4309452
– ident: ref17
  doi: 10.1109/TR.2019.2942049
– ident: ref5
  doi: 10.1016/j.ress.2013.02.022
– ident: ref44
  doi: 10.1145/1007730.1007735
– ident: ref30
  doi: 10.1109/ICCV.2019.00178
– ident: ref7
  doi: 10.1109/TII.2017.2683528
– ident: ref29
  doi: 10.1016/j.asoc.2019.105662
– ident: ref14
  doi: 10.1109/TIE.2018.2798633
– ident: ref13
  doi: 10.1016/j.ymssp.2016.10.034
– ident: ref49
  doi: 10.1109/CIDM.2009.4938667
– ident: ref20
  doi: 10.5555/1642194.1642224
– ident: ref18
  doi: 10.1613/jair.953
– ident: ref21
  doi: 10.3233/IDA-130630
– ident: ref11
  doi: 10.1109/TSE.2018.2836442
– ident: ref6
  doi: 10.1016/j.ress.2017.10.004
– ident: ref9
  doi: 10.1109/TR.2016.2591504
– ident: ref28
  doi: 10.1016/j.ins.2018.06.056
– ident: ref19
  doi: 10.1109/TSMCB.2008.2007853
– ident: ref26
  doi: 10.1016/j.eswa.2016.12.035
– ident: ref39
  doi: 10.1109/ICDE48307.2020.00078
– ident: ref4
  doi: 10.1109/TIM.2021.3088489
– ident: ref27
  doi: 10.5391/ijfis.2017.17.4.229
– ident: ref25
  doi: 10.1016/j.neucom.2019.06.043
– volume-title: Proc. Int. Conf. Mach. Learn., Workshop Learn. Imbalanced Datasets II
  year: 2003
  ident: ref42
  article-title: kNN approach to unbalanced data distributions: A case study involving information extraction
– start-page: 613
  volume-title: Proc. Int. Work-Conf. Artif. Neural Netw.
  year: 2013
  ident: ref50
  article-title: An n-spheres based synthetic data generator for supervised classification
– ident: ref43
  doi: 10.1007/11538059_91
– ident: ref12
  doi: 10.1016/j.cie.2019.106266
– ident: ref33
  doi: 10.1093/nsr/nwx106
– ident: ref24
  doi: 10.1109/IJCNN.2008.4633969
– ident: ref16
  doi: 10.1109/TR.2014.2315911
– ident: ref46
  doi: 10.1109/TSMCA.2009.2029559
– ident: ref36
  doi: 10.1007/s10994-019-05855-6
– ident: ref2
  doi: 10.1109/TR.2012.2194352
– ident: ref22
  doi: 10.1109/TSMCC.2011.2161285
– volume: 17
  start-page: 255
  year: 2011
  ident: ref32
  article-title: KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework
  publication-title: J. Multiple-Valued Log. Soft Comput.
– ident: ref31
  doi: 10.1016/j.eswa.2019.05.006
– ident: ref47
  doi: 10.1007/978-3-540-39804-2_12
– ident: ref40
  doi: 10.1109/TSMC.1972.4309137
– start-page: 912
  volume-title: Proc. 20th Int. Conf. Mach. Learn.
  year: 2003
  ident: ref37
  article-title: Semi-supervised learning using gaussian fields and harmonic functions
– ident: ref3
  doi: 10.1016/j.ymssp.2018.07.048
– ident: ref48
  doi: 10.1007/s10044-003-0192-z
– start-page: 10
  volume-title: Proc. Brazilian Workshop Bioinformat.
  year: 2003
  ident: ref45
  article-title: Balancing training data for automated annotation of keywords: A case study
– ident: ref1
  doi: 10.1109/TR.2019.2930195
– start-page: 513
  volume-title: Proc. Int. Conf. Neural Inf. Process. Syst.
  year: 2005
  ident: ref35
  article-title: Neighbourhood components analysis
– ident: ref10
  doi: 10.1109/TKDE.2008.239
– ident: ref23
  doi: 10.1609/aaai.v33i01.33014715
SSID ssj0014498
Score 2.5035014
Snippet With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 429
SubjectTerms Algorithms
Benchmarks
Class-imbalanced classification
Cost analysis
Datasets
Fault diagnosis
Labeling
Machine learning
Oversampling
Principal component analysis
Semisupervised learning
Supervised learning
Symmetric matrices
Training
Turbine blades
weakly supervised learning (WSL)
wind turbine
Wind turbines
Title A Weakly Supervised Learning-Based Oversampling Framework for Class-Imbalanced Fault Diagnosis
URI https://ieeexplore.ieee.org/document/9677063
https://www.proquest.com/docview/2635044483
Volume 71
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB7BnsqBPihiy7byoYce8JLYzsNHWljRShSJLionIscPhNguqJsc6K_vTOJdoUIlbkk0liyP7Zkv_vwNwEeXeUyLteYhTwxXTiuuMQxzWyOWwPxYpSXdRj75nh-fq28X2cUa7K3uwnjvO_KZH9Njd5bvbm1Lv8r2dV4UGFLXYR2BW39Xa3VioJSOuy4u4EzoKOOTJnp_eoY4UKQIT2WpqNDPgwjUlVR5tA93wWXyEk6W3eo5JTfjtqnH9s8_io3P7fcr2IxZJjvop8VrWPPzN7DxQHtwCy4P2E9vbmb37Ed7RxvGwjsW1Vav-GdDr6fE2TDEOZ9fscmSxsUwz2VdMU3-9VdN1EiLthPTzhp22DP3rhdv4XxyNP1yzGOxBW4RczVcZLXM6-CtUC4gJjEqDV4mHr8ZK60QDjNFTFZSKlkWvLaSTJTxInGhNqXchsH8du53gHmR68wJI0MaVGGkdkmBYLUwTpQhL80QxksHVDYqkVNBjFnVIZJEV9OzijxWRY8N4dOqwV0vwvF_0y0a_5VZHPohjJYeruIiXVSkw0NyeaV893SrXXgh6LZDRzkbwaD53fr3mIM09Ydu8v0FIwjX4Q
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB6VcgAOUCiIhdL6wIED3ia28_CxPFbb0i1S2YqeiBw_VqjLtmKTA_x6ZhLvqqJF4pZEY8ny2POIv_kG4LXLPIbFWvOQJ4YrpxXX6Ia5rTGXwPhYpSVVI09O8vGZOjrPzjfg7boWxnvfgc_8kB67u3x3aVv6Vbav86JAl3oH7mZUjNtXa63vDJTS0e7iEc6EjkQ-aaL3p6eYCYoUE1RZKmr1c80HdU1Vbljizr2MHsFkNbEeVXIxbJt6aH__xdn4vzPfgocxzmQH_cZ4DBt-8QQeXGMf3IZvB-yrNxfzX-xLe0UmY-kdi3yrM_7O0OtnQm0YQp0vZmy0AnIxjHRZ106TH_6oCRxpUXZk2nnDPvTYve_Lp3A2-jh9P-ax3QK3mHU1XGS1zOvgrVAuYFZiVBq8TDx-M1ZaIRzGihiupNS0LHhtJYko40XiQm1K-Qw2F5cL_xyYF7nOnDAypEEVRmqXFJiuFsaJMuSlGcBwpYDKRi5yaokxr7qcJNHV9LQijVVRYwN4sx5w1dNw_Ft0m9Z_LRaXfgA7Kw1X8ZguK2LiIcK8Ur64fdQe3BtPJ8fV8eHJp5dwX1DtQwdA24HN5mfrX2FE0tS73Ub8A4xU2yk
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Weakly+Supervised+Learning-Based+Oversampling+Framework+for+Class-Imbalanced+Fault+Diagnosis&rft.jtitle=IEEE+transactions+on+reliability&rft.au=Qian%2C+Min&rft.au=Yan-Fu%2C+Li&rft.date=2022-03-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=0018-9529&rft.eissn=1558-1721&rft.volume=71&rft.issue=1&rft.spage=429&rft_id=info:doi/10.1109%2FTR.2021.3138448&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9529&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9529&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9529&client=summon