A Weakly Supervised Learning-Based Oversampling Framework for Class-Imbalanced Fault Diagnosis
With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle the class-imbalanced problem by generating the minority samples to balance the training set. However, one of the main challenges of the exist...
Saved in:
Published in | IEEE transactions on reliability Vol. 71; no. 1; pp. 429 - 442 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.03.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 0018-9529 1558-1721 |
DOI | 10.1109/TR.2021.3138448 |
Cover
Loading…
Abstract | With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle the class-imbalanced problem by generating the minority samples to balance the training set. However, one of the main challenges of the existing oversampling methods is how to generate high-quality minority samples. Traditional oversampling methods regard all synthetic samples as minority ones to be added to the training set without filtering. The low-quality synthetic samples would distort the distribution of the dataset and worsen the classification performance. In this article, we propose a weakly supervised oversampling method that treats all synthetic samples as unlabeled samples and develops a graph semisupervised learning algorithm to select high-quality synthetic samples, adding into the final training set as minority samples. To improve the quality of synthetic samples, we propose a cost-sensitive neighborhood component analysis dimensionality reduction method to enhance domain information validity in high-dimensional datasets. Finally, combining a boosting-based ensemble framework, we propose a new imbalanced learning framework suitable for high dimensionality and highly imbalanced fault diagnosis in industrial systems. The experimental validation is performed on five real-world wind turbine blade cracking failure datasets and compared to 15 benchmark methods. The experimental results show that average performances and robustness of the proposed framework are significantly better than those of the benchmark methods. |
---|---|
AbstractList | With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle the class-imbalanced problem by generating the minority samples to balance the training set. However, one of the main challenges of the existing oversampling methods is how to generate high-quality minority samples. Traditional oversampling methods regard all synthetic samples as minority ones to be added to the training set without filtering. The low-quality synthetic samples would distort the distribution of the dataset and worsen the classification performance. In this article, we propose a weakly supervised oversampling method that treats all synthetic samples as unlabeled samples and develops a graph semisupervised learning algorithm to select high-quality synthetic samples, adding into the final training set as minority samples. To improve the quality of synthetic samples, we propose a cost-sensitive neighborhood component analysis dimensionality reduction method to enhance domain information validity in high-dimensional datasets. Finally, combining a boosting-based ensemble framework, we propose a new imbalanced learning framework suitable for high dimensionality and highly imbalanced fault diagnosis in industrial systems. The experimental validation is performed on five real-world wind turbine blade cracking failure datasets and compared to 15 benchmark methods. The experimental results show that average performances and robustness of the proposed framework are significantly better than those of the benchmark methods. |
Author | Li, Yan-Fu Qian, Min |
Author_xml | – sequence: 1 givenname: Min orcidid: 0000-0002-8622-1773 surname: Qian fullname: Qian, Min email: qm19@mails.tsinghua.edu.cn organization: Department of Industrial Engineering, Tsinghua University, Beijing, China – sequence: 2 givenname: Yan-Fu orcidid: 0000-0001-5755-7115 surname: Li fullname: Li, Yan-Fu email: liyanfu@tsinghua.edu.cn organization: Department of Industrial Engineering, Tsinghua University, Beijing, China |
BookMark | eNp9kM1PwkAUxDdGEwE9e_DSxHNhv9ruHhFFSUhIEOPN5lFeyUJp627B8N-7DcSDB08vM_nNm2S65LKsSiTkjtE-Y1QPFvM-p5z1BRNKSnVBOiyKVMgSzi5Jh1KmQh1xfU26zm28lFKrDvkcBh8I2-IYvO1rtAfjcBVMEWxpynX4CK2cHdA62NWFt4KxhR1-V3Yb5JUNRgU4F052SyigzDw7hn3RBE8G1mXljLshVzkUDm_Pt0fex8-L0Ws4nb1MRsNpmHGlm5BHSxEvc8y4XOWSK5AsR0HRe5CJjPOVThIlYyZZonPUmWgRCcjpKl-CEj3ycPpb2-prj65JN9Xelr4y5bGIqPSLCE8NTlRmK-cs5mltzQ7sMWU0bUdMF_O0HTE9j-gT0Z9EZhpoTFU2FkzxT-7-lDOI-Nui4yShsRA_qZiAwg |
CODEN | IERQAD |
CitedBy_id | crossref_primary_10_1109_TII_2024_3431048 crossref_primary_10_1007_s10489_024_05373_6 crossref_primary_10_1109_TR_2022_3190942 crossref_primary_10_3390_en17071590 crossref_primary_10_1016_j_eswa_2024_123930 crossref_primary_10_1016_j_eswa_2024_124944 crossref_primary_10_1016_j_ress_2024_110189 crossref_primary_10_1109_TIM_2023_3271729 crossref_primary_10_1016_j_eswa_2023_119891 crossref_primary_10_1016_j_aei_2024_102436 crossref_primary_10_1016_j_aei_2024_102612 crossref_primary_10_1016_j_renene_2023_03_097 crossref_primary_10_1109_JIOT_2024_3387741 crossref_primary_10_1360_SSPMA_2024_0474 crossref_primary_10_1016_j_eswa_2023_121799 crossref_primary_10_1109_TKDE_2024_3523043 crossref_primary_10_1109_TII_2022_3207749 crossref_primary_10_3390_su16052042 crossref_primary_10_1109_TII_2022_3228702 crossref_primary_10_1002_cpe_8204 crossref_primary_10_1016_j_knosys_2022_109437 crossref_primary_10_1109_TR_2024_3376601 crossref_primary_10_3390_electronics13173426 crossref_primary_10_1016_j_engappai_2023_107104 crossref_primary_10_1016_j_eswa_2024_123987 crossref_primary_10_1016_j_measurement_2024_115726 crossref_primary_10_1061_AJRUA6_RUENG_1480 crossref_primary_10_3934_math_2024851 crossref_primary_10_1007_s11227_024_06312_5 crossref_primary_10_1109_TIM_2024_3504560 crossref_primary_10_1016_j_ress_2023_109832 crossref_primary_10_1016_j_isatra_2023_09_027 crossref_primary_10_1109_TR_2022_3214519 crossref_primary_10_1109_JSEN_2024_3415713 crossref_primary_10_1016_j_knosys_2024_112354 |
Cites_doi | 10.1109/TR.2013.2259203 10.1109/TR.2018.2803798 10.1561/2200000019 10.1016/j.fss.2007.12.023 10.1109/TSMC.1976.4309452 10.1109/TR.2019.2942049 10.1016/j.ress.2013.02.022 10.1145/1007730.1007735 10.1109/ICCV.2019.00178 10.1109/TII.2017.2683528 10.1016/j.asoc.2019.105662 10.1109/TIE.2018.2798633 10.1016/j.ymssp.2016.10.034 10.1109/CIDM.2009.4938667 10.5555/1642194.1642224 10.1613/jair.953 10.3233/IDA-130630 10.1109/TSE.2018.2836442 10.1016/j.ress.2017.10.004 10.1109/TR.2016.2591504 10.1016/j.ins.2018.06.056 10.1109/TSMCB.2008.2007853 10.1016/j.eswa.2016.12.035 10.1109/ICDE48307.2020.00078 10.1109/TIM.2021.3088489 10.5391/ijfis.2017.17.4.229 10.1016/j.neucom.2019.06.043 10.1007/11538059_91 10.1016/j.cie.2019.106266 10.1093/nsr/nwx106 10.1109/IJCNN.2008.4633969 10.1109/TR.2014.2315911 10.1109/TSMCA.2009.2029559 10.1007/s10994-019-05855-6 10.1109/TR.2012.2194352 10.1109/TSMCC.2011.2161285 10.1016/j.eswa.2019.05.006 10.1007/978-3-540-39804-2_12 10.1109/TSMC.1972.4309137 10.1016/j.ymssp.2018.07.048 10.1007/s10044-003-0192-z 10.1109/TR.2019.2930195 10.1109/TKDE.2008.239 10.1609/aaai.v33i01.33014715 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
DBID | 97E RIA RIE AAYXX CITATION 7SP 8FD L7M |
DOI | 10.1109/TR.2021.3138448 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace |
DatabaseTitle | CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-1721 |
EndPage | 442 |
ExternalDocumentID | 10_1109_TR_2021_3138448 9677063 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 71731008 funderid: 10.13039/501100001809 – fundername: National Key Research and Development Program of China grantid: 2018YFB1306100 – fundername: Natural Science Foundation of Beijing Municipality grantid: L191022 funderid: 10.13039/501100004826 |
GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 8WZ 97E A6W AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD H~9 IAAWW IBMZZ ICLAB IDIHD IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ OCL P2P RIA RIE RNS TN5 VH1 VJK AAYXX CITATION RIG 7SP 8FD L7M |
ID | FETCH-LOGICAL-c289t-25b36bfec24df428a41fe30e6bfac3c22d97784614179fe9c38a414ae20dfba83 |
IEDL.DBID | RIE |
ISSN | 0018-9529 |
IngestDate | Mon Jun 30 10:22:33 EDT 2025 Tue Jul 01 00:49:10 EDT 2025 Thu Apr 24 22:57:06 EDT 2025 Wed Aug 27 02:49:30 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c289t-25b36bfec24df428a41fe30e6bfac3c22d97784614179fe9c38a414ae20dfba83 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0001-5755-7115 0000-0002-8622-1773 |
PQID | 2635044483 |
PQPubID | 85456 |
PageCount | 14 |
ParticipantIDs | crossref_citationtrail_10_1109_TR_2021_3138448 proquest_journals_2635044483 crossref_primary_10_1109_TR_2021_3138448 ieee_primary_9677063 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-March 2022-3-00 20220301 |
PublicationDateYYYYMMDD | 2022-03-01 |
PublicationDate_xml | – month: 03 year: 2022 text: 2022-March |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on reliability |
PublicationTitleAbbrev | TR |
PublicationYear | 2022 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref15 ref14 ref11 ref10 Mani (ref42) 2003 ref17 ref16 ref19 ref18 ref46 ref48 ref47 ref41 ref44 ref43 ref49 ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 ref34 ref36 ref31 ref30 ref33 ref2 ref1 ref39 ref38 Alcal-Fdez (ref32) 2011; 17 Snchez-Monedero (ref50) 2013 ref24 ref23 ref26 ref25 Batista (ref45) 2003 ref20 Zhu (ref37) 2003 ref22 ref21 ref28 ref27 ref29 Goldberger (ref35) 2005 |
References_xml | – ident: ref15 doi: 10.1109/TR.2013.2259203 – ident: ref8 doi: 10.1109/TR.2018.2803798 – ident: ref34 doi: 10.1561/2200000019 – ident: ref38 doi: 10.1016/j.fss.2007.12.023 – ident: ref41 doi: 10.1109/TSMC.1976.4309452 – ident: ref17 doi: 10.1109/TR.2019.2942049 – ident: ref5 doi: 10.1016/j.ress.2013.02.022 – ident: ref44 doi: 10.1145/1007730.1007735 – ident: ref30 doi: 10.1109/ICCV.2019.00178 – ident: ref7 doi: 10.1109/TII.2017.2683528 – ident: ref29 doi: 10.1016/j.asoc.2019.105662 – ident: ref14 doi: 10.1109/TIE.2018.2798633 – ident: ref13 doi: 10.1016/j.ymssp.2016.10.034 – ident: ref49 doi: 10.1109/CIDM.2009.4938667 – ident: ref20 doi: 10.5555/1642194.1642224 – ident: ref18 doi: 10.1613/jair.953 – ident: ref21 doi: 10.3233/IDA-130630 – ident: ref11 doi: 10.1109/TSE.2018.2836442 – ident: ref6 doi: 10.1016/j.ress.2017.10.004 – ident: ref9 doi: 10.1109/TR.2016.2591504 – ident: ref28 doi: 10.1016/j.ins.2018.06.056 – ident: ref19 doi: 10.1109/TSMCB.2008.2007853 – ident: ref26 doi: 10.1016/j.eswa.2016.12.035 – ident: ref39 doi: 10.1109/ICDE48307.2020.00078 – ident: ref4 doi: 10.1109/TIM.2021.3088489 – ident: ref27 doi: 10.5391/ijfis.2017.17.4.229 – ident: ref25 doi: 10.1016/j.neucom.2019.06.043 – volume-title: Proc. Int. Conf. Mach. Learn., Workshop Learn. Imbalanced Datasets II year: 2003 ident: ref42 article-title: kNN approach to unbalanced data distributions: A case study involving information extraction – start-page: 613 volume-title: Proc. Int. Work-Conf. Artif. Neural Netw. year: 2013 ident: ref50 article-title: An n-spheres based synthetic data generator for supervised classification – ident: ref43 doi: 10.1007/11538059_91 – ident: ref12 doi: 10.1016/j.cie.2019.106266 – ident: ref33 doi: 10.1093/nsr/nwx106 – ident: ref24 doi: 10.1109/IJCNN.2008.4633969 – ident: ref16 doi: 10.1109/TR.2014.2315911 – ident: ref46 doi: 10.1109/TSMCA.2009.2029559 – ident: ref36 doi: 10.1007/s10994-019-05855-6 – ident: ref2 doi: 10.1109/TR.2012.2194352 – ident: ref22 doi: 10.1109/TSMCC.2011.2161285 – volume: 17 start-page: 255 year: 2011 ident: ref32 article-title: KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework publication-title: J. Multiple-Valued Log. Soft Comput. – ident: ref31 doi: 10.1016/j.eswa.2019.05.006 – ident: ref47 doi: 10.1007/978-3-540-39804-2_12 – ident: ref40 doi: 10.1109/TSMC.1972.4309137 – start-page: 912 volume-title: Proc. 20th Int. Conf. Mach. Learn. year: 2003 ident: ref37 article-title: Semi-supervised learning using gaussian fields and harmonic functions – ident: ref3 doi: 10.1016/j.ymssp.2018.07.048 – ident: ref48 doi: 10.1007/s10044-003-0192-z – start-page: 10 volume-title: Proc. Brazilian Workshop Bioinformat. year: 2003 ident: ref45 article-title: Balancing training data for automated annotation of keywords: A case study – ident: ref1 doi: 10.1109/TR.2019.2930195 – start-page: 513 volume-title: Proc. Int. Conf. Neural Inf. Process. Syst. year: 2005 ident: ref35 article-title: Neighbourhood components analysis – ident: ref10 doi: 10.1109/TKDE.2008.239 – ident: ref23 doi: 10.1609/aaai.v33i01.33014715 |
SSID | ssj0014498 |
Score | 2.5035014 |
Snippet | With the lack of failure data, class imbalance has become a common challenge in the fault diagnosis of industrial systems. The oversampling methods can tackle... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 429 |
SubjectTerms | Algorithms Benchmarks Class-imbalanced classification Cost analysis Datasets Fault diagnosis Labeling Machine learning Oversampling Principal component analysis Semisupervised learning Supervised learning Symmetric matrices Training Turbine blades weakly supervised learning (WSL) wind turbine Wind turbines |
Title | A Weakly Supervised Learning-Based Oversampling Framework for Class-Imbalanced Fault Diagnosis |
URI | https://ieeexplore.ieee.org/document/9677063 https://www.proquest.com/docview/2635044483 |
Volume | 71 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB7BnsqBPihiy7byoYce8JLYzsNHWljRShSJLionIscPhNguqJsc6K_vTOJdoUIlbkk0liyP7Zkv_vwNwEeXeUyLteYhTwxXTiuuMQxzWyOWwPxYpSXdRj75nh-fq28X2cUa7K3uwnjvO_KZH9Njd5bvbm1Lv8r2dV4UGFLXYR2BW39Xa3VioJSOuy4u4EzoKOOTJnp_eoY4UKQIT2WpqNDPgwjUlVR5tA93wWXyEk6W3eo5JTfjtqnH9s8_io3P7fcr2IxZJjvop8VrWPPzN7DxQHtwCy4P2E9vbmb37Ed7RxvGwjsW1Vav-GdDr6fE2TDEOZ9fscmSxsUwz2VdMU3-9VdN1EiLthPTzhp22DP3rhdv4XxyNP1yzGOxBW4RczVcZLXM6-CtUC4gJjEqDV4mHr8ZK60QDjNFTFZSKlkWvLaSTJTxInGhNqXchsH8du53gHmR68wJI0MaVGGkdkmBYLUwTpQhL80QxksHVDYqkVNBjFnVIZJEV9OzijxWRY8N4dOqwV0vwvF_0y0a_5VZHPohjJYeruIiXVSkw0NyeaV893SrXXgh6LZDRzkbwaD53fr3mIM09Ydu8v0FIwjX4Q |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB6VcgAOUCiIhdL6wIED3ia28_CxPFbb0i1S2YqeiBw_VqjLtmKTA_x6ZhLvqqJF4pZEY8ny2POIv_kG4LXLPIbFWvOQJ4YrpxXX6Ia5rTGXwPhYpSVVI09O8vGZOjrPzjfg7boWxnvfgc_8kB67u3x3aVv6Vbav86JAl3oH7mZUjNtXa63vDJTS0e7iEc6EjkQ-aaL3p6eYCYoUE1RZKmr1c80HdU1Vbljizr2MHsFkNbEeVXIxbJt6aH__xdn4vzPfgocxzmQH_cZ4DBt-8QQeXGMf3IZvB-yrNxfzX-xLe0UmY-kdi3yrM_7O0OtnQm0YQp0vZmy0AnIxjHRZ106TH_6oCRxpUXZk2nnDPvTYve_Lp3A2-jh9P-ax3QK3mHU1XGS1zOvgrVAuYFZiVBq8TDx-M1ZaIRzGihiupNS0LHhtJYko40XiQm1K-Qw2F5cL_xyYF7nOnDAypEEVRmqXFJiuFsaJMuSlGcBwpYDKRi5yaokxr7qcJNHV9LQijVVRYwN4sx5w1dNw_Ft0m9Z_LRaXfgA7Kw1X8ZguK2LiIcK8Ur64fdQe3BtPJ8fV8eHJp5dwX1DtQwdA24HN5mfrX2FE0tS73Ub8A4xU2yk |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Weakly+Supervised+Learning-Based+Oversampling+Framework+for+Class-Imbalanced+Fault+Diagnosis&rft.jtitle=IEEE+transactions+on+reliability&rft.au=Qian%2C+Min&rft.au=Yan-Fu%2C+Li&rft.date=2022-03-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=0018-9529&rft.eissn=1558-1721&rft.volume=71&rft.issue=1&rft.spage=429&rft_id=info:doi/10.1109%2FTR.2021.3138448&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9529&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9529&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9529&client=summon |