Cross‐version defect prediction using threshold‐based active learning

Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect ch...

Full description

Saved in:
Bibliographic Details
Published inJournal of software : evolution and process Vol. 36; no. 4
Main Authors Mei, Yuanqing, Liu, Xutong, Lu, Zeyu, Yang, Yibiao, Liu, Huihui, Zhou, Yuming
Format Journal Article
LanguageEnglish
Published Chichester Wiley Subscription Services, Inc 01.04.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction. We propose a threshold‐based active learning (TAL) approach to address the problem of underperformance of defect prediction due to the different data distributions between successive versions. TAL can actively select the unlabeled modules from the current version to domain experts for labeling and merge them into the prior version to mitigate the different data distributions. The results of our extensive experiments showed that TAL outperforms the baseline methods, including three variants, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP).
AbstractList Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction. We propose a threshold‐based active learning (TAL) approach to address the problem of underperformance of defect prediction due to the different data distributions between successive versions. TAL can actively select the unlabeled modules from the current version to domain experts for labeling and merge them into the prior version to mitigate the different data distributions. The results of our extensive experiments showed that TAL outperforms the baseline methods, including three variants, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP).
Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction.
Author Zhou, Yuming
Lu, Zeyu
Liu, Huihui
Liu, Xutong
Mei, Yuanqing
Yang, Yibiao
Author_xml – sequence: 1
  givenname: Yuanqing
  orcidid: 0000-0003-3122-8887
  surname: Mei
  fullname: Mei, Yuanqing
  organization: Nanjing University
– sequence: 2
  givenname: Xutong
  surname: Liu
  fullname: Liu, Xutong
  organization: Nanjing University
– sequence: 3
  givenname: Zeyu
  surname: Lu
  fullname: Lu, Zeyu
  organization: Nanjing University
– sequence: 4
  givenname: Yibiao
  surname: Yang
  fullname: Yang, Yibiao
  email: yangyibiao@nju.edu.cn
  organization: Nanjing University
– sequence: 5
  givenname: Huihui
  surname: Liu
  fullname: Liu, Huihui
  organization: Nanjing University
– sequence: 6
  givenname: Yuming
  orcidid: 0000-0002-4645-2526
  surname: Zhou
  fullname: Zhou, Yuming
  email: zhouyuming@nju.edu.cn
  organization: Nanjing University
BookMark eNp1kM1KAzEQx4NUsNaCj7DgxcvWfG6boxQ_ChXBj3NIk4lN2e7WZNvSm4_gM_okZq14EJ05zDDzmxnmf4w6VV0BQqcEDwjG9CIuw4CKgh2gLsV8mA_5iHR-8iE7Qv0YFzhZQbHgoosm41DH-PH2voEQfV1lFhyYJlsFsN40bWUdffWSNfMAcV6XNrEzHcFmOrU3kJWgQ5WIE3TodBmh_x176Pn66ml8m0_vbybjy2luqGQs524kwUjNMQARhllDZ5g7J6GwZoStERIEdXZWaEsYIdhoKyQtuCZSG2dYD53t965C_bqG2KhFvQ5VOqkYZoQTmTxRgz1l2v8COGV8o9t_mqB9qQhWrWIqKaZaxdLA-a-BVfBLHXZ_ofke3foSdv9y6vHu4Yv_BKgPf20
CitedBy_id crossref_primary_10_1016_j_ins_2024_120786
crossref_primary_10_1142_S0218194024500414
Cites_doi 10.1109/TSE.2011.103
10.1109/TSE.2018.2794977
10.1145/2393596.2393669
10.1109/TSE.2010.51
10.1145/2556777
10.1007/978‐3‐319‐66854‐3_7
10.1109/TPAMI.2014.2307881
10.1109/TSE.2008.35
10.1109/SBES.2015.9
10.1016/j.eswa.2016.05.018
10.1109/ISSRE.2014.35
10.1007/s10664‐019‐09777‐8
10.1007/s10515‐011‐0092‐1
10.1109/TSE.2017.2724538
10.1145/2970276.2970353
10.1145/2786805.2786813
10.1007/s10664‐011‐9182‐8
10.1109/ICRSE.2015.7366475
10.1145/3183339
10.1016/j.infsof.2017.11.005
10.1109/TSE.2007.256941
10.1023/A:1007330508534
10.5120/20693-3582
10.1002/smr.404
10.1007/s12065-019-00201-0
10.1109/TSE.2017.2731766
10.1145/1390156.1390183
10.1109/QRS.2016.33
10.1016/j.infsof.2014.11.006
10.1080/09540091.2022.2077913
10.23940/ijpe.20.04.p12.609617
10.1016/j.jss.2011.05.044
10.1109/TR.2020.2996261
10.1109/TSE.2014.2370048
10.1109/TR.2018.2864206
10.1109/ICSM.2015.7332511
10.1145/2365324.2365335
10.23940/ijpe.20.02.p5.203213
10.1007/s10489‐020‐01935‐6
10.1016/j.eswa.2021.116217
10.1109/TSE.2010.9
10.1016/j.infsof.2015.01.014
10.1162/153244302760185243
10.23940/ijpe.19.10.p16.27012708
10.1023/A:1010933404324
10.1007/s10664‐015‐9396‐2
10.4324/9781315806730
10.1109/ACCESS.2021.3095559
10.1587/transinf.E95.D.1680
ContentType Journal Article
Copyright 2023 John Wiley & Sons Ltd.
2024 John Wiley & Sons, Ltd.
Copyright_xml – notice: 2023 John Wiley & Sons Ltd.
– notice: 2024 John Wiley & Sons, Ltd.
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1002/smr.2563
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2047-7481
EndPage n/a
ExternalDocumentID 10_1002_smr_2563
SMR2563
Genre article
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 62172205; 62072194
GroupedDBID .3N
.4S
.GA
.Y3
05W
0R~
10A
1OC
31~
33P
3SF
50Z
52O
52U
8-0
8-1
8-3
8-4
8-5
930
A03
AAESR
AAEVG
AAHHS
AAHQN
AAMNL
AANHP
AANLZ
AAONW
AASGY
AAXRX
AAYCA
AAZKR
ABCUV
ABPVW
ACAHQ
ACBWZ
ACCFJ
ACCZN
ACPOU
ACRPL
ACXBN
ACXQS
ACYXJ
ADBBV
ADEOM
ADIZJ
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZMN
AEEZP
AEIGN
AEIMD
AEQDE
AEUQT
AEUYR
AFBPY
AFFPM
AFGKR
AFPWT
AFWVQ
AFZJQ
AHBTC
AITYG
AIURR
AIWBW
AJBDE
AJXKR
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ARCSS
ATUGU
AUFTA
AZBYB
AZFZN
BAFTC
BDRZF
BHBCM
BMNLL
BMXJE
BRXPI
BY8
D-E
D-F
DCZOG
DPXWK
DR2
DRFUL
DRSTM
EBS
EDO
EJD
F00
F01
F04
G-S
G.N
GODZA
HGLYW
HZ~
I-F
LATKE
LEEKS
LH4
LITHE
LOXES
LUTES
LW6
LYRES
MEWTI
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
N04
N05
O66
O9-
P2W
P2X
PQQKQ
Q.N
Q11
QB0
R.K
ROL
SUPJJ
TUS
W8V
W99
WBKPD
WIH
WIK
WOHZO
WXSBR
WYISQ
WZISG
~WT
AAYXX
ADMLS
AEYWJ
AGHNM
AGQPQ
AGYGG
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c2933-4f89ec9a40ee15c3dc2b04ff9e6dc80dc59e52fdb6ad13110cad59264a19acfc3
IEDL.DBID DR2
ISSN 2047-7473
IngestDate Sun Jul 13 05:35:35 EDT 2025
Tue Jul 01 01:44:44 EDT 2025
Thu Apr 24 23:03:05 EDT 2025
Wed Jan 22 17:21:27 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2933-4f89ec9a40ee15c3dc2b04ff9e6dc80dc59e52fdb6ad13110cad59264a19acfc3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-3122-8887
0000-0002-4645-2526
PQID 3031419191
PQPubID 2034650
PageCount 26
ParticipantIDs proquest_journals_3031419191
crossref_citationtrail_10_1002_smr_2563
crossref_primary_10_1002_smr_2563
wiley_primary_10_1002_smr_2563_SMR2563
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate April 2024
2024-04-00
20240401
PublicationDateYYYYMMDD 2024-04-01
PublicationDate_xml – month: 04
  year: 2024
  text: April 2024
PublicationDecade 2020
PublicationPlace Chichester
PublicationPlace_xml – name: Chichester
PublicationTitle Journal of software : evolution and process
PublicationYear 2024
Publisher Wiley Subscription Services, Inc
Publisher_xml – name: Wiley Subscription Services, Inc
References 2019; 15
2020; 16
2008; 34
2012; 19
2012; 17
2018; 44
2001; 45
2007; 33
2014; 23
2010; 22
2001
2012; E95‐D
2019; 68
2015; 41
2022; 34
2021; 9
2015; 59
2010; 36
2012
2022; 191
2011
2010
2009
1997
2008
2007
1997; 28
2006
2005
2004
1992
2012; 38
2021; 51
2018; 27
2021; 14
2022
2015; 62
2021
2019; 45
2016; 21
2018
2014; 36
2017
2020; 69
2020; 25
2016; 61
2018; 96
2016
2015
2001; 2
2014
2015; 117
2012; 85
e_1_2_9_31_1
Ibrahim DR (e_1_2_9_65_1) 2017
Arcuri A (e_1_2_9_50_1) 2011
e_1_2_9_10_1
e_1_2_9_35_1
e_1_2_9_12_1
e_1_2_9_33_1
e_1_2_9_71_1
Balcan M‐F (e_1_2_9_22_1) 2007
Pushphavathi TP (e_1_2_9_67_1) 2014
Roy N (e_1_2_9_24_1) 2001
e_1_2_9_14_1
e_1_2_9_39_1
Romano J. (e_1_2_9_52_1) 2006
e_1_2_9_37_1
e_1_2_9_58_1
e_1_2_9_18_1
e_1_2_9_41_1
e_1_2_9_64_1
e_1_2_9_20_1
Matwin MK a S (e_1_2_9_16_1) 1997
e_1_2_9_45_1
e_1_2_9_43_1
e_1_2_9_66_1
Lanza M (e_1_2_9_6_1) 2006
e_1_2_9_8_1
Wilcoxon F (e_1_2_9_49_1) 1992
e_1_2_9_4_1
Panichella A (e_1_2_9_59_1) 2014
Alves TL (e_1_2_9_7_1) 2010
e_1_2_9_28_1
e_1_2_9_47_1
Thomas Zimmermann NN (e_1_2_9_38_1) 2009
e_1_2_9_30_1
e_1_2_9_53_1
e_1_2_9_51_1
e_1_2_9_11_1
e_1_2_9_34_1
e_1_2_9_57_1
e_1_2_9_13_1
e_1_2_9_32_1
e_1_2_9_55_1
Yang X (e_1_2_9_56_1) 2015
e_1_2_9_70_1
Kumar KV (e_1_2_9_63_1) 2021
Jelihovschi E (e_1_2_9_54_1) 2014
e_1_2_9_15_1
e_1_2_9_17_1
e_1_2_9_36_1
Mei YQ (e_1_2_9_5_1) 2022
e_1_2_9_19_1
Xu Z (e_1_2_9_2_1) 2018
e_1_2_9_42_1
e_1_2_9_40_1
e_1_2_9_61_1
e_1_2_9_21_1
e_1_2_9_46_1
Mockus A (e_1_2_9_60_1) 2005
e_1_2_9_23_1
e_1_2_9_44_1
Nguyen HT (e_1_2_9_26_1) 2004
Kakkar M (e_1_2_9_68_1) 2016
e_1_2_9_3_1
e_1_2_9_9_1
e_1_2_9_25_1
e_1_2_9_27_1
e_1_2_9_48_1
e_1_2_9_69_1
e_1_2_9_29_1
Soe YN (e_1_2_9_62_1) 2018
References_xml – volume: 45
  start-page: 683
  issue: 7
  year: 2019
  end-page: 711
  article-title: The impact of automated parameter optimization on defect prediction models
  publication-title: IEEE Trans Softw Eng
– start-page: 81
  year: 2017
  end-page: 95
– volume: 61
  start-page: 106
  year: 2016
  end-page: 121
  article-title: Deriving thresholds of software metrics to predict faults on open source software: replicated case studies
  publication-title: Expert Syst Applic
– volume: 38
  start-page: 1276
  issue: 6
  year: 2012
  end-page: 1304
  article-title: A systematic literature review on fault prediction performance in software engineering
  publication-title: IEEE Trans Softw Eng
– start-page: 79
  year: 2012
  end-page: 88
– volume: 19
  start-page: 201
  issue: 2
  year: 2012
  end-page: 230
  article-title: Sample‐based software defect prediction with active and semi‐supervised learning
  publication-title: Autom Softw Eng
– volume: 15
  start-page: 2701
  issue: 10
  year: 2019
  end-page: 2708
  article-title: Active learning using uncertainty sampling and query‐by‐committee for software defect prediction
  publication-title: Int J Performability Eng
– volume: 21
  start-page: 2107
  issue: 5
  year: 2016
  end-page: 2145
  article-title: Towards building a universal defect prediction model with rank transformed predictors
  publication-title: Empir Softw Eng
– start-page: 179
  year: 1997
  end-page: 186
– volume: 62
  start-page: 67
  year: 2015
  end-page: 77
  article-title: Negative samples reduction in cross‐company software defects prediction
  publication-title: Inf Softw Technol
– start-page: 441
  year: 2001
  end-page: 448
– year: 2014
– volume: 68
  start-page: 216
  issue: 1
  year: 2019
  end-page: 236
  article-title: An approach for the prediction of number of software faults based on the dynamic selection of learning techniques
  publication-title: IEEE Trans Reliabil
– volume: 69
  start-page: 1355
  issue: 4
  year: 2020
  end-page: 1375
  article-title: WR‐ELM: weighted regularization extreme learning machine for imbalance learning in software fault prediction
  publication-title: IEEE Trans Reliabil
– start-page: 95
  year: 2021
  end-page: 103
– start-page: 1
  year: 2011
  end-page: 10
– volume: 33
  start-page: 2
  issue: 1
  year: 2007
  end-page: 13
  article-title: Data mining static code attributes to learn defect predictors
  publication-title: IEEE Trans Softw Eng
– volume: 9
  start-page: 98754
  year: 2021
  end-page: 98771
  article-title: Software defect prediction using ensemble learning: a systematic literature review
  publication-title: IEEE Access
– volume: 27
  start-page: 1
  issue: 1
  year: 2018
  end-page: 51
  article-title: How far we have progressed in the journey? An examination of cross‐project defect prediction
  publication-title: ACM Trans Softw Eng Methodol
– volume: 96
  start-page: 38
  year: 2018
  end-page: 67
  article-title: Software metrics thresholds calculation techniques to predict fault‐proneness: an empirical comparison
  publication-title: Inf Softw Technol
– start-page: 252
  year: 2017
  end-page: 257
– start-page: 35
  year: 2007
  end-page: 50
– volume: 191
  year: 2022
  article-title: Empirical investigation of hyperparameter optimization for software defect count prediction
  publication-title: Expert Syst Applic
– start-page: 1
  year: 2018
  end-page: 5
– volume: 85
  start-page: 244
  issue: 2
  year: 2012
  end-page: 257
  article-title: Identifying thresholds for object‐oriented software metrics
  publication-title: J Syst Softw
– start-page: 208
  year: 2008
  end-page: 215
– volume: 23
  start-page: 10:11
  issue: 1
  year: 2014
  end-page: 10:51
  article-title: An in‐depth study of the potentially confounding effect of class size in fault prediction
  publication-title: ACM Trans Softw Eng Methodol
– start-page: 110
  year: 2015
  end-page: 119
– volume: 45
  start-page: 5
  issue: 1
  year: 2001
  end-page: 32
  article-title: Random forests
  publication-title: Mach Learn
– start-page: 1
  year: 2015
  end-page: 10
– start-page: 312
  year: 2014
  end-page: 322
– volume: 34
  start-page: 485
  issue: 4
  year: 2008
  end-page: 496
  article-title: Benchmarking classification models for software defect prediction: a proposed framework and novel findings
  publication-title: IEEE Trans Softw Eng
– start-page: 1
  year: 2014
  end-page: 5
– volume: 44
  start-page: 811
  issue: 9
  year: 2018
  end-page: 833
  article-title: A comparative study to benchmark cross‐project defect prediction approaches
  publication-title: IEEE Trans Softw Eng
– volume: 16
  start-page: 609
  issue: 4
  year: 2020
  end-page: 617
  article-title: Active learning empirical research on cross‐version software defect prediction datasets
  publication-title: Int J Performability Eng
– volume: 28
  start-page: 133
  issue: 2
  year: 1997
  end-page: 168
  article-title: Selective sampling using the query by committee algorithm
  publication-title: Mach Learn
– volume: 34
  start-page: 1482
  issue: 1
  year: 2022
  end-page: 1499
  article-title: Using active learning selection approach for cross‐project software defect prediction
  publication-title: Connection Science
– volume: 2
  start-page: 45
  year: 2001
  end-page: 66
  article-title: Support vector machine active learning with applications to text classification
  publication-title: J Mach Learn Res
– volume: 25
  start-page: 1573
  issue: 2
  year: 2020
  end-page: 1595
  article-title: Cross‐version defect prediction: use historical data, cross‐project data, or both?
  publication-title: Empir Softw Eng
– start-page: 496
  year: 2015
  end-page: 507
– year: 2016
– start-page: 546
  year: 2015
  end-page: 550
– volume: 14
  start-page: 315
  issue: 2
  year: 2021
  end-page: 329
  article-title: Threshold estimation from software metrics by using evolutionary techniques and its proposed algorithms, models
  publication-title: Evol Intell
– start-page: 196
  year: 1992
  end-page: 202
– start-page: 79
  year: 2004
– volume: 16
  start-page: 203
  issue: 2
  year: 2020
  end-page: 213
  article-title: LAL: meta‐active learning‐based software defect prediction
  publication-title: Int J Performability Eng
– volume: 41
  start-page: 331
  issue: 4
  year: 2015
  end-page: 357
  article-title: Are slice‐based cohesion metrics actually useful in effort‐aware post‐release fault‐proneness prediction? An empirical study
  publication-title: IEEE Trans Softw Eng
– volume: 22
  start-page: 1
  issue: 1
  year: 2010
  end-page: 16
  article-title: Finding software metrics threshold values using ROC curves
  publication-title: J Softw Maintenance Evol Res Practice
– start-page: 1
  year: 2022
  end-page: 53
  article-title: Deriving object‐oriented metric thresholds: research problems, Progress, and challenges
  publication-title: Ruan Jian Xue Bao/J Softw (in Chinese)
– start-page: 1
  year: 2010
  end-page: 10
  article-title: Deriving metric thresholds from benchmark data
  publication-title: IEEE Int Conf Softw Maintenance
– start-page: 164
  year: 2014
  end-page: 173
– start-page: 1
  year: 2012
  end-page: 11
– volume: 44
  start-page: 534
  issue: 6
  year: 2018
  end-page: 550
  article-title: MAHAKIL: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction
  publication-title: IEEE Trans Softw Eng
– year: 2006
– start-page: 658
  year: 2016
  end-page: 663
– volume: 117
  start-page: 18
  year: 2015
  end-page: 22
  article-title: Improved random Forest algorithm for software defect prediction through data mining techniques
  publication-title: Int J Comput Applic
– volume: 51
  start-page: 3615
  issue: 6
  year: 2021
  end-page: 3644
  article-title: An empirical study of ensemble techniques for software fault prediction
  publication-title: Appl Intell
– start-page: 17
  year: 2015
  end-page: 26
– volume: 36
  start-page: 216
  issue: 2
  year: 2010
  end-page: 225
  article-title: A quantitative investigation of the acceptable risk levels of object‐oriented metrics in open‐source systems
  publication-title: IEEE Trans Softw Eng
– volume: E95‐D
  start-page: 1680
  issue: 6
  year: 2012
  end-page: 1683
  article-title: Active learning for software defect prediction
  publication-title: IEICE Trans Inf Syst
– volume: 36
  start-page: 852
  issue: 6
  year: 2010
  end-page: 864
  article-title: Evolutionary optimization of software quality modeling with multiple repositories
  publication-title: IEEE Trans Softw Eng
– start-page: 91
  year: 2009
  end-page: 100
– volume: 59
  start-page: 170
  year: 2015
  end-page: 190
  article-title: An empirical study on software defect prediction with a simplified metric set
  publication-title: Inf Softw Technol
– start-page: 209
  year: 2018
  end-page: 220
– volume: 36
  start-page: 1936
  issue: 10
  year: 2014
  end-page: 1949
  article-title: Active learning by querying informative and representative examples
  publication-title: IEEE Trans Pattern Anal Mach Intell
– volume: 17
  start-page: 62
  issue: 1
  year: 2012
  end-page: 74
  article-title: On the dataset shift problem in software engineering prediction models
  publication-title: Empir Softw Eng
– start-page: 225
  year: 2005
  end-page: 233
– volume-title: The ScottKnott Clustering Algorithm
  year: 2014
  ident: e_1_2_9_54_1
– ident: e_1_2_9_70_1
  doi: 10.1109/TSE.2011.103
– start-page: 1
  year: 2010
  ident: e_1_2_9_7_1
  article-title: Deriving metric thresholds from benchmark data
  publication-title: IEEE Int Conf Softw Maintenance
– ident: e_1_2_9_53_1
  doi: 10.1109/TSE.2018.2794977
– ident: e_1_2_9_12_1
  doi: 10.1145/2393596.2393669
– ident: e_1_2_9_61_1
  doi: 10.1109/TSE.2010.51
– ident: e_1_2_9_14_1
  doi: 10.1145/2556777
– start-page: 1
  volume-title: A Novel Method for Software Defect Prediction: Hybrid of FCM and Random Forest
  year: 2014
  ident: e_1_2_9_67_1
– ident: e_1_2_9_17_1
  doi: 10.1007/978‐3‐319‐66854‐3_7
– start-page: 225
  volume-title: Predictors of Customer Perceived Software Quality
  year: 2005
  ident: e_1_2_9_60_1
– ident: e_1_2_9_27_1
  doi: 10.1109/TPAMI.2014.2307881
– ident: e_1_2_9_55_1
  doi: 10.1109/TSE.2008.35
– ident: e_1_2_9_9_1
  doi: 10.1109/SBES.2015.9
– ident: e_1_2_9_18_1
  doi: 10.1016/j.eswa.2016.05.018
– ident: e_1_2_9_3_1
  doi: 10.1109/ISSRE.2014.35
– ident: e_1_2_9_44_1
  doi: 10.1007/s10664‐019‐09777‐8
– ident: e_1_2_9_4_1
  doi: 10.1007/s10515‐011‐0092‐1
– volume-title: Object‐Oriented Metrics in Practice—Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object‐Oriented Systems
  year: 2006
  ident: e_1_2_9_6_1
– start-page: 95
  volume-title: Software Fault Prediction Using Random Forests
  year: 2021
  ident: e_1_2_9_63_1
– ident: e_1_2_9_34_1
  doi: 10.1109/TSE.2017.2724538
– start-page: 209
  volume-title: Cross‐version defect prediction via hybrid active learning with kernel principal component analysis
  year: 2018
  ident: e_1_2_9_2_1
– start-page: 1
  volume-title: A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering
  year: 2011
  ident: e_1_2_9_50_1
– volume-title: Appropriate Statistics for Ordinal Level Data: Should We Really Be Using t‐test and Cohen's d for Evaluating Group Differences on the NSSE and other Surveys?
  year: 2006
  ident: e_1_2_9_52_1
– ident: e_1_2_9_58_1
  doi: 10.1145/2970276.2970353
– ident: e_1_2_9_37_1
  doi: 10.1145/2786805.2786813
– ident: e_1_2_9_45_1
  doi: 10.1007/s10664‐011‐9182‐8
– start-page: 1
  volume-title: Software Defect Prediction Using Random Forest Algorithm
  year: 2018
  ident: e_1_2_9_62_1
– ident: e_1_2_9_36_1
  doi: 10.1109/ICRSE.2015.7366475
– ident: e_1_2_9_35_1
  doi: 10.1145/3183339
– ident: e_1_2_9_19_1
  doi: 10.1016/j.infsof.2017.11.005
– ident: e_1_2_9_13_1
  doi: 10.1109/TSE.2007.256941
– ident: e_1_2_9_23_1
  doi: 10.1023/A:1007330508534
– ident: e_1_2_9_64_1
  doi: 10.5120/20693-3582
– start-page: 1
  year: 2022
  ident: e_1_2_9_5_1
  article-title: Deriving object‐oriented metric thresholds: research problems, Progress, and challenges
  publication-title: Ruan Jian Xue Bao/J Softw (in Chinese)
– start-page: 164
  volume-title: Cross‐Project Defect Prediction Models: L'Union Fait La Force
  year: 2014
  ident: e_1_2_9_59_1
– ident: e_1_2_9_11_1
  doi: 10.1002/smr.404
– ident: e_1_2_9_20_1
  doi: 10.1007/s12065-019-00201-0
– ident: e_1_2_9_47_1
  doi: 10.1109/TSE.2017.2731766
– ident: e_1_2_9_25_1
  doi: 10.1145/1390156.1390183
– ident: e_1_2_9_43_1
  doi: 10.1109/QRS.2016.33
– ident: e_1_2_9_42_1
  doi: 10.1016/j.infsof.2014.11.006
– start-page: 441
  volume-title: Proceedings of the Eighteenth International Conference on Machine Learning
  year: 2001
  ident: e_1_2_9_24_1
– ident: e_1_2_9_33_1
  doi: 10.1080/09540091.2022.2077913
– ident: e_1_2_9_30_1
  doi: 10.23940/ijpe.20.04.p12.609617
– start-page: 91
  volume-title: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’09)
  year: 2009
  ident: e_1_2_9_38_1
– start-page: 179
  volume-title: Addressing the Curse of Imbalanced Training Sets: One‐Sided Selection
  year: 1997
  ident: e_1_2_9_16_1
– start-page: 17
  volume-title: Deep Learning for Just‐in‐Time Defect Prediction
  year: 2015
  ident: e_1_2_9_56_1
– ident: e_1_2_9_8_1
  doi: 10.1016/j.jss.2011.05.044
– start-page: 252
  volume-title: Software Defect Prediction using Feature Selection and Random Forest Algorithm
  year: 2017
  ident: e_1_2_9_65_1
– ident: e_1_2_9_40_1
  doi: 10.1109/TR.2020.2996261
– ident: e_1_2_9_57_1
  doi: 10.1109/TSE.2014.2370048
– ident: e_1_2_9_41_1
  doi: 10.1109/TR.2018.2864206
– ident: e_1_2_9_10_1
  doi: 10.1109/ICSM.2015.7332511
– ident: e_1_2_9_31_1
  doi: 10.1145/2365324.2365335
– ident: e_1_2_9_28_1
  doi: 10.23940/ijpe.20.02.p5.203213
– ident: e_1_2_9_39_1
  doi: 10.1007/s10489‐020‐01935‐6
– ident: e_1_2_9_71_1
  doi: 10.1016/j.eswa.2021.116217
– ident: e_1_2_9_15_1
  doi: 10.1109/TSE.2010.9
– ident: e_1_2_9_46_1
  doi: 10.1016/j.infsof.2015.01.014
– ident: e_1_2_9_21_1
  doi: 10.1162/153244302760185243
– ident: e_1_2_9_29_1
  doi: 10.23940/ijpe.19.10.p16.27012708
– start-page: 79
  volume-title: Proceedings of the Twenty‐First International Conference on Machine Learning
  year: 2004
  ident: e_1_2_9_26_1
– ident: e_1_2_9_69_1
  doi: 10.1023/A:1010933404324
– ident: e_1_2_9_48_1
  doi: 10.1007/s10664‐015‐9396‐2
– ident: e_1_2_9_51_1
  doi: 10.4324/9781315806730
– start-page: 196
  volume-title: Individual Comparisons by Ranking Methods
  year: 1992
  ident: e_1_2_9_49_1
– start-page: 35
  volume-title: Margin based active learning
  year: 2007
  ident: e_1_2_9_22_1
– ident: e_1_2_9_66_1
  doi: 10.1109/ACCESS.2021.3095559
– start-page: 658
  volume-title: Feature Selection in Software Defect Prediction: A Comparative Study
  year: 2016
  ident: e_1_2_9_68_1
– ident: e_1_2_9_32_1
  doi: 10.1587/transinf.E95.D.1680
SSID ssj0000620545
Score 2.2892454
Snippet Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better...
SourceID proquest
crossref
wiley
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
SubjectTerms Active learning
cross‐version defect prediction (CVDP)
Defects
Labels
Learning
median
Modules
Software
Software development
Subject specialists
threshold‐based active learning (TAL)
Voting
Title Cross‐version defect prediction using threshold‐based active learning
URI https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fsmr.2563
https://www.proquest.com/docview/3031419191
Volume 36
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA7iyYv1idUqEURP2-5mH26OUixV0EO1UPCwJJNsD2ot3daDJ3-Cv9Ff4sw-WhUF8bSXCWSTTOab5Ms3jB2hx2jAbMfBYANOYEJwYgGBI-gSyWpMxyBX-7yOuv3gchAOSlYlvYUp9CHmB27kGfl-TQ6udNZaiIZmj5MmxmsS-iSqFuGhnpgfr7iRQDBCBEZBWgQImv1KetYVrart12C0QJifcWoeaDo1dld1seCX3DdnU92El2_qjf_7hzW2WuJPflYsmHW2ZEcbrFbVduClq2-yizb1-P317bk4UOPGEvGDjyd0s0OzyYkyP-RTXAwZ3WGhLYVEw1W-hfKyHsVwi_U757ftrlOWXXAAY7_vBGksLUgVuNZ6IfgGhHaDNJU2MhC7BkJpQ5EaHSlDYj0uKBNKBFbKkwpS8LfZ8uhpZHcYt_EpaNwVMImUgavT2FNKG6E1ZnkKkV6dnVTjn0CpSU6lMR6SQk1ZJDhCCY1QnR3OLceFDscPNo1qCpPSE7PEJ31-TEqlV2fH-Vz82j65uerRd_evhntsRSDGKYg8DbY8nczsPmKUqT7IV-MH8YPnIw
linkProvider Wiley-Blackwell
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDLZgHODCeIrxDBKCU0eXtqMVJ4RA47XDAIkDUpU46Q7AmPbgwImfwG_kl2D3MR4CCXHqxZHSOI4_O85ngC2yGI0U7TjkbNDxTYBOKNF3JF8iWU3hGKZsn81649o_vQluxmC_eAuT8UOMEm5sGel5zQbOCendD9bQ_kOvSg7bG4cJbuidxlMtOUqwuHVJcIRLGCWzERBs9gryWVfuFoO_uqMPjPkZqaau5rgMt8UkswqTu-pwoKv4_I2_8Z9_MQPTOQQVB9memYUx25mDctHeQeTWPg8nhzzlt5fXpyynJozl2g_R7fHlDitUcNV8WwxoP_T5Gotk2SsaodJTVOQtKdoLcH18dHXYcPLOCw6S-_ccPwkji5HyXWtrAXoGpXb9JIls3WDoGgwiG8jE6LoyzNfjojJBRNhK1SKFCXqLUOo8duwSCBvuoaaDgeLIyHd1EtaU0kZqTYGeIrBXgZ1CATHmtOTcHeM-zgiVZUwrFPMKVWBzJNnNqDh-kFktdBjnxtiPPabop7g0qlVgO1XGr-Pjy4sWf5f_KrgBk42ri_P4_KR5tgJTkiBPVtezCqVBb2jXCLIM9Hq6Nd8BU8PrPg
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEB60gnjxLVarRhA9bd1mH-4eRS2-ER8geFiSSbYHtZa2evDkT_A3-kuc2Ud9oCCe9jKBbDKPb5LJNwBrZDEaKdtxKNig45sAnUii70i-RLKa0jHM2D5Pw_0r__A6uC6qKvktTM4PMThwY8vI_DUbeMekmx-kob37bp3itTcMI37oRqzRu-dycL7ihpLQCFcwSiYjINTsldyzrtwsB3-NRh8Q8zNQzSJNcwJuyjnmBSa39ce-ruPzN_rG__3EJIwXAFRs5xozBUO2PQ0TZXMHUdj6DBzs8IzfXl6f8hM1YSxXfohOl692eDsF18y3RJ-0oceXWCTLMdEIlflQUTSkaM3CVXPvcmffKfouOEjB33P8NIotxsp3rW0E6BmU2vXTNLahwcg1GMQ2kKnRoTLM1uOiMkFMyEo1YoUpenNQaT-07TwIG22hJrdAWWTsuzqNGkppI7WmNE8R1KvCRrn-CRak5Nwb4y7J6ZRlQiuU8ApVYXUg2cmJOH6QqZVbmBSm2Es8JuinrDRuVGE924tfxycXJ-f8Xfir4AqMnu02k-OD06NFGJOEd_KinhpU-t1Hu0R4pa-XM8V8B84f6fY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cross%E2%80%90version+defect+prediction+using+threshold%E2%80%90based+active+learning&rft.jtitle=Journal+of+software+%3A+evolution+and+process&rft.au=Mei%2C+Yuanqing&rft.au=Liu%2C+Xutong&rft.au=Lu%2C+Zeyu&rft.au=Yang%2C+Yibiao&rft.date=2024-04-01&rft.issn=2047-7473&rft.eissn=2047-7481&rft.volume=36&rft.issue=4&rft.epage=n%2Fa&rft_id=info:doi/10.1002%2Fsmr.2563&rft.externalDBID=10.1002%252Fsmr.2563&rft.externalDocID=SMR2563
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2047-7473&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2047-7473&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2047-7473&client=summon