Cross‐version defect prediction using threshold‐based active learning

Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect ch...

Full description

Saved in:

Bibliographic Details
Published in	Journal of software : evolution and process Vol. 36; no. 4
Main Authors	Mei, Yuanqing, Liu, Xutong, Lu, Zeyu, Yang, Yibiao, Liu, Huihui, Zhou, Yuming
Format	Journal Article
Language	English
Published	Chichester Wiley Subscription Services, Inc 01.04.2024
Subjects	Active learning cross‐version defect prediction (CVDP) Defects Labels Learning median Modules Software Software development Subject specialists threshold‐based active learning (TAL) Voting
Online Access	Get full text

Cover

Loading…

Abstract	Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction. We propose a threshold‐based active learning (TAL) approach to address the problem of underperformance of defect prediction due to the different data distributions between successive versions. TAL can actively select the unlabeled modules from the current version to domain experts for labeling and merge them into the prior version to mitigate the different data distributions. The results of our extensive experiments showed that TAL outperforms the baseline methods, including three variants, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP).
AbstractList	Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction. We propose a threshold‐based active learning (TAL) approach to address the problem of underperformance of defect prediction due to the different data distributions between successive versions. TAL can actively select the unlabeled modules from the current version to domain experts for labeling and merge them into the prior version to mitigate the different data distributions. The results of our extensive experiments showed that TAL outperforms the baseline methods, including three variants, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction.
Author	Zhou, Yuming Lu, Zeyu Liu, Huihui Liu, Xutong Mei, Yuanqing Yang, Yibiao
Author_xml	– sequence: 1 givenname: Yuanqing orcidid: 0000-0003-3122-8887 surname: Mei fullname: Mei, Yuanqing organization: Nanjing University – sequence: 2 givenname: Xutong surname: Liu fullname: Liu, Xutong organization: Nanjing University – sequence: 3 givenname: Zeyu surname: Lu fullname: Lu, Zeyu organization: Nanjing University – sequence: 4 givenname: Yibiao surname: Yang fullname: Yang, Yibiao email: yangyibiao@nju.edu.cn organization: Nanjing University – sequence: 5 givenname: Huihui surname: Liu fullname: Liu, Huihui organization: Nanjing University – sequence: 6 givenname: Yuming orcidid: 0000-0002-4645-2526 surname: Zhou fullname: Zhou, Yuming email: zhouyuming@nju.edu.cn organization: Nanjing University
BookMark	eNp1kM1KAzEQx4NUsNaCj7DgxcvWfG6boxQ_ChXBj3NIk4lN2e7WZNvSm4_gM_okZq14EJ05zDDzmxnmf4w6VV0BQqcEDwjG9CIuw4CKgh2gLsV8mA_5iHR-8iE7Qv0YFzhZQbHgoosm41DH-PH2voEQfV1lFhyYJlsFsN40bWUdffWSNfMAcV6XNrEzHcFmOrU3kJWgQ5WIE3TodBmh_x176Pn66ml8m0_vbybjy2luqGQs524kwUjNMQARhllDZ5g7J6GwZoStERIEdXZWaEsYIdhoKyQtuCZSG2dYD53t965C_bqG2KhFvQ5VOqkYZoQTmTxRgz1l2v8COGV8o9t_mqB9qQhWrWIqKaZaxdLA-a-BVfBLHXZ_ofke3foSdv9y6vHu4Yv_BKgPf20
CitedBy_id	crossref_primary_10_1016_j_ins_2024_120786 crossref_primary_10_1142_S0218194024500414
Cites_doi	10.1109/TSE.2011.103 10.1109/TSE.2018.2794977 10.1145/2393596.2393669 10.1109/TSE.2010.51 10.1145/2556777 10.1007/978‐3‐319‐66854‐3_7 10.1109/TPAMI.2014.2307881 10.1109/TSE.2008.35 10.1109/SBES.2015.9 10.1016/j.eswa.2016.05.018 10.1109/ISSRE.2014.35 10.1007/s10664‐019‐09777‐8 10.1007/s10515‐011‐0092‐1 10.1109/TSE.2017.2724538 10.1145/2970276.2970353 10.1145/2786805.2786813 10.1007/s10664‐011‐9182‐8 10.1109/ICRSE.2015.7366475 10.1145/3183339 10.1016/j.infsof.2017.11.005 10.1109/TSE.2007.256941 10.1023/A:1007330508534 10.5120/20693-3582 10.1002/smr.404 10.1007/s12065-019-00201-0 10.1109/TSE.2017.2731766 10.1145/1390156.1390183 10.1109/QRS.2016.33 10.1016/j.infsof.2014.11.006 10.1080/09540091.2022.2077913 10.23940/ijpe.20.04.p12.609617 10.1016/j.jss.2011.05.044 10.1109/TR.2020.2996261 10.1109/TSE.2014.2370048 10.1109/TR.2018.2864206 10.1109/ICSM.2015.7332511 10.1145/2365324.2365335 10.23940/ijpe.20.02.p5.203213 10.1007/s10489‐020‐01935‐6 10.1016/j.eswa.2021.116217 10.1109/TSE.2010.9 10.1016/j.infsof.2015.01.014 10.1162/153244302760185243 10.23940/ijpe.19.10.p16.27012708 10.1023/A:1010933404324 10.1007/s10664‐015‐9396‐2 10.4324/9781315806730 10.1109/ACCESS.2021.3095559 10.1587/transinf.E95.D.1680
ContentType	Journal Article
Copyright	2023 John Wiley & Sons Ltd. 2024 John Wiley & Sons, Ltd.
Copyright_xml	– notice: 2023 John Wiley & Sons Ltd. – notice: 2024 John Wiley & Sons, Ltd.
DBID	AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D
DOI	10.1002/smr.2563
DatabaseName	CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional
DatabaseTitleList	Computer and Information Systems Abstracts CrossRef
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	2047-7481
EndPage	n/a
ExternalDocumentID	10_1002_smr_2563 SMR2563
Genre	article
GrantInformation_xml	– fundername: National Natural Science Foundation of China funderid: 62172205; 62072194
GroupedDBID	.3N .4S .GA .Y3 05W 0R~ 10A 1OC 31~ 33P 3SF 50Z 52O 52U 8-0 8-1 8-3 8-4 8-5 930 A03 AAESR AAEVG AAHHS AAHQN AAMNL AANHP AANLZ AAONW AASGY AAXRX AAYCA AAZKR ABCUV ABPVW ACAHQ ACBWZ ACCFJ ACCZN ACPOU ACRPL ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADKYN ADMGS ADNMO ADOZA ADXAS ADZMN AEEZP AEIGN AEIMD AEQDE AEUQT AEUYR AFBPY AFFPM AFGKR AFPWT AFWVQ AFZJQ AHBTC AITYG AIURR AIWBW AJBDE AJXKR ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMYDB ARCSS ATUGU AUFTA AZBYB AZFZN BAFTC BDRZF BHBCM BMNLL BMXJE BRXPI BY8 D-E D-F DCZOG DPXWK DR2 DRFUL DRSTM EBS EDO EJD F00 F01 F04 G-S G.N GODZA HGLYW HZ~ I-F LATKE LEEKS LH4 LITHE LOXES LUTES LW6 LYRES MEWTI MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 O66 O9- P2W P2X PQQKQ Q.N Q11 QB0 R.K ROL SUPJJ TUS W8V W99 WBKPD WIH WIK WOHZO WXSBR WYISQ WZISG ~WT AAYXX ADMLS AEYWJ AGHNM AGQPQ AGYGG CITATION 7SC 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c2933-4f89ec9a40ee15c3dc2b04ff9e6dc80dc59e52fdb6ad13110cad59264a19acfc3
IEDL.DBID	DR2
ISSN	2047-7473
IngestDate	Sun Jul 13 05:35:35 EDT 2025 Tue Jul 01 01:44:44 EDT 2025 Thu Apr 24 23:03:05 EDT 2025 Wed Jan 22 17:21:27 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	4
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c2933-4f89ec9a40ee15c3dc2b04ff9e6dc80dc59e52fdb6ad13110cad59264a19acfc3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0003-3122-8887 0000-0002-4645-2526
PQID	3031419191
PQPubID	2034650
PageCount	26
ParticipantIDs	proquest_journals_3031419191 crossref_citationtrail_10_1002_smr_2563 crossref_primary_10_1002_smr_2563 wiley_primary_10_1002_smr_2563_SMR2563
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	April 2024 2024-04-00 20240401
PublicationDateYYYYMMDD	2024-04-01
PublicationDate_xml	– month: 04 year: 2024 text: April 2024
PublicationDecade	2020
PublicationPlace	Chichester
PublicationPlace_xml	– name: Chichester
PublicationTitle	Journal of software : evolution and process
PublicationYear	2024
Publisher	Wiley Subscription Services, Inc
Publisher_xml	– name: Wiley Subscription Services, Inc
References	2019; 15 2020; 16 2008; 34 2012; 19 2012; 17 2018; 44 2001; 45 2007; 33 2014; 23 2010; 22 2001 2012; E95‐D 2019; 68 2015; 41 2022; 34 2021; 9 2015; 59 2010; 36 2012 2022; 191 2011 2010 2009 1997 2008 2007 1997; 28 2006 2005 2004 1992 2012; 38 2021; 51 2018; 27 2021; 14 2022 2015; 62 2021 2019; 45 2016; 21 2018 2014; 36 2017 2020; 69 2020; 25 2016; 61 2018; 96 2016 2015 2001; 2 2014 2015; 117 2012; 85 e_1_2_9_31_1 Ibrahim DR (e_1_2_9_65_1) 2017 Arcuri A (e_1_2_9_50_1) 2011 e_1_2_9_10_1 e_1_2_9_35_1 e_1_2_9_12_1 e_1_2_9_33_1 e_1_2_9_71_1 Balcan M‐F (e_1_2_9_22_1) 2007 Pushphavathi TP (e_1_2_9_67_1) 2014 Roy N (e_1_2_9_24_1) 2001 e_1_2_9_14_1 e_1_2_9_39_1 Romano J. (e_1_2_9_52_1) 2006 e_1_2_9_37_1 e_1_2_9_58_1 e_1_2_9_18_1 e_1_2_9_41_1 e_1_2_9_64_1 e_1_2_9_20_1 Matwin MK a S (e_1_2_9_16_1) 1997 e_1_2_9_45_1 e_1_2_9_43_1 e_1_2_9_66_1 Lanza M (e_1_2_9_6_1) 2006 e_1_2_9_8_1 Wilcoxon F (e_1_2_9_49_1) 1992 e_1_2_9_4_1 Panichella A (e_1_2_9_59_1) 2014 Alves TL (e_1_2_9_7_1) 2010 e_1_2_9_28_1 e_1_2_9_47_1 Thomas Zimmermann NN (e_1_2_9_38_1) 2009 e_1_2_9_30_1 e_1_2_9_53_1 e_1_2_9_51_1 e_1_2_9_11_1 e_1_2_9_34_1 e_1_2_9_57_1 e_1_2_9_13_1 e_1_2_9_32_1 e_1_2_9_55_1 Yang X (e_1_2_9_56_1) 2015 e_1_2_9_70_1 Kumar KV (e_1_2_9_63_1) 2021 Jelihovschi E (e_1_2_9_54_1) 2014 e_1_2_9_15_1 e_1_2_9_17_1 e_1_2_9_36_1 Mei YQ (e_1_2_9_5_1) 2022 e_1_2_9_19_1 Xu Z (e_1_2_9_2_1) 2018 e_1_2_9_42_1 e_1_2_9_40_1 e_1_2_9_61_1 e_1_2_9_21_1 e_1_2_9_46_1 Mockus A (e_1_2_9_60_1) 2005 e_1_2_9_23_1 e_1_2_9_44_1 Nguyen HT (e_1_2_9_26_1) 2004 Kakkar M (e_1_2_9_68_1) 2016 e_1_2_9_3_1 e_1_2_9_9_1 e_1_2_9_25_1 e_1_2_9_27_1 e_1_2_9_48_1 e_1_2_9_69_1 e_1_2_9_29_1 Soe YN (e_1_2_9_62_1) 2018
References_xml	– volume: 45 start-page: 683 issue: 7 year: 2019 end-page: 711 article-title: The impact of automated parameter optimization on defect prediction models publication-title: IEEE Trans Softw Eng – start-page: 81 year: 2017 end-page: 95 – volume: 61 start-page: 106 year: 2016 end-page: 121 article-title: Deriving thresholds of software metrics to predict faults on open source software: replicated case studies publication-title: Expert Syst Applic – volume: 38 start-page: 1276 issue: 6 year: 2012 end-page: 1304 article-title: A systematic literature review on fault prediction performance in software engineering publication-title: IEEE Trans Softw Eng – start-page: 79 year: 2012 end-page: 88 – volume: 19 start-page: 201 issue: 2 year: 2012 end-page: 230 article-title: Sample‐based software defect prediction with active and semi‐supervised learning publication-title: Autom Softw Eng – volume: 15 start-page: 2701 issue: 10 year: 2019 end-page: 2708 article-title: Active learning using uncertainty sampling and query‐by‐committee for software defect prediction publication-title: Int J Performability Eng – volume: 21 start-page: 2107 issue: 5 year: 2016 end-page: 2145 article-title: Towards building a universal defect prediction model with rank transformed predictors publication-title: Empir Softw Eng – start-page: 179 year: 1997 end-page: 186 – volume: 62 start-page: 67 year: 2015 end-page: 77 article-title: Negative samples reduction in cross‐company software defects prediction publication-title: Inf Softw Technol – start-page: 441 year: 2001 end-page: 448 – year: 2014 – volume: 68 start-page: 216 issue: 1 year: 2019 end-page: 236 article-title: An approach for the prediction of number of software faults based on the dynamic selection of learning techniques publication-title: IEEE Trans Reliabil – volume: 69 start-page: 1355 issue: 4 year: 2020 end-page: 1375 article-title: WR‐ELM: weighted regularization extreme learning machine for imbalance learning in software fault prediction publication-title: IEEE Trans Reliabil – start-page: 95 year: 2021 end-page: 103 – start-page: 1 year: 2011 end-page: 10 – volume: 33 start-page: 2 issue: 1 year: 2007 end-page: 13 article-title: Data mining static code attributes to learn defect predictors publication-title: IEEE Trans Softw Eng – volume: 9 start-page: 98754 year: 2021 end-page: 98771 article-title: Software defect prediction using ensemble learning: a systematic literature review publication-title: IEEE Access – volume: 27 start-page: 1 issue: 1 year: 2018 end-page: 51 article-title: How far we have progressed in the journey? An examination of cross‐project defect prediction publication-title: ACM Trans Softw Eng Methodol – volume: 96 start-page: 38 year: 2018 end-page: 67 article-title: Software metrics thresholds calculation techniques to predict fault‐proneness: an empirical comparison publication-title: Inf Softw Technol – start-page: 252 year: 2017 end-page: 257 – start-page: 35 year: 2007 end-page: 50 – volume: 191 year: 2022 article-title: Empirical investigation of hyperparameter optimization for software defect count prediction publication-title: Expert Syst Applic – start-page: 1 year: 2018 end-page: 5 – volume: 85 start-page: 244 issue: 2 year: 2012 end-page: 257 article-title: Identifying thresholds for object‐oriented software metrics publication-title: J Syst Softw – start-page: 208 year: 2008 end-page: 215 – volume: 23 start-page: 10:11 issue: 1 year: 2014 end-page: 10:51 article-title: An in‐depth study of the potentially confounding effect of class size in fault prediction publication-title: ACM Trans Softw Eng Methodol – start-page: 110 year: 2015 end-page: 119 – volume: 45 start-page: 5 issue: 1 year: 2001 end-page: 32 article-title: Random forests publication-title: Mach Learn – start-page: 1 year: 2015 end-page: 10 – start-page: 312 year: 2014 end-page: 322 – volume: 34 start-page: 485 issue: 4 year: 2008 end-page: 496 article-title: Benchmarking classification models for software defect prediction: a proposed framework and novel findings publication-title: IEEE Trans Softw Eng – start-page: 1 year: 2014 end-page: 5 – volume: 44 start-page: 811 issue: 9 year: 2018 end-page: 833 article-title: A comparative study to benchmark cross‐project defect prediction approaches publication-title: IEEE Trans Softw Eng – volume: 16 start-page: 609 issue: 4 year: 2020 end-page: 617 article-title: Active learning empirical research on cross‐version software defect prediction datasets publication-title: Int J Performability Eng – volume: 28 start-page: 133 issue: 2 year: 1997 end-page: 168 article-title: Selective sampling using the query by committee algorithm publication-title: Mach Learn – volume: 34 start-page: 1482 issue: 1 year: 2022 end-page: 1499 article-title: Using active learning selection approach for cross‐project software defect prediction publication-title: Connection Science – volume: 2 start-page: 45 year: 2001 end-page: 66 article-title: Support vector machine active learning with applications to text classification publication-title: J Mach Learn Res – volume: 25 start-page: 1573 issue: 2 year: 2020 end-page: 1595 article-title: Cross‐version defect prediction: use historical data, cross‐project data, or both? publication-title: Empir Softw Eng – start-page: 496 year: 2015 end-page: 507 – year: 2016 – start-page: 546 year: 2015 end-page: 550 – volume: 14 start-page: 315 issue: 2 year: 2021 end-page: 329 article-title: Threshold estimation from software metrics by using evolutionary techniques and its proposed algorithms, models publication-title: Evol Intell – start-page: 196 year: 1992 end-page: 202 – start-page: 79 year: 2004 – volume: 16 start-page: 203 issue: 2 year: 2020 end-page: 213 article-title: LAL: meta‐active learning‐based software defect prediction publication-title: Int J Performability Eng – volume: 41 start-page: 331 issue: 4 year: 2015 end-page: 357 article-title: Are slice‐based cohesion metrics actually useful in effort‐aware post‐release fault‐proneness prediction? An empirical study publication-title: IEEE Trans Softw Eng – volume: 22 start-page: 1 issue: 1 year: 2010 end-page: 16 article-title: Finding software metrics threshold values using ROC curves publication-title: J Softw Maintenance Evol Res Practice – start-page: 1 year: 2022 end-page: 53 article-title: Deriving object‐oriented metric thresholds: research problems, Progress, and challenges publication-title: Ruan Jian Xue Bao/J Softw (in Chinese) – start-page: 1 year: 2010 end-page: 10 article-title: Deriving metric thresholds from benchmark data publication-title: IEEE Int Conf Softw Maintenance – start-page: 164 year: 2014 end-page: 173 – start-page: 1 year: 2012 end-page: 11 – volume: 44 start-page: 534 issue: 6 year: 2018 end-page: 550 article-title: MAHAKIL: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction publication-title: IEEE Trans Softw Eng – year: 2006 – start-page: 658 year: 2016 end-page: 663 – volume: 117 start-page: 18 year: 2015 end-page: 22 article-title: Improved random Forest algorithm for software defect prediction through data mining techniques publication-title: Int J Comput Applic – volume: 51 start-page: 3615 issue: 6 year: 2021 end-page: 3644 article-title: An empirical study of ensemble techniques for software fault prediction publication-title: Appl Intell – start-page: 17 year: 2015 end-page: 26 – volume: 36 start-page: 216 issue: 2 year: 2010 end-page: 225 article-title: A quantitative investigation of the acceptable risk levels of object‐oriented metrics in open‐source systems publication-title: IEEE Trans Softw Eng – volume: E95‐D start-page: 1680 issue: 6 year: 2012 end-page: 1683 article-title: Active learning for software defect prediction publication-title: IEICE Trans Inf Syst – volume: 36 start-page: 852 issue: 6 year: 2010 end-page: 864 article-title: Evolutionary optimization of software quality modeling with multiple repositories publication-title: IEEE Trans Softw Eng – start-page: 91 year: 2009 end-page: 100 – volume: 59 start-page: 170 year: 2015 end-page: 190 article-title: An empirical study on software defect prediction with a simplified metric set publication-title: Inf Softw Technol – start-page: 209 year: 2018 end-page: 220 – volume: 36 start-page: 1936 issue: 10 year: 2014 end-page: 1949 article-title: Active learning by querying informative and representative examples publication-title: IEEE Trans Pattern Anal Mach Intell – volume: 17 start-page: 62 issue: 1 year: 2012 end-page: 74 article-title: On the dataset shift problem in software engineering prediction models publication-title: Empir Softw Eng – start-page: 225 year: 2005 end-page: 233 – volume-title: The ScottKnott Clustering Algorithm year: 2014 ident: e_1_2_9_54_1 – ident: e_1_2_9_70_1 doi: 10.1109/TSE.2011.103 – start-page: 1 year: 2010 ident: e_1_2_9_7_1 article-title: Deriving metric thresholds from benchmark data publication-title: IEEE Int Conf Softw Maintenance – ident: e_1_2_9_53_1 doi: 10.1109/TSE.2018.2794977 – ident: e_1_2_9_12_1 doi: 10.1145/2393596.2393669 – ident: e_1_2_9_61_1 doi: 10.1109/TSE.2010.51 – ident: e_1_2_9_14_1 doi: 10.1145/2556777 – start-page: 1 volume-title: A Novel Method for Software Defect Prediction: Hybrid of FCM and Random Forest year: 2014 ident: e_1_2_9_67_1 – ident: e_1_2_9_17_1 doi: 10.1007/978‐3‐319‐66854‐3_7 – start-page: 225 volume-title: Predictors of Customer Perceived Software Quality year: 2005 ident: e_1_2_9_60_1 – ident: e_1_2_9_27_1 doi: 10.1109/TPAMI.2014.2307881 – ident: e_1_2_9_55_1 doi: 10.1109/TSE.2008.35 – ident: e_1_2_9_9_1 doi: 10.1109/SBES.2015.9 – ident: e_1_2_9_18_1 doi: 10.1016/j.eswa.2016.05.018 – ident: e_1_2_9_3_1 doi: 10.1109/ISSRE.2014.35 – ident: e_1_2_9_44_1 doi: 10.1007/s10664‐019‐09777‐8 – ident: e_1_2_9_4_1 doi: 10.1007/s10515‐011‐0092‐1 – volume-title: Object‐Oriented Metrics in Practice—Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object‐Oriented Systems year: 2006 ident: e_1_2_9_6_1 – start-page: 95 volume-title: Software Fault Prediction Using Random Forests year: 2021 ident: e_1_2_9_63_1 – ident: e_1_2_9_34_1 doi: 10.1109/TSE.2017.2724538 – start-page: 209 volume-title: Cross‐version defect prediction via hybrid active learning with kernel principal component analysis year: 2018 ident: e_1_2_9_2_1 – start-page: 1 volume-title: A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering year: 2011 ident: e_1_2_9_50_1 – volume-title: Appropriate Statistics for Ordinal Level Data: Should We Really Be Using t‐test and Cohen's d for Evaluating Group Differences on the NSSE and other Surveys? year: 2006 ident: e_1_2_9_52_1 – ident: e_1_2_9_58_1 doi: 10.1145/2970276.2970353 – ident: e_1_2_9_37_1 doi: 10.1145/2786805.2786813 – ident: e_1_2_9_45_1 doi: 10.1007/s10664‐011‐9182‐8 – start-page: 1 volume-title: Software Defect Prediction Using Random Forest Algorithm year: 2018 ident: e_1_2_9_62_1 – ident: e_1_2_9_36_1 doi: 10.1109/ICRSE.2015.7366475 – ident: e_1_2_9_35_1 doi: 10.1145/3183339 – ident: e_1_2_9_19_1 doi: 10.1016/j.infsof.2017.11.005 – ident: e_1_2_9_13_1 doi: 10.1109/TSE.2007.256941 – ident: e_1_2_9_23_1 doi: 10.1023/A:1007330508534 – ident: e_1_2_9_64_1 doi: 10.5120/20693-3582 – start-page: 1 year: 2022 ident: e_1_2_9_5_1 article-title: Deriving object‐oriented metric thresholds: research problems, Progress, and challenges publication-title: Ruan Jian Xue Bao/J Softw (in Chinese) – start-page: 164 volume-title: Cross‐Project Defect Prediction Models: L'Union Fait La Force year: 2014 ident: e_1_2_9_59_1 – ident: e_1_2_9_11_1 doi: 10.1002/smr.404 – ident: e_1_2_9_20_1 doi: 10.1007/s12065-019-00201-0 – ident: e_1_2_9_47_1 doi: 10.1109/TSE.2017.2731766 – ident: e_1_2_9_25_1 doi: 10.1145/1390156.1390183 – ident: e_1_2_9_43_1 doi: 10.1109/QRS.2016.33 – ident: e_1_2_9_42_1 doi: 10.1016/j.infsof.2014.11.006 – start-page: 441 volume-title: Proceedings of the Eighteenth International Conference on Machine Learning year: 2001 ident: e_1_2_9_24_1 – ident: e_1_2_9_33_1 doi: 10.1080/09540091.2022.2077913 – ident: e_1_2_9_30_1 doi: 10.23940/ijpe.20.04.p12.609617 – start-page: 91 volume-title: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’09) year: 2009 ident: e_1_2_9_38_1 – start-page: 179 volume-title: Addressing the Curse of Imbalanced Training Sets: One‐Sided Selection year: 1997 ident: e_1_2_9_16_1 – start-page: 17 volume-title: Deep Learning for Just‐in‐Time Defect Prediction year: 2015 ident: e_1_2_9_56_1 – ident: e_1_2_9_8_1 doi: 10.1016/j.jss.2011.05.044 – start-page: 252 volume-title: Software Defect Prediction using Feature Selection and Random Forest Algorithm year: 2017 ident: e_1_2_9_65_1 – ident: e_1_2_9_40_1 doi: 10.1109/TR.2020.2996261 – ident: e_1_2_9_57_1 doi: 10.1109/TSE.2014.2370048 – ident: e_1_2_9_41_1 doi: 10.1109/TR.2018.2864206 – ident: e_1_2_9_10_1 doi: 10.1109/ICSM.2015.7332511 – ident: e_1_2_9_31_1 doi: 10.1145/2365324.2365335 – ident: e_1_2_9_28_1 doi: 10.23940/ijpe.20.02.p5.203213 – ident: e_1_2_9_39_1 doi: 10.1007/s10489‐020‐01935‐6 – ident: e_1_2_9_71_1 doi: 10.1016/j.eswa.2021.116217 – ident: e_1_2_9_15_1 doi: 10.1109/TSE.2010.9 – ident: e_1_2_9_46_1 doi: 10.1016/j.infsof.2015.01.014 – ident: e_1_2_9_21_1 doi: 10.1162/153244302760185243 – ident: e_1_2_9_29_1 doi: 10.23940/ijpe.19.10.p16.27012708 – start-page: 79 volume-title: Proceedings of the Twenty‐First International Conference on Machine Learning year: 2004 ident: e_1_2_9_26_1 – ident: e_1_2_9_69_1 doi: 10.1023/A:1010933404324 – ident: e_1_2_9_48_1 doi: 10.1007/s10664‐015‐9396‐2 – ident: e_1_2_9_51_1 doi: 10.4324/9781315806730 – start-page: 196 volume-title: Individual Comparisons by Ranking Methods year: 1992 ident: e_1_2_9_49_1 – start-page: 35 volume-title: Margin based active learning year: 2007 ident: e_1_2_9_22_1 – ident: e_1_2_9_66_1 doi: 10.1109/ACCESS.2021.3095559 – start-page: 658 volume-title: Feature Selection in Software Defect Prediction: A Comparative Study year: 2016 ident: e_1_2_9_68_1 – ident: e_1_2_9_32_1 doi: 10.1587/transinf.E95.D.1680
SSID	ssj0000620545
Score	2.2892454
Snippet	Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better...
SourceID	proquest crossref wiley
SourceType	Aggregation Database Enrichment Source Index Database Publisher
SubjectTerms	Active learning cross‐version defect prediction (CVDP) Defects Labels Learning median Modules Software Software development Subject specialists threshold‐based active learning (TAL) Voting
Title	Cross‐version defect prediction using threshold‐based active learning
URI	https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fsmr.2563 https://www.proquest.com/docview/3031419191
Volume	36
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA7iyYv1idUqEURP2-5mH26OUixV0EO1UPCwJJNsD2ot3daDJ3-Cv9Ff4sw-WhUF8bSXCWSTTOab5Ms3jB2hx2jAbMfBYANOYEJwYgGBI-gSyWpMxyBX-7yOuv3gchAOSlYlvYUp9CHmB27kGfl-TQ6udNZaiIZmj5MmxmsS-iSqFuGhnpgfr7iRQDBCBEZBWgQImv1KetYVrart12C0QJifcWoeaDo1dld1seCX3DdnU92El2_qjf_7hzW2WuJPflYsmHW2ZEcbrFbVduClq2-yizb1-P317bk4UOPGEvGDjyd0s0OzyYkyP-RTXAwZ3WGhLYVEw1W-hfKyHsVwi_U757ftrlOWXXAAY7_vBGksLUgVuNZ6IfgGhHaDNJU2MhC7BkJpQ5EaHSlDYj0uKBNKBFbKkwpS8LfZ8uhpZHcYt_EpaNwVMImUgavT2FNKG6E1ZnkKkV6dnVTjn0CpSU6lMR6SQk1ZJDhCCY1QnR3OLceFDscPNo1qCpPSE7PEJ31-TEqlV2fH-Vz82j65uerRd_evhntsRSDGKYg8DbY8nczsPmKUqT7IV-MH8YPnIw
linkProvider	Wiley-Blackwell
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDLZgHODCeIrxDBKCU0eXtqMVJ4RA47XDAIkDUpU46Q7AmPbgwImfwG_kl2D3MR4CCXHqxZHSOI4_O85ngC2yGI0U7TjkbNDxTYBOKNF3JF8iWU3hGKZsn81649o_vQluxmC_eAuT8UOMEm5sGel5zQbOCendD9bQ_kOvSg7bG4cJbuidxlMtOUqwuHVJcIRLGCWzERBs9gryWVfuFoO_uqMPjPkZqaau5rgMt8UkswqTu-pwoKv4_I2_8Z9_MQPTOQQVB9memYUx25mDctHeQeTWPg8nhzzlt5fXpyynJozl2g_R7fHlDitUcNV8WwxoP_T5Gotk2SsaodJTVOQtKdoLcH18dHXYcPLOCw6S-_ccPwkji5HyXWtrAXoGpXb9JIls3WDoGgwiG8jE6LoyzNfjojJBRNhK1SKFCXqLUOo8duwSCBvuoaaDgeLIyHd1EtaU0kZqTYGeIrBXgZ1CATHmtOTcHeM-zgiVZUwrFPMKVWBzJNnNqDh-kFktdBjnxtiPPabop7g0qlVgO1XGr-Pjy4sWf5f_KrgBk42ri_P4_KR5tgJTkiBPVtezCqVBb2jXCLIM9Hq6Nd8BU8PrPg
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEB60gnjxLVarRhA9bd1mH-4eRS2-ER8geFiSSbYHtZa2evDkT_A3-kuc2Ud9oCCe9jKBbDKPb5LJNwBrZDEaKdtxKNig45sAnUii70i-RLKa0jHM2D5Pw_0r__A6uC6qKvktTM4PMThwY8vI_DUbeMekmx-kob37bp3itTcMI37oRqzRu-dycL7ihpLQCFcwSiYjINTsldyzrtwsB3-NRh8Q8zNQzSJNcwJuyjnmBSa39ce-ruPzN_rG__3EJIwXAFRs5xozBUO2PQ0TZXMHUdj6DBzs8IzfXl6f8hM1YSxXfohOl692eDsF18y3RJ-0oceXWCTLMdEIlflQUTSkaM3CVXPvcmffKfouOEjB33P8NIotxsp3rW0E6BmU2vXTNLahwcg1GMQ2kKnRoTLM1uOiMkFMyEo1YoUpenNQaT-07TwIG22hJrdAWWTsuzqNGkppI7WmNE8R1KvCRrn-CRak5Nwb4y7J6ZRlQiuU8ApVYXUg2cmJOH6QqZVbmBSm2Es8JuinrDRuVGE924tfxycXJ-f8Xfir4AqMnu02k-OD06NFGJOEd_KinhpU-t1Hu0R4pa-XM8V8B84f6fY
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cross%E2%80%90version+defect+prediction+using+threshold%E2%80%90based+active+learning&rft.jtitle=Journal+of+software+%3A+evolution+and+process&rft.au=Mei%2C+Yuanqing&rft.au=Liu%2C+Xutong&rft.au=Lu%2C+Zeyu&rft.au=Yang%2C+Yibiao&rft.date=2024-04-01&rft.issn=2047-7473&rft.eissn=2047-7481&rft.volume=36&rft.issue=4&rft.epage=n%2Fa&rft_id=info:doi/10.1002%2Fsmr.2563&rft.externalDBID=10.1002%252Fsmr.2563&rft.externalDocID=SMR2563
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2047-7473&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2047-7473&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2047-7473&client=summon