Cross‐version defect prediction using threshold‐based active learning
Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect ch...
Saved in:
Published in | Journal of software : evolution and process Vol. 36; no. 4 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Chichester
Wiley Subscription Services, Inc
01.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction.
We propose a threshold‐based active learning (TAL) approach to address the problem of underperformance of defect prediction due to the different data distributions between successive versions. TAL can actively select the unlabeled modules from the current version to domain experts for labeling and merge them into the prior version to mitigate the different data distributions. The results of our extensive experiments showed that TAL outperforms the baseline methods, including three variants, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). |
---|---|
AbstractList | Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction.
We propose a threshold‐based active learning (TAL) approach to address the problem of underperformance of defect prediction due to the different data distributions between successive versions. TAL can actively select the unlabeled modules from the current version to domain experts for labeling and merge them into the prior version to mitigate the different data distributions. The results of our extensive experiments showed that TAL outperforms the baseline methods, including three variants, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better understand and control software quality. Software development is a dynamic evolutionary process that may result in data distributions (e.g., defect characteristics) varying from version to version. In this case, effective cross‐version defect prediction (CVDP) is not easy to achieve. In this paper, we aim to investigate whether the defect prediction method of the threshold‐based active learning (TAL) can tackle the problem of the different data distribution between successive versions. Our TAL method includes two stages. At the active learning stage, a committee of investigated metrics is constructed to vote on the unlabeled modules of the current version. We pick up the unlabeled module with the median of voting scores to domain experts. The domain experts test and label the selected unlabeled module. Then, we merge the selected labeled module and the remaining modules with pseudo‐labels from the current version into the labeled modules of the prior version to form enhanced training data. Based on the training data, we derive the metric thresholds used for the next iteration. At the defect prediction stage, the iterations stop when a predefined threshold is reached. Finally, we use the cutoff threshold of voting scores, that is, 50%, to predict the defect‐prone of the remaining unlabeled modules. We evaluate the TAL method on 31 versions of 10 projects with three prevalent performance indicators. The results show that TAL outperforms the baseline methods, including three variations methods, two common supervised methods, and the state‐of‐the‐art method Hybrid Active Learning and Kernel PCA (HALKP). The results indicate that TAL can effectively address the different data distribution between successive versions. Furthermore, to keep the cost of extensive testing low in practice, selecting 5% of candidate modules from the current version is sufficient for TAL to achieve a good performance of defect prediction. |
Author | Zhou, Yuming Lu, Zeyu Liu, Huihui Liu, Xutong Mei, Yuanqing Yang, Yibiao |
Author_xml | – sequence: 1 givenname: Yuanqing orcidid: 0000-0003-3122-8887 surname: Mei fullname: Mei, Yuanqing organization: Nanjing University – sequence: 2 givenname: Xutong surname: Liu fullname: Liu, Xutong organization: Nanjing University – sequence: 3 givenname: Zeyu surname: Lu fullname: Lu, Zeyu organization: Nanjing University – sequence: 4 givenname: Yibiao surname: Yang fullname: Yang, Yibiao email: yangyibiao@nju.edu.cn organization: Nanjing University – sequence: 5 givenname: Huihui surname: Liu fullname: Liu, Huihui organization: Nanjing University – sequence: 6 givenname: Yuming orcidid: 0000-0002-4645-2526 surname: Zhou fullname: Zhou, Yuming email: zhouyuming@nju.edu.cn organization: Nanjing University |
BookMark | eNp1kM1KAzEQx4NUsNaCj7DgxcvWfG6boxQ_ChXBj3NIk4lN2e7WZNvSm4_gM_okZq14EJ05zDDzmxnmf4w6VV0BQqcEDwjG9CIuw4CKgh2gLsV8mA_5iHR-8iE7Qv0YFzhZQbHgoosm41DH-PH2voEQfV1lFhyYJlsFsN40bWUdffWSNfMAcV6XNrEzHcFmOrU3kJWgQ5WIE3TodBmh_x176Pn66ml8m0_vbybjy2luqGQs524kwUjNMQARhllDZ5g7J6GwZoStERIEdXZWaEsYIdhoKyQtuCZSG2dYD53t965C_bqG2KhFvQ5VOqkYZoQTmTxRgz1l2v8COGV8o9t_mqB9qQhWrWIqKaZaxdLA-a-BVfBLHXZ_ofke3foSdv9y6vHu4Yv_BKgPf20 |
CitedBy_id | crossref_primary_10_1016_j_ins_2024_120786 crossref_primary_10_1142_S0218194024500414 |
Cites_doi | 10.1109/TSE.2011.103 10.1109/TSE.2018.2794977 10.1145/2393596.2393669 10.1109/TSE.2010.51 10.1145/2556777 10.1007/978‐3‐319‐66854‐3_7 10.1109/TPAMI.2014.2307881 10.1109/TSE.2008.35 10.1109/SBES.2015.9 10.1016/j.eswa.2016.05.018 10.1109/ISSRE.2014.35 10.1007/s10664‐019‐09777‐8 10.1007/s10515‐011‐0092‐1 10.1109/TSE.2017.2724538 10.1145/2970276.2970353 10.1145/2786805.2786813 10.1007/s10664‐011‐9182‐8 10.1109/ICRSE.2015.7366475 10.1145/3183339 10.1016/j.infsof.2017.11.005 10.1109/TSE.2007.256941 10.1023/A:1007330508534 10.5120/20693-3582 10.1002/smr.404 10.1007/s12065-019-00201-0 10.1109/TSE.2017.2731766 10.1145/1390156.1390183 10.1109/QRS.2016.33 10.1016/j.infsof.2014.11.006 10.1080/09540091.2022.2077913 10.23940/ijpe.20.04.p12.609617 10.1016/j.jss.2011.05.044 10.1109/TR.2020.2996261 10.1109/TSE.2014.2370048 10.1109/TR.2018.2864206 10.1109/ICSM.2015.7332511 10.1145/2365324.2365335 10.23940/ijpe.20.02.p5.203213 10.1007/s10489‐020‐01935‐6 10.1016/j.eswa.2021.116217 10.1109/TSE.2010.9 10.1016/j.infsof.2015.01.014 10.1162/153244302760185243 10.23940/ijpe.19.10.p16.27012708 10.1023/A:1010933404324 10.1007/s10664‐015‐9396‐2 10.4324/9781315806730 10.1109/ACCESS.2021.3095559 10.1587/transinf.E95.D.1680 |
ContentType | Journal Article |
Copyright | 2023 John Wiley & Sons Ltd. 2024 John Wiley & Sons, Ltd. |
Copyright_xml | – notice: 2023 John Wiley & Sons Ltd. – notice: 2024 John Wiley & Sons, Ltd. |
DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
DOI | 10.1002/smr.2563 |
DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Computer and Information Systems Abstracts CrossRef |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 2047-7481 |
EndPage | n/a |
ExternalDocumentID | 10_1002_smr_2563 SMR2563 |
Genre | article |
GrantInformation_xml | – fundername: National Natural Science Foundation of China funderid: 62172205; 62072194 |
GroupedDBID | .3N .4S .GA .Y3 05W 0R~ 10A 1OC 31~~ I-F LATKE LEEKS LH4 LITHE LOXES LUTES LW6 LYRES MEWTI MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 O66 O9- P2W P2X PQQKQ Q.N Q11 QB0 R.K ROL SUPJJ TUS W8V W99 WBKPD WIH WIK WOHZO WXSBR WYISQ WZISG ~WT AAYXX ADMLS AEYWJ AGHNM AGQPQ AGYGG CITATION 7SC 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c2933-4f89ec9a40ee15c3dc2b04ff9e6dc80dc59e52fdb6ad13110cad59264a19acfc3 |
IEDL.DBID | DR2 |
ISSN | 2047-7473 |
IngestDate | Sun Jul 13 05:35:35 EDT 2025 Tue Jul 01 01:44:44 EDT 2025 Thu Apr 24 23:03:05 EDT 2025 Wed Jan 22 17:21:27 EST 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c2933-4f89ec9a40ee15c3dc2b04ff9e6dc80dc59e52fdb6ad13110cad59264a19acfc3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-3122-8887 0000-0002-4645-2526 |
PQID | 3031419191 |
PQPubID | 2034650 |
PageCount | 26 |
ParticipantIDs | proquest_journals_3031419191 crossref_citationtrail_10_1002_smr_2563 crossref_primary_10_1002_smr_2563 wiley_primary_10_1002_smr_2563_SMR2563 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | April 2024 2024-04-00 20240401 |
PublicationDateYYYYMMDD | 2024-04-01 |
PublicationDate_xml | – month: 04 year: 2024 text: April 2024 |
PublicationDecade | 2020 |
PublicationPlace | Chichester |
PublicationPlace_xml | – name: Chichester |
PublicationTitle | Journal of software : evolution and process |
PublicationYear | 2024 |
Publisher | Wiley Subscription Services, Inc |
Publisher_xml | – name: Wiley Subscription Services, Inc |
References | 2019; 15 2020; 16 2008; 34 2012; 19 2012; 17 2018; 44 2001; 45 2007; 33 2014; 23 2010; 22 2001 2012; E95‐D 2019; 68 2015; 41 2022; 34 2021; 9 2015; 59 2010; 36 2012 2022; 191 2011 2010 2009 1997 2008 2007 1997; 28 2006 2005 2004 1992 2012; 38 2021; 51 2018; 27 2021; 14 2022 2015; 62 2021 2019; 45 2016; 21 2018 2014; 36 2017 2020; 69 2020; 25 2016; 61 2018; 96 2016 2015 2001; 2 2014 2015; 117 2012; 85 e_1_2_9_31_1 Ibrahim DR (e_1_2_9_65_1) 2017 Arcuri A (e_1_2_9_50_1) 2011 e_1_2_9_10_1 e_1_2_9_35_1 e_1_2_9_12_1 e_1_2_9_33_1 e_1_2_9_71_1 Balcan M‐F (e_1_2_9_22_1) 2007 Pushphavathi TP (e_1_2_9_67_1) 2014 Roy N (e_1_2_9_24_1) 2001 e_1_2_9_14_1 e_1_2_9_39_1 Romano J. (e_1_2_9_52_1) 2006 e_1_2_9_37_1 e_1_2_9_58_1 e_1_2_9_18_1 e_1_2_9_41_1 e_1_2_9_64_1 e_1_2_9_20_1 Matwin MK a S (e_1_2_9_16_1) 1997 e_1_2_9_45_1 e_1_2_9_43_1 e_1_2_9_66_1 Lanza M (e_1_2_9_6_1) 2006 e_1_2_9_8_1 Wilcoxon F (e_1_2_9_49_1) 1992 e_1_2_9_4_1 Panichella A (e_1_2_9_59_1) 2014 Alves TL (e_1_2_9_7_1) 2010 e_1_2_9_28_1 e_1_2_9_47_1 Thomas Zimmermann NN (e_1_2_9_38_1) 2009 e_1_2_9_30_1 e_1_2_9_53_1 e_1_2_9_51_1 e_1_2_9_11_1 e_1_2_9_34_1 e_1_2_9_57_1 e_1_2_9_13_1 e_1_2_9_32_1 e_1_2_9_55_1 Yang X (e_1_2_9_56_1) 2015 e_1_2_9_70_1 Kumar KV (e_1_2_9_63_1) 2021 Jelihovschi E (e_1_2_9_54_1) 2014 e_1_2_9_15_1 e_1_2_9_17_1 e_1_2_9_36_1 Mei YQ (e_1_2_9_5_1) 2022 e_1_2_9_19_1 Xu Z (e_1_2_9_2_1) 2018 e_1_2_9_42_1 e_1_2_9_40_1 e_1_2_9_61_1 e_1_2_9_21_1 e_1_2_9_46_1 Mockus A (e_1_2_9_60_1) 2005 e_1_2_9_23_1 e_1_2_9_44_1 Nguyen HT (e_1_2_9_26_1) 2004 Kakkar M (e_1_2_9_68_1) 2016 e_1_2_9_3_1 e_1_2_9_9_1 e_1_2_9_25_1 e_1_2_9_27_1 e_1_2_9_48_1 e_1_2_9_69_1 e_1_2_9_29_1 Soe YN (e_1_2_9_62_1) 2018 |
References_xml | – volume: 45 start-page: 683 issue: 7 year: 2019 end-page: 711 article-title: The impact of automated parameter optimization on defect prediction models publication-title: IEEE Trans Softw Eng – start-page: 81 year: 2017 end-page: 95 – volume: 61 start-page: 106 year: 2016 end-page: 121 article-title: Deriving thresholds of software metrics to predict faults on open source software: replicated case studies publication-title: Expert Syst Applic – volume: 38 start-page: 1276 issue: 6 year: 2012 end-page: 1304 article-title: A systematic literature review on fault prediction performance in software engineering publication-title: IEEE Trans Softw Eng – start-page: 79 year: 2012 end-page: 88 – volume: 19 start-page: 201 issue: 2 year: 2012 end-page: 230 article-title: Sample‐based software defect prediction with active and semi‐supervised learning publication-title: Autom Softw Eng – volume: 15 start-page: 2701 issue: 10 year: 2019 end-page: 2708 article-title: Active learning using uncertainty sampling and query‐by‐committee for software defect prediction publication-title: Int J Performability Eng – volume: 21 start-page: 2107 issue: 5 year: 2016 end-page: 2145 article-title: Towards building a universal defect prediction model with rank transformed predictors publication-title: Empir Softw Eng – start-page: 179 year: 1997 end-page: 186 – volume: 62 start-page: 67 year: 2015 end-page: 77 article-title: Negative samples reduction in cross‐company software defects prediction publication-title: Inf Softw Technol – start-page: 441 year: 2001 end-page: 448 – year: 2014 – volume: 68 start-page: 216 issue: 1 year: 2019 end-page: 236 article-title: An approach for the prediction of number of software faults based on the dynamic selection of learning techniques publication-title: IEEE Trans Reliabil – volume: 69 start-page: 1355 issue: 4 year: 2020 end-page: 1375 article-title: WR‐ELM: weighted regularization extreme learning machine for imbalance learning in software fault prediction publication-title: IEEE Trans Reliabil – start-page: 95 year: 2021 end-page: 103 – start-page: 1 year: 2011 end-page: 10 – volume: 33 start-page: 2 issue: 1 year: 2007 end-page: 13 article-title: Data mining static code attributes to learn defect predictors publication-title: IEEE Trans Softw Eng – volume: 9 start-page: 98754 year: 2021 end-page: 98771 article-title: Software defect prediction using ensemble learning: a systematic literature review publication-title: IEEE Access – volume: 27 start-page: 1 issue: 1 year: 2018 end-page: 51 article-title: How far we have progressed in the journey? An examination of cross‐project defect prediction publication-title: ACM Trans Softw Eng Methodol – volume: 96 start-page: 38 year: 2018 end-page: 67 article-title: Software metrics thresholds calculation techniques to predict fault‐proneness: an empirical comparison publication-title: Inf Softw Technol – start-page: 252 year: 2017 end-page: 257 – start-page: 35 year: 2007 end-page: 50 – volume: 191 year: 2022 article-title: Empirical investigation of hyperparameter optimization for software defect count prediction publication-title: Expert Syst Applic – start-page: 1 year: 2018 end-page: 5 – volume: 85 start-page: 244 issue: 2 year: 2012 end-page: 257 article-title: Identifying thresholds for object‐oriented software metrics publication-title: J Syst Softw – start-page: 208 year: 2008 end-page: 215 – volume: 23 start-page: 10:11 issue: 1 year: 2014 end-page: 10:51 article-title: An in‐depth study of the potentially confounding effect of class size in fault prediction publication-title: ACM Trans Softw Eng Methodol – start-page: 110 year: 2015 end-page: 119 – volume: 45 start-page: 5 issue: 1 year: 2001 end-page: 32 article-title: Random forests publication-title: Mach Learn – start-page: 1 year: 2015 end-page: 10 – start-page: 312 year: 2014 end-page: 322 – volume: 34 start-page: 485 issue: 4 year: 2008 end-page: 496 article-title: Benchmarking classification models for software defect prediction: a proposed framework and novel findings publication-title: IEEE Trans Softw Eng – start-page: 1 year: 2014 end-page: 5 – volume: 44 start-page: 811 issue: 9 year: 2018 end-page: 833 article-title: A comparative study to benchmark cross‐project defect prediction approaches publication-title: IEEE Trans Softw Eng – volume: 16 start-page: 609 issue: 4 year: 2020 end-page: 617 article-title: Active learning empirical research on cross‐version software defect prediction datasets publication-title: Int J Performability Eng – volume: 28 start-page: 133 issue: 2 year: 1997 end-page: 168 article-title: Selective sampling using the query by committee algorithm publication-title: Mach Learn – volume: 34 start-page: 1482 issue: 1 year: 2022 end-page: 1499 article-title: Using active learning selection approach for cross‐project software defect prediction publication-title: Connection Science – volume: 2 start-page: 45 year: 2001 end-page: 66 article-title: Support vector machine active learning with applications to text classification publication-title: J Mach Learn Res – volume: 25 start-page: 1573 issue: 2 year: 2020 end-page: 1595 article-title: Cross‐version defect prediction: use historical data, cross‐project data, or both? publication-title: Empir Softw Eng – start-page: 496 year: 2015 end-page: 507 – year: 2016 – start-page: 546 year: 2015 end-page: 550 – volume: 14 start-page: 315 issue: 2 year: 2021 end-page: 329 article-title: Threshold estimation from software metrics by using evolutionary techniques and its proposed algorithms, models publication-title: Evol Intell – start-page: 196 year: 1992 end-page: 202 – start-page: 79 year: 2004 – volume: 16 start-page: 203 issue: 2 year: 2020 end-page: 213 article-title: LAL: meta‐active learning‐based software defect prediction publication-title: Int J Performability Eng – volume: 41 start-page: 331 issue: 4 year: 2015 end-page: 357 article-title: Are slice‐based cohesion metrics actually useful in effort‐aware post‐release fault‐proneness prediction? An empirical study publication-title: IEEE Trans Softw Eng – volume: 22 start-page: 1 issue: 1 year: 2010 end-page: 16 article-title: Finding software metrics threshold values using ROC curves publication-title: J Softw Maintenance Evol Res Practice – start-page: 1 year: 2022 end-page: 53 article-title: Deriving object‐oriented metric thresholds: research problems, Progress, and challenges publication-title: Ruan Jian Xue Bao/J Softw (in Chinese) – start-page: 1 year: 2010 end-page: 10 article-title: Deriving metric thresholds from benchmark data publication-title: IEEE Int Conf Softw Maintenance – start-page: 164 year: 2014 end-page: 173 – start-page: 1 year: 2012 end-page: 11 – volume: 44 start-page: 534 issue: 6 year: 2018 end-page: 550 article-title: MAHAKIL: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction publication-title: IEEE Trans Softw Eng – year: 2006 – start-page: 658 year: 2016 end-page: 663 – volume: 117 start-page: 18 year: 2015 end-page: 22 article-title: Improved random Forest algorithm for software defect prediction through data mining techniques publication-title: Int J Comput Applic – volume: 51 start-page: 3615 issue: 6 year: 2021 end-page: 3644 article-title: An empirical study of ensemble techniques for software fault prediction publication-title: Appl Intell – start-page: 17 year: 2015 end-page: 26 – volume: 36 start-page: 216 issue: 2 year: 2010 end-page: 225 article-title: A quantitative investigation of the acceptable risk levels of object‐oriented metrics in open‐source systems publication-title: IEEE Trans Softw Eng – volume: E95‐D start-page: 1680 issue: 6 year: 2012 end-page: 1683 article-title: Active learning for software defect prediction publication-title: IEICE Trans Inf Syst – volume: 36 start-page: 852 issue: 6 year: 2010 end-page: 864 article-title: Evolutionary optimization of software quality modeling with multiple repositories publication-title: IEEE Trans Softw Eng – start-page: 91 year: 2009 end-page: 100 – volume: 59 start-page: 170 year: 2015 end-page: 190 article-title: An empirical study on software defect prediction with a simplified metric set publication-title: Inf Softw Technol – start-page: 209 year: 2018 end-page: 220 – volume: 36 start-page: 1936 issue: 10 year: 2014 end-page: 1949 article-title: Active learning by querying informative and representative examples publication-title: IEEE Trans Pattern Anal Mach Intell – volume: 17 start-page: 62 issue: 1 year: 2012 end-page: 74 article-title: On the dataset shift problem in software engineering prediction models publication-title: Empir Softw Eng – start-page: 225 year: 2005 end-page: 233 – volume-title: The ScottKnott Clustering Algorithm year: 2014 ident: e_1_2_9_54_1 – ident: e_1_2_9_70_1 doi: 10.1109/TSE.2011.103 – start-page: 1 year: 2010 ident: e_1_2_9_7_1 article-title: Deriving metric thresholds from benchmark data publication-title: IEEE Int Conf Softw Maintenance – ident: e_1_2_9_53_1 doi: 10.1109/TSE.2018.2794977 – ident: e_1_2_9_12_1 doi: 10.1145/2393596.2393669 – ident: e_1_2_9_61_1 doi: 10.1109/TSE.2010.51 – ident: e_1_2_9_14_1 doi: 10.1145/2556777 – start-page: 1 volume-title: A Novel Method for Software Defect Prediction: Hybrid of FCM and Random Forest year: 2014 ident: e_1_2_9_67_1 – ident: e_1_2_9_17_1 doi: 10.1007/978‐3‐319‐66854‐3_7 – start-page: 225 volume-title: Predictors of Customer Perceived Software Quality year: 2005 ident: e_1_2_9_60_1 – ident: e_1_2_9_27_1 doi: 10.1109/TPAMI.2014.2307881 – ident: e_1_2_9_55_1 doi: 10.1109/TSE.2008.35 – ident: e_1_2_9_9_1 doi: 10.1109/SBES.2015.9 – ident: e_1_2_9_18_1 doi: 10.1016/j.eswa.2016.05.018 – ident: e_1_2_9_3_1 doi: 10.1109/ISSRE.2014.35 – ident: e_1_2_9_44_1 doi: 10.1007/s10664‐019‐09777‐8 – ident: e_1_2_9_4_1 doi: 10.1007/s10515‐011‐0092‐1 – volume-title: Object‐Oriented Metrics in Practice—Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object‐Oriented Systems year: 2006 ident: e_1_2_9_6_1 – start-page: 95 volume-title: Software Fault Prediction Using Random Forests year: 2021 ident: e_1_2_9_63_1 – ident: e_1_2_9_34_1 doi: 10.1109/TSE.2017.2724538 – start-page: 209 volume-title: Cross‐version defect prediction via hybrid active learning with kernel principal component analysis year: 2018 ident: e_1_2_9_2_1 – start-page: 1 volume-title: A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering year: 2011 ident: e_1_2_9_50_1 – volume-title: Appropriate Statistics for Ordinal Level Data: Should We Really Be Using t‐test and Cohen's d for Evaluating Group Differences on the NSSE and other Surveys? year: 2006 ident: e_1_2_9_52_1 – ident: e_1_2_9_58_1 doi: 10.1145/2970276.2970353 – ident: e_1_2_9_37_1 doi: 10.1145/2786805.2786813 – ident: e_1_2_9_45_1 doi: 10.1007/s10664‐011‐9182‐8 – start-page: 1 volume-title: Software Defect Prediction Using Random Forest Algorithm year: 2018 ident: e_1_2_9_62_1 – ident: e_1_2_9_36_1 doi: 10.1109/ICRSE.2015.7366475 – ident: e_1_2_9_35_1 doi: 10.1145/3183339 – ident: e_1_2_9_19_1 doi: 10.1016/j.infsof.2017.11.005 – ident: e_1_2_9_13_1 doi: 10.1109/TSE.2007.256941 – ident: e_1_2_9_23_1 doi: 10.1023/A:1007330508534 – ident: e_1_2_9_64_1 doi: 10.5120/20693-3582 – start-page: 1 year: 2022 ident: e_1_2_9_5_1 article-title: Deriving object‐oriented metric thresholds: research problems, Progress, and challenges publication-title: Ruan Jian Xue Bao/J Softw (in Chinese) – start-page: 164 volume-title: Cross‐Project Defect Prediction Models: L'Union Fait La Force year: 2014 ident: e_1_2_9_59_1 – ident: e_1_2_9_11_1 doi: 10.1002/smr.404 – ident: e_1_2_9_20_1 doi: 10.1007/s12065-019-00201-0 – ident: e_1_2_9_47_1 doi: 10.1109/TSE.2017.2731766 – ident: e_1_2_9_25_1 doi: 10.1145/1390156.1390183 – ident: e_1_2_9_43_1 doi: 10.1109/QRS.2016.33 – ident: e_1_2_9_42_1 doi: 10.1016/j.infsof.2014.11.006 – start-page: 441 volume-title: Proceedings of the Eighteenth International Conference on Machine Learning year: 2001 ident: e_1_2_9_24_1 – ident: e_1_2_9_33_1 doi: 10.1080/09540091.2022.2077913 – ident: e_1_2_9_30_1 doi: 10.23940/ijpe.20.04.p12.609617 – start-page: 91 volume-title: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’09) year: 2009 ident: e_1_2_9_38_1 – start-page: 179 volume-title: Addressing the Curse of Imbalanced Training Sets: One‐Sided Selection year: 1997 ident: e_1_2_9_16_1 – start-page: 17 volume-title: Deep Learning for Just‐in‐Time Defect Prediction year: 2015 ident: e_1_2_9_56_1 – ident: e_1_2_9_8_1 doi: 10.1016/j.jss.2011.05.044 – start-page: 252 volume-title: Software Defect Prediction using Feature Selection and Random Forest Algorithm year: 2017 ident: e_1_2_9_65_1 – ident: e_1_2_9_40_1 doi: 10.1109/TR.2020.2996261 – ident: e_1_2_9_57_1 doi: 10.1109/TSE.2014.2370048 – ident: e_1_2_9_41_1 doi: 10.1109/TR.2018.2864206 – ident: e_1_2_9_10_1 doi: 10.1109/ICSM.2015.7332511 – ident: e_1_2_9_31_1 doi: 10.1145/2365324.2365335 – ident: e_1_2_9_28_1 doi: 10.23940/ijpe.20.02.p5.203213 – ident: e_1_2_9_39_1 doi: 10.1007/s10489‐020‐01935‐6 – ident: e_1_2_9_71_1 doi: 10.1016/j.eswa.2021.116217 – ident: e_1_2_9_15_1 doi: 10.1109/TSE.2010.9 – ident: e_1_2_9_46_1 doi: 10.1016/j.infsof.2015.01.014 – ident: e_1_2_9_21_1 doi: 10.1162/153244302760185243 – ident: e_1_2_9_29_1 doi: 10.23940/ijpe.19.10.p16.27012708 – start-page: 79 volume-title: Proceedings of the Twenty‐First International Conference on Machine Learning year: 2004 ident: e_1_2_9_26_1 – ident: e_1_2_9_69_1 doi: 10.1023/A:1010933404324 – ident: e_1_2_9_48_1 doi: 10.1007/s10664‐015‐9396‐2 – ident: e_1_2_9_51_1 doi: 10.4324/9781315806730 – start-page: 196 volume-title: Individual Comparisons by Ranking Methods year: 1992 ident: e_1_2_9_49_1 – start-page: 35 volume-title: Margin based active learning year: 2007 ident: e_1_2_9_22_1 – ident: e_1_2_9_66_1 doi: 10.1109/ACCESS.2021.3095559 – start-page: 658 volume-title: Feature Selection in Software Defect Prediction: A Comparative Study year: 2016 ident: e_1_2_9_68_1 – ident: e_1_2_9_32_1 doi: 10.1587/transinf.E95.D.1680 |
SSID | ssj0000620545 |
Score | 2.2892454 |
Snippet | Because defects in software modules (e.g., classes) might lead to product failure and financial loss, software defect prediction enables us to better... |
SourceID | proquest crossref wiley |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
SubjectTerms | Active learning cross‐version defect prediction (CVDP) Defects Labels Learning median Modules Software Software development Subject specialists threshold‐based active learning (TAL) Voting |
Title | Cross‐version defect prediction using threshold‐based active learning |
URI | https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fsmr.2563 https://www.proquest.com/docview/3031419191 |
Volume | 36 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA7iyYv1idUqEURP2-5mH26OUixV0EO1UPCwJJNsD2ot3daDJ3-Cv9Ff4sw-WhUF8bSXCWSTTOab5Ms3jB2hx2jAbMfBYANOYEJwYgGBI-gSyWpMxyBX-7yOuv3gchAOSlYlvYUp9CHmB27kGfl-TQ6udNZaiIZmj5MmxmsS-iSqFuGhnpgfr7iRQDBCBEZBWgQImv1KetYVrart12C0QJifcWoeaDo1dld1seCX3DdnU92El2_qjf_7hzW2WuJPflYsmHW2ZEcbrFbVduClq2-yizb1-P317bk4UOPGEvGDjyd0s0OzyYkyP-RTXAwZ3WGhLYVEw1W-hfKyHsVwi_U757ftrlOWXXAAY7_vBGksLUgVuNZ6IfgGhHaDNJU2MhC7BkJpQ5EaHSlDYj0uKBNKBFbKkwpS8LfZ8uhpZHcYt_EpaNwVMImUgavT2FNKG6E1ZnkKkV6dnVTjn0CpSU6lMR6SQk1ZJDhCCY1QnR3OLceFDscPNo1qCpPSE7PEJ31-TEqlV2fH-Vz82j65uerRd_evhntsRSDGKYg8DbY8nczsPmKUqT7IV-MH8YPnIw |
linkProvider | Wiley-Blackwell |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDLZgHODCeIrxDBKCU0eXtqMVJ4RA47XDAIkDUpU46Q7AmPbgwImfwG_kl2D3MR4CCXHqxZHSOI4_O85ngC2yGI0U7TjkbNDxTYBOKNF3JF8iWU3hGKZsn81649o_vQluxmC_eAuT8UOMEm5sGel5zQbOCendD9bQ_kOvSg7bG4cJbuidxlMtOUqwuHVJcIRLGCWzERBs9gryWVfuFoO_uqMPjPkZqaau5rgMt8UkswqTu-pwoKv4_I2_8Z9_MQPTOQQVB9memYUx25mDctHeQeTWPg8nhzzlt5fXpyynJozl2g_R7fHlDitUcNV8WwxoP_T5Gotk2SsaodJTVOQtKdoLcH18dHXYcPLOCw6S-_ccPwkji5HyXWtrAXoGpXb9JIls3WDoGgwiG8jE6LoyzNfjojJBRNhK1SKFCXqLUOo8duwSCBvuoaaDgeLIyHd1EtaU0kZqTYGeIrBXgZ1CATHmtOTcHeM-zgiVZUwrFPMKVWBzJNnNqDh-kFktdBjnxtiPPabop7g0qlVgO1XGr-Pjy4sWf5f_KrgBk42ri_P4_KR5tgJTkiBPVtezCqVBb2jXCLIM9Hq6Nd8BU8PrPg |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEB60gnjxLVarRhA9bd1mH-4eRS2-ER8geFiSSbYHtZa2evDkT_A3-kuc2Ud9oCCe9jKBbDKPb5LJNwBrZDEaKdtxKNig45sAnUii70i-RLKa0jHM2D5Pw_0r__A6uC6qKvktTM4PMThwY8vI_DUbeMekmx-kob37bp3itTcMI37oRqzRu-dycL7ihpLQCFcwSiYjINTsldyzrtwsB3-NRh8Q8zNQzSJNcwJuyjnmBSa39ce-ruPzN_rG__3EJIwXAFRs5xozBUO2PQ0TZXMHUdj6DBzs8IzfXl6f8hM1YSxXfohOl692eDsF18y3RJ-0oceXWCTLMdEIlflQUTSkaM3CVXPvcmffKfouOEjB33P8NIotxsp3rW0E6BmU2vXTNLahwcg1GMQ2kKnRoTLM1uOiMkFMyEo1YoUpenNQaT-07TwIG22hJrdAWWTsuzqNGkppI7WmNE8R1KvCRrn-CRak5Nwb4y7J6ZRlQiuU8ApVYXUg2cmJOH6QqZVbmBSm2Es8JuinrDRuVGE924tfxycXJ-f8Xfir4AqMnu02k-OD06NFGJOEd_KinhpU-t1Hu0R4pa-XM8V8B84f6fY |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cross%E2%80%90version+defect+prediction+using+threshold%E2%80%90based+active+learning&rft.jtitle=Journal+of+software+%3A+evolution+and+process&rft.au=Mei%2C+Yuanqing&rft.au=Liu%2C+Xutong&rft.au=Lu%2C+Zeyu&rft.au=Yang%2C+Yibiao&rft.date=2024-04-01&rft.issn=2047-7473&rft.eissn=2047-7481&rft.volume=36&rft.issue=4&rft.epage=n%2Fa&rft_id=info:doi/10.1002%2Fsmr.2563&rft.externalDBID=10.1002%252Fsmr.2563&rft.externalDocID=SMR2563 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2047-7473&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2047-7473&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2047-7473&client=summon |