Bug Detection and Report: A Case Study on Dataset for Software Management using Security Bug Report

In order to build a prediction model we must label a large amount of data in order to mine software repositories. The accuracy of the labels has a significant impact on a model's performance. However, there have been few research that have looked into the influence on a prediction model, there...

Full description

Saved in:
Bibliographic Details
Published inInternational journal for research in applied science and engineering technology Vol. 10; no. 5; pp. 1567 - 1588
Main Authors Vinothini, A, R, Hariharan
Format Journal Article
LanguageEnglish
Published 31.05.2022
Online AccessGet full text

Cover

Loading…
More Information
Summary:In order to build a prediction model we must label a large amount of data in order to mine software repositories. The accuracy of the labels has a significant impact on a model's performance. However, there have been few research that have looked into the influence on a prediction model, there are occurrences that have been mislabeled.. To close the gap, we conduct a research project on how to report a security bug (SBR)prediction in this paper. Furthermore, it has the potential to mislead SBR prediction research. We first enhance the label validity of these 5 datasets by personally evaluating each and every bugcomplaint in this study, and we discover 749 SBRs that were previously Non-SBRs have been mislabeled(NSBRs). We then examine the performance of the classification models both on messy (before the alteration) and cleaner (after our reconfiguration) datasets , impact of dataset label correctness. The results suggest that cleaning the datasets improves the performance of classification models. Index Terms: Prediction of security bug reports, data quality, software detection and report
ISSN:2321-9653
2321-9653
DOI:10.22214/ijraset.2022.42562