An unsupervised cross project model for crashing fault residence identification
It is a critical quality assurance activity to effectively detect the root cause of faults causing the software crashes (i.e. crashing faults). Previous studies extracted features to characterise crash instances and built models to identify whether the residences of crashing faults locate inside the...
Saved in:
Published in | IET software Vol. 16; no. 6; pp. 630 - 646 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Wiley
01.12.2022
|
Online Access | Get full text |
Cover
Loading…
Summary: | It is a critical quality assurance activity to effectively detect the root cause of faults causing the software crashes (i.e. crashing faults). Previous studies extracted features to characterise crash instances and built models to identify whether the residences of crashing faults locate inside the stack traces. These models all belong to supervised learning methods which require labelled crash data to be involved. In this study, the introduction of an unsupervised model, called Transfer Spectral Clustering (TSC), for the task of crashing fault residence identification under the unlabelled data scenario is proposed. Unlike traditional unsupervised methods which are applied to individual project data, TSC transfers the knowledge of auxiliary unlabelled data from the source project to assist the clustering task on the unlabelled data from the target project. TSC is an unsupervised transfer learning method, and simultaneously considers the data manifold information of the individual project and feature manifold information across projects to facilitate the clustering effect. Extensive experiments are conducted on a benchmark dataset containing seven software projects. Five indicators were chosen for performance evaluation. The results show that TSC achieves better performance than four clustering based unsupervised methods, and competitive performance compared with eight supervised cross‐project methods.
In this work, we propose to introduce an unsupervised model, called Transfer Spectral Clustering (TSC), for the task of crashing fault residence identification under the unlabeled data scenario. Unlike traditional unsupervised methods which are applied to individual project data, TSC transfers the knowledge of auxiliary unlabeled data from the source project to assist the clustering task on the unlabeled data from the target project. |
---|---|
ISSN: | 1751-8806 1751-8814 |
DOI: | 10.1049/sfw2.12073 |