Evaluating the performance of clone detection tools in detecting cloned co-change candidates
Co-change candidates are the group of code fragments that require a change if any of these fragments experience a modification in a commit operation during software evolution. The cloned co-change candidates are a subset of the co-change candidates, and the members in this subset are clones of one a...
Saved in:
Published in | The Journal of systems and software Vol. 187; p. 111229 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Inc
01.05.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Co-change candidates are the group of code fragments that require a change if any of these fragments experience a modification in a commit operation during software evolution. The cloned co-change candidates are a subset of the co-change candidates, and the members in this subset are clones of one another. The cloned co-change candidates are usually created by reusing existing code fragments in a software system. Detecting cloned co-change candidates is essential for clone-tracking, and studies have shown that we can use clone detection tools to find cloned co-change candidates. However, although several studies evaluate clone detection tools for their accuracy in detecting cloned fragments, we found no study that evaluates clone detection tools for detecting cloned co-change candidates. In this study, we explore the dimension of code clone research for detecting cloned co-change candidates. We compare the performance of 12 different configurations of nine promising clone detection tools in identifying cloned co-change candidates from eight open-source C and Java-based subject systems of various sizes and application domains. A ranked list and analysis of the results provides valuable insights and guidelines into selecting and configuring a clone detection tool for identifying co-change candidates and leads to a new dimension of code clone research into change impact analysis.
•Syntax-based tools perform better than textual similarity-based tools.•Detecting all types of clones (i.e., Type-1, 2 & 3) improves performance.•Higher unique line coverage by clone fragments is helpful.•The source splitting technique used (e.g., token, line) impacts the results. |
---|---|
ISSN: | 0164-1212 1873-1228 |
DOI: | 10.1016/j.jss.2022.111229 |