Code similarity detection through control statement and program features
•Methods to identify duplicate codes (code clones) are introduced.•All four types of clones can be identified.•Does not require external lexer or parser to process the code.•Less complex approach compared to AST and PDG based approaches. Software clone detection is an emerging research area in the f...
Saved in:
Published in | Expert systems with applications Vol. 132; pp. 63 - 75 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York
Elsevier Ltd
15.10.2019
Elsevier BV |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Methods to identify duplicate codes (code clones) are introduced.•All four types of clones can be identified.•Does not require external lexer or parser to process the code.•Less complex approach compared to AST and PDG based approaches.
Software clone detection is an emerging research area in the field of software engineering. Software systems are subjected to continuous modifications in source code to improve the performance of the software, which may lead to code redundancy. Duplicate code/code clone is a piece of code reworked several times in software programs due to copy paste activity or reusability of existing software. Code clone is a prime subject in software evolution. Detection of software clones at the time of software evolution may improve the performance of software and reduce the maintenance cost and effort. This paper proposes metric based methods to detect code clones, as software clone is a universal problem in large scale programming environment. This paper introduces two metric based approaches to detect code clones by comparing (i) Control Statement Features (ii) Program Features like different types of statements, operators and operands. In order to demonstrate the effectiveness of the proposed approaches, extensive experiments are conducted on two datasets, C projects of Bellon's benchmark dataset and student lab programs (SLP).The methods efficiently identify similar functional clones. Proposed models only find similarity of whole programs but intelligent enough to highlight similar code segments across program files. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.04.045 |