22 0

Tree-Pattern-Based Clone Detection with High Precision and Recall

Title
Tree-Pattern-Based Clone Detection with High Precision and Recall
Author
도경구
Keywords
Software maintenance; code clone; clone detection; abstract syntax tree; CODE
Issue Date
2018-06
Publisher
KSII-KOR SOC INTERNET INFORMATION
Citation
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, v. 12, No. 5, Page. 1932-1950
Abstract
The paper proposes a code-clone detection method that gives the highest possible precision and recall, without giving much attention to efficiency and scalability. The goal is to automatically create a reliable reference corpus that can be used as a basis for evaluating the precision and recall of clone detection tools. The algorithm takes an abstract-syntax-tree representation of source code and thoroughly examines every possible pair of all duplicate tree patterns in the tree, while avoiding unnecessary and duplicated comparisons wherever possible. The largest possible duplicate patterns are then collected in the set of pattern clusters that are used to identify code clones. The method is implemented and evaluated for a standard set of open-source Java applications. The experimental result shows very high precision and recall. False-negative clones missed by our method are all non-contiguous clones. Finally, the concept of neighbor patterns, which can be used to improve recall by detecting non-contiguous clones and intertwined clones, is proposed.
URI
http://www.itiis.org/digital-library/manuscript/2000http://repository.hanyang.ac.kr/handle/20.500.11754/105584
ISSN
1976-7277
DOI
10.3837/tiis.2018.05.002
Appears in Collections:
COLLEGE OF COMPUTING[E] > COMPUTER SCIENCE(소프트웨어학부) > Articles
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE