top | item 20065696

(no title)

sixwing | 6 years ago

very doable, at least for certain types of clones, and a topic of active research.

while leveraging the ASTs and scope graphs produced by semantic can allow you to attack the more complicated clone types (eg, code that has nearly the same meaning, with significantly different implementation), various parsing + hashing methods have proven useful for the more simple cases.

useful for far more than detecting plagiarism, too. it can boost signal for search, allow for more nuanced semantic navigation, assist in refactoring, and help understand the propagation/provenance of code (which can be important for understanding licenses, etc).

discuss

No comments yet.