650 likes | 820 Views
Things Structural Clones Tell that Simple Clones Don’t. Hamid Abdul Basit. Software Clones. Simple clones the same or similar code fragments Structural clones higher-level , larger similarities Similarity of code and similarity of structure. Simple clones. Structural Clones.
E N D
Things Structural Clones Tell that Simple Clones Don’t Hamid Abdul Basit
Software Clones • Simple clones • the same or similar code fragments • Structural clones • higher-level, larger similarities • Similarity of code and similarity of structure
Structural Clones CreateTask CreateProject CreateTask.UI CreateTask() CreateProject.UI CreateProject() User Interface executes executes CreateTask.BL ValidateTask() CreateProject.BL ValidateProject() Business Logic visualizes visualizes accesses accesses Task.DB AddTask() Task Table Project.DB AddProject() Project Table Database Collaborative structural clone
When are structural clones useful? • showing a bigger picture of similarity situation - the forest from the trees • Finding refactoring opportunities • Architecture recovery, program understanding and maintenance • Structural clones often represent application domain or design concepts • Re-engineering for reuse • The bigger the clones the better for reuse • Some benefits for plagiarism detection
A x y C B z w D A structure is a graph • Entities {A,B,C,D} • Relationships {w,x,y,z}
Entities • Physically defined • Code fragments, files , web pages, directories • Semantically defined • Methods, classes, packages • Conceptually defined • Components, sub-systems
Relationships • Physical co-location • Same file, same directory • Runtime • Message passing • Hyperlink between web pages • Design level • Inheritance • Association • Composition
Structural Clones S6 S5 S4 e6a e5a e4a y y y y y x e6b x e6d e5b x e5d e4b e4d z z z e6c e5c e4c S8 S7 e8a e7a w y y y x e8b x e7b e8d e7d z z e8c e7c
Observation Higher level similarities are composed of lower level similarities Can be recovered by finding repeating configurations of lower level similarities
Detecting Structural Clones clone patterns Simple clones in files 2 3 4 5 1 1,4,8,10,11,12 6 7 8 9 10 1,4,7,8,10,11,12 13 11 12 14 15 2,5,9,13,15
Detecting Structural Clones File Analysis 7 13 5 12 8 4 15 11 10 9 2 1
7 13 5 12 8 4 15 11 10 9 2 1 Detecting Structural Clones Directory Analysis
Detecting Structural Clones File Level Structural Clone Across Directories 1 4 8 10 7 2 9 13 15 5
Simple Clone Structure (SCS)Across Files a2 a1 b1 b2 c2 c1 F1 F2
File Clone Class (FCC) F1 F2
File Clone Structure (FCS) Across / Within Directories F1 Y1 Y2 F2 X1 Z1 Z2 X2 D2 D1
In earlier work • We hypothesized the benefits of structural clones • Re-engineering for reuse, architecture recovery, … • Defined structural clones • Implemented Clone Miner – structural clone detector • Did initial empirical evaluation
In work presented here How frequent are the different types of structural clones? Are structural clones more meaningful for program understanding and design recovery than simple clones? What is the value added by structural clone detection in identifying refactoring opportunities?
How frequent are the different types of structural clones? 1
Structural Clones are Frequent • Simple clones tend to occur in groups • 56% of simple clones are within structural clones • There are less structural clones than simple clones
Simple Clone Classes (SCC) and Simple Clone Structures (SCS)
Method Clone Classes (MCC) and Method Clone Structures (MCS)
Are structural clones more meaningful for program understanding and design recovery than simple clones? 2
Improved Program Understanding and Design Recovery • Analysis is more qualitative than quantitative • anecdotal evidences of interesting examples of various types of structural clones • Larger program parts recovered as clones from a system are expected to be more meaningful than smaller ones • High level structural clones like FCC and FCS appear to be a very useful tool for design recovery because of their size, highlighting the design level similarities between various parts of the system