160 likes | 287 Views
Software Architecture Reconstruction: An Approach Based on Combining Graph Clustering and Partitioning. Ioana Sora, Gabriel Glodean, Mihai Gligor Department of Computers Politehnica University of Timisoara. Context: Software Architecture. What it is:
E N D
Software Architecture Reconstruction: An Approach Based on Combining Graph Clustering and Partitioning Ioana Sora, Gabriel Glodean, Mihai Gligor Department of Computers Politehnica University of Timisoara
Context: Software Architecture • What it is: • a model of the software system expressed at a high level of abstraction (interaction of ”black box” elements) • Why it is important: • Knowing and having an explicit representation of the system architecture is crucial in order to maintain, understand and evaluate a large software application. • Where to find it: • Architecture is not explicitly represented in the code, but it has to be documented • Problem: architectural documentation is missing or outdated • Solution: Software Architecture Reconstruction • Reverse engineering process of extracting information from code and mapping these on high-level concepts of design/architecture • Our goal: to develop a quasi-automatic reconstruction technique, while obtaining a reconstructed architectural model of a good quality (similar to the one extracted by a human expert).
Our reconstruction approach • Construction of analysis model • Lightweight dependency model • Operates with info extractible from compiled code (java bytecode) • General OO model • Reconstruction process • Goals: • Improving the quality/accuracy of the automatically reconstructed architectural models • Approach: • Clustering combined with partitioning • Validation • Metric-based comparison of the results obtained automatically to reference solutions given by human expert
Clustering Clustering software based on a similarity/dissimilarity metric derived only from direct coupling/cohesion does not provide satisfactory results
Our approach: clustering combined with partitioning • Starting assumption: • two classes belonging to layers of different abstraction levels are highly unlikely to be part of the same architectural subsystem (even if there • is a strong dependency between them). • two classes belonging to the same layers have a higher chance to be part of the same architectural subsystem. Our solution: The similarity metric for clustering is given by the dependency strenght pondered with the distance adjustement.
Our approach: clustering combined with partitioning (cont) Partitioning preprocessing: Layers resulted by applying a partitioning algorithm on the directed graph of dependencies: 1, 2, 3 4 5 6,7 8
Our approach: clustering combined with partitioning (cont) Our solution: The similarity metric for clustering is given by the dependency strenght pondered with the distance adjustement.
Our approach: clustering combined with partitioning (cont) • Clustering algorithms used as starting points: • 2 graph-theoretical clustering algorithms, that have been already tried for software clustering, with medium-low quality results • ZMST • MMST • Types of layer distance adjustements used:
Approach used for validation • A good automatically produced clustering should approximate the clustering produced by a human expert - the architect. • In order to validate our clustering technique, we compare how close the automatic obtained solution is to a reference solution. • The comparison of two clustering solutions is done automatically with help of the MoJo metric defined by Tzerpos and Holt • The MoJo metric measures the distance between two clusterings of the same system. The MoJo metrics counts the minimum number of operations (moves and joins) one needs to perform in order to transform one clustering solution to the other.
Results – MST Clustering with/without layer distance adjustment with different settings, applied on ARTkernel
Results – MMST Clustering with/without layer distance adjustmentwith different settings, applied on ARTkernel
Results – Impact of layer distance adjustment on MST clustering on several systems
Results – Impact of layer distance adjustment on MMST clusteringon several systems
Conclusions • Our approach for automatic software architecture reconstruction combines traditional clustering approaches with partitioning • The “layer distance adjustement” for the coupling/cohesion similarity metric • Advantage of our approach: it improves the quality of the automatically reconstructed architectural model - closer to the one extracted by a human expert. • Approach has been validated for 2 clustering algorithms, on several software systems • Future work: • prove the generality of this conclusion by applying layer distance adjustments to more clustering algorithms • Investigate other types of “adjustments” for the similarity metric