1 / 9

CLUSTERING SUPPORT FOR FAULT PREDICTION IN SOFTWARE

CLUSTERING SUPPORT FOR FAULT PREDICTION IN SOFTWARE. Maria La Becca Dipartimento di Matematica e Informatica, University of Basilicata, Potenza, Italy marialabecca@gmail.com. Fault Prediction Approaches. LEXICAL & STRUCTURAL INFORMATION. NEW SW CLUSTERING APPROACH. Process Metrics.

quincy
Download Presentation

CLUSTERING SUPPORT FOR FAULT PREDICTION IN SOFTWARE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CLUSTERING SUPPORT FOR FAULT PREDICTION IN SOFTWARE Maria La Becca Dipartimento di Matematica e Informatica, University of Basilicata, Potenza, Italy marialabecca@gmail.com

  2. Fault Prediction Approaches LEXICAL & STRUCTURAL INFORMATION NEW SW CLUSTERING APPROACH Process Metrics Component || Package Level Fault Predictors Product Metrics • SW • Quality • Testing • Refactoring GOAL INTRODUCTION Cluster Level Predictor FAULT PREDICTION MODELS

  3. Software Clustering Approach– Steps : • Lexical 1 - CORPUS CREATION 2 - CORPUS NORMALIZATION 3 - CORPUS INDEXING • Structural Corpus Vector Space Model (VSM) OO SW System Terms D1 Terms Di • SplittingIdentifiers • Special TokenElimination • Stop Word Removal • Stemming Term by Document Matrix Identifiers & Comment Terms D2 . . Terms Dn Terms D SOFTWARE CLUSTERING 4 - COMPUTING SIMILARITIES 5 - EXTRACTING DEPENDENCIES 6 - CLUSTERING ? JRipples ? G’ = (V, E, ω) LexicallySimilar StructurallyDependent BorderFlow Algorithm

  4. Fault PredictionModels FAULT PREDICTION MODELS FAULT PREDICTION MODELS • Classes • LexicallySimilar • StructurallyDependent • Product Metrics • Multivariate Linear Regression • LogisticRegression

  5. Definition and Context VS Baseline Approache (Class & Package) SW Clustering Approache Fault PredictionModels Fault PredictionModels != • Cluster Granularity Level • Class & Package Granularity Level = • Metrics • SWLR - LGR RQ – Does the cluster levelapproachimprove fault predictionascompared with the baseline (i.e., class and package level) ? CASE STUDY Source Code 15 Release & Popularity Dataset SW Metrics & Fault

  6. Planning Training Set X.1 X.0 INTRA Previous Knowledge OO SW System Fault Prediction Empiric Evaluation X.0 X.1 INTER Test Set SelectedVariables CASE STUDY

  7. Validation and Evaluation – Intra- & Inter-Release Analysis K-Fold Cross Validation K-Rounds Results To assess and compare predictors SWLR e LGR Averaged over the rounds Intra-Release Inter-Release DATASET V X.0 DATASET DATASET V X.0 V X.1 Version X.0 Training Set Test Set V X.0 V X.0 Dataset Dataset Test Set Training Set CASE STUDY (close to 1) SAR Kendall τ & Spearmanρ [-1;+1] SWLR Models SWLR Predictors Precision Recall F - measure AIC & RD (Lower Values > Goodness of Fit) LGR Models LGR Predictors

  8. Results 3 SWLR CLUSTER + BASELINE INTRA- INTER-RELEASE 6 PREDICTORS OO Software System 3 LGR CLUSTER + BASELINE Legend: Best Values Worst Values No Prevalence RESULTS

  9. Thanks CONCLUSION Acknowledgements Carmine Gravino Andrian Marcus Tim Menzies Giuseppe Scanniello

More Related