1 / 24

Software Defect Survey

Software Defect Survey. CS598 YYZ James Newell Lin Tan. Outline . Motivation Defect characteristics Generally True Controversial Use of the results Prediction Classification Conclusions. Motivation. Many papers study bug characteristics 2-3 applications

laszlo
Download Presentation

Software Defect Survey

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software Defect Survey CS598 YYZ James Newell Lin Tan

  2. Outline • Motivation • Defect characteristics • Generally True • Controversial • Use of the results • Prediction • Classification • Conclusions

  3. Motivation • Many papers study bug characteristics • 2-3 applications • Open source vs. closed source software • Various results • What are generally true? • What are the differences between open source software and closed source software • In terms of bug characteristics • What have been done to use these results? • How well have we done in prediction?

  4. Defect Characteristics • A small number of modules contains most of the bugs. • Support: Commercial [Basili84] [Compton90] [Munson92] [Ohlsson 96] [Kaaniche96] [Fenton00] [Ostrand02] [Pighin03] [Ostrand05] • Weak Support: 30% files contain 55% bugs. Open source [Chou01]

  5. Defect Characteristics • A small number of modules contains most of the bugs. – Simply because these modules contain most of the code • Support: Commercial [Compton90] • Reject/Weak Reject: Commercial [Kaaniche96] [Fenton00] [Ostrand02] [Ostrand05]

  6. Defect Characteristics • Files with the largest numbers of faults in an early release, seem to be more likely to have large numbers of faults in the next release and later releases. • Support: Commercial [Ostrand02] [Ostrand05] [Pighin03]

  7. Defect Characteristics • Faultier in PRE-releases, faultier in POST-releases • Reject: Commercial [Fenton00] [Ostrand02]

  8. Defect Characteristics • OSS developments exhibit very rapid responses to customer problems. • 50%: within 1 day, 75%: 42 days, 90%: 140 days. • The higher the priority, the faster they are fixed. • Priority: How many users depend on the bug. [Mockus00] [Mockus02]

  9. Defect Characteristics • Size is the best predictor when assessed in terms of number of faults; not good if considering fault density. • Support: Commercial [Fenton00] [Ostrand05]

  10. Defect Characteristics • The smaller the average component size, the more satisfied the users. • Support: Open Source [Stamelos02] • User satisfaction is subjective

  11. Defect Characteristics • Defect density in open source releases is lower than commercial code that has received a comparable level of testing. • Support: [Mockus00] [Mockus02] [Paulson04] • Weak reject [Stamelos02]

  12. Outline • Motivation • Defect characteristics • Generally True • Controversial • Use of the results • Prediction • Classification • Conclusions

  13. Defect Characteristics • Most bugs in release software are transient bugs. • Support: Commercial [Gray91] • Reject: • Commercial [Sullivan91] [Sullivan92] [Lee93] • Open Source [Chandra00]

  14. Defect Characteristics • Small modules are more fault-prone. • No relationship. Commercial [Fenton00] • Support/Weakly Support: Commercial [Basili1984] [Moller95] [Ostrand02] • Reject: Open Source [Chou01]

  15. Defect Characteristics • Newly written files are more likely to be faulty than old files. • Support: Commercial [Ostrand02] [Chou01] • NO substantial difference: Commercial [Pighin03]

  16. Generally True Characteristics • A small number of modules contains most of the bugs. • Files with the largest numbers of faults in an early release, seem to be more likely to have large numbers of faults in the next release and later releases. • Faultier in PRE-releases, less faultier in POST-releases (faultier: higher number of faults) • Defect density in open source releases is lower than commercial code that has received a comparable level of testing.

  17. Outline • Motivation • Defect characteristics • Generally True • Controversial • Use of the results • Prediction • Classification • Conclusions

  18. Prediction - Commercial • Top 20% of files: ~80% of faults [Ostrand04] [Ostrand05] • Negative Binomial Regression Model • Ideal: Top 20% of files: ~100% of faults • Top 20% of files: ~47% of faults [Ohlsson96] • Four models: equivalent performance • Ideal: 20% of files contain 60% of faults

  19. Classification – Open Source • Classify bugs according to root causes [Podgurski03] • Motivation • Accuracy: In 71-86% of clusters, the majority has the same root cause • Auto-Assign bugs to developers [Cubranic04] • Motivation • Bayesian learning approach • Accuracy: 30%

  20. Classification • Error rates of device drivers are 3-7 times higher than the rest of the kernel. [Chou01] • Bug life time: 1.8 years [Chou01] • Open Source • Undefined state errors dominate the error type distribution [Sullivan92] • Commercial

  21. Conclusions • Many controversial results • Not many studies on OSS • No general characteristics for OSS • Prediction accuracy is reasonably good • Different prediction models produce similar accuracy • Classification reveals interesting results

  22. Questions? Comments?

  23. Reference • [Basili84] Software Errors and Complexity: An Empirical Investigation, Comm. ACM • [Chandra00] • [Chou01] An Empirical Study of Operating System Errors, OSDI • [Compton90] Prediction and Control of ADA software Defects, J. Systems Software • [Cubranic04] Automatic bug triage using text categorization, SEKE • [Fenton00] Quantitative Analysis of Faults and Failures in a Complex Software System, TSE, • [Gray91] Why Do Computers Stop and What Can Be Done About It? Technical Report • [Kaaniche96] Software Reliability Analysis of T hree Successive Generations of a Switching System, EDCC-1 • [Lee93] • [Mockus00] A Case Study of Open Source Software Development: The Apache Server, ICSE • [Mockus02] Two case studies of open source software development: Apache and Mozilla, TSEM • [Moller95]

  24. Reference • [Munton92] The Detection of Fault-Prone Programs, TSE • [Ohlsson96] Predicting Fault-Prone Software Modules in Telephone Switches, TSE • [Ostrand02] The Distribution of Faults in a Large Industrial Software System, ISSTA • [Ostrand04] • [Ostrand05] Predicting the Location and Number of Faults in Large Software Systems”, TSE • [Paulson04] An Empirical Study of Open-Source and Closed-Source Softwaree Products, TSE • [Pighin03] An Empirical Analysis of Fault Persistence through Software Releases, ISESS • [Podgurski03] Automated Support for Classifying Software Failure Reports, ICSE • [Stamelos02] Code Quality analysis in open source software development, Info Systems J • [Sullivan91] • [Sullivan92]

More Related