1 / 49

How Human and Machine Cooperate to Get Job Done

Cooperative Developer Testing:. How Human and Machine Cooperate to Get Job Done. Tao Xie North Carolina State University In collaboration with Xusheng Xiao @NCSU ASE and Nikolai Tillmann , Peli de Halleux @Microsoft Research and students. Why Automate Testing?.

teleri
Download Presentation

How Human and Machine Cooperate to Get Job Done

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cooperative Developer Testing: How Human and Machine Cooperate to Get Job Done Tao Xie North Carolina State University In collaboration with XushengXiao@NCSU ASE and Nikolai Tillmann, Peli de Halleux@Microsoft Research and students

  2. Why Automate Testing? • Software testing is important • Software errors cost the U.S. economy about $59.5 billion each year (0.6% of the GDP) [NIST 02] • Improving testing infrastructure could save 1/3 cost [NIST 02] • Software testing is costly • Account for even half the total cost of software development [Beizer 90] • Automated testing reduces manual testing effort • Test execution: JUnit, NUnit, xUnit, etc. • Test generation: Pex, AgitarOne, ParasoftJtest, etc. • Test-behavior checking: Pex, AgitarOne, ParasoftJtest, etc.

  3. ? = Software Testing Problems + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation (machine) • Generating high-quality test inputs (e.g., achieving high code coverage) • Test Oracles (human) • Specifying high-quality test oracles (e.g., guarding against various faults)

  4. Test Generation • Human • Expensive, incomplete, … • Brute Force • Pairwise, predefined data, etc… • Random: • Cheap, Fast • “It passed a thousand tests” feeling • Dynamic Symbolic Execution: Pex, CUTE,EXE • Automated white-box • Not random – Constraint Solving

  5. Dynamic Symbolic Execution Choose next path • Code to generate inputs for: Solve Execute&Monitor void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Negated condition a==null F T a.Length>0 T F Done: There is no path left. a[0]==123… F T Data null {} {0} {123…} Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890

  6. Pex:Visual Studio Power Tool http://research.microsoft.com/projects/pex/ • Download counts (20 months)(Feb. 2008 - Oct. 2009 ) • Academic: 17,366 • Devlabs: 13,022 • Total: 30,388

  7. Challenges of DSE • Loops/path explosion • Fitnex [Xie et al. DSN 09] • Method sequences • MSeqGen [Thummalapenta et al. ESEC/FSE 09] • External methods or environments e.g., file systems, network, db, … • Parameterized Mock Objects [Taneja et al. ASE 10-sp] Opportunities • Regression testing [Taneja et al. ICSE 09-nier] • Manually written unit tests [Thummalapenta et al. FASE 11] • Developer guidance (cooperative developer testing) [Xiao et al. ICSE 11]

  8. Open Source Pexextensions http://pexase.codeplex.com/ Publications:http://research.microsoft.com/en-us/projects/pex/community.aspx#publications

  9. Problems Faced by DSE

  10. DSE Challenges - Preliminary Study external-method call problems (EMCP) object-creation problems (OCP) Reported EMCPs: 44 Reported OCPs: 18 vs. Real EMCPs: 0 Real OCPs: 5

  11. DSE Challenges - Preliminary Study • object-creation problems (OCP) - 64.79% • external-method call problems (EMCP) - 26.76% • boundary problems – 5.63% • limitations of the used constraint solver – 2.82% Preliminary results show that the total block coverage achieved is 49.87%, with the lowest coverage being 15.54%.

  12. External-Method Call Problems (EMCP) Example • Example 1: • File.Existshas data dependencies on program input • Subsequent branch at Line 1 using the return value of File.Exists. • Example 2: • Path.GetFullPathhas data dependencies on program input • Path.GetFullPaththrows exceptions. • Example3: Stirng.Formatdo not cause any problem

  13. Object-Creation Problems (OCP) Example • To cover true branch at Line 5, tools need to generate sequences of method calls: • Stacks1 = new Stack(); • s1.Push(new object()); • …… • s1.Push(new object()); • FixedSizeStacks2 = new FixedSizeStack (s1); • Most tools cannot generate such sequence • true branch at Line 5has data dependencies on stack.items (List<object>) stack.Count() returns the size of stack.items

  14. Cooperative Developer Testing • Developers provide guidance to help tools achieve higher structural coverage • Apply tools to generate tests • Tools report achieved coverage & problems • Developers provide guidance • ECMP: Instrumentation or Mock Objects • OCP: Factory Methods

  15. Existing Solution ofProblem Identification • Existing solution (e.g., in Pex) • identify all external-method calls in the program • report all the non-primitive object types of program inputs and their fields • Limitations • the number could be high • some identified problem are irrelevant, not causes for the tools not to achieve high structural coverage

  16. DSE Challenges - Preliminary Study Reported EMCPs: 44 Real OCPs: 18 vs. Real EMCPs: 0 Real OCPs: 5

  17. Proposed Approach: Covana • Precisely identify problems faced by tools when achieving structural coverage • Insight • Not-covered branches have data dependency on real problem candidates • Three main steps: • Problem Candidate Identification • Forward Symbolic Execution • Data Dependence Analysis [Xiao et al. ICSE 2011]

  18. Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems

  19. Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems

  20. Problem Identification • EMCP Candidate Identification • External-method calls whose arguments have data dependencies on program inputs (e.g., NOT method calls that print constant strings or put a thread to sleep for some time) • OCP Candidate Identification • Only non-primitive argument types (e.g., NOT int, boolean, double)

  21. Example EMCP Candidate Identification Data Dependencies

  22. Example OCP Candidate Identification OCP Candidates: • FixedSizeStack • FixedSizeStack.stack • Stack.items • object

  23. Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems

  24. Forward Symbolic Execution • Turn elements of problem candidates symbolic • EMCP: return values of external-method calls • OCP: non-primitive program inputs and their fields • Perform symbolic execution (e.g., DSE/Pex) • Collect runtime information • Symbolic expression in branches • Uncaught exceptions

  25. Overview of Covana Problem Candidate Identification Program / PUT Generated Test Inputs Runtime Events Forward Symbolic Execution Problem Candidates Coverage Runtime Information Data Dependence Analysis Identified Problems

  26. Data Dependence Analysis Symbolic Expression: return(File.Exists) == true Element of ECMP Candidate: return(File.Exists) Branch Statement Line 1 has data dependency on File.Exists at Line 1

  27. EMCP Analysis • Data Dependence Analysis: • partially-covered branch statements have data dependencies on EMCP candidates for return values • Exception Analysis: • extract external-method calls from exception trace • the remaining parts of the program after the call site of the external-method call are not covered

  28. Example EMCP Analysis Branch Statement Line 1 has data dependency on File.Existsat Line 1 False branch at Line 1 is not covered File.Existsis reported Code after Line 6 is not covered Path.GetFullPath throws exceptions for all executions Path.GetFullPath is reported

  29. OCP Analysis • Data Dependence Analysis for partially-covered branch statements • data dependencies on non-primitive program input  report program input • data dependencies on fields of program input  report the object type of field directly??

  30. Example OCP Analysis stack.Count() returns the size of the field stack.items true branch at Line 5 is not covered Report List<object>, the object type of stack.items False Warning!!! an object type of List<object> cannot be used by the tools: not assignable to the field Stack.items by invoking a public constructor or a public setter method of its declaring class Stack!!

  31. FixedSizeStack Field Declaration Hierarchy • FixedSizeStack .stack • Stack.items Field Declaration Hierarchy: reflection can achieve this: first look at all fields of FixedSizeStack, then all fields of FixedSizeStack.stack, and finally Stack.items.

  32. OCP Analysis Algorithm Only program input, report it directly Check whether a field is assignable for its declaring class report its declaring class report the field itself

  33. Implementation • An extension to Pex • identify problem candidates • turn elements of problem candidates symbolic • collect runtime information • Data dependence analyzer • analyze runtime information • Identify problems • Graphic User Interface (GUI) component • show identified problems with detailed analysis information

  34. Evaluation – Subjects and Setup • Subjects: • xUnit: unit testing framework for .NET • 223 classes and interfaces with 11.4 KLOC • QuickGraph: C# graph library • 165 classes and interfaces with 8.3 KLOC • Evaluation setup: • Pex with the implemented extension as our DSE test-generation tool • Apply Pex to generate tests for program under test • Collect coverage and runtime information for identifying EMCPs and OCPs

  35. Evaluation – Research Questions • RQ1: How effective is Covana in identifying the two main types of problems, EMCPs and OCPs? • RQ2: How effective is Covana in pruning irrelevant problem candidates of EMCPs and OCPs?

  36. Evaluations - RQ1: Problem Identification • Covana identifies • 43 EMCPs with only 1 false positive and 2 false negatives • 155 OCPs with 20 false positives and 30 false negatives.

  37. Example Identified OCP ClassStart, Pexachieved block coverage of 2/27 (7.14%) requires the field typeUnderTest of TestClassCommandnot null and to implement at least one interface typeUnderTest is assignable for TestClassCommand . ReportITypeInfoof typeUnderTestas OCP

  38. Evaluations –RQ2: Irrelevant-Problem-Candidate Pruning • Covana prunes • 97.33% (1567 in 1610) EMCP candidates with 1 false positive and 2 false negatives • 65.63% (296 in 451) OCP candidates with 20 false positives and 30 false negatives

  39. Discussion Assisting other structural test-generation approaches • automatic mock object generation: only deal with external-method calls of EMCPs • random approach: assign more possibilities on exploring object types of OCPs • advanced method-sequence-generation approaches (e.g., MSeqGen): only deal with object types of OCPs

  40. ? = Software Testing + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation (machine) • Generating high-quality test inputs (e.g., achieving high code coverage) • Test Oracles (human) • Specifying high-quality test oracles (e.g., guarding against various faults)

  41. Regression Test Generation • Given a method f(x) (old version) and g (x) (new version) , synthesize meta-program  branch cov: h(x) := Assert(f(x) == g(x))  if (f(x) != g(x)) throw new Exception(“changed behavior !”); • Complications: • What if x is a non-primitive type? deep clone, method-sequence generation, … • How to compare receiver objects? deep state comparison, … [Tanejaand Xie. ASE 08 SP]

  42. Migrating Pex to the Web/Cloud Try it at http://www.pexforfun.com/ • Engineering Pex for serious games in computer science • Train problem solving/programming skills and abstraction skills Demo

  43. ? = Software Testing + Expected Outputs Test inputs Program Outputs Test Oracles • Test Generation (machine) • Generating high-quality test inputs (e.g., achieving high code coverage) • Test Oracles (human) • Specifying high-quality test oracles (e.g., guarding against various faults)

  44. Thank you! Questions ? https://sites.google.com/site/asergrp/

  45. Observation of Path Condition This path condition contains all the required fields, since all of them are assigned symbolic values Path Condition that leads to true branch at Line 5: FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10

  46. Observation of Path Condition This path condition contains all the required fields, since all of them are assigned symbolic values Path Condition that leads to true branch at Line 5: FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10

  47. FixedSizeStack Constructing Field Declaration Hierarchy • FixedSizeStack .stack • Stack.items • Extract fields from path conditions and construct a field declaration hierarchy FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10

  48. Discussion cont. • Static field • initialized inside class • side effecting symbolic analysis by previous tests • Concrete argument for external-method calls • using constant string to access external environment • affecting achieved coverage

  49. Discussion cont. • Other potential issues • argument side effect of external-method calls • control dependency • static analysis • Future work • carry out experiments to evaluate the effectiveness of incorporating these three more features

More Related