1 / 61

Abstract Interpretation and Future Program Analysis Problems

Abstract Interpretation and Future Program Analysis Problems. Martin Rinard Alexandru Salcianu Laboratory for Computer Science Massachusetts Institute of Technology. Abstract Interpretation: The Early Years. Formal Connection Between Sound analysis of program Execution of program

hillk
Download Presentation

Abstract Interpretation and Future Program Analysis Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Abstract Interpretation and Future Program Analysis Problems Martin Rinard Alexandru Salcianu Laboratory for Computer Science Massachusetts Institute of Technology

  2. Abstract Interpretation:The Early Years • Formal Connection Between • Sound analysis of program • Execution of program • Broader Impact • Insight that analysis is execution • Reduced need to think of analysis as reasoning about all possible executions! • Good fit with analysis problems of that era • Properties of local variables • Within single procedure

  3. How Is Abstract Interpretation Holding Up? • Technical result as relevant as ever • Moore’s Law effects • Much more computing power for analysis • More complex programs • Ambitious analyses • Heap properties • Multiple threads • Interprocedural partial program analyses • Stretch intuitive vision of analysis as execution

  4. Outline • Combined pointer and escape analysis • Rationale behind design decisions • Alternative choices in design space • Challenges and Predictions • Bigger Picture

  5. Goal of Pointer Analysis • Characterize objects to which pointers point • Synthesize finite set of object representatives • Derive representative(s) each pointer points to r = p.f; p f r “p.f points to a object, so after the execution of r = p.f, r may point to a object, but not to a , , or object”

  6. Our Pointer Analysis Goals • Accurate for multithreaded programs • Compositional, partial program analysis • Analyze each procedure once • Independently of callers • May skip analysis of invoked procedures • Why? • Parts of program unavailable (different language, not written yet) • Parts may be irrelevant for desired result

  7. Analysis Abstraction Basic abstraction Is Points-to Graph • Nodes represent objects in heap • Edges represent references in heap f p f f q f u

  8. Two Kinds of Edges • Inside edges (solid) – represent references created inside analyzed part of program • Outside edges (dashed) – represent references created outside analyzed part of program f p f f q f u

  9. Two Kinds of Nodes • Inside nodes (solid) – represent objects created inside analyzed part of program • Outside nodes (dashed) – represent objects • Created outside analyzed part of program, or • Accessed via edges created outside analyzed part of program f p f f q f u

  10. Key Question What does the heap look like when the procedure begins its execution? • Previous algorithms analyzed callers before callees, so model of heap always available • Unfortunately, this approach requires analysis of entire program in top-down fashion • Our solution: use code to reconstruct what (accessed part of) heap must look like

  11. Analysis In Example m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p q

  12. Analysis In Example m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r q

  13. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r q

  14. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r q s

  15. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r f q s

  16. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r f f q s One option – continue to expand graph But the analysis may never terminate…

  17. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r f f q s Instead have one outside node per load statement Represents all objects loaded at that statement Bounds graph and guarantees termination

  18. Consequences of This Decision • Multiple objects represented by single node (load node in loop) • But can also have single object represented by multiple nodes in graph (!!) (object loaded at multiple statements) f do a = q.f; until (a = null); do b = q.f; until (b = null); f q f f

  19. Consequences of This Decision • Form of points-to graph depends on program • Programs with identical behavior but different graphs… f f p p f r r f f f f q q s s do s = s.f; until (s = null); s = s.f; while (s != null) s = s.f

  20. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r f f q s

  21. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r f f q s t

  22. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r f f f q s t

  23. Analysis In Example f m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } p r f f f q s t u

  24. What Does Result Tell Us? • Nodes (outside) • Created outside analyzed part of program • Incomplete information • Nodes (inside, escaped) • Created inside analyzed part of program • But reachable from unanalyzed part of program • Incomplete information f p r f f f q s t u • Nodes (inside, captured) • Created inside analyzed part of program • Unreachable from unanalyzed part of program • Complete information about referencing relationships!

  25. Crucial Distinction • Escaped vs. Captured • Enables analysis to identify regions of heap where it has complete information • Crucial for both • Accuracy of analysis • Effective use of analysis results f p r f f f q s t u

  26. Multiple Calling Contexts f • Two Key Assumptions • p and q refer to different objects • Parallel threads may access objects p r f f f q m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } s t

  27. f p r f f f q s t Multiple Calling Contexts What if p and q refer to the same object? (i.e. p and q aliased) m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } r f f p f f q s t

  28. Multiple Calling Contexts f p What if p and q refer to the same object and there are no parallel threads? r f f f q m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } s t r f f p f f q s t

  29. Multiple Calling Contexts What if p and q refer to the same object and there are no parallel threads? m(p, q) { r = new C(); p.f = r; s = q; do s = s.f; until (s = null); t = new C(); s.f = t; u = new C(); } r p f f q s t

  30. Issues • Substantially different results for different calling contexts • But caller is unavailable at analysis time… • New analysis for each possible context? • Lots of contexts… • Most of which probably won’t be needed…

  31. r p f f q s t Our Solution f p • Analyze assuming • Distinct parameters • Parallel threads • Aliased parameters at caller? Merge nodes… • No parallel threads? Remove outside edges and nodes… r f f f q s t r f f p f f q s t

  32. Solution Is Not Perfect • Specialization can lose precision – can have two procedures such that when analyzed with • Distinct parameters – same analysis result • Aliased parameters - different analysis result • Conceptually complex analysis • Think about all contexts during analysis • Start to lose intuition of analysis as execution • Difficult time applying abstract interpretation framework

  33. V – concrete values A – abstract values  - abstraction function  - concretization function Abstract Interpretation and Analysis Abstract interpretation is parameterized framework ta a1 a2     tv v1 v2

  34. Applying Framework • A – points-to graphs • V – concrete heaps •  - points-to graph for a given heap • Points-to graph depends on program • Need to augment heap with access history •  - all heaps that correspond to points-to graph • OK, I give up…

  35. Correctness Proof • Inductively construct a relation  between • Objects in heap • Nodes that represent objects • Invariants that characterize  • Transfer function • Takes points-to graph and  • Give new points-to graph and  • Prove that transfer functions preserve invariants

  36. Threads and Abstract Interpretation • Philosophy of Abstract Interpretation • Come up with a decent abstraction • Execute program on that abstraction • Problem with threads • Execution usually modeled as interleaving • Too many interleavings!

  37. Our Solution Points-to graphs explicitly represent all possible interactions between parallel threads Basic Analysis Approach • Analyze each thread in isolation • To compute combined effect of multiple threads • Retrieve result for each thread • Compute interactions that may occur Outside edges Interactions in which one thread reads a reference created by parallel thread Inside Edges Interactions in which one thread creates a reference read by parallel thread

  38. Interthread Analysis n(p,q) || m(p,q)

  39. p Interthread Analysis n(p,q) || m(p,q) p q q Retrieve points-to graph from analysis of each thread

  40. p if may represent same object as A B A B Interthread Analysis n(p,q) || m(p,q) p q q Establish correspondence between nodes Start with parameter nodes

  41. p Interthread Analysis n(p,q) || m(p,q) p q q • Compute Interactions Between Threads • Match inside and outside edges • For each outside node, compute nodes in other graph that it represents

  42. p Interthread Analysis n(p,q) || m(p,q) p q q • Compute Interactions Between Threads • Match inside and outside edges • For each outside node, compute nodes in other graph that it represents

  43. p p Interthread Analysis n(p,q) || m(p,q) p q q • Use computed representation relationship to • combine graphs and • obtain single graph for the execution of both threads q

  44. Property of Analysis • Flow-sensitive within each thread (if reorder statements, get different result) • Flow-insensitive between threads • Assumes interactions can happen • Any number of times • In any order • Analysis models interactions that can’t actually happen in any interleaved execution

  45. a a b b c c Imprecision Due To Flow Insensitivity n(a,b,c) { 1:p=b.f p.f=a 2:a.f=b } m(a,c) { 3:q=a.f 4:q.f=c } || Interthread Analysis Result Execution Order Required to Produce Blue Edge a 1 3 b 2 4 c

  46. Weak Memory Consistency Models

  47. Initially: y=1 x=0 Thread 1 Thread 2 y=0 z = x+y x=1 What is value of z?

  48. Initially: y=1 x=0 Three Interleavings z = x+y y=0 Thread 1 Thread 2 z = x+y y=0 y=0 x=1 x=1 z = x+y z = 0 z = 1 x=1 y=0 What is value of z? x=1 z = x+y z = 1

  49. Initially: y=1 x=0 Three Interleavings z = x+y y=0 Thread 1 Thread 2 z = x+y y=0 y=0 x=1 x=1 z = x+y z = 0 z = 1 x=1 y=0 What is value of z? x=1 z can be 0 or 1 z = x+y z = 1

  50. Initially: y=1 x=0 Three Interleavings z = x+y y=0 Thread 1 Thread 2 INCORRECT REASONING! z = x+y y=0 y=0 x=1 x=1 z = x+y z = 0 z = 1 x=1 y=0 What is value of z? x=1 z can be 0 or 1 z = x+y z = 1

More Related