1 / 76

Design-Driven Compilation

Design-Driven Compilation. Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology. Goal: Parallelization. Computation. +. Fully Automatic. Design Driven. Overview. Analysis Problems: Points-to Analysis, Region Analysis. Two Potential

gaynell
Download Presentation

Design-Driven Compilation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design-Driven Compilation Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

  2. Goal: Parallelization Computation + Fully Automatic Design Driven Overview Analysis Problems: Points-to Analysis, Region Analysis Two Potential Solutions Evaluation

  3. Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2

  4. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide

  5. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer

  6. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine

  7. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine 1 2 3 4 5 6 7 8

  8. Divide and Conquer Algorithms • Lots of Generated Concurrency • Solve Subproblems in Parallel

  9. Divide and Conquer Algorithms • Lots of Recursively Generated Concurrency • Recursively Solve Subproblems in Parallel

  10. Divide and Conquer Algorithms • Lots of Recursively Generated Concurrency • Recursively Solve Subproblems in Parallel • Combine Results in Parallel

  11. “Sort n Items in d, Using t as Temporary Storage” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n);

  12. “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Divide array into subarrays and recursively sort subarrays

  13. 7 4 6 1 3 5 8 2 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Subproblems Identified Using Pointers Into Middle of Array d d+n/4 d+n/2 d+3*(n/4)

  14. 7 4 6 1 3 5 8 2 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n/4 d+n/2 d+3*(n/4)

  15. 4 7 1 6 3 5 2 8 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Sorted Results Written Back Into Input Array d d+n/4 d+n/2 d+3*(n/4)

  16. 4 1 4 7 1 6 6 7 3 2 5 3 2 5 8 8 “Merge Sorted Quarters of d Into Halves of t” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d t t+n/2

  17. 1 1 4 2 3 6 4 7 5 2 6 3 7 5 8 8 “Merge Sorted Halves of t Back Into d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d t t+n/2

  18. 7 4 6 1 3 5 8 2 “Use a Simple Sort for Small Problem Sizes” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n

  19. 7 4 1 6 3 5 8 2 “Use a Simple Sort for Small Problem Sizes” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n

  20. Parallel Sort • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • spawn sort(d,t,n/4); • spawn sort(d+n/4,t+n/4,n/4); • spawn sort(d+2*(n/2),t+2*(n/2),n/4); • spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • sync; • spawn merge(d,d+n/4,d+n/2,t); • spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); • sync; • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n);

  21. What Do You Need To Know To Exploit This Form of Parallelism? Points-to Information (data blocks that pointers point to) Region Information (accessed regions within data blocks)

  22. Information Needed To Exploit Parallelism d and t point to different memory blocks Calls to sort access disjoint parts of d and t Together, calls access [d,d+n-1] and [t,t+n-1] sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+n/2,t+n/2,n/4); sort(d+3*(n/4),t+3*(n/4), n-3*(n/4)); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1

  23. Information Needed To Exploit Parallelism d and t point to different memory blocks First two calls to merge access disjoint parts of d,t Together, calls access [d,d+n-1] and [t,t+n-1] merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4), d+n,t+n/2); merge(t,t+n/2,t+n,d); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1

  24. Information Needed To Exploit Parallelism • Calls to insertionSort access [d,d+n-1] • insertionSort(d,d+n); d d+n-1

  25. What Do You Need To Know To Exploit This Form of Parallelism? Points-to Information (d and t point to different data blocks) Symbolic Region Information (accessed regions within d and t blocks)

  26. How Hard Is It To Figure These Things Out?

  27. How Hard Is It To Figure These Things Out? Challenging

  28. How Hard Is It To Figure These Things Out? void insertionSort(int *l, int *h) { int *p, *q, k; for (p = l+1; p < h; p++) { for (k = *p, q = p-1; l <= q && k < *q; q--) *(q+1) = *q; *(q+1) = k; } } Not immediately obvious that insertionSort(l,h) accesses [l,h-1]

  29. How Hard Is It To Figure These Things Out? void merge(int *l1, int*m, int *h2, int *d) { int *h1 = m; int *l2 = m; while ((l1 < h1) && (l2 < h2)) if (*l1 < *l2) *d++ = *l1++; else *d++ = *l2++; while (l1 < h1) *d++ = *l1++; while (l2 < h2) *d++ = *l2++; } Not immediately obvious that merge(l,m,h,d) accesses [l,h-1] and [d,d+(h-l)-1]

  30. Issues • Heavy Use of Pointers • Pointers into Middle of Arrays • Pointer Arithmetic • Pointer Comparison • Multiple Procedures • sort(int *d, int *t, n) • insertionSort(int *l, int *h) • merge(int *l, int *m, int *h, int *t) • Recursion

  31. Fully Automatic Solution • Whole-program pointer analysis • Context-sensitive, flow-sensitive • Rugina and Rinard, PLDI 1999 • Whole-program region analysis • Symbolic constraint systems • Solve by reducing to linear programs • Rugina and Rinard, PLDI 2000

  32. Key Complication Need for sophisticated interprocedural analyses • Pointer analysis • Propagate analysis results through call graph • Fixed-point algorithm for recursive programs • Region analysis • Formulation avoids fixed-point algorithms • Single constraint system for each strongly connected component • Need to have whole program in analyzable form

  33. Bigger Picture • Points-to and region information is (implicitly) part of the interface of each procedure • Programmer understands procedure interfaces • Programmer knows • Points-to relationships on entry • Effect of procedure on points-to relationships • Regions of memory blocks that procedure accesses

  34. Idea Enhance procedure interface to make points-to and region information explicit • Points-to language • Points-to graphs at entry and exit • Effect on points-to relationships • Region language • Symbolic specification of accessed regions • Programmer provides information • Analysis verifies that it is correct

  35. Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; } }

  36. p p q q p p q q Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; } } Contexts for f(p,q,n) entry exit

  37. p p q q p p q q Verifying Points-to Information One (flow sensitive) analysis per context f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  38. p p p q q q p p q q Verifying Points-to Information Start with entry points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  39. p p q q p p q q Verifying Points-to Information Analyze procedure f(p,q,n) { . . . } Contexts for f(p,q,n) entry p q exit

  40. p p q q p p p q q q Verifying Points-to Information Analyze procedure f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  41. p p q q p p p q q q Verifying Points-to Information Check result against exit points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  42. p p q q p p q q Verifying Points-to Information Similarly for other context f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  43. p p p q q q p p q q Verifying Points-to Information Start with entry points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  44. p p q q p p p q q q Verifying Points-to Information Analyze procedure f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  45. p p q q p p p q q q Verifying Points-to Information Check result against exit points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit

  46. Analysis of Call Statements g(r,n) { . . f(r,s,n); . . }

  47. Analysis of Call Statements Analysis produces points-graph before call g(r,n) { . . f(r,s,n); . . } r s

  48. p p q q p p q q Analysis of Call Statements Retrieve declared contexts from callee g(r,n) { . . f(r,s,n); . . } Contexts for f(p,q,n) r entry s exit

  49. p p q q p p q q Analysis of Call Statements Find context with matching entry graph g(r,n) { . . f(r,s,n); . . } Contexts for f(p,q,n) r entry s exit

  50. p p q q p p q q Analysis of Call Statements Find context with matching entry graph g(r,n) { . . f(r,s,n); . . } Contexts for f(p,q,n) r entry s exit

More Related