1 / 43

Register Allocation

Register Allocation. Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc.html. Source program (string). Already Studied. lexical analysis. Tokens. syntax analysis. Abstract syntax tree. semantic analysis. Abstract syntax tree.

harley
Download Presentation

Register Allocation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Register Allocation Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc.html

  2. Source program (string) Already Studied lexical analysis Tokens syntax analysis Abstract syntax tree semantic analysis Abstract syntax tree Translate Tree IR Cannon Cannonical Tree IR Instruction Selection Assem (with many reg)

  3. Register Allocation • Input: • Sequence of machine code instructions(assembly) • Unbounded number of temporary registers • Output • Sequence of machine code instructions(assembly) • Machine registers • Some MOVE instructions removed • Missing prologue and epilogue

  4. Register Allocation Process Repeat Construct a interference graph Nodes = Machine registers and temporaries Interference-Edges = Temporaries with overlapping lifetimes Move-edges = Source and Target of MOVE Color graph nodes with machine registers Adjacent nodes are not colored by the same register Spill a temporary into activation record Until no more spill

  5. Running Example /* k, j */ g := mem[j+12] h := k - 1 f := g * h e := mem[j+8] m := mem[j+16] b := mem[f] c := e + 8 d := c k := m + 4 j := b /* d, k, j */

  6. Challenges • The Coloring problem is computationally hard • The number of machine registers may be small • Avoid too many MOVEs • Handle “pre-colored” nodes

  7. Coloring by SimplificationKempe 1879 • Let K be the number of machine • Let G(V, E) be the interference graph • Consider a node v V which has K-1 or less neighbors • Color G – v in K colors • Color v in a color different that its (colored) neighbors

  8. Graph Coloring by Simplification Build: Construct the interference graph Simplify: Recursively remove nodes with less than K neighbors ; Push removed nodes into stack Potential-Spill: Spill some nodes and remove nodes Push removed nodes into stack Select: Assign actual registers (from simplify/spill stack) Actual-Spill: Spill some potential spills and repeat the process

  9. Coalescing • MOVs can be removed if the source and the target share the same register • The source and the target of the move can be merged into a single node (unifying the sets of neighbors) • May require more registers • Conservative Coalescing • Merge nodes only if the resulting node has fewer than K neighbors with degree  K (in the resulting graph)

  10. Constrained Moves • A instruction T := S is constrained • if S and T interfere • May happen after coalescing • Assume X and Z interfere and • X := Y • Y := Z • After coalescing X and Y we get • XY := Z with interference between XY and Z • Constrained MOVs are not coalesced

  11. Graph Coloring with Coalescing Build: Construct the interference graph Simplify: Recursively remove non MOVE nodes with less than K neighbors; Push removed nodes into stack Coalesce: Conservatively merge unconstrained MOV related nodes with fewer that K “heavy” neighbors Freeze: Give-Up Coalescing on some low-degree MOV related nodes Potential-Spill: Spill some nodes and remove nodes Push removed nodes into stack Select: Assign actual registers (from simplify/spill stack) Actual-Spill: Spill some potential spills and repeat the process

  12. Spilling • Many heuristics exist • Maximal degree • Live-ranges • Number of usages in loops • The whole process need to be repeated after an actual spill

  13. Pre-Colored Nodes • Some registers are in the intermediate language are pre-colored • correspond to real registers (stack-pointer, frame-pointer, …) • Cannot be Simplified, Coalesced, or Spilled (infinite degree) • Interfered with each other • But normal temporaries can be coalesced into pre-colored registers • Register allocation is completed when all the nodes are pre-colored

  14. Caller-Save and Callee-Save Registers • callee-save-registers (MIPS 16-23) • Saved by the callee • caller-save-registers • Saved by the caller • Interprocedural analysis may improve results int f(int a) { int b=a+1; g(); h(b); return(b+2); } void h (int y) { int x=y+1; f(y); f(2); }

  15. Saving Callee-Save Registers enter: def(r7) … exit: use(r7) enter: def(r7) t231 := r7 … r7 := t231 exit: use(r7)

  16. A Realistic Example enter: c := r3 a := r1 b := r2 d := 0 e := a loop: d := d+b e := e-1 if e>0 goto loop r1 := d r3 := c return /* r1,r3 */ enter: /* r2, r1, r3 */ c := r3 /* c, r2, r1 */ a := r1 /* a, c, r2 */ b := r2 /* a, c, b */ d := 0 /* a, c, b, d */ e := a / * e, c, b, d */ loop: d := d+b /* e, c, b, d */ e := e-1 /* e, c, b, d */ if e>0 goto loop /* c, d */ r1 := d /* r1, c */ r3 := c /* r1, r3 */ return /* r1,r3 */ r1, r2 caller save r3 callee-save

  17. A Realistic Example(cont) enter: /* r2, r1, r3 */ c := r3 /* c, r2, r1 */ a := r1 /* a, c, r2 */ b := r2 /* a, c, b */ d := 0 /* a, c, b, d */ e := a / * e, c, b, d */ loop: d := d+b /* e, c, b, d */ e := e-1 /* e, c, b, d */ if e>0 goto loop /* c, d */ r1 := d /* r1, c */ r3 := c /* r1, r3 */ return /* r1,r3 */

  18. Graph Coloring with Coalescing Build: Construct the interference graph Simplify: Recursively remove non MOVE nodes with less than K neighbors; Push removed nodes into stack Coalesce: Conservatively merge unconstrained MOV related nodes with fewer that K “heavy” neighbors Freeze: Give-Up Coalescing on some low-degree MOV related nodes Potential-Spill: Spill some nodes and remove nodes Push removed nodes into stack Select: Assign actual registers (from simplify/spill stack) Actual-Spill: Spill some potential spills and repeat the process

  19. spill = (uo + 10 ui)/deg enter: /* r2, r1, r3 */ c := r3 /* c, r2, r1 */ a := r1 /* a, c, r2 */ b := r2 /* a, c, b */ d := 0 /* a, c, b, d */ e := a / * e, c, b, d */ loop: d := d+b /* e, c, b, d */ e := e-1 /* e, c, b, d */ if e>0 goto loop /* c, d */ r1 := d /* r1, c */ r3 := c /* r1, r3 */ return /* r1,r3 */

  20. stack stack c Spilling c

  21. stack c Coalescing a+e stack c

  22. stack c Coalescing b+r2 stack c

  23. stack c Coalescing ae+r1 stack c r1ae and d are constrained

  24. stack d c Simplifying d stack c

  25. stack stack d c c Pop d d is assigned to r3

  26. stack stack c Pop c c actual spill!

  27. enter: /* r2, r1, r3 */ c := r3 /* c, r2, r1 */ a := r1 /* a, c, r2 */ b := r2 /* a, c, b */ d := 0 /* a, c, b, d */ e := a / * e, c, b, d */ loop: d := d+b /* e, c, b, d */ e := e-1 /* e, c, b, d */ if e>0 goto loop /* c, d */ r1 := d /* r1, c */ r3 := c /* r1, r3 */ return /* r1,r3 */ enter: /* r2, r1, r3 */ c1 := r3 /* c1, r2, r1 */ M[c_loc] := c1 /* r2 */ a := r1 /* a, r2 */ b := r2 /* a, b */ d := 0 /* a, b, d */ e := a / * e, b, d */ loop: d := d+b /* e, b, d */ e := e-1 /* e, b, d */ if e>0 goto loop /* d */ r1 := d /* r1 */ c2 := M[c_loc] /* r1, c2 */ r3 := c2 /* r1, r3 */ return /* r1,r3 */

  28. enter: /* r2, r1, r3 */ c1 := r3 /* c1, r2, r1 */ M[c_loc] := c1 /* r2 */ a := r1 /* a, r2 */ b := r2 /* a, b */ d := 0 /* a, b, d */ e := a / * e, b, d */ loop: d := d+b /* e, b, d */ e := e-1 /* e, b, d */ if e>0 goto loop /* d */ r1 := d /* r1 */ c2 := M[c_loc] /* r1, c2 */ r3 := c2 /* r1, r3 */ return /* r1,r3 */

  29. stack stack Coalescing c1+r3; c2+c1r3

  30. stack stack Coalescing a+e; b+r2

  31. stack stack d Coalescing ae+r1; Simplify d

  32. stack Pop d stack d a r1 b r2 c1 r3 c2 r3 d r3 e r1 d

  33. enter: c1 := r3 M[c_loc] := c1 a := r1 b := r2 d := 0 e := a loop: d := d+b e := e-1 if e>0 goto loop r1 := d c2 := M[c_loc] r3 := c2 return /* r1,r3 */ enter: r3 := r3 M[c_loc] := r3 r1 := r1 r2 := r2 r3 := 0 r1 := r1 loop: r3 := r3+r2 r1 := r1-1 if r1>0 goto loop r1 := r3 r3 := M[c_loc] r3 := r3 return /* r1,r3 */ a r1 b r2 c1 r3 c2 r3 d r3 e r1

  34. enter: r3 := r3 M[c_loc] := r3 r1 := r1 r2 := r2 r3 := 0 r1 := r1 loop: r3 := r3+r2 r1 := r1-1 if r1>0 goto loop r1 := r3 r3 := M[c_loc] r3 := r3 return /* r1,r3 */ enter: M[c_loc] := r3 r3 := 0 loop: r3 := r3+r2 r1 := r1-1 if r1>0 goto loop r1 := r3 r3 := M[c_loc] return /* r1,r3 */

  35. Graph Coloring Implementation • Two kinds of queries on interference graph • Get all the nodes adjacent to node X • Is X and Y adjacent? • Significant save is obtained by not explicitly representing adjacency lists for machine registers • Need to be traversed when nodes are coalesced • Heuristic conditions for coalescing temporary X with machine register R: • For every T that is a neighbor of X any of the following is true(guarantee that the number of neighbors of $T$ are not increased from <K to =K): • T already interferes with R • T is a machine register • Degree(T) < K

  36. Worklist Management • Use work-list to avoid expensive iterative queries • Maintain various invariants as data structures are updated • Implement work-lists as doubly linked lists to allow element-deletion

  37. Graph Coloring Expression Trees • Drastically simpler • No need for data flow analysis • Optimal solution can be computed (using dynamic programming)

  38. Simple Register Allocation int n := 0 function SimpleAlloc(t) { for each non trivial tile u that is a child of t SimpleAlloc(u) for each non trivial tile u that is a child of t n := n - 1 n := n + 1 assign rn to hold the value of t }

  39. Bad Example BINOP BINOP PLUS MEM TIMES MEM MEM NAME a NAME b NAME c

  40. Two Phase SolutionDynamic ProgrammingSethi & Ullman • Bottom-up (labeling) • Compute for every subtree • The minimal number of registers needed • Top-Down • Generate the code using labeling by preferring “heavier” subtrees (larger labeling)

  41. void label(t) { for each tile u that is a child of t do label(u) if t is trivial then need[t] := 0 else if t has two children, left and rightthen if need[left] = need[right] then need[t] := need[left] + 1 else need[t] := max(1, need[left], need[right]) else if t has one child, c then need[t] := max(1, need[c]) else if t has no children then need[t] := 1 } The Labeling Algorithm

  42. int n := 0 void StehiUllman(t) { if t has two children,left and right then if need[left]>K and need[right]>K then StehiUllman (right) ; n := n – 1 ; spill: emit instruction to store reg[right] StehiUllman (left) unspill: emit instruction to fetch reg[right] else if need[left]> need[right] then StehiUllman (left) ; StehiUllman (right) ; n := n - 1 else if need[right]> need[left] then StehiUllman (right) ; StehiUllman (left) ; n := n – 1 reg[t] := rn; emit(OPER, instruction[t], reg[t], reg[left], reg[right]) else if t has one child c then StehiUllman(c) ; reg[t] : = rn; emit(OPER, instruction[t], reg[t], reg[c]) else if t is non trivial but has no children n := n + 1 ; reg[t] := rn ; emit (OPER, instruction[t], reg[t]) else if t a trivial node TEMP(ri) then reg[t] := ri

  43. Summary • Two Register Allocation Methods • Local of every IR tree • Simultaneous instruction selection and register allocation • Optimality (under certain conditions) • Global of every function • Applied after instruction selection • Performs well for machines with many registers • Can handle instruction level parallelism • Missing • Interprocedural allocation

More Related