340 likes | 628 Views
Connectivity-Based Garbage Collection. Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003. Garbage Collection Benefits. Garbage collection leads to simpler Design no complex deallocation protocols Implementation automatic deallocation
E N D
Connectivity-BasedGarbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003
Garbage Collection Benefits Garbage collection leads to simpler • Design no complex deallocation protocols • Implementation automatic deallocation • Maintenance fewer bugs Benefits are widely accepted • Java, C#, Python, …
Garbage Collection:Haven’t we solved this problem yet? • For a state-of-the-art garbage collector: • time ~14% of execution time • space 3x high watermark • pauses 0.8 seconds • Can reduce any one cost • Challenge: reduce all three costs
Example Heap s1 o1 Boxes: heap objects o2 o3 o4 s2 o5 o6 Arrows: pointers o7 o8 o9 o15 g o10 o11 o14 o12 o13 Long box: stack + global variables
Thesis o1 • Objects form distinct data structures • Connected objects die together • Garbage collectors can exploit 1. and 2. to reclaim objects efficiently o2 o3 o4 o5 o6 o7 o8 o9 o15 o10 o11 o14 o12 o13 stack +globals
Experimental Infrastructure JikesRVM Research Virtual Machine • From IBM Research • Written in Java • Application and runtime system share heap Good garbage collection even more important Benchmarks • SPECjvm98 suite and SPECjbb2000 • Java Olden suite • xalan, ipsixql, nfc, jigsaw
Outline • Garbage Collector Design Principles • Family of Garbage Collectors • Design Space Exploration • Conclusion
Don’t collect the full heap every time Shorter pause times Garbage Collector Design Principles“Do partial collections.” o1 o2 o3 o4 o5 o6 o7 o8 o9 o15 o10 o11 o14 o12 o13 stack +globals
Generational hypothesis:Most objects die young Generational garbage collection: Partition by age Collect young objects most often Low time overhead That’s the state of the art. Garbage Collector Design Principles“Predict lifetime based on age.” o1 o2 o3 o4 o5 o6 o7 o8 o9 o15 o10 o11 o14 o12 o13 stack +globals young generation old generation
Regular full collections Long peak pause Old-to-young pointers Need bookkeeping Garbage Collector Design PrinciplesGenerational GC Problems o1 o2 o3 o4 o5 o6 o7 o8 o9 o15 o10 o11 o14 o12 o13 stack +globals young generation old generation
Garbage Collector Design Principles“Collect connected objects together.” ? o1 o2 o1 o2 o1 o2 o1 o2
Garbage Collector Design Principles“Focus on objects with few ancestors.” Shortlived objects are easy to collect
Garbage Collector Design Principles“Predict lifetime based on roots.” o1 s o2 g o3 o4 For details, see [ISMM’02] paper. stack +globals
Outline • Garbage Collector Design Principles • Family of Garbage Collectors • Design Space Exploration • Conclusion
Do partial collections. Collect connected objects together. Predict lifetime based on age. Focus on objects with few ancestors. Predict lifetime based on roots. CBGC Family of Garbage Collectors:Connectivity-Based Garbage Collection p1 o1 o2 p2 o3 o4 o5 o6 o7 o8 o9 p3 o15 o10 o11 o14 o12 o13 p4 stack +globals
Family of Garbage CollectorsComponents ofCBGC Before allocation: • PartitioningDecide into which partition to put each object Collection algorithm: • EstimatorEstimate dead + live objects for each partition • ChooserChoose “good” set of partitions • Partial collectionCollect chosen partitions
Find fine-grained partitions, where • Partition edges respect pointers • Objects don’t move between partitions Family of Garbage CollectorsPartitioning Problem p1 o1 o2 p2 o3 o4 o5 o6 o7 o8 o9 p3 o15 o10 o11 o14 o12 o13 p4 stack +globals
Pointer analysis Type-based [Harris] o1 may point to o2 if o1 has a field of atype compatible to o2 -conservative: they determine the absence of a pointer btw two heaps only if they can prove that such pointer cannot exist. Family of Garbage CollectorsPartitioning Solutions p1 o1 o2 p2 o3 o4 o5 o6 o7 o8 o9 p3 o15 o10 o11 o14 o12 o13 p4 stack +globals
For each partition guess dead Objects that can be reclaimed Pay-off live Objects that must be traversed Cost Family of Garbage CollectorsEstimator Problem p1 1 dead + 2 live p2 3 dead + 3 live p3 2 dead + 0 live 2 dead + 2 live p4 stack +globals
Heuristics Connected objects die together Most objects die young Objects reachable from globals live long The past predicts the future Family of Garbage CollectorsEstimator Solutions p1 1 dead + 2 live p2 3 dead + 3 live p3 2 dead + 0 live 2 dead + 2 live p4 stack +globals
Pick subset of partitions Maximize total dead Minimize total live Closed under predecessor relation No bookkeeping for external pointers Family of Garbage CollectorsChooser Problem p1 1 dead + 2 live p2 3 dead + 3 live 7 dead + 5 live p3 p3 2 dead + 0 live 2 dead + 2 live p4 stack +globals
Optimal algorithm based on network flow [TR] Simpler, greedy algorithm Family of Garbage CollectorsChooser Solutions p1 1 dead + 2 live p2 3 dead + 3 live 7 dead + 5 live p3 p3 2 dead + 0 live 2 dead + 2 live p4 stack +globals
Look only at chosen partitions Traverse reachable objects Reclaim unreachable objects Family of Garbage CollectorsPartial Collection Problem rest of heap o2 p2 o o5 o5 o6 o7 o8 o8 o o9 p3 o15 o10 o10 o11 o11 o14 o12 o13 p4 stack +globals
Generalize canonical full-heap algorithms • Mark and sweep [McCarthy’60] • Semi-space copying [Cheney’70] • Treadmill [Baker’92] Family of Garbage CollectorsPartial Collection Solutions rest of heap o2 p2 o5 o5 o6 o7 o8 o8 o9 p3 o15 o10 o10 o11 o11 o14 o12 o13 p4 stack +globals
Outline • Garbage Collector Design Principles • Family of Garbage Collectors • Design Space Exploration • Conclusion
Design Space ExplorationQuestions How good is a naïve CBGC? How good could CBGC be in 20 years? How well does CBGC do in a JVM?
Design Space ExplorationSimulator Methodology Garbage collection simulator (under GPL) • Uses traces of allocations and pointer writesfrom our benchmark runs Simulator advantages • Easier to implement variety of collector algorithms • Know entire trace beforehand:can use that for “in 20 years” experiments Currently adding CBGC to JikesRVM
jack jack jack xalan xalan xalan jbb jbb jbb javac javac javac Design Space ExplorationHow good is a naïve CBGC? 1.72 0 0.87 0 0.22 0
jack jack jack xalan xalan xalan jbb jbb jbb javac javac javac Design Space ExplorationHow good could CBGC be in 20 years? 1.72 0 0.87 0 0.22 0
Design Space ExplorationHow good could CBGC be in 20 years? CBGC with oracles beats Appel • We did not find a “performance wall” • CBGC has potential The performance gap between CBGC with oracles and naïve CBGC is large • Research challenges
How well does CBGC doin a Java virtual machine? Implementation in progress Need a pointer analysis for the partitioning
Contributions presented in this talk Connectivity-based GC design principles [ISMM’02] CBGC, a new family of garbage collectors; Design space exploration with simulator [OOPSLA’03]