460 likes | 623 Views
UniProcessor Garbage Collection Techniques. Paul R. Wilson University of Texas. Presented By Naomi Sapir Tel-Aviv University. Overall Goal. Introduction of garbage collectors for uniprocessors. Clarify basic issues in the field. Motivation. Modular programming.
E N D
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University
Overall Goal • Introduction of garbage collectors for uniprocessors. • Clarify basic issues in the field.
Motivation • Modular programming. • Unreclaimed memory leads to slow memory leaks. • Reclaiming too soon may cause unpredictable results. • GC should be built into a language implementation.
Two Phase Abstraction • Distinguish live objects from garbage by terms of root set. - reference counting - mark sweep - copying • Reclaim Garbage storage.
Root Set • Global variables • local variables in the activation stack. • Registers • Live object: Any object reached from the Root Set. • What is Garbage?
Cons • Conservative approximation of true liveness. • Efficiency cost proportional to the number of objects allocated in runtime. - a real pointer points to another object - short lived stack variables • Fragmentation.
Pros • Used in distributed systems combined with other techniques.
Page Root Object Mark-swept
Cons • Fragmentation - difficult to allocate large objects. • Locality of reference: interleave of different ages causes many page swaps. • Cost: proportional to the heap size.
Mark Compact CollectionCompact after mark Pros • Solves fragmentation. • Saves locality. Cons • Cost: Several passes over the data: Mark, compute new locations, update pointers and move the objects.
Efficiency of Copying Collection • Proportional to the live data during collection. • Decrease of collection frequency, decreases collection effort. • Need increased heap memory. • Objects that die before GC needn’t be copied.
Basic Techniques - Conclusions • High performance systems use hybrid techniques. • Copy collectors use a separate large objects area. • In-place collectors (mark-sweep, treadmill) are conservative in respect to untyped objects, a copying collector must identify pointers.
Problems with basic GC • Not large memory, causes excessive paging. • A copying collection might cause paging. • Locality in cache memory is important. • Time consuming, not usable for real time applications.
Incremental Tracing Collectors • Incremental tracing for garbage detection. • The running program may mutate the graph of reachable objects. • - keep trackof changes. • - floating garbage.
Tricolor Marking Before After a violation D is not reachable
Incremental Approaches • Coordinates the collector with the mutator. • Read barrier A mutator access a pointer to a white object, colors the object grey. • Write barrier - a direct method - Traps the write of a white pointer into a black object. - Traps the death of a pointer before it is reached by GC.
Baker’s Incremental Copying • The best known real-time garbage collector. • Free list(tospace), Live list (fromspace). • Object: two pointers (next,prev) and a color (for the set). • Fast allocation: copying in Cheney fashion. • Read Barrier.
Baker’s Incremental Copying (cont’) • Tricolors: - Black: Scanned area in tospace - Grey: copied but not scanned. - White: unreached objects in fromspace. • Use scan-pointer on unscanned area of tospace, and move referred-to objects from fromspace.
Baker’s Incremental Copying (cont’) • Rule: Scanned objects in tospace(black) cannot point to objects in fromspace(white). • If a mutator tries to access a pointer from the fromspace (white), the referent is copied into the tospace (grey) before the access. • Allocation of new objects during GC is done in tospace, they are live - black.
Baker’s Incremental Copying implementation • Rate of copy is tied to the rate of runtime allocation. • Read barrier: compiled in software or implemented by hardware checks and/or microcode routines (Lisp Machines) • Order of 20% time overheads.
The Treadmill (Baker) • Non-copying. • Doubly linked lists.
The Treadmill (cont’) During allocation
The Treadmill Conservatism • Allocated objects are marked live, but might die before the collection finishes. • Pre-existing object marked live, might die after being reached. • If the mutator destroys all the grey objects that point to a white object, although the white object will not be reachable by the collector, its memory will be reclaimed.
Snapshot-at-Beginning Write-Barrier (Yuasa) • Cheaper then read barrier, as heap writes are less common then heap reads. • The graph of reachable objects is fixed from the beginning. • All objects are accessed by the GC during collection (saves overwritten values). • more conservative then Baker, all pointers retained, no free during GC.
Tricolor Marking D is reachable Before
Incremental Update Write-Barrier(Dijkstra) • Heuristically retain live objects at the end of GC. • Objects that die during GC and before reached by GC, may be reclaimed. • Records a pointer that escapes into an already reached object (black white) (grey white)
Incremental UpdateWrite-Barrier(Dijkstra) cont’ • New objects are allocated white: short lived objects will not be traversed early, but will be reclaimed quickly (advantage).
Generational Garbage Collection • New objects are allocated in the New Gen. • When full, New Gen only is scavenged, then old objects are copied to the Old Gen. • Include a pointer Old Gen New Gen in the Root Set. Does not copy all live data at a collection. Does not copy old objects repeatedly.
Generational Garbage Collection Cont’ • Copy Collector: all pointers to moved objects are updated. • Conservative true liveness, not all pointers Old Gen New Gen are live, they will float until the Old Gen will be scavenged.
Generational Garbage Collection Cont’ • Newer generations are usually smaller then older, so scanning them is faster. • Better locality. • Record of intergenerational pointers is not tied to the rate of object creation, but still might slow the program.
Conclusions • Generational techniques reduce cost as objects tend to die fast. • Generational techniques with write barrier can support incremental update collection. • We studied several kinds of GCs. • Most important characteristics of GCs. • Constant factors of cost (locality effects). • Understanding current research.