Thread specific heaps for multi threaded programs
1 / 15

Thread-specific Heaps for Multi-threaded Programs - PowerPoint PPT Presentation

  • Updated On :

Thread-specific Heaps for Multi-threaded Programs. Bjarne Steensgaard Microsoft Research. GC and Threads. Traditional approaches: Pseudo-concurrency => no concurrency Concurrent GC => synchronization overhead Stop and GC => no concurrency during GC Observations leading to our approach:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Thread-specific Heaps for Multi-threaded Programs' - draco

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Thread specific heaps for multi threaded programs l.jpg

Thread-specific Heapsfor Multi-threaded Programs

Bjarne Steensgaard

Microsoft Research

Gc and threads l.jpg
GC and Threads

Traditional approaches:

  • Pseudo-concurrency => no concurrency

  • Concurrent GC => synchronization overhead

  • Stop and GC => no concurrency during GC

    Observations leading to our approach:

  • Much data is only used by a single thread

  • When collecting data used only by a single thread, other threads can be ignored

Gc and thread specific heaps l.jpg
GC and Thread-specific Heaps

Thread-specific Heaps

  • Contains data only accessed by a single thread

  • Can be GC’ed independently of and concurrently with other thread-specific heaps (no pointers from the outside into these heaps)

    Shared Heap

  • Contains data possibly shared among threads

  • GC’ed using one of the traditional approaches

Advantages l.jpg

  • Concurrent collection of thread heaps

  • Increased locality of GC

  • Reduced GC latency (shorter “stops”)

  • Reduced memory overhead for two-space copying components of GC

    • “To”-space only needed for heaps actively being copied, “from” space can be released as copying of each heap is completed

Enabling thread specific heaps l.jpg
Enabling Thread-specific Heaps

Memory requests must be specialized

  • Shared or thread-specific; choose conservatively

  • Must observe the invariant that there are no pointers from shared data to thread-specific data

    Root set division

  • May distinguish shared and thread-specific roots

  • Not necessary (and not implemented), but could reduce GC latency

Compiler support in marmot l.jpg
Compiler Support in Marmot

Escape and Access Analysis

  • Interprocedural, flow-insensitive, context-sensitive

  • Polymorphic type inference (monomorphic recursion) for a non-standard type system

  • Tracks object flow and threads object access

  • Objects “escape” only when potentially accessed by multiple threads (as opposed to being visible to multiple threads)

Compiler support in marmot7 l.jpg
Compiler Support in Marmot

Method specialization

  • Duplicate methods as necessary to specialize memory requests according to analysis results (and to call other specialized methods)

  • Crucial for achieving a usable separation of objects into shared and thread-specific objects

    Very similar to Ruf’s PLDI’00 work

  • Analysis and transformation stages are similar to Ruf’s work to remove synchronization ops

Thread specific gc in marmot l.jpg
Thread-specific GC in Marmot

Prototype! Proof of concept

  • Modified two-generation copying GC

  • Each heap has two generations

    When a GC is triggered, all heaps are GC’ed

  • Reachable objects in the shared heap are copied first by a single thread

  • Threads then copy objects from their own heaps (helper threads are available for blocked threads)

  • When thread copying is complete, thread is restarted

  • Minimal synchronization needed for copying shared objects after initial copy of shared objects

Example l.jpg

Shared root

Thread 1 root

Thread 2 root

Thread 3 root

Thread-specific object


Shared object

Performance and efficacy l.jpg
Performance and Efficacy


  • On par with existing garbage collector for most programs, better for others


  • Unknown! Most available programs do not use multi-threading for interesting purposes

Efficacy examples l.jpg
Efficacy Examples

  • VolanoMark (chat client/server) shares almost all long-lived data among threads

    • Client: allocates ½MB thread, 16MB shared data,copies 4KB thread, 1.2MB shared data

    • Server: allocates 5MB thread, 10MB shared data,copies 5KB thread, 1.7MB shared dataGC has improved locality, but otherwise little benefit

  • Mtrt benefits greatly, but is a poor benchmark

    • Allocates 27MB thread, ½MB shared data,copies6.5MB thread, 170MB shared data

Future work l.jpg
Future Work

  • Variations on how to collect the heaps

  • Heaps for thread groups or groups of threads

  • Allowing non-followed pointers from shared objects to thread-specific objects

  • Allowing thread-specific objects in shared containers using programmer annotations

Multi layer heap division l.jpg

Heap A

Heap D

Heap F

Heap E

Heap B

Heap C

Multi-layer Heap Division

Partially ordered rather than per-thread heaps

Completely ordered heaps

  • If very fine-grained, then we are approaching Tofte & Talpin’s “Stack of Regions” approach

Other heap divisions l.jpg
Other Heap Divisions

User-defined divisions checked by compiler

  • FX with regions

    Divisions according to major data structures

  • Example: a compiler could use different heap for program representation and analysis results

  • Permits customizing the collector to the nature of the data structure

  • The IBM folks are experimenting with “memory contexts”

Related work l.jpg
Related Work

  • Andy King & Richard JonesUniversity of Kent

    • Static division into thread-specific heaps

  • Pat Caudill & Allen Wirfs-BrockInstantiations, Inc. (makers of Jove)

    • Dynamic division into thread-specific heaps

    • Use write-barrier and copy-on-GC to deal with objects that are really shared among threads