1 / 22

Multiprocessor Cache Consistency

Multiprocessor Cache Consistency. (or, what does volatile mean?) Andrew Whitaker CSE451. What Does This Program Print?. public class VisiblityExample extends Thread { private static int x = 1; private static int y = 1; private static boolean ready = false;

Download Presentation

Multiprocessor Cache Consistency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

  2. What Does This Program Print? publicclass VisiblityExample extends Thread { private staticint x = 1; private static int y = 1; private static boolean ready = false; publicstaticvoid main(String[] args) { Thread t = new VisiblityExample(); t.start(); x = 2; y = 2; ready = true; } publicvoid run() { while (! ready) Thread.yield(); // give up the processor System.out.println(“x= “ + x + “ y= “ + y); } }

  3. Answer • It’s a race condition. Many different outputs are possible: • x=2, y=2 • x=1,y=2 • x=2,y=1 • x=1,y=1 • Or, the program may print nothing! • The ready loop runs forever

  4. What’s Going on Here? • Processor caches ($) can get out-of-sync CPU CPU CPU CPU $ $ $ $ Memory

  5. A Mental Model • Every thread/processor has its own copy of every variable • Yikes! // Not real code; for illustration purposes only publicclass Example extends Thread { private static final int NUM_PROCESSORS = 4; private staticint x[NUM_PROCESSORS]; private static int y[NUM_PROCESSORS]; private static boolean ready[NUM_PROCESSORS]; // …

  6. Two Issues • Cache coherence • Do caches eventually converge on the same state • All modern caches are coherent • Cache consistency • When are operations by one processor visible on other processors? • Sometimes called “publication” • How much re-ordering is possible across processors?

  7. Subjective View of Cache Consistency Strategies Relaxed Amount of reordering Strict Fast and scalable

  8. Factors Pushing Towards Relaxed Consistency Models • Hardware perspective: consistency operations are expensive • Writing processor must invalidate all other processors • Reading processor must re-validate its cached state • Compiler perspective: optimizations frequently re-arrange memory operations to hide latency • These are guaranteed to be transparent, but only on a single processor

  9. Caches 101 • Caches store blocks of main memory • Blocks are fairly small (perhaps 64 bytes) • Each cache block exists in one of three states • Invalid, shared, exclusive • Memory operations causes the cache block to change states • CPUs must communicate to implement cache block state changes

  10. Reading processors Writing processor Cache Block State During a Coherence Operation Invalid Shared (read-only) Exclusive (read-write)

  11. Some Terminology • Publication: A CPU announces its updates to some or all of cache memory • Fetch: A CPU loads that latest values for previously published updates

  12. Hardware Support: Memory Fences (Barriers) • No memory operation can be moved across a fence • No operation after the fence appears before the fence • No operation before the fence appears after the fence • Several variants: • Write fences (for publication) • Read fences (for fetch) • Read/write (total) fences

  13. Sequential Consistency • All writes are immediately published • All reads fetch the latest value • All processors agree on order of memory accesses • Every operation is a fence • Behaves like shuffling cards

  14. A subset of legal orderings: A. x = 2; B. y = 3; C. x = 4; D. y = 5; C. x = 4; D. y = 5; A. x = 2; B. y = 3; C. x = 4; A. x = 2; D. y = 5; B. y = 3; A. x = 2; C. x = 4; D. y = 5; B. y = 3; Sequential Consistency Example Processor 1 Processor 2 x = 2; y = 3; x = 4; y = 5; A always appears before B C always appears before D

  15. The Cost of Sequential Consistency • Every write requires a complete cache invalidation • Writing processor acquires exclusive access • Writing processor sends an invalidation message • Writing processor receives acknowledgements from all processors • Expensive!

  16. Relaxed Consistency Models • Updates are published lazily • Therefore, updates may appear out-of-order • Challenge: Exposing a programming model that a human can understand

  17. Release Consistency • Observation: concurrent programs usually use proper synchronization • “All shared, mutable state must be properly synchronized” • It suffices to sync-up memory during synchronized operations • Big performance win: the number of cache coherency operations scales with synchronization, not the number of loads and stores

  18. Fetch current values Publish new values Simple Example • Within the critical section, updates can be re-ordered • Without publication, updates may neverbe visible synchronized (this) { x++; y++; }

  19. Java Volatile Variables • Java synchronized does double-duty • It provides mutual exclusion, atomicity • It ensures safe publication of updates • Sometimes, we don’t want to pay the cost of mutual exclusion • Volatile variables provide safe publication without mutual exclusion volatile int x = 7;

  20. More on Volatile • Updates to volatile fields are propagated immediately • “Don’t cache me!” • Effectively, this activates sequential consistency • Volatile serves as a fence to the compiler and hardware • Memory operations are not re-ordered around a volatile

  21. Rule #1, Revised • All shared, mutable state must be properly synchronized • With a synchronized statement, an Atomic variable, or with volatile

  22. Need synchronization to ensure publication Example: Lazy Initialization class Example { static List list = null; public static List getList () { if (list == null) { list = new LinkedList(); return list; } }

More Related