1 / 23

Thread-Level Speculation as a Memory Consistency Protocol for Software DSM?

Thread-Level Speculation as a Memory Consistency Protocol for Software DSM?. University of Edinburgh http://www.dcs.ed.ac.uk/home/mc. Marcelo Cintra. Thread-Level Speculation (TLS). Speculatively run whole “threads” and backtrack if necessary

rane
Download Presentation

Thread-Level Speculation as a Memory Consistency Protocol for Software DSM?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thread-Level Speculation as a Memory Consistency Protocol for Software DSM? University of Edinburgh http://www.dcs.ed.ac.uk/home/mc Marcelo Cintra

  2. Thread-Level Speculation (TLS) • Speculatively run whole “threads” and backtrack if necessary • Track data accesses to detect cross-thread “conflicting” memory accesses • Buffer state of speculative threads and commit when appropriate • Enforce some expected correct execution behavior Dagstuhl Seminar - October 2003

  3. RAW Example 1: Speculative Parallelization • Original code: sequential with non-decidable dependences • Squash on data flow dependences for(i=0; i<100; i++) { … = A[L[i]]+… A[K[i]] = … } Iteration J … = A[4]+… A[5] = ... Iteration J+1 … = A[2]+… A[2] = ... Iteration J+2 … = A[5]+… A[5] = ... Dagstuhl Seminar - October 2003

  4. Example 2: Speculative Synchronization [Martinez and Torrellas, ASPLOS02] • Original code: parallel with locks and barriers • Squash on conflicting accesses Thread A acquire Thread B acquire Thread C acquire … = A[4]+… A[5] = … … = A[2]+… A[2] = … … = A[5]+… A[5] = … RAW WAW release release release Dagstuhl Seminar - October 2003

  5. Example 2: Speculative Synchronization • Non-conflicting memory operations can perform out-of-order • Conflicting memory operations eventually complete in-order after rollback • Relaxes the order of non-conflicting memory operations while still providing RC abstraction • At release/commit all pending stores must complete TLS used to enforce RC in a more “relaxed” way by means of speculation and rollback Dagstuhl Seminar - October 2003

  6. Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003

  7. LRC Consistency Protocol • Block on acquires and wait for lock • Obtain lock along with invalidations • On load page fault allocate local page and get diff update • On store page fault generate twin copy • On release compare twin and private copy to generate twin; send invalidations and lock to next thread in line Dagstuhl Seminar - October 2003

  8. Example LRC Operation Thread A acquire … = A[4]+… … A[5] = … release Thread B acquire … = A[2]+… … A[2] = … release Thread C acquire … = A[5]+… … A[5] = … release Generate diff Obtain diff from Thread A Dagstuhl Seminar - October 2003

  9. TLS-based Consistency Protocol • On load or write miss allocate local page and twin copy • Expand loads and stores to keep a record of the accesses to individual fields of shared objects • On commit • Wait for “diff” from non-speculative thread • Check for violations • Merge “diff’s” and pass to next speculative thread in line • If violation detected • Incorporate received “diff” into twin copy and discard local copy • Discard own “diff” • Discard some private data (may require extra buffering) • Re-execute Dagstuhl Seminar - October 2003

  10. Speculative Non-spec NotAccessed Loaded Modified NotAccessed NotAccessed NotAccessed Modified Loaded NotAccessed NotAccessed Modified Modified Modified Violation Violation TLS “diff” and Violations • 3 possible states for each field of shared object: • NotAccessed: thread did not touch this field • Loaded: thread loaded this field but did not store to it • Modified: thread stored to this field and possibly loaded it • Violation and merging of “diff”s Dagstuhl Seminar - October 2003

  11. Example TLS DSM Operation Thread A TLS_start … = A[4]+… TLS_load … A[5] = … TLS_store TLS_end Thread B TLS_start … = A[2]+… TLS_load … A[2] = … TLS_store TLS_end Thread C TLS_start … = A[5]+… TLS_load … A[5] = … TLS_store TLS_end No need to update “diff” Update “diff” to have A[2] as Loaded Get page with stale data Update “diff” to have A[5] as Modified Wait for non-spec (A) to finish. Obtain “diff” from A. Compare “diff” with own “diff”. No violations, so become non-spec. Merge “diff’s” Wait for non-spec (B) to finish. Obtain “diff” from B. Compare “diff” with own “diff”. Violation detected. Dagstuhl Seminar - October 2003

  12. Example Implementation • TLS_load: • TLS_store: • TLS_start: • Try to acquire lock with a non-blocking operation • If successful then become non-speculative • Otherwise get a place in line for the lock, and execute speculatively if (SA[i]==NotAccessed) SA[i]=Loaded SA[i]=Modified Dagstuhl Seminar - October 2003

  13. Example Implementation • TLS_end: • If non-speculative then “pass” lock to next thread in line; next thread becomes non-speculative • Else, if next thread waiting for lock then • Wait for non-speculative to finish • Get “diff” from non-speculative thread • Check for violations • Merge “diff”s • “Pass” lock to next thread in line • Else, wait for lock Dagstuhl Seminar - October 2003

  14. Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003

  15. Will It Work? • Overheads • Augmented loads and stores • Both speculative parallelization and optimistic concurrency control in software have been done successfully • Compiler instrumentation for write trapping in DSM is not so bad [Adve et. al., HPCA96] • Serialization of commits • Implementation • Hopefully not much more complex than a software DSM • Use source code augmentation and user help • Applications • Irregular applications with little overlap of modifications in critical sections • Easy to switch back to normal DSM operation Dagstuhl Seminar - October 2003

  16. Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003

  17. Related Work Speculative Synchronization: • Martinez and Torrellas (ASPLOS 2002); Rajwar and Goodman (MICRO 2001) • Hardware-based Optimistic Concurrency Control and Software Transactional Memory • Herlihy (ACM TDBS 1990); Kung and Robinson (ACM TDBS 1981) • Source-code level speculation for transaction processing • Shavit and Touitou (PODC 1995); Herlihy et. al., (PODC 2003) • Run-time system speculation on top of hardware coherent systems Dagstuhl Seminar - October 2003

  18. Related Work Speculation and consistency models: • Gniady, Falsafi, and Vijaykumar (ISCA 1999) • SC plus speculation in hardware • Speculation only within instruction window and ld/st queue Dagstuhl Seminar - October 2003

  19. Related Work Software Speculative Parallelization: • Dang, Yu, and Rauchwerger (IPDPS 2002); Rundberg and Stenström (WSSMM 2000); Cintra and Llanos (PPoPP 2003) • Speculative parallelization at source-code level • Papadimitriou and Mowry (CMU-CS-01-145) • Speculative parallelization on software DSM protocol Dagstuhl Seminar - October 2003

  20. Related Work Software DSM systems: • Treadmarks: Amza et. al. (IEEE Computer 1996) • Lazy RC (LRC) • Midway: Bershad, Zekauskas, and Sawdon (CompCon 1993) • Entry Consistency (EC) • Adve et. al. (HPCA 1996) • Compared LRC versus EC • Compared twinning versus compiler instrumentation for write trapping Dagstuhl Seminar - October 2003

  21. Outline • Background and motivation • A TLS-based protocol for software DSM • Summary • Related work • Conclusions Dagstuhl Seminar - October 2003

  22. Conclusions and Future Work • TLS can provide RC with more relaxed synchronization • Hardware speculative synchronization and software speculative parallelization have been successful • Must find applications • Must perform detailed performance evaluation • ? Dagstuhl Seminar - October 2003

  23. Thread-Level Speculation as a Memory Consistency Protocol for Software DSM? University of Edinburgh http://www.dcs.ed.ac.uk/home/mc Marcelo Cintra

More Related