1 / 28

Efficient Data-Race Detection

Efficient Data-Race Detection. Recap. Happens-before based data-race detection Shaz’s guest lecture Vector clock algorithm Lock-set based data-race detection Eraser paper discussion. New Definition of a Data Race. Two instruction conflict if They access the same memory location

Download Presentation

Efficient Data-Race Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Data-Race Detection Efficient Data-Race Detection

  2. Efficient Data-Race Detection Recap • Happens-before based data-race detection • Shaz’s guest lecture • Vector clock algorithm • Lock-set based data-race detection • Eraser paper discussion

  3. Efficient Data-Race Detection New Definition of a Data Race • Two instruction conflict if • They access the same memory location • At least one of them is a write • A program contains a data race if • One can schedule the program on a multiprocessor • Such that conflicting instructions execute simultaneously

  4. Efficient Data-Race Detection New Definition vs Happens-Before X = 1 Lock L X = 2 Unlock L Lock L X = 3 Unlock L

  5. Efficient Data-Race Detection New Definition vs Happens-Before Lock L X = 1 X = 2 Unlock L Lock L X = 3 Unlock L

  6. Efficient Data-Race Detection New Definition vs Happens-Before • True or False ? • If a program has a data race as per the new definition • Then it has a data race as per the happens-before definition

  7. Efficient Data-Race Detection New Definition vs Happens-Before • True or False ? • If a program has a data race as per the new definition • Then it has a happens-before data race • If a program has a happens-before data race • Then it has a data race as per the new definition

  8. Efficient Data-Race Detection New Definition vs Lock Set • True or False ? • If a program has a data race as per the new definition • Then it has a lock-set based data race • If a program has a lock-set based data race • Then it has a data race as per the new definition

  9. Efficient Data-Race Detection Challenges For Dynamic Data Race Detection • Dynamically monitor memory accesses • Need instrumentation infrastructure • Performance overhead

  10. Efficient Data-Race Detection Challenges For Dynamic Data Race Detection • Dynamically monitor memory accesses • Need instrumentation infrastructure • Performance overhead • Understand synchronization mechanisms • Handle home-grown locks • Need to annotate the happens-before relationship

  11. Efficient Data-Race Detection Challenges For Dynamic Data Race Detection • Dynamically monitor memory accesses • Need instrumentation infrastructure • Performance overhead • Understand synchronization mechanisms • Handle home-grown locks • Need to annotate the happens-before relationship • Maintain detection meta-data per variable, per synchronization object • Manage memory • Performance overhead

  12. Efficient Data-Race Detection DataCollider [OSDI ‘10] • Lightweight • < 5% overhead • Effective • Found data races in all applications we have run • Easy to implement • The algorithm can be described in ~ 10 lines

  13. Efficient Data-Race Detection Design Choice #1 • Sample memory accesses • Detect data races on a small, random sample of memory accesses

  14. Efficient Data-Race Detection Design Choice #1 • Sample memory accesses • Detect data races on a small, random sample of memory accesses • It is acceptable to miss races • Dynamic techniques cannot find all data races anyway • A lightweight tool will have more coverage • Likely to be run on a lot more tests • But data races are ‘rare’ events • Need to be careful when sampling

  15. Efficient Data-Race Detection Design Choice #2 • Perturb thread schedules • By inserting delays • Force the data race to happen, rather than infer from past memory/synchronization accesses

  16. Efficient Data-Race Detection Design Choice #2 • Perturb thread schedules X = 1 Lock L X = 2 Unlock L Lock L X = 3 Unlock L

  17. Efficient Data-Race Detection Design Choice #2 • Perturb thread schedules X = 1 Insert Delay Lock L X = 1 X = 2 Unlock L Lock L X = 3 Unlock L

  18. Efficient Data-Race Detection The Algorithm At_Every_Memory_Access ( x ) { if (not sampled) return; delay for some time; if another thread performed a conflicting operation during the delay report data race; }

  19. Efficient Data-Race Detection Use Data Breakpoints for Detecting Conflicting Operations At_Every_Memory_Access ( x ) { if (not sampled) return; rw = accessIsWrite ? RW : W; SetDataBreakpoint ( x, rw); Delay(); ClearDataBreakpoint ( x ); if (breakpoint fired) ReportDataRace(); }

  20. Efficient Data-Race Detection Sampling Memory Accesses • Need to execute the sampling function at every access • Performance overhead • Need a good sampling algorithm • Data races are ‘rare’ events

  21. Efficient Data-Race Detection Dynamic Sampling • Toss a biased coin at each memory access • Sample if coin turns up heads (with small probability)

  22. Efficient Data-Race Detection Dynamic Sampling • Toss a biased coin at each memory access • Sample if coin turns up heads (with small probability) • Hot (frequent) instructions are sampled overwhelmingly more that cold instructions • Bugs (data races) are likely to be present on cold instructions

  23. Efficient Data-Race Detection Code Breakpoint Sampling • Pick a random instruction X • Set a code breakpoint at X • Overwrite the first byte of X with a ‘0xcc’ • Sample X when the breakpoint fires and goto Step 1

  24. Efficient Data-Race Detection Sampling Using Code Breakpoints • Samples instructions independent of their execution frequency • Hot and cold instructions are sampled uniformly

  25. Efficient Data-Race Detection Sampling Using Code Breakpoints • Samples instructions independent of their execution frequency • Hot and code instructions are sampled uniformly • Sampling distribution is determined by you (fair_coin_toss) • Sampling rate is determined by the program (unfair_coin_toss) repeat { t = fair_coin_toss(); while( t != unfair_coin_toss() ); print( t ); }

  26. Efficient Data-Race Detection Sampling Using Code Breakpoints • Samples instructions independent of their execution frequency • Hot and code instructions are sampled uniformly • Sampling distribution is determined by you (fair_coin_toss) • Sampling rate is determined by the program (unfair_coin_toss) Set a breakpoint at location X repeat { t = fair_coin_toss(); while( t != unfair_coin_toss() ); print( t ); } Run the program till it executes X Sample X

  27. Efficient Data-Race Detection The DataCollider Algorithm Init: SetRandomCodeBreakpoints( n ); AtCodeBreakpoint( x ) { SetDataBreakpoint ( x); Delay(); if (breakpoint fired) ReportDataRace(); ClearDataBreakPoint( x ); SetRandomCodeBreakpoints( 1 ); }

  28. Efficient Data-Race Detection DataCollider Summary • Tunable runtime overhead • Tune code breakpoint rate so that we get k samples per second • Finds lots of data races in practice • The fastest implementation • Someone hacked it up during the talk • Pruning intended data races is a problem • Sampling algorithm improvements?

More Related