1 / 28

Distributed Concurrency Control, Lecture 4 (BHG , Chap. 4 + Comp. Surveys Article)

Distributed Concurrency Control, Lecture 4 (BHG , Chap. 4 + Comp. Surveys Article). Motivation. Distributed usage. Local autonomy. Maintainability. Allows for growth. Reliability – a number of copies. Components: Reliable communication. Local DBs – may be identical. Problems:

tory
Download Presentation

Distributed Concurrency Control, Lecture 4 (BHG , Chap. 4 + Comp. Surveys Article)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Concurrency Control, Lecture 4 (BHG, Chap. 4 + Comp. Surveys Article) (c) Oded Shmueli 2004

  2. Motivation • Distributed usage. • Local autonomy. • Maintainability. • Allows for growth. • Reliability – a number of copies. • Components: • Reliable communication. • Local DBs – may be identical. • Problems: • Query processing. • Maintaining multiple copies. • Concurrency control and recovery. • Distributed data dictionary. (c) Oded Shmueli 2004

  3. Topics • Part I: Distributed 2PL • Part II: Distributed Deadlocks • Part III: Timestamp based Algorithms • Part IV: Optimistic Concurrency Control (c) Oded Shmueli 2004

  4. Part I: Distributed 2PL • Each item may have a number of copies. • Intuitively – behave as if there is a single copy. • Mechanisms: • Writers lock all copies. • Central copy. • Central locking site. • Majority locking. • A generalization. • Moving central copy – not covered. (c) Oded Shmueli 2004

  5. Writers lock all copies • Each copy may be locked individually. • Read[x]: lock some copy of x. • Write[x]: lock all copies of x. • Resulting executions are SR. • Problems: • writers tend to deadlock. • Many messages. (c) Oded Shmueli 2004

  6. Central copy • A central copy per item. • Read[x]: read-lock the central copy. • Write[x]: write-lock the central copy. • Advantage: fewer messages. (c) Oded Shmueli 2004

  7. Central locking site • A single site that maintains a global lock table. • Advantages: • few messages. • checking the WFG for deadlocks. • Disadvantages: a possible bottleneck. (c) Oded Shmueli 2004

  8. Majority locking • The previous solutions are vulnerable to site failure (any in the first, a central in the other two). • Read[x]: lock a majority of x’s copies. • Write[x]: lock a majority of x’s copies. • Thus, for all x, no transactions that conflict on x can both have a majority – effective lock. • Disadvantage: many messages, can trade time for number of messages using “forwarding”. (c) Oded Shmueli 2004

  9. A generalization • Suppose there are n copies of x. • Let k, l be s.t. k + l > n and l > n/2. • Read[x]: obtain k out of n. • Write[x]: obtain l out of n. • There can be no concurrent reader/writer and another writer of x effectively locking x. • Choose l, k: • Many readers: small k. • Many writers: small l. (c) Oded Shmueli 2004

  10. Part II: Distributed Deadlocks • Left as reading material. (c) Oded Shmueli 2004

  11. Part III: Timestamp based Algorithms • A system model. • Assumptions. • Operations in a distributed environment. • Timestamp Ordering (TO). • Conservative Timestamp Ordering (CTO). • Transaction classes. (c) Oded Shmueli 2004

  12. Transaction    Transaction Transaction    Transaction Transaction    Transaction TM TM TM DM DM DM DATA DATA DATA A system model (c) Oded Shmueli 2004

  13. Assumptions • No concurrency within a transaction. • Write into private workspaces at the various DMs. • Each transaction is managed by a single TM. • Each item x may have a number of physical copies x1, … , xn. (c) Oded Shmueli 2004

  14. Operations in a distributed environment • Begin: set up a private workspace. • Read[x]: If x is in the workspace, read it from there. Otherwise read x from some copy xi by issuing dm_read. • Write[x,v]: The single copy of x in the privateworkspace is assigned v. • END: perform a 2-phase commit: • For each updated x, for all copies of x: • Issue a pre-stable-write command to store x on stable storage. • Once all DMs confirm: issue dm-write commands to the DMs to install the new value in the database. (c) Oded Shmueli 2004

  15. Timestamp Ordering (TO) - skip • Idea: conflict equivalent to a serial history in timestamp order. • Item = <S_read, S_write, stable, value> • S_read – set of readers’ timestamps of the item. • S_write – set of writers’ timestamps of the item. • stable – a flag indicating a committed value. (c) Oded Shmueli 2004

  16. Timestamp Ordering (TO) - skip • On accessing an item with stable = no: • wait  possible deadlock. • abort  may be wasteful. • DM_Read with ts: • if ts < max {t | t  S_write }  abort. • otherwise, read and add ts to S_read. • DM_Write with ts: • if ts < max {t | t  S_read}  abort. • if ts < max {t | t  S_write }  ignore (TWR). • otherwise, set stable = no; write and add ts to S_write. • Commit: After all writes are performed, set stable = yes. • Abort: Remove ts from S_read and S_write. Make all items the transaction updated stable=yes. (c) Oded Shmueli 2004

  17. Another Timestamp Ordering Algorithm • Terminology • DM_r: read item. • DM_w: write item at transaction end. • p[x]: synchronization due to private write. • WTS(x), RTS(x): ts of latest dm-write, dm-read. • Buffering: Delaying operations for future execution. • min_r(x): ts of earliestts buffered read op. • min_w(x), min_p(x): same idea. • DM_r[x] is ready if ts(DM_r[x]) < min_p(x) in the buffer, if any. • DM_w[x] is ready if ts(DM_w[x]) < min_r(x), if any, and min_p(x) = ts(DM_w[x] ), in the buffer. (c) Oded Shmueli 2004

  18. Another Timestamp Ordering Algorithm • DM_r[x]: • if ts(r[x]) < WTS[x]  abort. • if ts(r[x]) > min_p[x]  put in buffer. • otherwise, perform and update RTS(x). • p[x]: • if ts(p[x]) < RTS[x]  abort. • if ts(p[x]) < WTS[x]  abort. • otherwise, put in buffer. • DM_w[x]: (note: a p[x] previously executed, no abort) • if ts(w[x]) > min_r[x]  put in buffer. • if ts(w[x]) > min_p[x]  put in buffer. • otherwise, perform, update WTS(x) and throw p[x]. • Occasionally check if actions change min_r(x) and min_p(x)  some buffer operation is now ready. • Observations: • No deadlocks are possible (why?). • Upon abort, discard the private workspace and all transaction’s operations. Need to update RTS(x). (c) Oded Shmueli 2004

  19. Conservative Timestamp Ordering (CTO) • To prevent aborts do the following. • Perform an operation only when certain that a later operation will not be restarted due to small ts. • No aborts and no deadlocks, less concurrency. • But, how long to wait? • Solution – use CTO. • Operations: DM_r, DM_w. (c) Oded Shmueli 2004

  20. transaction    transaction transaction    transaction TM1 TMn DM1 DMk Conservative Timestamp Ordering Architecture queue – ts order       queue – ts order (c) Oded Shmueli 2004

  21. Conservative Timestamp Ordering Algorithm • TMs must submit dm-read operations in ts-order, if an operation with ts t is issued, one with ts s < t will not be issued in the future. • Similarly for dm-writes. • Achieve by: • each TM works serially. • each transaction first reads all its data and then writes all results at the end (still in ts order but allows execution parallelism). Termination of transactions in ts order. • Data items need no associated timestamps. (c) Oded Shmueli 2004

  22. CTO - Ready Operations • Maintain at each DM a queue for read and write ops. • Buffer DM_r and DM_w operations. • Output a DM_r operation if: • There is a DM_w operation from each TMi and all such operations have higher timestamps. • Output a DM_w operation if: • There is a DM_r operation from each TMi and all such operations have higher timestamps. • There is a DM_w operation from each TMi and all such operations have higher timestamps. • DM_w operations are never rejected! • Overall effect: a serial execution in timestamp order! (c) Oded Shmueli 2004

  23. Conservative Timestamp Ordering - Problems • Problem: What if a TM issues no operation to this queue? • Solution: null operations (have a ts but are no-ops). • Can send ‘infinite time stamps’ to indicate expected long inactivity. (c) Oded Shmueli 2004

  24. Transaction classes • CTO synchronizes everything, an overkill! • Transaction class= <readset, writeset> • If transactions are known in advance, each transaction can be assigned to one or more classes. • If T reads X and writes Y, T belong to a class c=<rs, ws> if X  rs and Y  ws. • A TM manages a single class. • A transaction must belong to the class of the TM managing it. • Run CTO, only ops of relevant TMs are considered: • To output DM_r[x], wait until there are DM_w operations from all TMs (classes) that have x in their write sets with higher ts. • To output DM_w[x], wait until there are DM_r operations with higher ts from all TMs (classes) that have x in their read sets and DM_w operations with higher ts from all TMs (classes) that have x in their write sets. (c) Oded Shmueli 2004

  25. Part IV: Optimistic Concurrency Control • Can be based on locking or timestamps. • First, a centralized algorithm. • We show a timestamp-based algorithm (Kung-Robinson). • Then, adaptation to a distributed environment. (c) Oded Shmueli 2004

  26. Rules for Validation: Centralized • A transaction has read, validate and write phases. During the read phase it also computes and writes to a private space. • Executions will be serializable in timestamp order. • To ensure this, for all transactions Tk s.t. ts(Tk) < ts(T), one of the following should hold: • Tk completed its write phase prior to T starting its read phase. • Tk completed its write phase while T is in its read phase and write-set(Tk)  read-set(T) = . • Tk completed its read phase before T completes its read-phase, write-set(Tk)  read-set(T) =  and write-set(Tk)  write-set(T) = . • A timestamp is assigned only once validation succeeds. Do it after the write phase. • Different validations can be executed in parallel. • So, each transaction T uses START(T) and FINISH(T) to determine the transactions against which it should be validated. (c) Oded Shmueli 2004

  27. Rules for T and Tk Tk completed its write phase prior to T starting its read phase. ts(Tk) < start(T). rule a R V W R V W Tk completed its write phase while T is in its read phase and write-set(Tk)  read-set(T) = . start(T) < ts(Tk) < finish(T). R V W rule b R V W Tk completed its read phase before T completes its read-phase, write-set(Tk)  read-set(T) =  and write-set(Tk)  write-set(T) = . finish(Tk) < finish(T). R V W rule c R V W (c) Oded Shmueli 2004

  28. Distributed Setting • A transaction can execute at many sites. • Perform a validation phase at each site in which T operated. This is called local validation. • The local site may have purely local as well as sub-transactions of global transactions. • If validation is successful at all sites, ensure global consistency: • Build HB(Tj) for each sub-transaction of T at site j. This is a set of id’s of global transactions that must precede T. Built during local validation. • Global validation is done by making sure that each transaction in the HB set is either committed or aborted. • Deadlocks are possible. • After the global validation phase, a timestamp can now be issued. It will be the same one for all local sub-transactions of a global transaction. • Use 2-phase commit. Notify local sub-transactions. (c) Oded Shmueli 2004

More Related