430 likes | 454 Views
Exploring multiversion concurrency control in transactions for enhanced database performance. Discover benefits, costs, equivalence theories, and more.
E N D
Transactions Lecture 5:Multiversion Concurrency Control (Chapter 5, BHG) More than one version per data item presents opportunities for better performance (c) Oded Shmueli 2004
The basic idea • Each Write(x) produces a new version for x. • For Read(x), the DM has to decide which version to use. • Benefit: avoid rejecting Read operations that arrive “too late”. • Older versions may be useful for recovery. • Cost: storage, management complexity. • Versions: are due to active or committed transactions. • Users cannot “see” versions. • The DBS should behave as if there was one version per item. (c) Oded Shmueli 2004
Extending the theory • DM executions are represented by MV histories. • Users regard 1V serial histories as correct. • To prove a CC algorithm correct, we need show that its MV histories are equivalent to serial 1V histories. • Denote the versions of X as xi, xj etc. • The subscript indicates the writing transaction. • A Write is of the form wi[xi]. • A read is of the form ri[xj] (j=I is possible). (c) Oded Shmueli 2004
Equivalence – First try • “Definition:” Hmv is equivalent to H1v if every pair of conflicting operations in Hmv is in the same order in H1v. • H1= w0[x0] c0 w1[x1] c1 r2[x0] w2[y2] c2 • H2= w0[x] c0 w1[x] c1 r2[x] w2[y] c2 • However, in H1 r2 reads from T0 and in H2 from T1. • Need to use view equivalence (same reads-from and same final writes). (c) Oded Shmueli 2004
Is SG(H) acyclic sufficient? • H3= w0[x0] w0[y0] c0 r0[x0] r1[y0] w1[x1] w1[y1] c1 r2[x0]r2[y1] c2 • SG(H3) is acyclic; H3 is not equivalent to a serial 1V history. • For example, consider H4, H5 • H4= w0[x] w0[y] c0 r1[x] r1[y] w1[x]w1[y] c1 r2[x]r2[y] c2 • In H4: T2 reads x and y from T0. • In H3: T2 reads x from T1. • H5= w0[x]w0[y] c0 r2[x]r2[y] c2 r1[x] r1[y] w1[x] w1[y] c1 T0 T2 T1 SG(H3) (c) Oded Shmueli 2004
An overview • A complete mv history H is serial if for all Ti, Tj in H either all operations of Ti precede all operations of Tj or vice versa. • A serial mv history H is 1-serial if for all i,j,x if Ti reads x from Tj (ri[xj]) then either i=j or Tj is the last transaction preceding Ti to write any version of x. • H is one-copy serializable (1SR) if C(H) is equivalent to a 1-serial mv history. • H an mv history over T. C(H) is equivalent to a serial 1v history over T iff H is 1SR. • An mv H is 1SRiff there exists a version order << such that MVSG(H, <<) is acyclic. (c) Oded Shmueli 2004
Complete mv History T0 T1 T2 T3 T4 w3([y] w0([x] r1([x] r4([x] r2([x] c3 w0([y] c0 w1([y] w2[x] c2 r3([z] r4([y] c4 w0([z] r2([z] c1 w3([z] r4([z] r1([x0] w1([y1] c1 H6 w0([x0] r2([x0] w2[x2] c2 r4([x2] w0([y0] c0 r2([z0] r4([y3] c4 w3([y3] w0([z0] c3 r3([z0] w3([z3] r4([z3] (c) Oded Shmueli 2004
Another Complete mv History r1([x0] w1([y1] c1 H6 w0([x0] r2([x0] w2[x2] c2 r4([x2] w0([y0] c0 r2([z0] r4([y3] c4 w3([y3] w0([z0] c3 r3([z0] w3([z3] r4([z3] H7 r1([x0] w1([y1] c1 r4([y1] w0([x0] r2([x0] w2[x2] c2 r4([x2] c4 w0([y0] c0 r2([z0] w3([y3] r4([z3] w0([z0] c3 r3([z0] w3([z3] (c) Oded Shmueli 2004
mv History - Definitions • T={t0,…,Tn} transactions, ordered by <i. • H maps wi[x] to wi[xi], ri[x] to ri[xj], ci to ci, ai to ai. • A complete mv history H over T is a p.o. <: 1. H = h(Uin=1Ti) for some translation function h; 2. for each Ti and all operations pi, qi; in Ti, if pi <i qi, then h(pi) < h(qi); 3. if h(rj[x]) = rj[xi], then wi[xi] < rj[Xi]; 4. if wi[x] <i ri[x], then h(ri[x]) = ri[xi]; and 5. if h(rj[x]) = rj[Xi], i ≠ j, and cj H, then ci < cj. (c) Oded Shmueli 2004
More Definitions • If H satisfies (4), it preserves reflexive reads-from relationships. • If H satisfies (5), it is recoverable. • A mv history H is a prefix of a complete mv history. • An mv history preserves reflexive reads-from relationships (or is recoverable) if it the prefix of a complete mv history that does so. Isn’t this always true? • C(H) is as defined for 1V histories. • If H is an mv history then C(H) is a complete mv history. (c) Oded Shmueli 2004
Equivalence of mv histories • Again, we assume no transaction reads or writes twice to the same item x. • Ti reads x from Tj in mv history H if Tj reads the version of x produced by Tj (iff rj[xi] H). • Two mv histories are equivalent () if they have the same operations and the same reads-from relationship. • But, same operations same reads-from. • Proposition 1: two mv histories over the same set of transactions are equivalent iff the histories have the same operations. (c) Oded Shmueli 2004
Equivalence of an mv history to a 1v history • Intuitively, the 1v history is a valid 1 version view of the mv history. • Formally: • Same set T={To,…,Tn} of transactions. Same <i. • Their operations in 1-1 correspondence (A 1-1, onto function mapping ai to ai, ci to ci, wi[x] to wi[xi] and ri[x] to some ri[xj]). • There reads-from relations are the same if they are preserved under that function. • So, an mv history H and 1v history are equivalent if they have the same reads-from relationship. • Note – We don’t worry about final writes as all writes are in the mv produced state. (c) Oded Shmueli 2004
Serialization graphs • Two operations conflicts if they operate on the same version and one is a write, i.e., are of the form wi[xi]<rj[xi]. • Let H be an mv history. SG(H) has nodes for committed transactions in H and edge TiTj (I ≠ j) if for some x rj[xi] is an operation in C(H). • Proposition 2: Let H, H’ be mv histories. If HH’ then SG(H)=SG(H’). (c) Oded Shmueli 2004
Serialization graphs r1([x0] w1([y1] c1 H6 w0([x0] r2([x0] w2[x2] c2 r4([x2] w0([y0] c0 r2([z0] r4([y3] c4 w3([y3] w0([z0] c3 r3([z0] w3([z3] r4([z3] T2 SG(H6) T0 T3 T4 T1 (c) Oded Shmueli 2004
Another Complete mv History H7 r1([x0] w1([y1] c1 r4([y1] w0([x0] r2([x0] w2[x2] c2 r4([x2] c4 w0([y0] c0 r2([z0] w3([y3] r4([z3] w0([z0] c3 r3([z0] w3([z3] T2 SG(H7) T0 T3 T4 T1 (c) Oded Shmueli 2004
One Copy Serializability • A complete mv history H is serial if for all Ti, Tj in H either all operations of Ti precede all operations of Tj or vice versa. • Recall H3 that although it is serial, it behaves differently than a serial 1V history. • H3= w0[x0] w0[y0] c0 r0[x0] r1[y0] w1[x1] w1[y1] c1 r2[x0]r2[y1] c2 • A serial mv history H is 1-serial if for all i,j,x if Ti reads x from Tj (ri[xj]) then either i=j or Tj is the last transaction preceding Ti to write any version of x. • H3 is not 1-serial as it doesn’t read x from T1. • H8 is 1-serial: • H8= w0[x0] w0[y0] w0[z0] c0 r1[x0] w1[y1] c1 r2[x0] r2[z0] w2[x2] c2 r3[z0] w3[y3] w3[z3] c3 r4[x2] r4[y3] r4[z3] c4 (c) Oded Shmueli 2004
One Copy Serializability • An mv history H is one-copy serializable (1SR) if C(H) is equivalent to a 1-serial mv history. • 1SR is a prefix commit-closed property. • Reason: the committed projection of a 1SR history is equivalent to a 1-serial mv history (Exercise 5.4). • So, unlike view serializability, we need not require that the committed projection of every prefix of an MV history be 1SR. (c) Oded Shmueli 2004
One Copy Serializability • Again - an mv history H is one-copy serializable (1SR) if C(H) is equivalent to a 1-serial mv history. • H6=C(H6) is equivalent to H8. It is 1SR. • H7=C(H7) is equivalent to no 1-serial history, so it’s not 1SR. • A serial history can be 1SR and not 1-serial: • H10= w0[x0] c0 r1[x0] w1[x1] c1 r2[x0] c2 is not 1-serial (T2 reads x from T0). • But, H10 is 1SR since it’s equivalent to: • H11= w0[x0] c0 r2[x0] c2 r1[x0] w1[x1] c1 • We take 1SR as the correctness criterion, this need be justified. (c) Oded Shmueli 2004
Justifying the correctness criterion • Theorem 3: Let H be an mv history over T. C(H) is equivalent to a serial 1v history over T iff H is 1SR. (c) Oded Shmueli 2004
(If) • H is 1SR. There exists 1-serial mv Hs s.t. C(H) Hs. • Let H’s be serial 1V obtained from Hs by eliminating subscripts (from version to item). • H’s Hs: Suffices to show same read-from. • Say Tj reads x from Ti in Hs. Hs is 1-serial no wk[xk] lies between wi[xi] and rj[xi]. So, no wk[x] lies between wi[x] and rj[x] in H’s. So, Tj reads x from Ti in H’s. • Say Tj reads x from Ti in H’s. If rj[x] was obtained from rj[xi] we have same in Hs. If rj[x] was obtained from rj[xk], k≠ i, then: • j=i, then Ti reads x from Ti in H’s, by (4) in mv history definition, Ti reads x from Ti in Hs, so k=i, contradiction. • j ≠ i: since Hs is 1-serial wi[xi] < wk[xk] or rj[xk] < wi[xi]. Then, Tj does not read x from Ti in H’s, contradiction. • C(H) Hs, Hs H’s C(H) H’s. Done. (c) Oded Shmueli 2004
(Only If) • Given C(H) H’s, H’s serial 1V. • Translate H’s to a serial mv Hs: • ci ci • wi[x] wi[xi] • rj[x] rj[xi] s.t. Tj reads x from Ti in H’s • Hs H’s, same read-from by construction. • Claim 1: Hs is a complete mv history. Next slide. • Claim 2: Hs is 1-serial. • Say Tj reads x from Ti, x ≠ i. Since H’s is a serial 1V history, no wk[x] lies between wi[x] and rj[x]. So, no wk[xk] between wi[xi] and rj[xi] in Hs. So, Hs is 1-serial. • Claim 3: Hs H’s C(H) H is 1SR. (c) Oded Shmueli 2004
Claim 1: Hs is a complete mv history. • The translation from H’s to Hs (1), (2). • Condition (3): Since H is an mv history, each rj[xk] in C(H) is preceded by wk[xk]. So, in H’s (serial 1v), each rj[x] is preceded by some w[x] op. So, in Hs each rj[xi] is preceded by wi[xi]. • Condition (4): Suppose wj[x]<rj[x] in H’s. H’s is serial, so Tj reads x from Tj in H’s, so rj[x] is translated into rj[xj] in Hs. • Condition (5): Say rj[xi] in Hs, then Tj reads x from Ti in H’s. Hs is serial, in it Ti, Tj commit. So, ci<cj in H’s. The translation retains positions, so ci<cj in H’s. (c) Oded Shmueli 2004
The 1-serializability theorem • Given a CC mechanism, ensure all histories are 1SR. • All known mv CC algorithms totally sort versions. • A version order << for x is a total order on its versions. • A version order for H is the union of the version orders of all data items. • MVSG(H, <<) is SG(H) with additional version order edges: (recall nodes are committed transactions) • For each rk[xj] wi[xi] in C(H) (i,j,k distinct): • If xi << xj then add Ti Tj • Otherwise add Tk Ti (c) Oded Shmueli 2004
x0<<x2, y0<<y1<<y3, z0<<z3 MVSG(H6,<<) r1([x0] w1([y1] c1 H6 w0([x0] r2([x0] w2[x2] c2 r4([x2] w0([y0] c0 r2([z0] r4([y3] c4 w3([y3] w0([z0] c3 r3([z0] w3([z3] r4([z3] T1 SG(H6) rk=4([yj=3] MVSG(H6,<<) T1 wi=1([y1] rk=1([xj=0] T0 T3 T4 T0 T3 T4 wi=2[x2] rk=2([zj=0] T2 T2 wi=3([z3] new • For each rk[xj] wi[xi] in C(H) (i,j,k distinct): • If xi << xj then add Ti Tj • Otherwise add Tk Ti (c) Oded Shmueli 2004 old
Using MVSG • Suppose SG(H) is acyclic. • A serial mv history Hs obtained by topologically sorting Hs may not be equivalent to any serial 1V history. • This is due to changing read-from relationships in mapping version ops to item ops. • Version order edges detect this change. • If rk[xj] wi[xi] in C(H), version edges force either wi[xi] precede wj[xj] or follow rk[xj]. • So, in translating ops on xi and xj to ops on x, the read-from is not changed. • MVSG needs be acyclic for all this to work. (c) Oded Shmueli 2004
Using MVSG(H,<<) • Theorem 4: • An mv history H is 1SR iff • there exists a version order << s.t. MVSG(H,<<) is acyclic. (c) Oded Shmueli 2004
(If) • Topologically sort G=MVSG(H, <<), produce a Hs = Ti1,…,Tin. • C(H) is a mv history. • Hs is a mv history (same ops as C(H)). • Since C(H) and Hs have same ops, they are equivalent by proposition 1, i.e. C(H) Hs. • Hs is 1-serial since: • Say Tk reads x from Tj, k ≠ j. Let wi[xi] be any write on x (i ≠ j, I ≠ k). • If xi<<xj then G includes Ti Tj, so Tj follows Ti in Hs. • If xj<<xi then G includes Tk Ti, so Ti follows Tk in Hs. • So, no transaction that writes x is between Tj and Tk in Hs. • So, Hs is 1-serial • Conclusion: H is 1SR. (c) Oded Shmueli 2004
(Only if) • Define MV(H, <<) as a graph containing only version order edges. These edges depend only on the operations in H and on <<. • Let Hs C(H) be 1-serial mv history (one exists). • In SG(Hs), Ti Tj implies Ti precedes Tj in Hs. • Define <<: xi << xj only if Ti precedes Tj in Hs. • All edges in MV(Hs,<<) are s.t. Ti Tj implies Ti precedes Tj in Hs. • So, in MVSG(Hs,<<) = SG(Hs) MV(Hs, <<), Ti Tj implies Ti precedes Tj in Hs. MVSG(Hs,<<) is thus acyclic. • Hs C(H).By proposition 1, they have the same ops. • So, MV(C(H), <<) = MV(Hs, <<). • SG(C(H)) = SG(Hs). • MVSG(C(H), <<) = MVSG(Hs, <<). • MVSG(C(H), <<) is acyclic. • MVSG(C(H), <<) = MVSG(H, <<). • MVSG(H, <<) is acyclic. (c) Oded Shmueli 2004
Mv CC mechanisms • Can define based on 2PL, TO and SGT. (c) Oded Shmueli 2004
Multiversion timestamp ordering • Each transaction Ti has a unique ts(Ti). • Operations are ts-tagged; versions are tagged with the ts of the writing transaction. • Ops are processed first-come-first-served. • ri[x] ri[xk], xk has largest ts ≤ ts(Ti). Send to DM. • wi[x] • If it has already processed rj[xk] s.t. ts(Tk) < ts( Ti) < ts( Tj), then it rejects wi[x]. • Otherwise, it translates wi[x] into wi[xi] and sends it to the DM. • ci is delayed until cj for all Tj that wrote versions Ti has read. (c) Oded Shmueli 2004
Intuition • “Simulate” a 1V ts-order execution. • In such an execution a read of x gets the latest x data produced by a transaction with a lesser ts. • In such an execution if x was produced at ts t1 and read by T3 with ts t3 t>t1, then writing x with a write with ts t2 s.t. t1 <t2 < t3 would invalidate the read. (c) Oded Shmueli 2004
Implementation • For each version xi maintain interval(xi) = [wts, rts]: • wts = ts(xi) • rts = max (ts of a read op on xi, wts) • intervals(xi) = {interval(xi) | xi a version of x}. • Op processing. Find i itervals(xi) with max i.wts < ts(Ti): • ri[x]: set i.wts = max (i.wts, ts(Ti)). • wi[x]: if i.rts > ts(Ti) then reject else send to DM and create a newinterval(xi)=[wts = ts(Ti), rts = ts(Ti)]. • Space: need delete ‘old’ versions. • Delete from old to new, otherwise wrong versions may be read. • It’s also possible that when a read arrives a version with a smaller ts is no longer available. (c) Oded Shmueli 2004
Correctness: properties of histories • p1: ts(Ti)=ts(Tj) iff i = j. • p2: For all rk[xj]: wj[xj] < rk[xj] and ts(Tj) ≤ ts(Tk). • p3: For all rk[xj], wi[xi] in H s.t. i ≠ j: • ts(Ti) < ts(Tj), or • ts(Tk) < ts(Ti), or • i=k and rk[xj] < wi[xi]. • That is: • rk[xj] there is no other version with ts between ts(Tj) and ts(Tk). • If xk exists and k ≠ j, then rk[xj] < wk[xk]. • p4: For all rj[xi] in H s.t. i ≠ j and cj in H, ci < cj. That is, H is recoverable. See Note. • P1-P4 H preserves reflexive read-from relationships. Otherwise, wk[xk] < rk[xj] and j ≠ k. By p2, ts(Tj) < ts(Tk). By p3, either ts(Tk) < ts(Tj) (impossible), or ts(Tk) < ts(Tk) (impossible), or rk[xj] < wk[xk] (impossible). Contradiction. (c) Oded Shmueli 2004
Theorem: Every history H produced by MVTO is 1SR. • Define version order xi << xj iff ts(Ti) < ts(Tj). • In G = MVSG(H, <<) Ti Tj ts(Ti) < ts(Tj). • Suppose Ti Tj in SG(H), it is due to a read. By p2, ts(Ti) ≤ ts(Tj). By p1 ts(Ti) ≠ ts(Tj), so ts(Ti) < ts(Tj). • Let rk[xj], wi[xi], s.t. i,j,k distinct. They generate a version edge: • case 1: xi <<xj. Ti Tj in G. ts(Ti) < ts(Tj). • case 2: xj << xi. Tk Ti in G. By p3: • i=k, impossible – distinct. • ts(Ti) < ts(Tj). Impossible as xj << xi. OR • ts(Tk) < ts(Ti). Only possible option. • So, G is acyclic. By theorem 4, H is 1SR. (c) Oded Shmueli 2004
Two version 2PL (2V2PL) • Idea: Use versions (resp., 2PL) for rw (resp., ww) synchronization. • DM stores one or two versions of each data item. • If there are two versions, only one written by a committed transaction. • If Ti wrote x and has not yet committed, one version is the before image of x and the other the one Ti wrote. • Once Ti commits, the older version can be deleted. (c) Oded Shmueli 2004
2V2PL: the mechanism • Use three lock types: R, W , C (certify). • Upon Ti completion, all of Ti’s W locks are converted to C. • wi[x]: Get wli[x]. Note: x not xi. Conflicts may block the request, otherwise set w-lock on x, wi[xi]. • ri[x]: rli[x]. Only conflicts with C. • if owns wli[x], ri[xi] • else, once rli[x], ri[xj] to DM where xj is the only committed version of x. Recoverable! • ci: convert wl’s to cl’s. • On such locks no other wl’s but other rl’s are possible. cl[x] only when no more readers on x! • Abort during commit no lock release prior to obtaining all cl’s. • Effect: minor reader delays! But, readers delay commits! holder (c) Oded Shmueli 2004
Correctness: properties of histories • Let fi denote certification of Ti. • q1: For all committed Ti’s, fi follows all Ti’s operations and precedes ci. • q2: For all rk[xj] (Tk committed): • j ≠ k cj < rk[xj] • j = k wk[xk] < rk[xk] • That is, read yours or a committed version. (c) Oded Shmueli 2004
Correctness: properties of histories • q3: For all wk[xk] and rk[xj], • wk[rk] < rk[xj] j=k. • That is, read your own produced value. • q4: If rk[xj], fi and wi[xi] are in H: • fi < rk[xj], or • rk[xj] < fi • That is, rk[xj] totally ordered with respect to certification operations. See Note 1. • q5: For all rk[xj], wi[xi], s.t. i,j,k distinct committed, fi < rk[xj] fi < fj. See Note 2. (c) Oded Shmueli 2004
Correctness: properties of histories • q6: For all rk[xj] and wi[xi], i ≠ j and i ≠ k, if rk[xj] < fi then fk < fi. See Note. • q7: For all wi[xi] and wj[xj], i ≠ j, fi < fj or fj < fi. (c) Oded Shmueli 2004
Theorem: Every history H produced by 2v2PL is 1SR. • Writing the proof carefully is a homework assignment. (c) Oded Shmueli 2004
Using more than 2 versions • Suppose write locks don’t conflict. • Will have more than two versions, still only the most recently committed is read. • Then, we can allow transactions to read these versions: • Can’t certify a transaction until all versions it read from other transactions are certified. • Convert a write lock on x to a certify lock only if there are no read locks on certified versions of x. • So, read locks on uncertified versions are ignored. • Cascading aborts are now possible. If Ti produced x that was read by Tj and Ti aborts, Tj is aborted as well. (c) Oded Shmueli 2004
Multiversion mixed method • Distinguish pre-identified queries and updaters. • Updaters use strict 2PL. • When the TM receives an updater’s commit, it assigns the updater a timestamp: • Updaters have timestamps consistent with their order in SG(H) (readers committed). • Each write produces a ts-tagged new version. • Upon receiving a query’s 1st operation the TM assigns it a ts less than that of all committed updater. • Query q’s ri[x] is translated to the largest xi.ts < q.ts. • This means xi was written by a committed transaction. • Future writes will not be rejected due to this read. • Queries set no locks, never wait, never cause updaters to wait. • Disadvantage: reads out of date, timestamps, space mgmt. (c) Oded Shmueli 2004
Claim: Every history H produced by Multiversion mixed method is 1SR. • Writing the proof carefully is a homework assignment. (c) Oded Shmueli 2004