1 / 30

Computer Science 328 Distributed Systems

Computer Science 328 Distributed Systems Lecture 6 Global States & Causality Detecting Global Properties Process Histories and States For a process P i , history(P i ) = h i = <e i 0 , e i 1 , … > prefix history(P i k ) = h i k = <e i 0 , e i 1 , …,e i k >

niveditha
Download Presentation

Computer Science 328 Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Science 328Distributed Systems Lecture 6 Global States & Causality

  2. Detecting Global Properties

  3. Process Histories and States • For a process Pi, history(Pi) = hi = <ei0, ei1, … > prefix history(Pik) = hik = <ei0, ei1, …,eik > Sik : Pi ‘s state immediately before kth event • For a set of processes, global history:H = i (hi) global state:S = i (Siki) a cut C  H = h1c1 h2c2 …  hncn the frontier of C = {eici, i = 1,2, … n}

  4. Consistent cut Inconsistent cut Consistent States • A cut C is consistent if e  C(if f  e then f  C) • A global state S is consistent if it corresponds to a consistent cut e12 e10 e11 e13 P1 e21 P2 e22 e20 P3 e31 e30 e32

  5. Global States • ARunis a total ordering of events in H that is consistent with each hi’s ordering • A Linearization is a run consistent with happens-before () relation in H. • Linearizations pass through consistent global states. • A global state Sk is reachable from global state Si, if there is a lineralization, L, that passes through Si and then through Sk. • A DS evolves as a series of transitions between global states S0 ,S1 , ….

  6. Global State Predicates • Aglobal-state-predicateis a function from the set of global states to {true, false} , e.g. deadlock, termination • A stable global-state-predicateis one that once it becomes true, it remains true in subsequent global states, e.g. deadlock • if P is a gobal-state-predicate of being deadlocked, then safety S reachable from S0, P(S) = false • If P is a global-state predicate of reaching termination, then liveness(P)  L lineralizatoins from S0  SL :passes through SL & P(SL) = true • We need a way to record global states

  7. The “Snapshot” Algorithm • Records a set of process and channel states such that the combination is a consistent GS. • Assumptions: • No failure, all messages arrive intact, exactly once • Communication channels are unidirectional and FIFO-ordered • There is a comm. path between any two processes • Any process may initiate the snapshot (sends Marker) • Snapshot does not interfere with normal execution • Each process records its state and the state of its incoming channels (no central collection)

  8. The “Snapshot” Algorithm (2) 1. Marker sending rule for process Pk • After Pk has recorded its state • for each outgoing channel C, send a marker on C 2. Marker receiving rule for a process Pk on receipt of a marker over channel C • ifPk has not yet recorded its state • record Pk’s state • record the state of C` as “empty” • turn on recording of messages over other incoming channels else • record the state of C` as all the messages received over C` since Pk saved its state

  9. Chandy and Lamport’s ‘Snapshot’ Algorithm • Marker receiving rule for process pi • On pi’s receipt of a marker message over channel c: • if (pi has not yet recorded its state) it • records its process state now; • records the state of c as the empty set; • turns on recording of messages arriving over other incoming channels; • else • pi records the state of c as the set of messages it has received over c • since it saved its state. • end if • Marker sending rule for process pi • After pi has recorded its state, for each outgoing channel c: • pi sends one marker message over c • (before it sends any other message over c).

  10. Two Processes and Their Initial States

  11. Execution of the Processes Send 5 widgets

  12. e11,2 e14 e13 M M e24 M M e21,2,3 M M e31 e32,3,4 1- P1 initiates snapshot: records its state (S1); sends Markers to P2 & P3; turns on recording for channels C21 and C31 2- P2 receives Marker over C12, records its state (S2), sets state(C12) = {} sends Marker to P1 & P3; turns on recording for channel C32 3- P1 receives Marker over C21, sets state(C21) = {a} 4- P3 receives Marker over C13, records its state (S3), sets state(C13) = {} sends Marker to P1 & P2; turns on recording for channel C23 5- P2 receives Marker over C32, sets state(C32) = {b} 6- P3 receives Marker over C23, sets state(C23) = {} 7- P1 receives Marker over C31, sets state(C31) = {} Snapshot Example e10 e13 P1 a e23 P2 e20 b P3 e30

  13. Reachability Between States in Snapshot Algorithm • Let ei and ej be events occurring at pi and pj, respectively such that ei ej • the snapshot algorithm ensures that • if ej is in the cut then ei is also in the cut. • if ej occurred before pj recorded its state, then ei must have occurred before • pi recorded its state.

  14. Include(obj1) obj1.method() P2 has obj1 Causality Violation Physical Time 1 2 P1 0 1 P2 0 5 6 2 4 P3 0 4 3 • Causality violation occurs when order of messages causes an action based on information that another host has not yet received. • In designing a DS, potential for causality violation is important

  15. Violation: (1,0,0) < (2,1,2) Detecting Causality Violation Physical Time 1,0,0 2,0,0 0,0,0 P1 (1,0,0) P2 0,0,0 2,1,2 2,2,2 (2,0,0) (2,0,2) P3 0,0,0 2,0,2 2,0,1 • Potential causality violation can be detected by vector timestamps. • If the vector timestamp of a message is less than the local vector timestamp, on arrival, there is a potential causality violation.

  16. Communication Modes in DS • Unicast (best effort or reliable) • Messages are sent from exactly one host to one host • Best effort guarantees that if a message is delivered it would be intact • Reliable guarantees delivery of messages. • Broadcast • Messages are sent from exactly one host to all hosts on the same network. • Reliable broadcast protocols are not practical • Multicast • Messages are sent from exactly one host to several hosts on the same or different networks. • Multicast can be implemented above a reliable unicast

  17. Basic Multicast • A basic multicast primitive guarantees a correct process will eventually deliver the message, as long as the multicaster does not crash. • A straightforward way to implement B-multicast is to use a reliable one-to-one send operation: • B-multicast(g,m): for each process p in g, send (p,m). • receive(m): B-deliver(m) at p.

  18. Reliable Multicast • Integrity: A correct process pdelivers a message m at most once. • Validity: If a correct process multicasts message m, then it will eventually deliver m. • Guarantees liveness to the sender. • Agreement: If a correct process delivers message m, then all the other correct processes in group(m) will eventually deliver m. • Property of “all or nothing.” • Validity and agreement together ensure that overall liveness: if one process (the sender) eventually delivers a message m, then, since the correct processes agree on the set of messages they deliver, it follows that m will eventually be delivered to all the group’s correct members.

  19. Reliable Multicast Algorithm

  20. Ordered Multicast • FIFO ordering: If a correct process issues multicast(g,m) and then multicast(g,m’), then every correct process that delivers m’ will deliver m before m’. • Causal ordering: If multicast(g,m)  multicast(g,m’) then any correct process that delivers m’ will deliver m before m’. • Totally ordering: If a correct process delivers message m before m’, then any other correct process that delivers m’ will deliver m before m’.

  21. Total, FIFO and Causal Ordering • Totally ordered messages T1 and T2. • FIFO-related messages F1 and F2. • Causally related messages C1 and C3 • Total ordering does not imply causal ordering. • Causal ordering does not imply total ordering. • Hybrid mode: causal-total ordering, FIFO-total ordering.

  22. os.interesting Bulletin board: From Subject Item 23 A.Hanlon Mach 24 G.Joseph Microkernels 25 A.Hanlon Re: Microkernels 26 T.L’Heureux RPC performance 27 M.Walker Re: Mach end Display From Bulletin Board Program

  23. Ordering Guarantees (FIFO) • Process messages from each host in the order they were sent: • Each process keeps a sequence number for each host. • When a message is received, as expected, accept higher than expected, buffer in a queue lower than expected, reject If Message# is

  24. Implementing FIFO Ordering • Spg: the number of messages p has sent to g. • Rqg: the sequence number of the latest group-g message p has delivered from q. • For p to FO-multicast m to g • p piggy backs the value Spg onto the message. • p B-multicasts m to g. • p increments Spg by 1. • Upon receipt of m from q with sequence number S: • p checks whether S=Rqg+1. If so, p FO-delivers m. • If S > Rqg+1, p places the message in the hold-back queue until the intervening messages have been delivered and S=Rqg+1.

  25. Hold-back Queue for Arrived Multicast Messages

  26. Accept: 2 = 1 + 1 Accept 1 = 0 + 1 Reject: 1 < 1 + 1 2 0 0 Accept 1 = 0 + 1 2 0 0 Buffer 2 > 0 + 1 Accept Buffer 2 = 1 + 1 Example: FIFO Multicast Physical Time Reject: 1 < 1 + 1 2 0 0 2 1 0 1 0 0 2 1 0 P1 0 0 0 1 2 2 1 1 1 P2 0 0 0 2 1 0 2 0 0 1 0 0 1 P3 0 0 0 2 1 0 0 0 0 1 0 0 Accept: 1 = 0 + 1 0 0 0 Sequence Vector

  27. Total Ordering Using a Sequencer

  28. ISIS algorithm for total ordering P 2 1 Message 3 2 P 2 4 2 Proposed Seq 1 3 Agreed Seq 1 2 P 1 3 P 3

  29. Causal Ordering using vector timestamps The number of group-g messages from process j that happened before the current message to be multicast

  30. 0,0,0 1,1,0 1,0,0 Accept: Accept Buffered message Example: Causal Ordering Multicast Reject: Accept 1,0,0 1,1,0 1,1,0 P1 (1,1,0) (1,1,0) (1,0,0) P2 0,0,0 1,1,0 1,0,0 (1,0,0) (1,1,0) P3 0,0,0 1,1,0 Accept Buffer, missing P1(1) Physical Time

More Related