Synchronization

Synchronization

Synchronization • Difficult to implement synchronization in distributed environment • Memory is not shared • Clock is not shared • Decisions are usually based on local information • Centralized solutions undesirable (single point of failure, performance bottleneck) • Synchronization mechanisms used to facilitate cooperative or competitive sharing • Clock synchronization • Event ordering • Mutual exclusion • Deadlock • Election algorithms

Clock synchronization • Timer mechanism used to keep track of current time, accounting purposes, measure duration of distributed activities that starts on one node and terminates on another node etc. • Computer Clocks • A quartz crystal that oscillates at fixed frequency. • A counter register whose value is decremented by one for every oscillation of quartz crystal. • A constant register to reinitialize counter register when its value becomes zero & an interrupt is generated.

Drifting of Clocks • Differences in crystals result in difference in the rate at which two clocks run. • Due to difference accumulated over time computer clocks drifts from real time clock used for their initial setting. • If ρ is maximum drift rate allowable, a clock is said to be non-faulty if : 1- ρ≤ dc/dt ≤ 1+ ρ • The nodes of a distributed system must periodically resynchronize their local clocks to maintain a global time base across the entire system.

Clock Drift

Synchronization techniques • External Synchronization • Synchronization with real time (external) clocks • Synchronized to UTC (Coordinated Universal Time) • Mutual (Internal) Synchronization • For consistent view of time across all nodes of the system • External Synchronization ensures internal synch.

Issues in Clock synchronization • The difference in time values of two clocks is called clock skew . • Set of clocks are said to be synchronized if the clock skew of any two is less than δ (Specified constant). • A node can obtain only an approximate view of its clock skew with respect to other nodes’ clocks in the system, due to unpredictable communication delays. • Readjustments done for fast/ slow running clocks • If time of a fast clock readjusted to actual all at once, might result in running time backward for that clock. • Use intelligent interrupt routine.

Question: • A distributed system has 3 nodes n1, n2, n3 each having its own clock. The clock at nodes n1, n2, n3 tick 495, 500 & 505 times per millisecond. The system uses external clock synchronization mechanism in which all nodes receive real time every 20 seconds from an external file source & readjust their clocks. What is the maximum clock skew that will occur in this system?

Answer: n1 = .495 s, n2= .500 s, n3 = .505 s Maximum skew in 1 sec between n1 & n3 = (.505-.495) = .010 sec Thus, skew in 20 sec = .010 * 20 = .2 sec

Synchronization algorithms Distributed Algorithms Centralized Algorithms Global Averaging Local Averaging Passive Time Server Active Time Server

Centralized Algorithms • Keep the clocks of all nodes synchronized with the clock time of the time server node, a real time receiver. • Drawbacks • Single point of failure • Poor scalability • Passive time server centralized algorithm • Active time server centralized algorithm

Passive Time Server Centralized • Each node periodically sends message (“time=?”) to the time server • Server responds by sending (“time = T”) • After receiving client adjusts time to • T+ (T1-T0)/2 (T0 – time when client sent request, T1 - time of client when received reply, thus message propagation time one way is ((T1-T0)/2) • T+ (T1-T0- I)/2, I is time taken by time server to handle time request message • T+ (average (T1-T0))/2

Active Time Server Centralized • Time server node periodically broadcasts clock time (“time=T”) • Time of propagation (Ta) from server to client is known to all clients • Client adjusts time to T + Ta • Drawbacks • Not fault tolerant in case message reaches too late at a node, its clock will be adjusted to wrong value • Requires broadcast facility • Drawbacks overcome by Berkeley algorithm

Berkeley Algorithm • Used for internal clock synchronization of a group • Time server periodically sends a message (“time=?”) to all computers in the group • Each computer in the group sends its clock value to the server • Server has prior knowledge of propagation time from node to server • Time server readjusts the clock values of the reply messages using propagation time & then takes fault tolerant average • The time server readjusts its own time & sends the adjustment (positive or negative) to each node

Distributed Algorithms • All nodes are equipped with real time receiver so that each node’s clock is independently synchronized with real time • External synchronization also results in internal synchronization. • Internal synchronization performed for better results • Global averaging distributed algorithms • Localized averaging distributed algorithms

Global Averaging Distributed Algorithms • Local time with “resync” message is broadcasted from each node at beginning of every fixed length resynchronization interval T0+iR, R is system parameter. • All broadcasts do not happen simultaneously due to difference in local clocks. • Broadcasting node waits for time T during which it collects “resync” messages by other nodes & records time of receipt according to its own clock. • At the end of waiting time, it estimates the skew of its clock with respect to other nodes on the basis of times at which it received “resync” messages. • Calculate fault tolerant average of estimated skews & uses it to correct its own local clock before restart of next “resync” interval

Localized Averaging Distributed Algorithms • Nodes of a DS are logically arranged in some kind of pattern, such as ring or a grid. • Periodically, each node exchanges its clock time with its neighbors in the ring, grid etc. • It then sets its clock time to the average of its own clock time and the clock time of its neighbors.

Event Ordering • Observations by Lamport • If two processes do not interact, it is not necessary to keep their clocks synchronized. • All processes need not agree on exactly what time it is, rather they agree on the order in which events occur. • Events can be • Procedure invocation • Instruction execution • Message exchange (Send/ Receive) • Defined relation happened before, for partial ordering of events.

Happened Before Relation • Happened before relation ( casual ordering) • If a and b are events in the same process, and a occurs before b then a→ b is true. • If a is the event of a message being sent by one process, and b is the event of the message being received by another process, then a → b is also true. • If a → b and b → c, then a → c (transitive). • Irreflexive, a → a not true • Concurrent Events - events a and b are concurrent (a||b) if neither a → b nor b → a is true.

e32 e24 e13 e23 e22 e12 e31 Time e21 e11 e20 e10 e30 Process P1 Process P2 Process P3 Space-time Diagram for three processes Casually ordered events => (e10→e11), (e20 →e24), (e21 →e23), (e21 →e13), (e30 →e24), (e11 →e32) Concurrent events => (e12, e20), (e21,e30),(e10,e30),(e12,e32),(e13,e22)

e9 e4 e8 e7 e3 Time e6 e2 e5 e1 Process P1 Process P2 • Question • List all pairs of concurrent events according to happened before relation.

Logical Clocks • Need globally synchronized clocks to determine a → b • Rather use Logical clock concept • Associate timestamp with every event • Timestamps assigned to events by logical clocks must follow clock condition : if a → b, then C(a) < C(b) • Conditions to follow to satisfy clock condition • If a occurs before b in process Pi then Ci(a) < Ci(b) • If a is the sending of a message by process Pi and b is the receipt of that message by process Pj then Ci(a) <Cj(b) • Clocks should always go forward, that is corrections should be positive value • To meet the above conditions : Lamport’s algorithm

The implementation rules of lamport’s algorithm • IR1 : Each process Pi increments Ci between any two events • IR2 : If event a is the sending of message m by process Pi the message m contains a timestamp Tm = Ci(a). Upon receiving the message m, a process Pj sets Cj greater than or equal to its present value but greater than Tm

Implementation of Logical Clock using Counters • All processors have counters acting as logical clocks. • Counters initialized to zero & incremented by 1 whenever an event occurs in that process. • On sending of a message, the process includes the incremented value of the counter in the message. • On receiving the message, counter value incremented by 1 & then checked against counter value received with message. • If counter value is less than that of received message, the counter value set to (timestamp in the received message + 1). Ex. e13. • If not , the counter value is left as it is. Ex. e08

C1=8 e08 Timestamp=6 e14 C2=6 e07 C1=7 C1=6 e06 e13 C2= 3 5 Timestamp=4 C1=5 e05 time e04 C1=4 e12 C2=2 C1=3 e03 C1=2 e02 e11 C2=1 C1=1 e01 C1=0 Process 1 Process 2 C2=0 Implementation of Logical Clock

Implementation of Logical Clock using Physical Clocks

Example: • Three processes that run on different machines, each with its own clock, running at its own speed. • When the clock has ticked 6 times in process 0, it has ticked 8 times in process 1 and 10 times in process 2. • Each clock runs at a constant rate, but the rates are different due to differences in the crystals. • At time 6: process 0 sends message A to process 1. • The clock in process 1 reads 16 when it arrives. • If the message carries the starting time 6 in it, process 1 will conclude that it took 10 ticks to make the journey. • According to this reasoning, message B from 1 to 2 takes 16 (40 – 24) ticks, again a plausible value. • Message from 2 to 1 leaves at 60 and arrives at 56. Impossible! • Message D from 1 to leaves at 64 and arrives at 54. Impossible!

Lamport's solution: • Each message carries the sending time, according to the sender's clock. • When a message arrives and the receiver's clock shows a value prior to the time the message was sent, the receiver fast forwards its clock to be one more than the sending time. • Since C left at 60, it must arrive at 61 or later. • On the right we see that C now arrives at 61. Similarly, D arrives at 70.

Total Ordering • No two events ever occur at exactly the same time. • Say events a & b occur in processes P1 & P2 respectively, at time 100 according to their clocks. • Process identity numbers are used to create their ordering. Timestamp of a is 100.001 (process id of P1 is 001) & of b is 100.002 (process id of P2 is 002)

Mutual Exclusion • Exclusive access to shared resources achieved by critical sections & mutual exclusion. • Conditions to be satisfied by mutual exclusion: • Mutual exclusion - Given a shared resource accessed by multiple concurrent processes, at any time only one process should access the resource. A process that has been granted the resource must release it before it can be granted to another process. • No starvation – If every process that is granted resource eventually releases it, every request will be eventually granted.

Centralized Approach • Coordinator coordinates entry to critical section • Allows only one process to enter critical section. • Ensures no starvation as uses first come, first served policy. • Simple implementation • 3 messages per critical section – request, reply, release • Single point of failure & performance bottleneck.

Status of request queue Initial state P2 P2 Status after 3 6 Reply 3 Request P3 P2 7 Release Status after 4 P3 9 Release 5 Release Status after 5 8 Reply P1 Pc P3 2 Reply 4 Request 1 Request Status after 7 Centralized Approach

Distributed Approach • Ricart & Agrawala’s Algorithm • When a process wants to enter the CS, it sends a request message to all other processes, and when it receives reply from all processes, then only it is allowed to enter the CS. • The request message contains following information: • Process identifier • Name of CS • Unique time stamp generated by process for request message

The decision whether receiving process replies immediately to a request message or defers its reply is based on three cases: • If receiver process is in its critical section, then it defers its reply • If receiver process does not want to enter its critical section, then it immediately sends a reply. • If receiver process itself is waiting to enter critical section, then it compares its own request timestamp with the timestamp in request message • If its own request timestamp is greater than timestamp in request message, then it sends a reply immediately. • Otherwise, the reply is deferred

OK P2 TS=6 P1 P2 P1 TS=4 OK OK TS=4 TS=6 TS=4 TS=6 Queue P4 P2 P1 P3 P4 P3 Defer sending reply to P1 and p2 Already in CS b Queue a P1 Defer sending a reply to P1 Distributed Approach

Queue Enters CS Exits CS P1 P1 P2 P1 OK Enters CS P2 OK OK c P4 P3 P4 P3 d Exits CS

Guarantees mutual exclusion as a process enters critical section only after getting permission from all other processes. • Guarantees no starvation as scheduling done according to timestamp ordering. • If there are n processes, 2(n-1) messages (n-1 request & n-1 reply) are required per critical section entry.

Drawbacks • N points of failure in a system of n processes. All requesting processes have to wait indefinitely if one process fails. Send “permission denied” instead of deferring reply & “ok” when permission granted. • The processes need to know the identity of all other processes in the system, which makes the dynamic addition and removal of processes more complex. Suitable for groups whose member processes are fixed. • Process enters critical section after exchange of 2(n-1) messages. Waiting time for exchanging 2(n-1) messages can be quite large. Suitable for small group of co-operating processes.

Token Passing Approach • Processes organized in logical ring • Single token circulated among processes in system • Token holder allowed to enter critical section • On receiving token, process: • If it wants to enter CS, keeps token & enters CS. It passes token to its neighbor on leaving CS. A process can enter only one CS when it receives the token. • If it does not want to enter CS, it passes token to its neighbor.

Mutual exclusion achieved as single token present. • No starvation as ring is unidirectional & a process is permitted to enter only one CS at a time. • Number of messages per CS vary from 1 to unbounded value. • Wait time to enter a CS varies from 0 to n-1 messages.

Two types of failure can occur: • Process failure • Requires detection of failed process & dynamic reconfiguration of logical ring • Process receiving token sends back acknowledgement to neighbor • When a process detects failure of neighbor, it removes failed process by skipping it & passing token to process after it. • Lost token • Must have mechanism to detect & regenerate token • Designate process as monitor. Monitor periodically circulates “who has token” • Owner of token writes its process identifier in message & passes on. • On receipt of message, monitor checks process identifier field. If empty generate new token & passes it. • Multiple monitors can be used.

Election Algorithm • Used for electing a coordinator process from among the currently running processes in such a manner that at any instance of time there is a single coordinator for all processes in the system. • Election algorithms based on assumptions: • Each process has unique priority number. • Highest priority process among currently active processes is elected as the coordinator. • On recovery, a failed process can rejoin the set of active processes.

Bully Algorithm • Assumes that every process knows priority of every other process in the system. • When a process Pj does not receive reply of request from coordinator within a fixed time period, it assumes that the coordinator has failed. • Pjinitiates election by sending message to the processes having priority higher than itself • If Pj does not receive any reply, it takes up job of the coordinator & sends message to all processes with lower priority about it being new coordinator. • If Pj receives any reply, process with higher priority is alive. Processes with higher priority now take over election activity. • The highest priority process (that does not receive any reply) becomes new coordinator.

A failed process initiates election on recovery. • If process with highest priority recovers from failure, it simply sends a coordinator message to all other processes & bullies current coordinator into submission. • In a system of n processes when process with lowest priority detects coordinator’s failure, n-2 elections are performed. O(n2) messages • If process just below coordinator detects its failure, it elects itself as coordinator & sends n-2 messages. • On process failure recovery, depending upon process priority O(n2) messages in worst case, n-1 messages in best case are required.

Ring Algorithm • Processes are ordered • Each process knows its successor • No token involved • Any process noticing that the coordinator is not responding sends an electionmessage with its priority number to its next live successor • Receiving process adds its priority number to the message and passes it along • When message gets back to election initiator, it elects the process having the highest priority number as the coordinator and changes message to coordinator & circulates to all members • Coordinator is process with highest priority number

When failed process recovers it does not initiate election. It simply circulates the inquiry message in the ring to find current coordinator. • If more than one process detects a crashed coordinator? • More than one election will be produced but all messages will contain the same information (member process numbers, order of members). Same coordinator is chosen (highest number). • Irrespective of which process detects error, election always require 2(n-1) messages. • More efficient & easier to implement

Synchronization