240 likes | 329 Views
Algorithms for Extracting Timeliness Graphs. Carole Delporte, LIAFA, Univ. D. Diderot Stéphane Devismes, VERIMAG, Univ. J. Fourier Hugues Fauconnier, LIAFA, Univ. D. Diderot Mikel Larrea, University of the Basque Country. Goals. (In partially synchronous distributed systems)
E N D
Algorithms for Extracting Timeliness Graphs Carole Delporte, LIAFA, Univ. D. Diderot Stéphane Devismes, VERIMAG, Univ. J. Fourier Hugues Fauconnier, LIAFA, Univ. D. Diderot Mikel Larrea, University of the Basque Country SIROCCO'2010
Goals (In partially synchronous distributed systems) • How to determine the timeliness relations between processes? • That is, communication from p to q within a bounded delay • Determine? • Eventually all processes agrees on the timeliness of some links SIROCCO'2010
Why? • For example • (leader) There exists a process that communicates in a timely way with all others -> leader election • (tree) There exist timely paths from p to every other process -> routing • (ring) There exists at least one timely ring linking all correct -> ring overlay SIROCCO'2010
Also… • Timeliness is often used to determine correct processes (p timely received messages from q => q is correct) • Leader -> Ω • Tree-Routing -> The source is correct (Ω) • Ring -> exactly all correct processes (◊ P) (Failure Detectors) SIROCCO'2010
Context… • Processes : timely • (bounds on the time to execute a step -> accurately measure the time) • Some process crashes • (correct / faulty) • Communication: fully connected graph • Communication: by messages • Reliable links (no message loss) SIROCCO'2010
Timeliness • The link (p,q) is timely: • There exists an unknown bound D: any message sent at time t by p cannot be received by q after time t+D • (if (p,q) is not timely, the communication delays from p to q are unbounded) • (there exists an unknown bound eventually there exists an unknown bound) • (Timeliness is a property that is defined to a given run) SIROCCO'2010
Recalls • In asynchronous systems, no hypothesis on the link timeliness • In synchronous systems, all links are timely • Asynchronous <-> no consensus • Synchronous <-> consensus • Partially synchronous: Some links are timely SIROCCO'2010
Partially synchronous systems: • Examples: • There exists a process having all its outgoing links timely • There exists a time from which all links are timely • Remark: in both cases, consensus is possible • (Ω can be implemented in the first one and ◊ P in the second one) SIROCCO'2010
Timeliness: • The timeliness graph of a given run r: T(r)=<S,E> • Nodes: correct processes • Oriented edges: (p,q) is an edge iff the link from p to q is timely in r SIROCCO'2010
Basic tool: Watchdog • q can test the link from p to q: • p regularly sends "Alive” in the link (p,q) • q loads a timer of period T, if it does not receive "ALIVE” from p within T time, q blames (p,q) and increases T • If the link (p,q) is timely (and p correct) eventually, T is sufficiently large so that q never more blame (p,q) • If the link (p,q) is not timely (and q correct), q will blame (p,q) infinitely often • Timely link Finite number of blaming • (assumption: FIFO links) SIROCCO'2010
Systems • G=<S,E> is compatible a with T(r)=<Sr,Er> • (1) S=Sr • (2) All edges of E are timely in T(r). • A system X is defined by a set of timeliness graphs: • Let R(X) the set of run of X: r is in R(X) if there exists G in X that is compatible with T(r) SIROCCO'2010
Some systems… • ASYNC: G=<S,Æ> • COMPLETE: all complete graphs • STAR: all star graphs • TREE: all out-trees • RING: all rings • SC: all strongly connected graphs • PAIR: all cycles of two elements SIROCCO'2010
Extraction • Examples: We want (when it is possible) • To build a star • To build a (out-) tree • To build a ring • Moreover, we want: • Only timely links • All nodes must be (or almost be) correct processes SIROCCO'2010
Almost? • In the general case, it is not possible to ensure that all processes of the extracted graph are correct… • (We can just evaluate the timeliness to know if a process is correct) • However we can evaluate: • if G satisfies • G contains at least all the corrects • We don’t know G[Correct] but… SIROCCO'2010
Di-cut (directed cut) • In the extracted graph, if there is no link outgoing from p supposed to be timely (e.g. p is a sink), no process can determine if p is correct… In the same way, if all the links from p lead to faulty processes. • (X,Y) is a dicut of G=<S,E> iff (X,Y) is a partition of S such that there is no (directed) link from Y to X SIROCCO'2010
Almost? • In the general case, it is not possible to ensure that all nodes of the extracted graph are correct… • However, we can ensure that: • the extracted graph G satisfies • G contains at least all the correct processes • G[Correct] is either G or (Correct, F) is a dicut of G where F is a subset of faulty processes SIROCCO'2010
Extraction: • Algorithm for extracting a graph from X • Each p has a variable Gp, for all run r there exists G in X: • Convergence: for all correct process, there exists a time t from which Gp=G • Compatibility:G[Correct(r)] is compatible with T(r) • Closure:G[Correct(r)] is a dicut reduction of G (or G itself) SIROCCO'2010
Some results: • If G is extracted, (p,q) is an edge of G, and q is correct, then p is correct. • If p0,…,pmsuch that p0 and pm are correct is a path of the extracted graph, then for 0≤i<m, (pi,pi+1)is timely and all pi are correct • (in particular, we obtain a route from p0 to pm that only contains timely links) • If G is strongly connected, G[correct]=G. SIROCCO'2010
Main result: • If a family of graph X is closed by dicut reduction (for G in X and (A,B) a dicut of G, we have G[A] is in X), then we can always extract a graph from X. • If every graph of X is strongly connected, then the extracted graph G satisfies G[Correct]=G SIROCCO'2010
Example • In STAR, we extract a star graph whose center is a correct process (Ω) • In TREE, we extract a out-tree whose root is a correct process p0 and such that for all correct process q, there exists a tree-path from p0 to q that only contains correct processes and timely links • In RING, we extract a ring among all correct processes and containing only timely links • (In contrast, for PAIR, there is no extraction algorithm) SIROCCO'2010
Principles of the algorithm • Watch and punish • Regularly test (p,q): • (p,q) timely q blames (p,q) only a finite number of time • For each (p,q)-blaming, punish all G containing (p,q): increase the counter of G • For each process p, punish all G that does not contain p • (reliably) broadcast the counters • Choose the graph with the smallest counter value • Any graph whose all links are timely and containing all correct in the run is only finitely blamed -> finite counter • Any graph having at least one asynchronous link or that misses some correct will be blamed infinitely often -> infinite counter SIROCCO'2010
Moreover… • Enhancement: • If there exists a spanning out-tree in all graph of X, eventually the messages are only sent through the links of the extracted graph • Examples: • STAR, TREE, RING, O(n) links are used (instead of O(n2)) SIROCCO'2010
Conclusion and perspectives • Timeliness <-> failures • Timeliness allows to detect failures (the only way?) • Timeliness is useful (independently of failures detection) • Algorithm Complexity… • Impossibility results SIROCCO'2010