1 / 32

Failure Detection and Recovery in MULTI6 draft-arkko-multi6dt-failure-detection-00.txt

Failure Detection and Recovery in MULTI6 draft-arkko-multi6dt-failure-detection-00.txt. Multi6 Design Team -- Jari Arkko, Marcelo Bagnulo, Geoff Huston, Erik Nordmark, Margaret Wasserman, Iljitsch van Beijnum, Jukka Ylitalo. Presentation Outline. Background Addresses

ishi
Download Presentation

Failure Detection and Recovery in MULTI6 draft-arkko-multi6dt-failure-detection-00.txt

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Failure Detection and Recoveryin MULTI6draft-arkko-multi6dt-failure-detection-00.txt Multi6 Design Team -- Jari Arkko, Marcelo Bagnulo, Geoff Huston, Erik Nordmark, Margaret Wasserman, Iljitsch van Beijnum, Jukka Ylitalo

  2. Presentation Outline • Background • Addresses • Interfaces to other components • Reachability • Principles of failure detection • Principles of alternative search • A sketch of a protocol • Design decisions • Architectural issues

  3. Background

  4. Background • MULTI6 design team work • HIP multihoming work • MOBIKE multihoming work • What SCTP has done • Movement detection in mobility protocols • Host address configuration mechanisms

  5. Addresses

  6. Multihoming Basics There’s more than one path for traffic Typically, multiple prefixes for some of the participants Observations: • Multiple addresses on one or both end hosts • Nodes should know about their own addresses • And learn about the peer’s addresses over the MULTI6 protocol

  7. Node’s Own Addresses --Where Do They Come From? Addresses come from the other parts of the stack • The addresses are typically configured through protocols such as DHCP or IPv6 Neighbor Discovery • Processes related to addresses are not trivial -- Duplicate Address Detection, valid/deprecated, scoping, ... • Relationship to what the rest of the IP layer does (e.g. Router Discovery) or what the L2 does (e.g., 802.11 attachment)

  8. Node’s Own Addresses --Where Do They Go To? • Addresses are taken away by the same mechanisms

  9. Addresses -- What is their Status? • Security: we need to believe what the configuration mechanisms tell us • If an address is no longer usable, we need to believe it • Address allocation can be secure (but often not turned on); Nevertheless, M6 has nothing else to rely on either, so an address given to it should be considered at least as a candidate • Even if an address is assigned to an interface, it is not guaranteed that you will actually be able to use it • Link temporarily broken • Router down • Etc

  10. A Few Address-Related Definitions Available address • Address is assigned to an interface • The address is valid (in IPv6) and has completed uniqueness tests Locally operational address • Address is available • L2 green light is on • Default router is reachable (IPv6 NUD)

  11. Interfaces to Related Modules • An obvious set of “configuration” modules and protocols that handle address assignment and deletion and other related tasks • A growing body of work for improving the characteristics related to changing connectivity at the “lower layer” -- e.g. DNA WG • DNA WG: draft-ietf-dna-goals-03.txt • DHC WG: draft-ietf-dhc-dna-ipv4-09.txt

  12. Reachability

  13. Answer: No (Not even if you can talk to someone else) host1 host2 (broken) R R cnn. com Are Two Locally Operational Addresses Enough?

  14. The Definition of an Address Pair Address pair • A pair of addresses (src, dst) used in communications between two peers Operational address pair • Both addresses are locally operational • Traffic flows when the pair is used

  15. Symmetric vs. Asymmetric Address Pair Reachability Note that reachability may not always be two-way… Host1 should send from p to r, host2 from s to q Ping would never work here! (=> only) (<= only) r p R R host1 host2 q R R s

  16. Detection and Search

  17. Selecting an Address • How do we know there is a problem? • The address went away (certain) • Explicit test failed (certain…but might be a transient problem) • Lack of TCP progress, ICMP, … (hmm...) • Picking another pair • No existing protocol proposals for finding operational address pairs (multi6, hip, and mobike looking at this)

  18. Picking Another Address Pair • The selection should not itself cause a new problem by congestion • If a site link goes down, it would be a bad idea for all hosts in the site to suddenly start a cartesian ping bomb • All hosts must obey exponential back-off while searching • Downside: • 4 addresses on both sides, 0.1 start timeout • exponential back-off would take 3200 seconds! • Suggestion: either this or a slight relaxation

  19. Picking Another Address Pair, Cont’d • As a result, the order at which you try things out is important • Some signaling of preferences can be made while nodes tell each other what addresses they have • For the rest, a number of heuristics can apply to the order • Example: an address that worked 30 seconds ago would be a useful candidate to try • Suggestion: Leave details to implementations

  20. Picking Another Address Pair, Cont’d • Testing for bidirectional reachability is easy • Testing for unidirectional reachability is harder • Reachability may depend on packet! • Multi6 protocol vs. PING • Multi6 protocol vs. payload packet • Significant?

  21. Finding Pairs -- Unidirectional Case Peer A Peer B | | | | A decides that it has a problem

  22. Finding Pairs -- Unidirectional Case Peer A Peer B | | | Poll 1 (src=A1, dst=B1) | |-------------------------------------------------------------->| | |

  23. Finding Pairs -- Unidirectional Case Peer A Peer B | | | Poll 1 (src=A1, dst=B1) | |-------------------------------------------------------------->| | | B sees that apparently A has a problem, starts the same process

  24. Finding Pairs -- Unidirectional Case Peer A Peer B | | | Poll 1 (src=A1, dst=B1) | |-------------------------------------------------------------->| | | | Poll 2 (src=B1, dst=A1) OK: 1 | | X----------------------------------------------| | |

  25. Finding Pairs -- Unidirectional Case Peer A Peer B | | | Poll 1 (src=A1, dst=B1) | |-------------------------------------------------------------->| | | | Poll 2 (src=B1, dst=A1) OK: 1 | | X----------------------------------------------| | | | Poll 3 (src=A2, dst=B1) | |------------------------------X | | |

  26. Finding Pairs -- Unidirectional Case Peer A Peer B | | | Poll 1 (src=A1, dst=B1) | |-------------------------------------------------------------->| | | | Poll 2 (src=B1, dst=A1) OK: 1 | | X----------------------------------------------| | | | Poll 3 (src=A2, dst=B1) | |------------------------------X | | | | Poll 4 (src=B2, dst=A1) OK: 1 | |<--------------------------------------------------------------| | |

  27. Finding Pairs -- Unidirectional Case Peer A Peer B | | | Poll 1 (src=A1, dst=B1) | |-------------------------------------------------------------->| | | | Poll 2 (src=B1, dst=A1) OK: 1 | | X----------------------------------------------| | | | Poll 3 (src=A2, dst=B1) | |------------------------------X | | | | Poll 4 (src=B2, dst=A1) OK: 1 | |<--------------------------------------------------------------| | | | Poll 5 (src=A1, dst=B1) OK: 4 | |-------------------------------------------------------------->|

  28. Design Decisions

  29. Some Suggested Design Principles • Multi6 should not venture in to the area of the configuration modules or protocols -- we shall not reinvent DHCP, and we shall believe what ND tells us • Own addresses learned locally, peer addresses are communicated • Search procedures need to apply some form of exponential back-off • Multi6 only works as a fail-over • Not load balancing (would cause problems to TCP) • Not selection of “best” path (harder than “a working” path) • No mandated search order, no application input on “primary” or “backup” connection

  30. Some Open Design Principles • Do we need to support unidirectional reachability? • It complicates the protocols • Many failure modes cause unidirectional reachability, particularly given ingress filtering • Is there any limitation in the scope of addresses allowed? • Statistically unique site-locals should work as well as global addresse with MULTI6

  31. Some Architectural Issues

  32. Some Architectural Issues • Division of work between configuration / lower-layer modules and MULTI6 • Some cross-layer communication is needed: • ULP progress information helps failure detection (similar to what IPv6 NUD already needs) • The multihoming layer needs to inform ULPs that a slow start is needed after we have switched to a new adderss • Division of work between MULTI6 and transport/application layers • Reachability information at MULTI6 or transport layers • Congestion information at transport layer • Application requirements for what is an acceptable connection

More Related