550 likes | 661 Views
Optimal External Route Selection: Tips and Techniques for ISPs. Avi Freedman Net Access. Overview. Brief review of BGP routing concepts Safe routing Determining policy Using MEDs Setting MEDs on internal routes as-path padding to tune external traffic
E N D
Optimal External Route Selection: Tips and Techniques for ISPs Avi Freedman Net Access
Overview • Brief review of BGP routing concepts • Safe routing • Determining policy • Using MEDs • Setting MEDs on internal routes • as-path padding to tune external traffic • Using local-prefs to tune external traffic • Setting MEDs to tune external traffic
BGP Intro • BGP4 is the protocol used on the Internet to exchange routing information between providers, and to propagate external routing information through networks. • Each autonomous network is called an Autonomous System. • ASs which inject routing information on their own behalf have ASNs.
BGP Peering • BGP-speaking routers peer with each other over TCP sessions, and exchange routes through the peering sessions. • Providers typically try to peer at multiple places. Either by peering with the same AS multiple times, or because some ASs are multi-homed, a typical network will have many candidate paths to a given prefix.
The BGP Route • The BGP route is, conceptually, a “promise” to carry data to a section of IP space. The route is a “bag” of attributes. • The section of IP space is called the “prefix” attribute of the route. • As a BGP route travels from AS to AS, the ASN of each AS is stamped on it when it leaves that AS. Called the AS_PATH attribute, or “as-path” in Cisco-speak.
BGP Route Attributes • In addition to the prefix, the as-path, and the next-hop, the BGP route has other attributes, affectionately known as “knobs and twiddles” - • weight, rarely used - “sledgehammer” • local-pref, sometimes used - “hammer” • origin code, rarely used • MED (“metric”) - a gentle nudge
BGP Policy • BGP was designed to allow ASs to express a routing policy. This is done by filtering certain routes, based on prefix, as-path, or other attributes - or by adjusting some of the attributes to influence the best-route selection process.
BGP Best-Route Selection • With all of the paths that a router may accumulate to a given prefix, how does the BGP router choose which is the “best” path? • Through an RFC-specified (mostly) route selection algorithm.
BGP Best-Route Selection • Do not consider IBGP path if not synchronized • Do not consider path if no route to next hop • Highest weight (local to router) • Highest local preference (global within AS) • Shortest AS path • Lowest origin code IGP < EGP < incomplete • Lowest MED • Prefer EBGP path over IBGP path • Path with shortest next-hop metric wins • Lowest router-id
BGP Selection, Summary • So, local-pref is stronger than as-path is stronger than MED. • Setting local-pref without careful planning can cause strange things (preferring other paths to get to your own customers)…
Safe Routing • BGP routes are “promises” to carry traffic to a certain destination. Still, not every provider makes good promises {at all times}. • So, it is best to sanity-filter all eBGP sessions.
Safe Routing • Method 1: • The Cisco “maximum-prefix” keyword • neighbor <remote-ip> maximum-prefix [percent] [warning] • Sets a maximum number of prefixes allowed for a peer. • Behavior 1 - Shut down the session and log the fact. • Behavior 2 - Leave the session up; just log the warning.
Safe Routing - Filtering • Another method of sanity filtering is to restrict your peers based on routes or as-paths. • Usually, it is hard to filter based on routes (except for our friends, the fanatics at ANS). • So, from smaller providers it is a good idea to prevent random route redistribution.
Safe Routing - Filtering ip as-path access-list 40 deny _701_ ip as-path access-list 40 deny _1239_ ip as-path access-list 40 deny _3561_ ip as-path access-list 40 deny _1_ ip as-path access-list 40 deny _1673_ ip as-path access-list 40 deny _174_ ip as-path access-list 40 permit .* • Apply this access-list inbound for sanity.
“I am Blackholio” • In sufficiently strange circumstances, this won’t help. • If someone (AS 7007, perhaps) strips the as-path information, as-path filters do no good.
Determining Policy • What do you want to do? • The tricky part. • Configuring is easy… • Do you want to prefer higher-quality connections? • Optimize for cost of the links?
Connection Quality • We will assume that you want to optimize for connection quality. • This generally means, in the Platonic zero-packet-loss Internet, minimizing latency and avoiding small pipes. • We’ll come back to small pipes and backup paths when we talk about local-prefs. • We’ll talk about minimizing latency when we explore MEDs.
Connection Quality • At all times, we must minimize packet loss. • In general, this means avoiding public exchanges in favor of private peering and/or transit. • Sometimes this might not be economically desirable, but if you don’t tune this way, stay vigilant about inter-connection quality. • Best to measure it if you really care...
Measuring Packet Loss with MRTG Max Max: 423.0 ms (352.5%) Average Max: 32.0 ms (26.7%) Current Max: 37.0 ms (30.8%) Max Min: 9.0 ms (7.5%) Average Min: 5.0 ms (4.2%) Current Min: 6.0 ms (5.0%)
Peering Points • You want to prefer paths that you hear over uncongested pipes. • Assuming you have non-full private interconnects, PIs will be better than public exchanges. • Of course, that can depend on which Gigaswitch you’re on; whether you’re at PSK, PACBell, AADS, or the MAEs.
Hot-Potato • In general, traffic is handed off as soon as possible to external providers to minimize backbone utilization and costs. • This is not always the best plan if you want to maximize connection quality (assuming your inter-LATA and/or cross-country links are not full). • Solution - Listen to and use MEDs.
Asymmetry • For this presentation, we are going to ignore the return path - data coming back into your network. • Still, for best tuning you will want to explore this and use as-path padding and possibly controlled de-aggregation (to willing partners)...
Review: Policy • Somehow, you want to prefer better-quality links. • In the examples that follow, we’ll assume a small but national network, peering at MAE-West, MAE-East, and Pennsauken. • Additionally, private interconnects with IDT, PSI, Digex, above.net, and Exodus. • Transit through above.net and UUNET.
Goals • Our goals will be to prefer, in this order: • Private interconnects • Regionality of traffic • Pennsauken over MAE-East • Public Exchanges • Transit pipes, above.net first
Introduction to MEDs • The MULTI_EXIT_DISCRIMINATOR, or MED, is a BGP attribute used to: • Describe internal network topology. • Pass on this topology to external peers. • A smaller knob than others, like local-pref or as-path padding. • Major problem - no inter-provide consistency on MED semantics. • Internally, also called “metrics”.
Setting MEDs • Use an internally consistent scheme. • Usually, people’s MEDs are in the low hundreds or less. • Suggestion - use average delay in ms between POPs. • Set MEDs in one direction only. • To be advanced, MEDs can be set on a per-router basis in a POP, but usually are not.
Network Diagram CHI PHL SF DC
Setting MEDs • For SF, CHI, PHL, DC: SF-DC +60 SF-CHI +40 CHI-PHL +30 CHI-DC +25 PHL-DC +10 PHL-PSK +0 DC-MAE-E +5 SF-MAE-W +5
Network Diagram w/ MEDs CHI 30 40 25 PHL 60 SF 10 DC
Route Maps in DC route-map from-sf set metric +60 route-map from-chi set metric +40 route-map from-phl set metric +10 neighbor <sf-ip> route-map from-sf in etc...
What this Does • A route originating in PHL will have: • metric 60 or or 70 in SF (unless there are multiple link failures) • metric 10 or 60 in SF • metric 10 or 35 in DC • etc… • Thus, a provider honoring MEDs (not doing hot-potato) will send packets destined to that route in PSK, to PHL.
Slight Improvement? • Or, change things to weight PSK vs. DC over PHL vs. DC. PSK +0 MAE-E +20 • Thus, a provider honoring MEDs will send a PHL-destined packet to PSK. This is generally a good thing.
as-path padding • Some think that modifying as-paths is a nasty business. • It is a good beginning way to do preferences. • If providers have already padded to de-prefer, preserves that “de-preference”. • Simple to do.
as-path padding • First, policy? • Private interconnects - pad no times • Regionality of traffic - pad four times x-country • Pennsauken over MAE-East - pad once; twice • Public Exchanges - twice at MAE-West • Transit pipes, above.net first - pad three • Problem - can’t pad easily going cross-country. • But we can do the rest. • Problem - lots of route-maps and typing. • Why? Can’t prepend our own AS inside network, so must have separate roue-map per session.
route-maps • On everyone, at above.net: route-map prepend-once permit 10 set as pre 6461 6461 6461 • On everyone, at UUNET: route-map prepend-once permit 10 set as pre 701 701 701 • On PSI, at MAE-East and MAE-West: route-map prepend-once permit 10 set as pre 174 174 • On PSI, at Pennsauken: route-map prepend-once permit 10 set as pre 174
Local-prefs • Most common method of preferring external routes. • Local-pref is a number, by default 100, put on routes and passed to all routers within a network. • Never passed to an eBGP peer.
Implementing Policy • Customers - local-pref 200 • Private interconnects - local-pref 150 • Pennsauken over MAE-East - 120 for Pennsauken • Public Exchanges - 100 at MAE-East and MAE-West • Transit pipes, above.net first - 80 from transit pipes • Regionality of traffic - defer to MEDS for equal local-pref. May want to add PACBELL cxn and make it 120.
route-maps • At Pennsauken: route-map psk in set local-pref 120 set community 4969:800 neighbor peer-group external-peer-psk route-map psk in or neighbor <remoteip> route-map psk in
Problem: Prefers Bad Paths • The problem with this approach: • Take AS 14000, who has a T1 to Sprintlink and a backup-backup-backup 56k to another local provider, say, 13000. • Announces as: • 1239 14000 and • 701 13000 14000 14000 14000 14000 1400 • Local-prefs can screw with this.
Listening to MEDs: Same Peer • Nothing special is required to listen to MEDs. • Because MEDs mean different things to different networks, one approach is no only set MEDs inbound for your own routes. • When listening to MEDs at multiple locations from a peer, set to internal MEDs if you want to hot-potato.
route-map on DC, v2 route-map from-sf permit 10 match community 1 set metric +60
MEDs from Diff. eBGP Peers • “bgp always-compare-med” keyword allows Ciscos to use MEDs among different providers. • Otherwise, will use them to compare iBGP routes, or eBGP routes from the same AS.