500 likes | 525 Views
Learn about window-based congestion control, TCP protocols, rate-based control, and network-indicated approaches in managing congestion in networking. Explore methods like Leaky Bucket and Buffer Preallocation for efficient data flow.
E N D
End-end congestion control:window-based congestion control Sending transport entity maintains congestion window over sequence number space • can send a packet if packet seq. # in window • distinct from flow control window On timeout • loss assumed • decrease congestion window size • increase timeout value (in case packet was delayed) ACK received: increase window size • everything is OK, so allow even larger window
End-end congestion control: TCP Uses window-based congestion control Two variables used • cwnd: congestion window size • ssthresh: threshold for slowing down rate of increase TCP slow start + congestion avoidance: • assume 4K segment size • TCP window size = min(flow control window size, congestion control window size)
initialize: cwnd=1 ssthresh=16 loop: if (ACK received and cwnd <= ssthresh) cwnd = cwnd+1 else if (ACK received and cwnd > ssthresh) cwnd = cwnd + 1/cwnd else if packet timeout ssthresh = cwnd/2 /* new thresh half current win */ cwnd = 1 /* new window size back to 1 */ forever
Network-indicated Congestion Control Window-based control strictly end-end • network layer not involved, but congestion occurs in network layer! One network-indicated approach: network "marks" packets passing through congested node • receiver sees congestion indication mark and tells sender to slow down • congestion-experienced flag in ISO CLNP, CWI (change window indicator) in IBM SNA virtual route pacing
Network-indicated Congestion Control (cont) Second network-indicated approach: upon detecting congestion, congested router sends explicit message back to traffic sources to slow them down • text: choke packets • source quench in ICMP (Internet control message protocol) • VR-RWI bit in SNA VR pacing
Network Indicated CongestionControl: Difficulties Receiver-initiated control may have long feedback time in high-speed networks • sender may have 1000’s sent (but unACKed) packets before congestion indicator receiver • pipe filled already Both approaches require coupling of network and transport layer • OK for homogeneous networks • difficult in internetworked environment, with different network layers
Rate-Based congestion control • congestion control particularly hard in high-speed networks • e.g.,: 1 Gbit. sec link, 1Kbyte packet takes 8 msec to transmit • thousands of packets "in the wire" propagating cross country • when congestion occurs, too late to react • avoid congestion by regulating flow of packets into network • smoother flows will avoid bursts of packets from different senders arriving at same node and causing congestion • smooth out packet bursts at network edge on per session basis before they enter network.
Goal: regulate rate at which sender can inject packets into network Rate-Based Congestion Control: Leaky Bucket
Goal: regulate rate at which sender can inject packets into network a packet must match up with (and remove ) a token before entering network tokens added to bucket at rate r: controls long term rate of packet entry into network at most b tokens accumulate in bucket. Bucket size b control "burstiness" of arrivals maximum number of packets entering network in t time units is b+rt XTP uses rate and burst parameters analagous to r,b Rate Based Congestion Control: Leaky Bucket
lack of buffering in network is fundamental cause of congestion avoid congestion by allocating buffers to an end-end connection if insufficient buffers for a new connection, block it (as in circuit switching) buffers dedicated to connection, link still shared protects behaving connections from misbehaving one data link layer involved in transport layer congestion control Congestion control by Buffer Preallocation
Congestion Control in ATM ABR Service ATM ABR (available Bit Rate) service: • allows sender to send at rate up to a peak cell rate (PCR) • guarantees a rate of at least minimum cell rate (MCR) when needed • sender rate may fluctuate between 0 and PCR, depending on sender need, network congestion • provides max-min fairness among sessions Congestion Control in ABR service: • combines aspects of rate-based and network-indicated congestion control
ATM ABR Congestion Control: EFCI EFCI: explicit forward congestion indication • based on negative feedback ("bad things are happening") to sender • congested node (queue length > threshold) marks EFCI bit in sender-to-receiver cell • receiver sees EFCI set and notifies sender • sender decreases cell rate: • ACR: allowed cell rate ACR = max(ACR * multiplicative decrease, MCR) • sender increases cell rate if no negative feedback in an update interval: ACR = min(ACR+ additive_increase, PCR)
Additive-Increase Multiplicative-Decrease Congestion Avoidance r_i - rate after i-th feedback r_{i+1} = r_i + c if i-th feedback is no congestion r_{i+1} = a*r_i if i-th feedback indicates congestion • yields fair share of bandwidth of a congested link
Example C • two sources • initial rates r^1 and r^2 • bandwidth C • as time goes on , i increases, source rates converge to a fair share . . . (r^1,r^2) r^2_i C r^1_i
ATM ABR Congestion Control: explicit rates Sender declares every N-th cell as "RM" cell • RM: resource management • records its PCR, allowed_call_rate in RM cell • ER field in RM cell: used by switches to set source rate Switch on sender-to-receiver path: if congested • determine new rate for that source (consider PCR, ACR) • set ER field to indicate new rate only if new rate less than current ER value
Connection Management: Connection Paradigms Connection-oriented • explicitly setup/tear down connections • setup up session context • initial sequence number, flow control window size • exchange data within context of connection • e.g., TCP, ISO TP4
Connection Paradigms (cont) Connectionless service • pure datagram • one-time unreliable send • e.g., UDP (RFC 768), ISO CLTP (ISO 8072) • transaction oriented • single request from sender, single reply from receiver • VMTP protocol
Connection Management: Fundamental Issues Source of problems: • network can delay, reorder lose packets • timeout/retransmit introduces duplicates of data, ACKs, connect, close packets On packet arrival: is it real or is it memorex? • new connection request/release from "real live client" or an old one • transport protocols must create/maintain/destroy enough "state” information to answer the memorex question • explicit connection establishment/teardown with connection-oriented service
Connection Management: Two basic approaches for setting up a connection • two way handshake with wristwatch • three way handshake
Connection Management: Choosinga Unique Identifier Problem: choose identifier (e.g., number) so that no other packet associated with this host currently in network has same identifier • host id unique globally, so concatenated address and identifier unique • assume we know maximum lifetime of packet in network (T) Approach: maintain state • keep list of all values used in last 2T (why 2T ?) • don’t reuse value in list • if list lost: wait 2T • concerns:
Choosing a Unique Identifier Approach: what me worry? • choose at random from large set (e.g., 2**32) of numbers • unlikely to choose new number previously chosen in last 2T • can be combined with used value list for more protection • good enough for many people (except academics)
Connection Establishment: Two-way Handshake • initiator sends req_conn(x) message to other side (respondent) • x is unique identifier • respondents accepts connection via acc_conn(x) reply
Two-way Handshake: Old Messages acc_conn(y) recognized as old!
Two-way Handshake: Failure Scenario Receiver cannot tell if req_conn(x) is a duplicate or not
Two-way Handshake with Timers Receiver won’t delete connection record for x until sure no more req_conn(x) in network • hold record until T after connection close
Two-way Handshake: Transactions Request to open connection, pass data, close connection accomplished with one packet • only one round trip delay needed for transactions • receiver: on receipt, perform operation on data, return reply, close connection
Three-way Handshake Two-way handshake: • sender chose unique identifer, x • allowed sender to detect old replies from receiver (receiver had to reply with x) • allowed receiver (with timers on x value) to detect old sender message
Three-way Handshake(cont) Three-way handshake: • let receiver also choose its own unique identifier, y, and require sender to reply back using y • allows receiver to detect old sender messages without using timers • requires three way exchange of messages to set up connection
Three-way Handshake Illustrated Three way handshake: • used in TCP, TP4, DECnet • header bits in TCP packet for SYN, ACK • trades extra round-trip requirements for no timers
Handshaking Scenarios in TCP passive open accept() active open connect() syn(x) SYN recvd choose unique y synack(y,x+1) x+1 tells us there’s a live connection go to ESTAB y+1 tells us there’s a live client, goto ESTAB ack(x+1,y+1)
Handshaking Scenarios in TCP passive open accept() message old connection (no live client) syn(x) SYN recvd choose unique y synack(y,x+1) z+1 tells us this is NOTin response to y message! ack(x+1,z+1)
Closing a Connection Two approaches for closing a connection • abort: send close msg to peer, close connection, delete state info • what if close message lost? • graceful: send close msg, but before deleting state info wait for peer to acknowledge close
TCP graceful close: • initiator send FIN(x) msg to other side (respondent), waits until ACK(x+1) recvd • respondent: how to know if ACK(x+1 received) • if not received, don’t want to quit, because will need to resend ACK(x+1) later • ask initiator to ACK the ACK(x+1)? • wait 2T after sending ACK(x+1)?
Closing a Connection: Reaching Agreement Q: can I decide to close, knowing the other entity has also agreed to close and knows that I will close A similar scenario: can two armies coordinate their attacks if communication is unreliable?
Transport Protocol Timers We’ve seen several transport-level timers • timeout-retransmit timer • implicit connection close timer: in two way handshake • timeout on connection establishment: give up if no reply to SYN in 75 secs • delayed ACK timer – try to piggyback receiver-to-sender data on ACK, wait for 200 ms before generating standalone ACK Additional timers: • flow control timer (TCP persist timer) if receiver has set my window to 0 and no recent update, query receiver • keep-alive timer: if no activity in period, generate "I’m OK are you OK packet" – 2 hours in TCP
Timers: Implementation Physically: just one countdown timer • initialized, started, stopped via software • will generate interrupt after counting down to zero • protocol code (timer interrupt handler) initiated in response to interrupt
Timers: Implementation Logically: many timers may be running • all future timer timeout value recorded in data structures • hardware timer counts down to earliest interrupt time • on interrupt: • perform required activity • consult data structures, load physical timer with time until next closest interrupt time, start timer • return from interrupt time in future for this interrupt interrupt type next time in future for this interrupt interrupt type next time in future for this interrupt interrupt type next timer list
Estimating Round Trip Delays Retransmit timer values based on round trip delay estimate End-end-delays variable in wide area network (congestion) Estimating round trip time(RTT): exponential averaging • record time from packet send to ACK receipt • compute new RTT estimate using new measured round trip delay (M) and old estimate: RTT <- a*RTT + (1-a)M • a in range [0,1], RTT changes slowly when a close to 1, quickly when a close to 0 complication: how to deal with retransmissions
Timers: Retransmit Timer Value • retransmission timer should be function of RTT (as estimated above) • timeout value = b * RTT • original TCP spec. recommends b=2 • on packet timeout • increase timer value: loss of delay assumed due to congestion (rather than corruption) • doubling of timeout value is common (up to some threshold) More art than science!
TCP Retransmit Timer: Jacobson’sAlgorithm Original TCP timer algorithm replaced in late 1980’s New approach: • adjust timer as a function of RTT and a measure of jitter (variability) (D): D = aD + (1-a)|RTT-M| • timeout value = RTT + 4*D
Transport layer protocol often manages multiple higher layer connections packet at layer N must contain info about which layer N+1 protocol to pass "data" TCP protocol will handle multiple connections (open sockets) simultaneously must be able to dispatch (demultiplex) incoming packets to correct upper layer connection (e.g., socket) TCP uses local and remote IP address/port info to demultiplex Network layer may need to demultiplex upward to one of several possible transport protocols (e.g., UDP or TCP) Multiplexing and Addressing