Simulating the Internet: challenges & methods

Simulating the Internet:challenges & methods Kevin Fall Network Research Group, Lawrence Berkeley National Laboratory Berkeley, CA USA

LBNL’s Network Research Group: Members: Van Jacobson, group leader Kevin Fall Sally Floyd * Craig Leres Vern Paxson * http://www-nrg.ee.lbl.gov

Outline • Simulating the Internet is not easy • The VINT project: an effort in Internet-style simulation

Simulations for Network Research • Models of interesting behavior • Easily-varied parameters • Controlled environment, reproducible results

Problems in Characterizing the Internet • Large Scale: • even a small fraction of misbehaving entities is non-negligible • scale stresses assumptions in protocol design and implementation • Drastic Change: • will the rate of change continue? • predominant use not obvious (e.g. the web, continuous media, ?) • Heterogeneity everywhere!

Link and Topology Heterogeneity • Delay and bandwidth span 5 to 6 orders of magnitude! • 20msec to 2s round-trip prop delay • 10Kb/s to 10Gb/s bandwidth range • Topology • hierarchy and clustering chosen by ISPs • performance tied to which path packets take in network • paths may change dynamically • IP routes are frequently asymmetric

Protocol Heterogeneity • Adaptive and non-adaptive Internet protocols • react to congestion (TCP) • nonreactive (UDP) • Application Dynamics • multi-protocol interactions • user activity • application mix varies greatly by site • Implementations may not be consistent

Traffic • Internet traffic not easily characterized • no commonly accepted model • traffic may be shaped by congestion response • Dependent on source behavior • application protocol limitations • new applications • pricing policies

So, what can be done in simulation? • Strategy • 1: Look for invariants • 2: Explore the parameter space • 3: Understand the limits of simulation

1: Searching for Invariants • What do we really know about Internet dynamics? • How tocharacterizestatistically? • traffic • users • sessions • congestion, etc. • Mathematical simplicity does not imply accuracy

The Self-Similar Nature of Traffic • packet arrivals not exponentially distributed • thus, arrival process is not Poisson • bursts over multiple time-scales • they exhibit long-range dependence • suggests self-similar models • (there is still contention on this point) • Implications • aggregation does not “smooth out” variation • traffic synthesis more difficult • network buffering may be much less effective than thought based on Markovian models

User-generated Sessions look Poisson • user-generated session arrivals look Poisson (machine-generated connection arrivals are not) • distribution is invariant, parameterized only by a (fixed, hourly) rate

Network Activity tends to have a heavy-tailed distribution • Examples: packets in a user’s TELNET session; bytes in FTP-DATA transfers • distribution looks Pareto with 0.9 < b < 1.0 • Pareto distribution with shape b has: • infinite mean if b <= 2 • infinite variance if b <= 1 • This type of Pareto has infinite mean and variance (and is very unlike an exponential) • burstiness remains across aggregation

2: Exploring the Parameter Space • Consider a large range for parameters • recall, 5-6 orders of magnitude range in bandwidth and delay • note that behavior is often non-linear in parameter values • Repeat, repeat, repeat • topology generators • randomness

3: The Limits of Simulation • Simplified Models • useful for gaining intuition and exploring parameters • danger of oversimplification • Need for a Reality Check • compare simulation results with measurement • Internet measurements often offer “surprises”

The VINT Project(Virtual InterNet Testbed) • USC/ISI: Deborah Estrin, Mark Handley, John Heideman, Ahmed Helmy, Polly Huang, Satish Kumar, Kannan Varadhan, Daniel Zappala • LBNL: Kevin Fall, Sally Floyd • UCBerkeley: Elan Amir, Steven McCanne • Xerox PARC: Lee Breslau, Scott Shenker • VINT is currently funded by DARPA through mid-1999

VINT Goals • provide common platform for network research • explore issues of scale and multi-protocol interaction • Specific Areas: • multicast, end-to-end transport • simulation scaling • traffic management • emulation

Multicast Research • Reliable Multicast Transport • Large Scale • “SRM”-- Scalable Reliable Multicast • Multicast Congestion Management • Group formation • (still ongoing) • Layered Transmission • layered encoding • dynamic multi-group join/leave

Simulation Scaling • Simulator capable of 1000s of nodes • Want 100,000s of nodes (or more) • “Session” Abstraction • abstract away some simulation details • trade detail for time/space • scales simulation by about 10X

Traffic Management • Active Buffer Management • Random Early Detection Gateways • Explicit Congestion Notification (ECN) • Packet Scheduling • Class-Based Queuing (CBQ) • Round-Robin and Fair Queuing Variants • Differentiated Services • Admission Control • Reservation Support

Emulation • Interface Simulator with Live Network • Live Traffic Passes through Simulated Topology • Special “Real-Time” Scheduler • may not keep synchronized under load

The VINT Simulation Environment • Components: ns2 and nam • NS2 (network simulator, version 2): • Discrete-event C++ simulation engine • scheduling, timers, packets • Split Otcl/C++ object “library” • protocol agents, links, nodes, classifiers, routing, error generators, traces, queuing, math support (random variables, integrals, etc) • Nam (network animator) • Tcl/Tk application for animating simulator traces • available on UNIX and Windows 95/NT

NS Supported Components • Protocols: • TCP (2modes + variants),UDP, IP, RTP/RTCP, SRM, 802.3 MAC, 802.11 MAC • Routing • global topology map, classifiers • static unicast, dynamic unicast (distance-vector), multicast • Queuing and packet scheduling • FIFO/drop-tail, RED, CBQ, WRR, DRR, SFQ • Topology: nodes, links Failures: link errors/failures • Emulation: interface to a live network

TCP Animation

SRM Animation

Benefits • Common simulation environment • simulations expressed in scripting language • separate visualization tool • topology and “scenario” generators • modular structure is extensible; sources provided • Unique Features • Rich Protocol Set • “Session” abstraction • provides scaling simulations by a factor of 4 • Visualization and Emulation capabilities • separate Network Animator (nam) tool • low-level interface to system’s protocols

The NS Architecture • Simulator is a Object-Tcl “shell” • Split Objects • fine-grain, easily composed • objects exist both in C++ and Tcl Context • library handles object consistency

Work in Progress • Adaptive Web Caching (LBNL, UCLA) • Nam Improvements (USC, ISI) • Simulator Scaling (USC, ISI) • Simulator Addressing Hierarchy (USC, ISI) • Protocol Robustness (USC, ISI) • Emulation (LBNL, UCB) • Quality of Service (Xerox PARC) • Router-Based Congestion Control (LBNL) • Topology and “Scenario” Generation

Router-Based Congestion Control • Two main classes of traffic on Internet: • TCP (reduces sending rate in face of loss) • UDP (application decides when and how much to send) • Internet stability due in large part to TCP’s congestion response • Danger with growing use of UDP-based applications • UDP will “steal” bandwidth from TCP • currently no incentives to prevent this behavior

Encouraging Congestion Control • Combine RED Gateway with analysis and regulation • RED (Random Early Detection) Gateways: • keep smoothed average queue size measure • when measure exceeds threshold, drop or mark packets with increasing probability • a flow’s fraction of the aggregate random packet drop rate is roughly equal to it’s fraction of the aggregate arrival rate • Select candidate “bad” flows with high drop rate

“Bad” Flow Selection Criteria • Flow is not “TCP-friendly” • throughput exceeds factor times analytic model: • Flow is not responsive • does not alter arrival rate with increased packet drops • Flow is “high-bandwidth” • uses more than it’s “fair share”

Flow Regulation • Need bandwidth-regulating packet scheduler • CBQ • others • Use “good” and “bad” scheduling partitions • Bad partition gets allocation below current usage • decays over time with continued offered load • flows may be reclassified as “ok” if they adapt

Conclusion • Simulating the Internet is difficult • Simulation is useful, but must be used carefully • The VINT project a common simulation framework that addresses many of the issues

Additional Information • Web pages: • http://www-nrg.ee.lbl.gov/ • http://www-mash.cs.berkeley.edu/ns • http://netweb.usc.edu/vint • http://www.ito.darpa.mil/Summaries97/E243_0.html • NS Users Mailing list: • majordomo@mash.cs.berkeley.edu • “subscribe ns-users”

Simulating the Internet: challenges & methods