450 likes | 585 Views
Network Simulation Performance Optimizations, and the Need for Validation. David M. Nicol University of Illinois, Urbana-Champaign. www.project-moses.net. Overview. Simulation of very large-scale networks introduces some interesting technical challenges Fast generation of background traffic
E N D
Network Simulation Performance Optimizations, and the Need for Validation David M. Nicol University of Illinois, Urbana-Champaign www.project-moses.net
Overview Simulation of very large-scale networks introduces some interesting technical challenges • Fast generation of background traffic • Integration of real devices with simulation Talk describes background and solutions for these problems • We then point out how each optimization creates interesting challenges for validation
High Performance Simulation of Low-Resolution Traffic Flows
Motivation Large-scale network simulations with • “background” traffic where details aren’t needed • Congestion affecting results • traffic where principal interest is delivered volume • e.g. worm scans, flooding attack • Our specific motivation is for cyber-defense training (RINSE) Possible solution : simulate such traffic as “flows” at a coarse time-scale • Inject flow rates at edge of network • Compute delivered volume for each flow • Compute link utilization throughout network Challenges: • Capture interactions between flows, routing infrastructure, fine scale traffic
AT&T AboveNet Exodus Cable&Wireless Level3 Verio Sprint UUNet Big Picture Define time-step larger than end-to-end latency (e.g. 1 sec) Each time-step • Define (src,dest,rate) triples • At all network ingress points • Rate can depend on feedback • “Push” flows through network • fine time-scale traffic viewed in aggregate with its own (historical) flow rates • routing based on forwarding tables • loss at router ports where aggregate input rate exceeds port bandwidth • record bandwidth consumption
Define otherwise in out out in No congestion when depends on depends on depends on Modeling Congestion Even though flows are acyclic, dependency cycles may form in definition of flow rates congestion
Example : Suppose Port becomes resolved Then 1. so that Resolution and Transparency Try to resolve finaloutput flow values based on upper bounds All of a port’s final output flows can be resolved once all of its input flow values are resolved But to break cycles we need to be smarter…. A port is transparent if the sum of input rate bounds is no greater than the output bandwidth Notice that every output flow is bounded from above by input flow rate …. Every flow can be bounded by its ingress rate Flow rate becomes resolved 2. Port becomes resolved 3. Flows become resolved Port becomes transparent 4. Repeat
Dependency Reduction Formalization Flow states are {settled, bounded} Port states are {resolved, transparent, unresolved} A port’s state may change, depending on input flows An output flow state may settle, when the port state becomes resolved or transparent Iterate: { • Select port or flow whose state may change • Process state/value change • Identify ports/flows affected by the change }
State Change Rules Port states are {resolved, transparent, unresolved} Flow states are {settled, bounded} Rule 1: port resolution Pre-condition Action Port state is not resolved and all input flow states are settled Mark port state as resolved, compute all output flow values, mark each as settled
State Change Rules Port states are {resolved, transparent, unresolved} Flow states are {settled, bounded} Rule 2: port transparency Pre-condition Action Port state is unresolved and sum of input rate bounds is less than bandwidth, Mark port state as transparent. For every input rate that is settled, mark corresponding output rate as settled
State Change Rules Port states are {resolved, transparent, unresolved} Flow states are {settled, bounded} Rule 3: settle state transition Pre-condition Action Port state is transparent, some input flow is settled, and corresponding output flow is not Mark corresponding output flow as settled, with value equal to input flow value
is sum of settled flow rates State Change Rules Port states are {resolved, transparent, unresolved} Flow states are {settled, bounded} Rule 4: flow bound transition #1 Pre-condition Action Port state is unresolved, the fair proportion relative to settled flows of an input flow rate exceeds bound on output flow Lower corresponding output flow bound to be equal to fair proportion of input flow bound
State Change Rules Port states are {resolved, transparent, unresolved} Flow states are {settled, bounded} Rule 5: flow bound transition #2 Pre-condition Action Port state is not resolved, the flow rate bound of an input flow is less than the corresponding output flow bound Set bound of output flow equal to bound on input rate
Cycle Resolution After all that, we may still be left with cycles of unresolved ports General problem is solution of a system of non-linear equations • Solution methods generally iterative • The number of iterations, and cost of iterations is principle issue • We explore “fixed-point” iteration. Each iteration : • freeze all input rates • compute output rates based on frozen input rates • compare new solutions with old for convergence • Our experiments define convergence when the relative difference between successive flow value solutions is less than (1/10)% for all flow values
Experiments Topologies obtained from Rocketfuel database of observed Internet topologies Traffic loads derived from Poisson-Pareto Burst Processes We ask • How many cycles form, as a function of load? • How many iterations needed to converge, as a function of load? • How fast does it run? • What is speedup relative to pure packet simulation? • What is the accuracy? • Does it scale on a parallel machine?
Topology #routers #links #flows Mbps Top-1 27 88 702 100 Top-2 244 1080 12200 2488 Top-3 610 3012 61000 2488 Top-4 1126 6238 168900 2488 Topology median #ports in cycles #median iterations Top-2 20 5 Top-3 40 9 Top-4 125 11 Results Convergence behavior • Examine # ports in cycle and iterations for convergence • Vary topology • 50% average link utilization Dependency reduction is effective Fixed point algorithm converges quickly
0.285 0.907 Topology #routers #links #flows Mbps Top-1 27 88 702 100 Top-2 244 1080 12200 2488 Top-3 610 3012 61000 2488 Top-4 1126 6238 168900 2488 Topology secs/time-step secs/time-step (20% link util.) (50% link util.) Top-1 0.0026 0.0026 Top-2 0.051 0.051 Top-3 0.283 0.285 Top-4 0.852 0.907 Results We ask • How fast does it run? • What is speedup relative to pure packet simulation? • What is the accuracy relative to packet simulation? • Does it scale on a parallel machine? Topologies • Experiments run on PC • 1.5 GHz CPU • 3Gb memory • Linux OS Results For 1 sec time-step, faster than real-time on a model equivalent to 1.9G pkt-evts/sec (1K bytes/pkt)
Topology #routers #links #flows Mbps Top-1 27 88 702 100 Top-2 244 1080 12200 2488 Top-3 610 3012 61000 2488 Top-4 1126 6238 168900 2488 Link util. speedup Link util. speedup 10% 213 50% 3436 20% 1665 60% 3725 30% 2112 70% 1023 40% 2728 80% 1135 speedup over wide range of loads Results We ask • How fast does it run? • What is speedup relative to pure packet simulation? • What is the accuracy relative to packet simulation? • Does it scale on a parallel machine? Topologies • Experiments run on PC • 1.5 GHz CPU • 3Gb memory • Linux OS Directly compare packet-oriented simulation, using exactly same input flow rates, on Top-1 Results
Results We ask • How fast does it run? • What is speedup relative to pure packet simulation? • What is the accuracy relative to packet simulation? • Does it scale on a parallel machine? Experiments gather statistics of foreground UDP and TCP flows, comparing equivalent packet and fluid based background flows UDP foreground traffic is largely insensitive to difference in background flows
TCP goodput accuracy Compare pure packet and fluid background flows on ATT model (Top-1)
Results We ask • How fast does it run? • What is speedup relative to pure packet simulation? • What is the accuracy relative to packet simulation? • Does it scale on a parallel machine? Experiment : run on 3.2GHz Xeon cluster, Myrianet backplane, 1,2,4,8,16,32 processors Exodus backbone @ processor # flows = 118,828 x # procs • Results : • Substantial speedup of fixed problem size on 2-32 processors • Excellent scaling as problem size and # processors increases
Execution Time, Fixed Problem Size Number of Processors
Speedup, Fixed Problem Size Number of Processors
Scalability Increase problem size and # processors linearly (same work @ processor) Number of Processors
Summary • Coarse scale simulation of network flows is a necessary component of large-scale network simulation • We’ve shown how to do it efficiently • Faster than real-time on large problems • Accurate enough for the training context for which it was designed • Efficiently parallelizable
Observations on Validation • In the motivating context we cared about validation of other behaviors as they interact with abstracted background traffic • We did not need to capture fine details in background flows • What are the most important behavioral requirements on the background flows (e.g. end-to-end loss, latency) against which they can be validated? • Is there a mathematical way of doing the validation w/o the need for huge experiments • E.g, validated discrete-event fluid formation of TCP [Nicol&Yan, TOMACS 2003].
Outline • Introduction to network emulation. • Real-time network simulation. • RINSE, a real-time simulator for cyber-security exercises.
Parallel simulation Real-time simulation Fluid models Multi-scale models Available Tools for Network Researchers Fidelity Physical Testbed Emulation Level of Details Simulation Analytical Model Scalability Size of the Network
What’s Emulation? • Simulation is purely virtual. • Separation between virtual time and real-time clock. • Emulation: part of the system involves physical devices or real applications. • Most likely, physical devices and real applications operate in real time. • The emulated system must keep up with the wall-clock time.
Why Emulation? • Improve model validity as real-world applications/devices are used. • Maintain flexibility and repeatability (to some extent) as simulation models are still part of the system. • Reduce software development time as implementations of network protocols and services are used both in real world and in model world.
Real-Time Network Simulation • Simulation-based emulation, or event-driven network emulation. • Run network simulation and interact with real applications in real time. • Some existing real-time network simulators: • NSE (Fall, 1999): an emulation extension of ns-2. • IP-TNE (Bradford et al., 2000): first to combine emulation with parallel simulation. • MaSSF (Liu et al., 2003): added emulation capabilities in SSFNet for grid computing. • Maya (Zhou et al., 2004): an emulation extension of QualNet for wireless mobile networks.
Real-Time Network Simulation • Major benefits: • Provide the flexibility of running detailed network models and the capability of emulation. • One can study the performance of network applications under controllable and repeatable networking conditions. • One can study network performance using realistic application traffic. • One can monitor and change the state evolution of the network model in real time.
Real-Time Requirements • Responsiveness: real-time simulation must be able to interact with real applications in real time. • Techniques supporting real-time interactions include source-code modification, packet capturing, link/run-time library replacement, kernel virtualization, executable modification. • Timeliness: the simulation system must be able to run the network model in real time. • Techniques supporting high-performance simulation include parallel and distributed simulation, and multi-scale modeling.
Network Viewer Client • GUI for user to monitor and control the virtual network. • Five command types: • Attack: launching attack against a simulated network entity. • Defense: initiating defense mechanisms, such as packet filtering. • Diagnostic networking tools: monitoring network health. • Device control: controlling virtual network devices. • Simulator data collection.
RINSE Capabilities • To support efficient modeling of large-scale networks: • iSSFNet is a network simulator developed with parallel and distributed simulation capabilities. • Multi-scale network traffic modeling, including both packet and fluid-oriented models. • To support efficient modeling of network security attacks: • Multi-scale models focusing on network assets, such as network bandwidth, and computational and memory resources. • To support real-time interactive simulation: • Augmented emulation capabilities in SSF. • Latency hiding technique for latency sensitive applications. • To support fault-tolerance for longer simulation life cycles: • Checkpoint and restart. • Automatic failover with support for emulation speed adjustment.
Interaction through CPU/memory resources TransactionServer Transactions Services TCP UDP ICMP IPv4 Filter Interface • Load-dependent processing costs associated with each stack layer • e.g. crypto, filtering, application • High load through filters • reduces throughput • induces packet loss Packet loss on TCP stream induces window collapse at sender
Emulation Extensions in RINSE • Emulation support is an extension to the popular SSF simulation API. • The definitions of input and output channels are extended to import and export simulation events. • Capability of dynamically throttling the emulation speed. • Necessary for supporting fault tolerance. • A priority-based scheduling algorithm for real-time events. • To reduce timing faults in real-time simulation. • Latency absorption technique for latency-sensitive applications.
A real packet created at real time tR is injected into the simulator at real time t'R. The agent application computes a virtual time deficit of d = (t'R - tR)/, where is the emulation speed. At the network interface, the packet is inserted before packet k, where k = max{i|i >= 0 and d < j=i..N (+j)}, where is the link delay, j is the transmission delay of packet j, and N is the number of packets in queue. Latency Absorption
Where does the model take liberties with reality? Latency Between actual clients and simulated servers Best effort ordering of unsynchronized inputs High level model of resource consumption and impact on behavior What metrics do we use to validate? What about Validation?