200 likes | 481 Views
Distributed simulation with MPI in ns-3. Joshua Pelkey and Dr. George Riley. Wns3 March 25, 2011. Overview. Standard sequential simulation techniques with substantial network traffic Lengthy execution times Large amount of computer memory
E N D
Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011
Overview • Standard sequential simulation techniques with substantial network traffic • Lengthy execution times • Large amount of computer memory • Parallel and distributed discrete event simulation [1] • Allows single simulation program to run on multiple interconnected processors • Reduced execution time! Larger topologies!
Overview (cont.) • Important Note • It is mandatory that distributed simulations produce the same results as identical sequential simulations
Overview: terminology • Logical Process (LP) • An individual sequential simulation • Rank or system id • The unique number assigned to each LP Figure 1. Simple point-to-point topology, distributed
Overview: related work • Parallel/Distributed ns (PDNS) [2] • Georgia Tech Network Simulator (GTNetS) [3] • Both use a federated approach and a conservative (blocking) mechanism
Implementation Details in ns-3 • LP communication • Message Passing Interface (MPI) standard • Send/Receive time-stamped messages • MpiInterface in ns-3 • Synchronization • Conservative algorithm using lookahead • DistributedSimulator in ns-3
Implementation Details in ns-3 (cont.) • Assigning rank to nodes • Handled manually in simulation script • Remote point-to-point links • Created automatically between nodes with different ranks through point-to-point helper • When a packet is set to cross a remote point-to-point link, the packet is transmitted via MPI using our interface • Merged since ns-3.8
Implementation Details in ns-3: limitations • All nodes created on all LPs, regardless of rank • It is up to the user to only install applications on the correct rank • Nodes are assigned rank manually • An MpiHelper class could be used to assign rank to nodes automatically. This would enable easy distribution of existing simulation scripts. • Pure distributed wireless is currently not supported • At least one point-to-point link must exist in order to divide the simulation
Performance Study • DARPA NMS campus network simulation • Using nms-p2p-nix-distributed example available in ns-3 • Allows creation of very large topologies • Any number of campus networks are created and connected together • Different campus networks can be placed on different LPs • Tested with 2 CNs, 4 CNs, 6 CNs, 8 CNs, and 10 CNs
Performance Study: campus network topology 200 ms, 10 us Figure 2. Campus network topology block [4]
Performance Study: Georgia Tech clusters used • Hogwarts Cluster • 6 nodes, each with 2 quad-core processors and 48GB of RAM • Ferrari Cluster • Mix of machines, including 3 quad-core nodes and 8 dual-core nodes
Performance Study: simulations on Hogwarts Figure 3. Campus network simulations on Hogwarts with (A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs
Performance Study: simulations on Ferrari Figure 4. Campus network simulations on Ferrari with (A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs
Performance Study: speedup Figure 5. Speedup using distributed simulation for campus network topologies on the (A) Hogwarts cluster and (B) Ferrari cluster
Performance Study: speedup (cont.) • Linear speedup for Hogwarts, not for Ferrari. Further investigation revealed Ferrari consisted of a mix of machines, with the first two nodes considerably faster Table 1: Speedup for Hogwarts and Ferrari
Performance Study: changing the lookahead • By changing the delay between campus networks, the lookahead was varied (200ms to 10 µs) • For Hogwarts and Ferrari, the 10 µs simulations ran, on average, 25% and 47% slower, respectively • As expected, a smaller lookahead time decreases the potential speedup, as the simulators must synchronize with a greater frequency
Future Work • MpiHelper class to facilitate creating distributed topologies • Nodes assigned rank automatically • Existing simulation scripts could be distributed easily • Distributing the topology could occur at the node level, rather than the application • Ghost nodes, save memory • Pure distributed wireless support
Summary • Distributed simulation in ns-3 allows a user to run a single simulation in parallel on multiple processors • Very large-scale simulations can be run in ns-3 using the distributed simulator • Distributed simulation in ns-3 offers potentially optimal linear speedup compared to identical sequential simulations
References [1] R.M. Fujimoto. Parallel and Distributed Simulation Systems. Wiley Interscience, 2000. [2] PDNS - Parallel/Distributed ns. http://www.cc.gatech.edu/computing/compass/pdns, March 2004. [3] G. F. Riley. The Georgia Tech Network Simulator. In Proceedings of the ACM SIGCOMM workshop on Models, methods and tools for reproducible network research, MoMeTools ’03, pages 5-12, New York, NY, USA, 2003 ACM. [4] Standard baseline NMS challenge topology. http://www.ssfnet.org/Exchange/gallery/baseline, July 2002