1 / 40

EventWave : Programming Model and Runtime Support for Tightly-Coupled Elastic Cloud Applications

EventWave : Programming Model and Runtime Support for Tightly-Coupled Elastic Cloud Applications. Wei-Chiu Chuang , Bo Sang , Sunghwan Yoo, Rui Gu, Charles Killian, Milind Kulkarni. Motivation. clients. world. building. building. room. room. server. room. room. # clients.

garan
Download Presentation

EventWave : Programming Model and Runtime Support for Tightly-Coupled Elastic Cloud Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EventWave: Programming Model and Runtime Support forTightly-Coupled Elastic Cloud Applications Wei-Chiu Chuang, Bo Sang, Sunghwan Yoo, Rui Gu, Charles Killian, Milind Kulkarni

  2. Motivation clients world building building room room server room room # clients Response time Time

  3. Motivation Scale up clients Elasticity is hard server # clients Response time Time

  4. Objectives A programming model which supports: • Statefulcomputation Transparent elasticity Simple sequential semantics

  5. Related Work MapReduce [Dean et. al. OSDI ‘04] stateless No simple sequential semantics Data Flow Dryad [Isard et. al. EuroSys ‘07] No Stateful Computation No transparent elasticity CIEL [Murray et. al. NSDI ‘11] Live Migration of Virtual Machines [Clark et. al. NSDI ‘05] Does not change scale: “split”/”merge” state Zephyr [Elmore et. al. SIGMOD ‘11] Live Migration Transactional, reconcile conflicts Orleans [Bykov et. al. SoCC ‘11] Scalable programming model

  6. Event Driven Systems client Event queue server Event 1 commits Event 2 commits Event 3 commits 1 2 3 • Typical event driven systems are not scalable.

  7. Context • Scalability comes from parallelism • Partition program state into `contexts` • An event accesses one or more contexts • Events accessing disjoint contexts can run in parallel world world building building building room room room Contexts enable implicit parallelism hallway room room

  8. EventWave Event 1 commits Event 1 finishes • Stateful • Sequential semantics • Parallelism Context 1 Event 2 commits Event 2 finishes Context 2 3 2 1 Context 3 Event 3 finishes Enforce sequential ordering Event 2 can not commit until Event 1 commits Event 3 commits

  9. Access Multiple Contexts • A player can move from one room to another • Remove it from source room • Insert it into destination room An event may access multiple contexts world Player list building building Room 1 Room 2 room room Bob Bob Alice Bob room room

  10. Access Multiple Contexts • Must ensure • Sequential semantics • parallelism To be scalable, events can not access contexts arbitrarily Event 2 can’t start before event 1 finishes 2 1 Context 1 Event 2 commits Context 2 Event 1 finishes Context 3

  11. Hierarchical Contexts • Contexts are not completely independent • The world has many buildings • A building has many rooms world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>

  12. Wave of Events • Must access contexts from top to bottom The hierarchical access enables parallelism world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>

  13. Wave of Events • Move a player from room 1 to room 2 world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>

  14. Wave of Events Allow the next event to access Building<1> world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>

  15. Wave of Events world Building<1> Building<2> Enter Room<1> Room<1> Room<1> Room<2> Room<2>

  16. Wave of Events world Building<1> Building<2> Remove player Release exclusive access Room<1> Room<1> Room<2> Room<2>

  17. Wave of Events world Building<1> Building<2> Enter Room<2> Room<1> Room<1> Room<2> Room<2>

  18. Wave of Events world Event finishes, releasing all contexts Building<1> Building<2> Insert player Room<1> Room<1> Room<2> Room<2>

  19. Wave of Events world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>

  20. Wave of Events Event commits, releasing snapshot world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>

  21. Distributed Execution • Scale more by executing events across multiple nodes • Map contexts • Head node world Head node Building<1> Building<2> Room<1> Room<3> Room<2> Room<4>

  22. Distributed Execution • Logical Node: a set of physical nodes Server Logical Node Client Logical Node #1 Client Logical Node #2

  23. Distributed Execution

  24. Elasticity Request nodes from cloud scheduler Update context mapping world Building<2> Building<1> Room<1> Room<1> Room<2> Room<2>

  25. Elasticity Transfer contexts to the new node world Building<2> Building<1> Room<1> Room<1> Room<2> Room<2>

  26. Evaluation In the paper Key-value store Does it scale? • Microbechmarks • Scalability What is the cost of migration? • Microbechmarks • Migration latency Case study • Multi-player game server

  27. Microbenchmark-Scalability Setup • One logical node, fixed context mapping • EC2 Small Instances • 1 vCPU, 1.7GB RAM, 160 GB local disk • Distribute 160 contexts to physical nodes Measures • Throughput

  28. Microbenchmark-Scalability Takeaway: Throughput grows w.r.t. # of nodes P: workload

  29. Microbenchmark-Migration Latency Setup • 2 x 8-core 2.0 GHz Xeon, 8GB RAM • 1Gb Ethernet connection Scale does not change

  30. Microbenchmark-Migration Latency Migrate a 100MB context The migration event commits Measure • Throughput of events Finished events must wait for migration event

  31. Multi-player Game Server Setup • Server logical node • 1 x Extra Large Instance (head) • 64 x Small Instances • Client logical nodes • 128 clients on 16 EC2 Small Instances Measure • Latency

  32. Multi-player Game Server Synthetic workload Server contexts spread to 64 physical nodes Server contexts merge to 1 physical nodes

  33. Conclusion • Elasticity is crucial for cloud applications. • Our programming model enables transparent elasticity for tightly-coupled applications • Case studies show EventWave is efficient http://www.macesystems.org

  34. Backups

  35. Language Construct state_variables{ Hallway hw; vector<Room> rooms; } context Hallway{ int x; } context Room<int>{ inty; } Declare implicit parallelism Mace [Killian et. al. PLDI ‘07] Hallway Hallway Room[0] Room[1] … Room<0> Room<1>

  36. Event Handler upcall deliver(Message m){ } Annotation Specify what context to access Message(roomID = 2) upcall[Room<m.roomID>]deliver(Message m){ } Context Room<2>

  37. Key-value store Setup • 2 x 8-core 2.0 GHz Xeon, 8GB RAM • 1Gb Ethernet connection Measure • Latency

  38. Key-value store

  39. Microbenchmark-Migration Latency Setup • 2 x 8-core 2.0 GHz Xeon, 8GB RAM • 1Gb Ethernet connection Scale does not change Context

  40. Context Migration Update context-node mapping Head Event 3 goes to the new node 1 3 M Event 1 goes to the old node Old node Copy context state New node Replicate context state

More Related