1 / 31

Capriccio: Scalable Threads for Internet Services

Capriccio: Scalable Threads for Internet Services. Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley Presenter: Olusanya Soyannwo. Outline. Motivation Background Goals Approach Experiments Results Related work

aloha
Download Presentation

Capriccio: Scalable Threads for Internet Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley Presenter: Olusanya Soyannwo

  2. Outline • Motivation • Background • Goals • Approach • Experiments • Results • Related work • Conclusion & Future work EECS Advanced Operating Systems Northwestern University

  3. Motivation • Increasing scalability demands for Internet services • Hardware improvements are limited by existing software • Current implementations are event based EECS Advanced Operating Systems Northwestern University

  4. Background : Event Based Systems - Drawbacks • Events systems hide the control flow • Difficult to understand and debug • Programmers need to match related events • Burdens programmers EECS Advanced Operating Systems Northwestern University

  5. Goals: Capriccio • Support for existing thread API • Scalability to hundreds of thousands of threads • Automate application-specific customization EECS Advanced Operating Systems Northwestern University

  6. Approach: Capriccio • Thread package • Cooperative scheduling • Linked stacks • Address the problem of stack allocation for large numbers of threads • Combination of compile-time and run-time analysis • Resource-aware scheduler EECS Advanced Operating Systems Northwestern University

  7. Approach: User Level Thread – The Choice • POSIX API (-)Complex preemption (-)Bad interaction with Kernel scheduler • Performance • Ease thread synchronization overhead • No kernel crossing for preemptive threading • More efficient memory management at user level • Flexibility • Decoupling user and kernel threads allows faster innovation • Can use new kernel thread features without changing application code • Scheduler tailored for applications EECS Advanced Operating Systems Northwestern University

  8. Approach: User Level Thread – Disadvantages • Additional Overhead • Replacing blocking calls with non-blocking calls • Multiple CPU synchronization EECS Advanced Operating Systems Northwestern University

  9. Approach: User Level Thread – Implementation • Context Switches • Built on top of Edgar Toernig’s coroutine library • Fast context switches when threads voluntarily yield • I/O • Capriccio intercepts blocking I/O calls • Uses epoll for asynchronous I/O • Scheduling • Very much like an event-driven application • Events are hidden from programmers • Synchronization • Supports cooperative threading on single-CPU machines • Requires only Boolean checks EECS Advanced Operating Systems Northwestern University

  10. Approach: Linked Stack • The problem: fixed stacks • Overflow vs. wasted space • Limits thread numbers • The solution: linked stacks • Allocate space as needed • Compiler analysis • Add runtime checkpoints • Guarantee enough space until next check Fixed Stacks Linked Stack EECS Advanced Operating Systems Northwestern University

  11. Approach: Linked Stack • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls 3 3 2 5 2 4 3 6 MaxPath = 8 EECS Advanced Operating Systems Northwestern University

  12. Approach: Linked Stack • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls 3 3 2 5 2 4 3 6 MaxPath = 8 EECS Advanced Operating Systems Northwestern University

  13. Approach: Linked Stack • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls 3 3 2 5 2 4 3 6 MaxPath = 8 EECS Advanced Operating Systems Northwestern University

  14. Approach: Linked Stack • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls 3 3 5 2 2 4 3 6 MaxPath = 8 EECS Advanced Operating Systems Northwestern University

  15. Approach: Linked Stack • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls 3 3 2 3 2 4 3 6 MaxPath = 8 EECS Advanced Operating Systems Northwestern University

  16. Approach: Scheduling • Advantages of event-based scheduling • Tailored for applications • With event handlers • Events provide two important pieces of information for scheduling • Whether a process is close to completion • Whether a system is overloaded EECS Advanced Operating Systems Northwestern University

  17. Approach: Scheduling -The Blocking Graph • Thread-based • View applications as sequence of stages, separated by blocking calls • Analogous to event-based scheduler Sleep Read Write Close Main Threadcreate Write EECS Advanced Operating Systems Northwestern University

  18. Approach: Resource-aware Scheduling • Track resources used along BG edges • Memory, file descriptors, CPU • Predict future from the past • Algorithm • Increase use when underutilized • Decrease use near saturation • Advantages • Operate near the knee w/o thrashing • Automatic admission control EECS Advanced Operating Systems Northwestern University

  19. Experiment: Threading Microbenchmarks • SMP, two 2.4 GHz Xeon processors • 1 GB memory • two 10 K RPM SCSI Ultra II hard drives • Linux 2.5.70 • Compared Capriccio, LinuxThreads, and Native POSIX Threads for Linux EECS Advanced Operating Systems Northwestern University

  20. Experiment: Thread Scalability • Producer-consumer microbenchmark • LinuxThreads begin to degrade after 20 threads • NPTL degrades after 100 • Capriccio scales to 32K producers and consumers (64K threads total) EECS Advanced Operating Systems Northwestern University

  21. Results: Thread Primitive - Latency EECS Advanced Operating Systems Northwestern University

  22. Results: Thread Scalability EECS Advanced Operating Systems Northwestern University

  23. Results: I/O performance • Network performance • Token passing among pipes • Simulates the effect of slow client links • 10% overhead compared to epoll • Twice as fast as both LinuxThreads and NPTL when more than 1000 threads • Disk I/O comparable to kernel threads EECS Advanced Operating Systems Northwestern University

  24. Results: Runtime Overhead • Tested Apache 2.0.44 • Stack linking • 73% slowdown for null call • 3-4% overall • Resource statistics • 2% (on all the time) • 0.1% (with sampling) • Stack traces • 8% overhead EECS Advanced Operating Systems Northwestern University

  25. Results: Web Server Performance EECS Advanced Operating Systems Northwestern University

  26. Related Work • Programming Model of high concurrency • Event based models are a result of poor thread implementations • User-Level Threads • Capriccio is unique • Kernel Threads • NPTL • Application Specific Optimization • SPIN & Exokernel • Burden on programmers • Portability • Asynchronous I/O • Stack Management • Using heap requires a garbage collector (ML of NJ) EECS Advanced Operating Systems Northwestern University

  27. Related Work (cont’d) • Resource Aware Scheduling • Several similar to capriccio

  28. Future Work • Threading • Multi-CPU support • Kernel interface • (enabled) Compile-time techniques • Variations on linked stacks • Static blocking graph • Scheduling • More sophisticated prediction EECS Advanced Operating Systems Northwestern University

  29. Conclusion • Capriccio simplifies high concurrency • Scalable & high performance • Control over concurrency model • Stack safety • Resource-aware scheduling • Enables compiler support, invariants • Issues • Additional burden to programmer • Resource controlled sched.? What hysteresis? EECS Advanced Operating Systems Northwestern University

  30. OTHER GRAPHS

  31. OTHER GRAPHS

More Related