140 likes | 271 Views
CSE451 Section 7: Virtual memory. Table of content. Real address space Process creation optimization Debugging: watch point. Real address space. 0x00000000. 0xffffffff. User address space. Kernel space. 0xc0000000 on x86. From Understanding The Linux Virtual Memory Manager.
E N D
Table of content • Real address space • Process creation optimization • Debugging: watch point
Real address space 0x00000000 0xffffffff User address space Kernel space 0xc0000000 on x86 From Understanding The Linux Virtual Memory Manager
Kernel address space • Kernel address space • Remain constant in user address space • Why does all processes have the same shape of address space? • Kernel can’t access user address space directly • Why? • Understanding The Linux Virtual Memory Manager • http://www.skynet.ie/~mel/projects/vm/
Optimization • What do you think about memory copy? • Inter Process Communication • Network packet processing • Booting embedded system • Is it really fast?
Process creation • Fork() • Allocate process control block • Copy address space including shared resources • Schedule new process • Let’s analyze real cases
When do we use fork? • Web server • Why? • Usage? • What are shared? • What are not shared? • Shell • Why? • Usage? • What are shared? • What are not shared?
Optimize fork – Web server • Most of address space is shared • Program code • Configuration of web server • Heap • May contain useful data • What is not shared? • Stack & Heap • Do we need to copy entire address space?
Copy-on-write • Principle of laziness • Do action when it is REALLY required • Do copy when the data is really changed • Effect on fork? • Code: never be copied • Heap & stack: copy only changed portion • Much faster, eh? • Linux fork() implements copy-on-write
How is it implemented? P1 P2 CODE1 CODE1 CODE2 CODE2 CODE1 HEAP1 HEAP1 HEAP4 HEAP2 HEAP2 HEAP2’ HEAP2 HEAP3 HEAP3 HEAP4 HEAP4 STACK2 HEAP3 STACK2’ CODE2 HEAP2’ STACK2 STACK2’ STACK2 STACK1 STACK1 STACK1 HEAP1 Fork Call function A’ Write HEAP2
More copy-on-write • Packet processing: Router • Can be seen as a chain of filters • http://pdos.csail.mit.edu/click/ • Copy a packet whenever it passes a filter? • Ethernet: 1500Byte frame size • 10Mbps ~ 1.25MB/s • Sample NAT/firewall enabled configuration has 37 filters • 30MB memory copies per second • 300MB when 100Mbps • 3GB when 1Gbps • This is just pure overhead! • Copy on write! • Comparison does not change the content • Other applications?
Optimize fork – shell • What is shared? • NOTHING! • exec() will overwrite entire address space • Copy? • Noooooo! • Do not copy page table • Suspend parents until child calls exec() • exec() builds a new address space from the scratch • vfork does this
Implementing thread? • Linux 2.6 thread implementation • NPTL: Native POSIX Thread Library • 1:1 threading model • Process/thread creation is integrated in clone() • NPTL design paper • http://people.redhat.com/drepper/nptl-design.pdf
Debugging: watch point • Watch point • Stop execution whenever the value of interesting variable has been changed • Naïve approach • The variable is in page P • Let P as unmapped • Accessing P will cause page fault • Intercept page fault and temporarily map the page • Single instruction step over • If it touches, watched area, TRAP debugger