1 / 31

Virtual Memory & Address Translation

Virtual Memory & Address Translation. Vivek Pai Princeton University. General Memory Problem. We have a limited (expensive) physical resource: main memory We want to use it as efficiently as possible We have an abundant, slower resource: disk. Lots of Variants.

libitha
Download Presentation

Virtual Memory & Address Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Virtual Memory &Address Translation Vivek Pai Princeton University

  2. General Memory Problem • We have a limited (expensive) physical resource: main memory • We want to use it as efficiently as possible • We have an abundant, slower resource: disk Virtual Memory & Translation

  3. Lots of Variants • Many programs, total size less than memory • Technically possible to pack them together • Will programs know about each other’s existence? • One program, using lots of memory • Can you only keep part of the program in memory? • Lots of programs, total size exceeds memory • What programs are in memory, and how to decide? Virtual Memory & Translation

  4. History Versus Present • History • Each variant had its own solution • Solutions have different hardware requirements • Some solutions software/programmer visible • Present – general-purpose microprocessors • One mechanism used for all of these cases • Present – less capable microprocessors • May still use “historical” approaches Virtual Memory & Translation

  5. Many Programs, Small Total Size • Observation: we can pack them into memory • Requirements by segments • Text: maybe contiguous • Data: keep contiguous, “relocate” at start • Stack: assume contiguous, fixed size • Just set pointer at start, reserve space • Heap: no need to make it contiguous Virtual Memory & Translation

  6. Many Programs, Small Total Size • Software approach • Just find appropriate space for data & code segments • Adjust any pointers to globals/functions in the code • Heap, stack “automatically” adjustable • Hardware approach • Pointer to data segment • All accesses to globals indirected Virtual Memory & Translation

  7. One Program, Lots of Memory • Observations: locality • Instructions in a function generally related • Stack accesses generally in current stack frame • Not all globals used all the time • Goal: keep recently-used portions in memory • Explicit: programmer/compiler reserves, controls part of memory space – “overlays” • Note: limited resource may be address space Virtual Memory & Translation

  8. Many Programs, Lots of Memory • Software approach • Keep only subset of programs in memory • When loading a program, evict any programs that use the same memory regions • “Swap” programs in/out as needed • Hardware approach • Don’t permanently associate any address of any program to any part of physical memory • Note: doesn’t address problem of too few address bits Virtual Memory & Translation

  9. Why Virtual Memory? • Use secondary storage($) • Extend DRAM($$$) with reasonable performance • Protection • Programs do not step over each other • Communications require explicit IPC operations • Convenience • Flat address space • Programs have the same view of the world Virtual Memory & Translation

  10. How To Translate • Must have some “mapping” mechanism • Mapping must have some granularity • Granularity determines flexibility • Finer granularity requires more mapping info • Extremes: • Any byte to any byte: mapping equals program size • Map whole segments: larger segments problematic Virtual Memory & Translation

  11. Translation Options • Granularity • Small # of big fixed/flexible regions – segments • Large # of fixed regions – pages • Visibility • Translation mechanism integral to instruction set – segments • Mechanism partly visible, external to processor – obsolete • Mechanism part of processor, visible to OS – pages Virtual Memory & Translation

  12. Translation Overview CPU • Actual translation is in hardware (MMU) • Controlled in software • CPU view • what program sees, virtual memory • Memory view • physical memory virtual address Translation (MMU) physical address Physical memory I/O device Virtual Memory & Translation

  13. Goals of Translation • Implicit translation for each memory reference • A hit should be very fast • Trigger an exception on a miss • Protected from user’s faults Registers Cache(s) 10x DRAM 100x paging Disk 10Mx Virtual Memory & Translation

  14. Base and Bound • Built in Cray-1 • A program can only access physical memory in [base, base+bound] • On a context switch: save/restore base, bound registers • Pros: Simple • Cons: fragmentation, hard to share, and difficult to use disks bound virtual address > error + base physical address Virtual Memory & Translation

  15. Segmentation Virtual address • Have a table of (seg, size) • Protection: each entry has • (nil, read, write, exec) • On a context switch: save/restore the table or a pointer to the table in kernel memory • Pros: Efficient, easy to share • Cons: Complex management and fragmentation within a segment segment offset > error seg size . . . + physical address Virtual Memory & Translation

  16. Paging Virtual address page table size • Use a page table to translate • Various bits in each entry • Context switch: similar to the segmentation scheme • What should be the page size? • Pros: simple allocation, easy to share • Cons: big table & cannot deal with holes easily VPage # offset error > Page table PPage# ... ... . . . PPage# ... PPage # offset Physical address Virtual Memory & Translation

  17. How Many PTEs Do We Need? • Assume 4KB page • Equals “low order” 12 bits • Worst case for 32-bit address machine • # of processes  220 • What about 64-bit address machine? • # of processes  252 Virtual Memory & Translation

  18. Segmentation with Paging Virtual address Vseg # VPage # offset Page table seg size PPage# ... ... . . . . . . PPage# ... > PPage # offset error Physical address Virtual Memory & Translation

  19. Multiple-Level Page Tables Virtual address pte dir table offset . . . Directory . . . . . . . . . What does this buy us? Sparse address spaces and easier paging Virtual Memory & Translation

  20. Inverted Page Tables Physical address Virtual address • Main idea • One PTE for each physical page frame • Hash (Vpage, pid) to Ppage# • Pros • Small page table for large address space • Cons • Lookup is difficult • Overhead of managing hash chains, etc pid vpage offset k offset 0 pid vpage k n-1 Inverted page table Virtual Memory & Translation

  21. Virtual-To-Physical Lookups • Programs only know virtual addresses • Each virtual address must be translated • May involve walking hierarchical page table • Page table stored in memory • So, each program memory access requires several actual memory accesses • Solution: cache “active” part of page table Virtual Memory & Translation

  22. VPage # Translation Look-aside Buffer (TLB) Virtual address offset VPage# PPage# ... Real page table Miss VPage# PPage# ... . . . VPage# PPage# ... TLB Hit PPage # offset Physical address Virtual Memory & Translation

  23. Bits in A TLB Entry • Common (necessary) bits • Virtual page number: match with the virtual address • Physical page number: translated address • Valid • Access bits: kernel and user (nil, read, write) • Optional (useful) bits • Process tag • Reference • Modify • Cacheable Virtual Memory & Translation

  24. Hardware-Controlled TLB • On a TLB miss • Hardware loads the PTE into the TLB • Need to write back if there is no free entry • Generate a fault if the page containing the PTE is invalid • VM software performs fault handling • Restart the CPU • On a TLB hit, hardware checks the valid bit • If valid, pointer to page frame in memory • If invalid, the hardware generates a page fault • Perform page fault handling • Restart the faulting instruction Virtual Memory & Translation

  25. Software-Controlled TLB • On a miss in TLB • Write back if there is no free entry • Check if the page containing the PTE is in memory • If no, perform page fault handling • Load the PTE into the TLB • Restart the faulting instruction • On a hit in TLB, the hardware checks valid bit • If valid, pointer to page frame in memory • If invalid, the hardware generates a page fault • Perform page fault handling • Restart the faulting instruction Virtual Memory & Translation

  26. Hardware vs. Software Controlled • Hardware approach • Efficient • Inflexible • Need more space for page table • Software approach • Flexible • Software can do mappings by hashing • PP#  (Pid, VP#) • (Pid, VP#)  PP# • Can deal with large virtual address space Virtual Memory & Translation

  27. Similarities Both cache a portion of memory Both write back on a miss Combine L1 cache with TLB Virtually addressed cache Why wouldn’t everyone use virtually addressed caches? Differences Associativity TLB is usually fully set-associative Cache can be direct-mapped Consistency TLB does not deal with consistency with memory TLB can be controlled by software Cache vs. TLBs Virtual Memory & Translation

  28. Similarities Both cache a portion of memory Both read from memory on misses Differences Associativity TLBs generally fully associative Caches can be direct-mapped Consistency No TLB/memory consistency Some TLBs software-controlled Caches vs. TLBs • Combining L1 caches with TLBs • Virtually addressed caches • Not always used – what are their drawbacks? Virtual Memory & Translation

  29. Issues • What TLB entry to be replaced? • Random • Pseudo LRU • What happens on a context switch? • Process tag: change TLB registers and process register • No process tag: Invalidate the entire TLB contents • What happens when changing a page table entry? • Change the entry in memory • Invalidate the TLB entry Virtual Memory & Translation

  30. Consistency Issues • Snoopy cache protocols can maintain consistency with DRAM, even when DMA happens • No hardware maintains consistency between DRAM and TLBs: you need to flush related TLBs whenever changing a page table entry in memory • On multiprocessors, when you modify a page table entry, you need to do “TLB shoot-down” to flush all related TLB entries on all processors Virtual Memory & Translation

  31. Issues to Ponder • Everyone’s moving to hardware TLB management – why? • Segmentation was/is a way of maintaining backward compatibility – how? • For the hardware-inclined – what kind of hardware support is needed for everything we discussed today? Virtual Memory & Translation

More Related