1 / 39

Computer Architecture Virtual Memory

Computer Architecture Virtual Memory. Dr. Lihu Rappoport. Virtual Memory. Provides the illusion of a large memory Different machines have different amount of physical memory Allows programs to run regardless of actual physical memory size

chelsi
Download Presentation

Computer Architecture Virtual Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Architecture Virtual Memory Dr. Lihu Rappoport

  2. Virtual Memory • Provides the illusion of a large memory • Different machines have different amount of physical memory • Allows programs to run regardless of actual physical memory size • The amount of memory consumed by each process is dynamic • Allow adding memory as needed • Many processes can run on a single machine • Provide each process its own memory space • Prevents a process from accessing the memory of other processes running on the same machine • Allows the sum of memory spaces of all process to be larger than physical memory • Basic terminology • Virtual Address Space: address space used by the programmer • Physical Address: actual physical memory address space

  3. Virtual Memory: Basic Idea • Divide memory (virtual and physical) into fixed size blocks • Pages in Virtual space, Frames in Physical space • Page size = Frame size • Page size is a power of 2: page size = 2k • All pages in the virtual address space are contiguous • Pages can be mapped into physical Frames in any order • Some of the pages are in main memory (DRAM), some of the pages are on disk • All programs are written using Virtual Memory Address Space • The hardware does on-the-fly translationbetween virtual and physical address spaces • Use a Page Tableto translate betweenVirtualand Physical addresses

  4. Virtual Memory • Main memory can act as a cache for the secondary storage (disk) • Advantages: • illusion of having more physical memory • program relocation • protection Physical Addresses Virtual Addresses Address Translation Disk Addresses

  5. Virtual Address 31 0 12 11 Page offset Virtual Page Number V D AC Frame number Page table base reg Access Control Dirty bit 1 0 Valid bit 29 0 11 12 Page offset Physical Frame Number Physical Address Virtual to Physical Address translation Page size: 212 byte=4K byte

  6. Page Tables Page Table Physical Page Or Disk Address Virtual page number Physical Memory Valid 1 1 1 1 0 1 1 0 Disk 1 1 0 1

  7. Address Mapping Algorithm If V = 1 then page is in main memory at frame address stored in table  Fetch data else (page fault) need to fetch page from disk  causes a trap, usually accompanied by a context switch: current process suspended while page is fetched from disk Access Control (R = Read-only, R/W = read/write, X = execute only) If kind of access not compatible with specified access rights then protection_violation_fault  causes trap to hardware, or software fault handler • Missing item fetched from secondary memory only on the occurrence of a fault demand load policy

  8. 1 0 page table entry 1 0 1 0 0 0 Ref bit Page Replacement Algorithm • Not Recently Used (NRU) • Associated with each page is a reference flag such that ref flag = 1 if the page has been referenced in recent past • If replacement is needed, choose any page frame such that its reference bit is 0. • This is a page that has not been referenced in the recent past • Clock implementation of NRU: While (PT[LRP].NRU) { PT[LRP].NRU LRP++ (mod table size) } • Possible optimization: search for a page that is both not recently referenced AND not dirty

  9. Page Faults • Page faults: the data is not in memory  retrieve it from disk • The CPU must detect situation • The CPU cannot remedy the situation (has no knowledge of the disk) • CPU must trap to the operating system so that it can remedy the situation • Pick a page to discard (possibly writing it to disk) • Load the page in from disk • Update the page table • Resume to program so HW will retry and succeed! • Page fault incurs a huge miss penalty • Pages should be fairly large (e.g., 4KB) • Can handle the faults in software instead of hardware • Page fault causes a context switch • Using write-through is too expensive so we use write-back

  10. Optimal Page Size • Minimize wasted storage • Small page minimizes internal fragmentation • Small page increase size of page table • Minimize transfer time • Large pages (multiple disk sectors) amortize access cost • Sometimes transfer unnecessary info • Sometimes prefetch useful data • Sometimes discards useless data early • General trend toward larger pages because • Big cheap RAM • Increasing memory / disk performance gap • Larger address spaces

  11. Virtual Address TLB Access TLB Hit ? No Access Page Table Physical Addresses Yes Translation Lookaside Buffer (TLB) • Page table resides in memory  each translation requires a memory access • TLB • Cache recently used PTEs • speed up translation • typically 128 to 256 entries • usually 4 to 8 way associative • TLB access time is comparable to L1 cache access time

  12. TLB Valid Tag Physical Page Virtual page number 1 Physical Memory 1 1 1 0 1 Page Table Valid 1 1 Disk 1 1 0 1 1 0 Physical Page Or Disk Address 1 1 0 1 Making Address Translation Fast TLB is a cache for recent address translations:

  13. Tag Set Way 2 Way 2 Way 3 Way 3 Way 0 Way 0 Way 1 Way 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 TLB Access Virtual page number Offset Set# = = = = Way MUX PTE Hit/Miss

  14. Access Cache Physical Addresses Virtual Memory And Cache Virtual Address Access TLB L2Cache Hit ? L2Cache Hit ? L1Cache Hit ? TLB Hit ? Access Page Table In Memory No No No Access Memory No Yes Yes Data • TLB access is serial with cache access • Page table entries can be cached in L2 cache (as data)

  15. 29 12 11 0 Physical Page Number Page offset 29 14 13 6 5 0 tag set disp Overlapped TLB & Cache Access Virtual Memory view of a Physical Address Cache view of a Physical Address • #Set is not contained within the Page Offset • The #Set is not known until the physical page number is known • Cache can be accessed only after address translation done

  16. tag set disp Overlapped TLB & Cache Access (cont) Virtual Memory view of a Physical Address 29 12 11 0 Physical Page Number Page offset Cache view of a Physical Address 29 6 5 0 12 11 • In the above example #Set is contained within the Page Offset • The #Set is known immediately • Cache can be accessed in parallel with address translation • Once translation is done, match upper bits with tags Limitation: Cache ≤ (page size × associativity)

  17. 29 12 11 0 Physical Page Number Page offset 20 14 13 6 5 0 tag set disp Overlapped TLB & Cache Access (cont) • Assume 4K byte per page  bits [11:0] are not translated • Assume cache is 32K Byte, 2 way set-associative, 64 byte/line • (215/ 2 ways) / (26 bytes/line) = 215-1-6 = 28 = 256 sets • Physical_addr[13:12] may be different than virtual_addr[13:12] • Tag is comprised of bits [31:12] of the physical address • The tag may mis-match bits [13:12] of the physical address • Cache miss  allocate missing line according to its virtual set address and physical tag

  18. Tag Set Overlapped TLB & Cache Access (cont) Virtual page number Page offset set disp TLB Cache Set# Set# = = = = Way MUX Hit/Miss = = = = = = = = Way MUX Physical page number Hit/Miss Data

  19. More On Page Swap-out • DMA copies the page to the disk controller • Reads each byte: • Executes snoop-invalidate for each byte in the cache (both L1 and L2) • If the byte resides in the cache: • if it is modified reads its line from the cache into memory • invalidates the line • Writes the byte to the disk controller • This means that when a page is swapped-out of memory • All data in the caches which belongs to that page is invalidated • The page in the disk is up-to-date • The TLB is snooped • If the TLB hits for the swapped-out page, TLB entry is invalidated • In the page table • The valid bit in the PTE entry of the swapped-out pages set to 0 • All the rest of the bits in the PTE entry may be used by the operating system for keeping the location of the page in the disk

  20. Context Switch • Each process has its own address space • Each process has its own page table • When the OS allocates to each process frames in physical memory, and updates the page table of each process • A process cannot access physical memory allocated to another process • Unless the OS deliberately allocates the same physical frame to 2 processes (for memory sharing) • On a context switch • Save the current architectural state to memory • Architectural registers • Register that holds the page table base address in memory • Flush the TLB • Load the new architectural state from memory • Architectural registers • Register that holds the page table base address in memory

  21. VM in VAX: Address Format Virtual Address 30 31 9 8 0 29 Page offset Virtual Page Number 0 0 - P0 process space (code and data) 0 1 - P1 process space (stack) 1 0 - S0 system space 1 1 - S1 Physical Address 29 8 0 9 Page offset Physical Frame Number Page size: 29 byte= 512 bytes

  22. Process0 Process1 Process2 Process3 0 P0process code & global vars grow upward P1process stack & local vars grow downward 7FFFFFFF 80000000 S0system space grows upward, generally static VM in VAX: Virtual Address Spaces

  23. 31 20 0 Physical Frame Number V PROT M Z OWN S S 3 ownership bits Indicate if the line was cleaned (zero) Modified bit 4 Protection bits • Valid bit =1 if page mapped to main memory, otherwise page fault: • Page on the disk swap area • Address indicates the page location on the disk Page Table Entry (PTE)

  24. 29 9 8 0 Page offset 10 VPN 00 0 VPN + 00 SBR (System page table base physical address) = 00 PTE physical address Get PTE PFN (from PTE) 29 9 8 0 Page offset PFN System Space Address Translation

  25. 31 9 29 8 0 offset 10 VPN SBR VPN*4 PFN 9 29 8 0 offset PFN System Space Address Translation

  26. 29 9 8 0 00 0 VPN Page offset 00 VPN + 00 P0BR (P0 page table base virtual address) = 00 PTE S0 space virtual address Get PTE using system space translation algorithm PFN (from PTE) 29 9 8 0 Page offset PFN P0 Space Address Translation

  27. 31 31 9 9 29 29 8 8 0 0 P0BR+VPN*4 Offset’ offset 10 00 VPN’ VPN SBR VPN’*4 PFN’ Physical addr of PTE 9 9 29 29 8 8 0 0 Offset’ Offset PFN PFN’ PFN P0 Space Address Translation (cont)

  28. Process TLB Access System TLB hit? Process TLB hit? 00 VPN offset P0 space Address translation Using TLB Memory Access Calculate PTE virtual addr (in S0): P0BR+4*VPN No Yes System TLB Access Get PTE of req page from the proc. TLB Access Sys Page Table in SBR+4*VPN(PTE) No Yes PFN Get PTE from system TLB PFN Calculate physical address Get PTE of req page from the process Page table Access Memory

  29. Linear Address Space (4K Page) 31 21 11 0 DIR TABLE OFFSET 4K Page Frame Operand Page Table Page Directory PG Tbl Entry CR3 4K Dir Entry Paging in x86 • 2-level hierarchical mapping • Page directory and page tables • All pages and page tables are 4K • Linear address divided to: • Dir 10 bits • Table 10 bits • Offset 12 bits • Dir/Table serves as indexinto a page table • Offset serves ptr into adata page • Page entry points to a page table or page • Performance issues: TLB

  30. DIR TABLE OFFSET CR3 Code Page Dir Phys Mem Data 4K page OS Stack Page Tables x86 Page Translation Mechanism • CR3 points to current page directory (may be changed per process) • Usually, a page directory entry (covers 4MB) points to a page table that covers data of the same type/usage • Can allocate different physical for same Linear (e.g. 2 copies of same code) • Sharing can alias pages from diff. processes to same physical (e.g., OS)

  31. Present Writable User Write-Through Cache Disable Accessed Page Size (0: 4 Kbyte) Available for OS Use Page Frame Address 31:12 AVAIL 0 0 0 A PCD PWT U W P Page Dir Entry - 31 12 11 9 8 7 6 5 4 3 2 1 0 Present Writable User Write-Through Cache Disable Accessed Dirty Available for OS Use Page Frame Address 31:12 AVAIL 0 0 D A PCD PWT U W P Page Table Entry - 31 12 11 9 8 7 6 5 4 3 2 1 0 Reserved by Intel for future use (should be zero) Figure 11-14. Format of Page Directory and Page Table Entries for 4K Pages x86 Page Entry Format • 20 bit pointer to a 4K Aligned address • 12 bits flags • Virtual memory • Present • Accessed, Dirt • Protection • Writable (R#/W) • User (U/S#) • 2 levels/type only • Caching • Page WT • Page Cache Disabled • 3 bit for OS usage

  32. x86 Paging – Virtual memory • A page can be • Not yet loaded • Loaded • On disk • A loaded page can be • Dirty • Clean • When a page is not loaded (P bit clear) => Page fault occurs • It may require throwing a loaded page to insert the new one • OS prioritize throwing by LRU and dirty/clean/avail bits • Dirty page should be written to Disk. Clean need not. • New page is either loaded from disk or “initialized” • CPU will set page “access” flag when accessed, “dirty” when written

  33. Trans- lation VA PA CPU Main Memory Cache hit data Virtually-Addressed Cache • Cache uses virtual addresses (tags are virtual) • Only require address translation on cache miss • TLB not in path to cache hit • Aliasing: 2 different virtual addr. mapped to same physical addr • Two different cache entries holding data for the same physical address • Must update all cache entries with same physical address

  34. Virtually-Addressed Cache (cont). • Cache must be flushed at task switch • Solution: include process ID (PID) in tag • How to share memory among processes • Permit multiple virtual pages to refer to same physical frame • Problem: incoherence if they point to different physical pages • Solution: require sufficiently many common virtual LSB • With direct mapped cache, guarantied that they all point to same physical page

  35. Backup

  36. Inverted Page Tables IBM System 38 (AS400) implements 64-bit addresses. 48 bits translated start of object contains a 12-bit tag V.Page P. Frame Virtual Page hash = => TLBs or virtually addressed caches are critical

  37. Hardware / Software Boundary • What aspects of the Virtual → Physical Translation is determined in hardware? • TLB Format • Type of Page Table • Page Table Entry Format • Disk Placement • Paging Policy

  38. Why virtual memory? • Generality • ability to run programs larger than size of physical memory • Storage management • allocation/deallocation of variable sized blocks is costly and leads to (external) fragmentation • Protection • regions of the address space can be R/O, Ex, . . . • Flexibility • portions of a program can be placed anywhere, without relocation • Storage efficiency • retain only most important portions of the program in memory • Concurrent I/O • execute other processes while loading/dumping page • Expandability • can leave room in virtual address space for objects to grow. • Performance

  39. Address Translation with a TLB n–1 p p–1 0 virtual address virtual page number page offset valid tag physical page number TLB . . . = TLB hit physical address tag byte offset index valid tag data Cache = data cache hit

More Related