1 / 20

Lecture 20: Main Memory

Lecture 20: Main Memory. Today. SRAM vs. DRAM Technology Trends Advanced DRAM organizations. Integrated into CPU Fast, many ports. Main Memory Very dense Slower to access, one port. SRAM. DRAM. Register File. DO. A. DI. D. A. W. B. B. Typical Memory Technologies. Caches

danika
Download Presentation

Lecture 20: Main Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 20: Main Memory Today SRAM vs. DRAM Technology Trends Advanced DRAM organizations

  2. Integrated into CPU Fast, many ports Main Memory Very dense Slower to access, one port SRAM DRAM Register File DO A DI D A W B B Typical Memory Technologies • Caches • On chip L1 • Off chip L2 • More bits, fewer ports

  3. SRAM Organization

  4. SRAM Timing Address Address CE OE Dout Data Valid

  5. Basic DRAM Architecture

  6. RAS CAS DRAM Access Time Address Row Column Dout Data Valid

  7. SRAM 4MBit capacity 64-bit data interface 15-20ns access time (50-80MHz) Storage cells are self-restoring Lower power 4 times cost per bit (vs DRAM) DRAM 64MBit capacity 16-bit interfaces 60ns read access time 16MHz Reads destroy data Must write data back Refresh periodically Higher power Lower cost per bit SRAM vs. DRAM

  8. Technology Trends

  9. Improving External Memory System Performance • Bandwidth vs. Latency • Bandwidth = #bits transferred per cycle • Latency = time to access DRAM • Bandwidth • Memory bus width (16, 32, 64) • Address interleaving • Independent Memory Banks • Latency • Synchronous DRAM access modes • Faster interface (Rambus)

  10. Memory Bus Width • Depends on microprocessor implementation • 386 = 32 bit external data bus • 386SX = 16 bit external data bus • Today = 64 bit data busses common, 128 bit soon • Also interacts with external DRAM organization

  11. Main Memory Organization (widen data busses) • 64Mbit DRAM (= 8MB) • What if we want a 64MB memory system? 4MB x 4x4 2MB x 8x4 1MB x 16x4

  12. Distribute memory address space across memory banks Route requests to banks based on low block address bits Allows memory accesses to go in parallel Two key issues how are replies matched up with requesters? how do we avoid bank conflicts? Multiple Memory Banks (interleaving) or Offset Bank Word Offset Bank A0 A1 A2 A3 D0 D1 D2 D3 A0 A1 A2 A3

  13. Interleaved Memory Organization Bank Select Latch or Queue CPU & Cache Memory Bank

  14. Accesses may not reference banks evenly Consider 0,1,2,3 … vs 0,8,16,24 … often caused by column access to a matrix causes problems for large block size too Solutions don’t do that number of columns in matrix not a power of 2 prime number of banks number of active banks with stride s is lcd(s,b) hash the banks Bank Conflicts

  15. Synchronous DRAM • Interface signals are clocked • Clock provided by microprocessor • Why? • Easier to designed timed protocols • “Data available 8 cycles after CAS” • Add intelligence to EMI (external memory interface) on CPU

  16. RAS CAS Burst Mode • Provide one address and sequence of data comes out • Perfect for cache line reads and writes • Burst size programmable Address Row Column Dout D0 D1 D3 D2

  17. RAS CAS Page Mode Access • One RAS (get whole row) • Multiple CAS (different parts of the row) • Exploits spatial locality (kind of like DRAM cache) Address Row ColB ColC Dout C0 B0 C1 B1 B3 B2

  18. RAS CAS Pipelined Mode Access • Interleave access to multiple internal banks • Lower latency for back-to-back access to different banks Address ColC RowB RowC ColB Dout C0 B0 C1 B1 B3 B2

  19. New DRAM Interfaces • Rambus • 800 MHz interface (18 bits gets you 14.4Gb/sec) • compare this to 100Mhz, 16 bit SyncDram = 1.6Gb/sec • More complicated electrical interface on DRAM and CPU • Restrictions on board level design

  20. Next Time • Virtual Memory • Allow multiple users with protection • Enable relocation of data • Extend memory hierarchy automatically past DRAM

More Related