1 / 33

Advanced Micro Devices - Athlon

Advanced Micro Devices - Athlon. Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001. Where is the Competition?. What Have We Seen So Far?. IA-64. IA-32. RISC. Overview of Today’s Events. Company History Differences in AMD Athlon Architecture System Bus Macro vs. Micro Operations

terrene
Download Presentation

Advanced Micro Devices - Athlon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Micro Devices - Athlon • Buddy Guest • Mike Lewitt • Bill McCorkle November 28, 2001

  2. Where is the Competition? What Have We Seen So Far? IA-64 IA-32 RISC

  3. Overview of Today’s Events • Company History • Differences in AMD Athlon Architecture • System Bus • Macro vs. Micro Operations • Floating Point Operations • Branch Prediction • Memory Management • Comparing Processor Performance

  4. May 1, 1969 – founded Semiconductor company 1975 8080A and AM2900 1976 Sign cross-licensing agreement 1987 AMD & Intel go to court 1992 Court awards full rights to AMD to produce AM386 Processor 1991 AM386 (breaks Intel Monopoly) 1993 AM486 1997 AMD-K6 1998 Athlon – 1st 7th Generation Processor July 18, 1968 – founded Semiconductor memory 1971 4004 introduced 1971 8008 introduced 1976 Sign cross-licensing agreement 1981 16-bit 8086 1982 286 (on-board memory) 1985 32-bit 386 1989 486 1993 Pentium 1998 Celeron & Pentium II AMD Intel

  5. Architecture Summary • AMD Approach • Balanced approach to optimize processorperformance (IPC) and improving the operating frequency at the same time. • Intel Approach • Increased pipelining depth to handle more instructions which created loss in processor performance (IPC). • Solution: Compensated with much higher frequency to stay in competition. (=IPC)

  6. Architecture Summary • Overall Improvement to Performance • Frequency Improvements • Smaller Geometries • Faster Transistors (“process shrinks”) • Deeper Pipelines • Fewer Gates Per Clock Cycle • Work Per Clock Improvements • Super scalar Architectures • Dynamic Instruction Schedulers • Larger On-Chip Caches • Advanced Branch Prediction

  7. Architecture Summary • Clock Speed / EV6 Bus • Designed with very high clock speeds in mind • K7 has very deep buffers to enable those high clock speeds, offering up to 72 x86 instructions in-flight. • Uses Rising Edge and Falling Edge Detection For Bus • 100 MHz Clock  200 MHz Processor • 133 MHz Clock  266 MHz Processor • AMD vs. Intel comparing same clock

  8. Architecture Summary • EV6 Bus on AMD Athlon • Scalable up to 200 MHz Yielding Effective frequency 400 MHz • Multiprocessor support • Highest bus bandwidth (1.60 GB/s) • Intel using 133 MHz (1.01 GB/s)

  9. AMD Athlon PIII

  10. Architecture Summary • Instruction Control Unit • Holds 72 MOps Before Assignment (MOp = x86 instruction, therefore Athlon can have 72 “in-flight” instructions) • P6 Only Holds 13 in-flight MOps

  11. Architecture Summary • Execution Ports • AMD Has No Less Than 9 • Intel Has 5 • 2 Dedicated to memory stores Enhanced Parallelism Inside Athlon

  12. Micro-OPs / Macro-OPs • Athlon has 3 parallel x86 instruction decoders translate into a Macro-Op of 72-entry ICU • Uses 2 pipelines (Intel uses 1) • -Decoding common instructions (direct path) • -Decoding complex x86 instructions (vector path) • Integer Scheduler is fed and holds max 15 M-Ops, representing 30 at a time • Leads to 3 parallel integer execution units

  13. Micro-OPs / Macro-OPs • Athlon Decoders 3-Way Instruction • Has 3 parallel decoding units • Can handle any combination of instructions with any of it’s decoders that are “fully capable” decoders • Handles Complex and Simple Instructions • Intel Decoders • Has 3 parallel decoding units • 1 Complex • 2 Simple • Handles Complex / Simple / Simple

  14. 3DNOW! MMX Developed When FPUs Not As Important Every 4-wide Intel SSE instruction is actually 2 Athlon micro-ops *AMD takes advantage of rising edge as well as falling edge **SSE Cannot be used with MMX Registers

  15. 3DNOW! Each pipeline can do any instruction above. The second pipeline can do any instruction in any group except the group the first pipeline has chosen.

  16. 3DNOW! • Conclusion of 3DNOW! Vs SSE • Both have pairing restrictions • SSE Separate Unit  implementation more difficult  program with more freedom • MMX-add & prefetch-instructions slightly better for SSE • Final Conclusion: DRAW

  17. Full Architecture views AMD Athlon PIII

  18. Looking at the ALUs

  19. Floating Point Operations • Fully pipelined FPU • 3 ported parallel Floating Point Execution Units • Pentium has 3 also, but are behind only one port • FPU can execute two 80-bit extended Ops • Intel can currently only execute one

  20. Pipelining Differences • Determining the length • Execution rate of pipeline (ALU) • Degree of Parallelism (AMD-Athlon)

  21. Branch Prediction When x>0 Example: if (x > 0){ a=0; b=1; c=2; } d=3; When x<0 Predicting x<0

  22. Branch Prediction • AMD Athlon • Branch Target Buffer size of 2048 entries • Branch History Table can store 4096 entries • Intel Pentium III • Dynamic Branch Predictor can store 512 entries • Approximate Correct Branch Predictions • AMD Athlon: 95% • Intel Pentium III: 90-92%

  23. Memory Management • Level 2 Cache • 512kB to 8 MB • Rate of 1/3, 1/2, 2/3, 1/1 the clock frequency • External to the CPU (Weakness of Athlon) • Intel L2: 256kB ‘on-die’ • Intel moving away from Slot1 and back to socket • AMD will need to move to ‘on-die’ and socket connections to stay competitive • Main push towards 0.18 m-process • Level 1 Cache • 64kB data and instruction caches (4x Pentium III) • Scalability

  24. Which One Is Better? • In the past (286, 386, 486) • Performance = Frequency • In Today’s World • Performance = IPC * Frequency • How else so we compare? • Benchmarking

  25. Benchmarking • Software that performs different tasks to obtain comparisons between processors. • Problems: • Processor frequencies. • Other processes already running. • Types of programs • Some programs are written to take advantage of certain architecture.

  26. Photo Editing Software

  27. Animation Software

  28. 3D Graphics Editor

  29. 3D Gaming

  30. Various Benchmarks

  31. Summary • Past couple years, AMD and Intel have taken different approaches. • We have gone over the main architectural differences. • We have shown how they compare. • It will be very interesting to see how the market plays out.

  32. Questions?

  33. References • http://www.amd.com • http://www.amdzone.com • http://www.intel.com • Gardner, Ryan. AMD employee CPU Specialist email: ryan.gardner@amd.com • Hsieh, Paul. 7th Generation CPU Comparisons. http://www.azillionmonkeys.com/qed/cpujihad.shtml . 11/30/00 • Pabst, Thomas. The New Athlon Processor – AMD is Finally Overtaking Intel . http://www6.tomshardware.com/cpu/99q3/990809/index.html. 8/9/99 • Pabst, Thomas. AMD Processors vs. Intel Processors – Facts and Lies. http://www6.tomshardware.com/cpu/00q4/001017/athlon-02.html. 10/12/00 • Morgan, Rob. Power Mac G4 Dual 500 vs. Pentium 4 vs. Athlon. http://www.barefeats.com/pentium.html . 1/08/01

More Related