1 / 37

Yale Patt The University of Texas at Austin

Future Microprocessors: What must we do differently to effectively utilize multi-core and many-core chips? …and what are the implications re: education?. Yale Patt The University of Texas at Austin. World University Presidents’ Symposium University of Belgrade April 4, 2009.

scout
Download Presentation

Yale Patt The University of Texas at Austin

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Future Microprocessors:What must we do differently to effectively utilize multi-core and many-core chips? …and what are the implications re: education? Yale Patt The University of Texas at Austin World University Presidents’ Symposium University of Belgrade April 4, 2009

  2. What I want to try to do today: • First, explain multi-core chips • How we got to where we are • What we have to do differently moving forward • Second, discuss what that means with respect to education

  3. Problem Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Electronic Devices

  4. Problem Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Electronic Devices

  5. Problem Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Electronic Devices

  6. How many electronic devices? • The first microprocessor (Intel 4004), 1971 • 2300 transistors • 106 KHz • The Pentium chip, 1992 • 3.1 million transistors • 66 MHz • Today • more than one billion transistors • Frequencies in excess of 5 GHz

  7. In the next few years: • Process technology: 50 billion transistors • Gelsinger says we are can go down to 10 nanometers (I like to say 100 angstroms just to keep us focused) • Dreamers will use whatever we come up with • How will we design the microarchitecture? • How will we harness 50 billion transistors?

  8. How will we use 50 billion transistors? How have we used the transistors up to now?

  9. How have we used the available transistors?

  10. Intel Pentium M

  11. Intel Core 2 Duo • Penryn, 2007 • 45nm, 3MB L2

  12. Why Multi-core chips? • It is easier than designing a much better uni-core • It is embarrassing to continue making L2 bigger • It is the next obvious step • It is not the Holy Grail

  13. that is… • In the beginning: a better and better uniprocessor • improving performance on the hard problems • More recently:a uniprocessor with a bigger L2 cache • forsaking further improvement on the “hard” problems • poorly utilizing the chip area • and blaming the processor for not delivering performance • Today: dual core, quad core • Tomorrow: ???

  14. The Good News: Lots of cores on the chip The Bad News: Not much benefit.

  15. In my opinion the reason is: Our inability to effectively exploit: -- The transformation hierarchy -- Parallel programming

  16. Problem Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Electrons

  17. Up to now • Maintain the artificial walls between the layers • Keep the abstraction layers secure • Makes for a better comfort zone • (Mostly) Improving the Microarchitecture • Pipelining, Branch Prediction, Speculative Execution Out-of-order Execution, Caches, Trace Cache • Today, we have too many transistors • BANDWIDTH and POWER are blocking improvement • We MUST change the paradigm

  18. …and that means (I think) breaking the layers: • Compiler, Microarchitecture • Multiple levels of cache • Block-structured ISA • Part by compiler, part by uarch • Fast track, slow track • Algorithm, Compiler, Microarchitecture • X + superscalar – the Refrigerator • Niagara X / Pentium Y • Microarchitecture, Circuits • Verification Hooks • Internal fault tolerance

  19. IF we break the layers: • 50 billion transistors naturally leads to • A large number of simple processors, AND • A few very heavyweight processors, AND • Enough “accelerators” for handling lots of special tasks • Heavyweight processors mean we can exploit ILP • i.e., Improve performance on hard, sequential problems • The death of ILP has been greatly exaggerated • There is still plenty of head room yet to be pursued • We need software that can utilize both • We need multiple interfaces

  20. that is: • IF we are willing to continue to pursue ILP • IF we are willing to break the layers • IF we are willing to embrace parallel programming

  21. The Microprocessor of 2019 • It WILL BE a Multi-core chip • But it will be PentiumX / Niagara Y • It will tackle off-chip bandwidth (better L2 miss handling) • It will tackle power consumption (ON/OFF switches) • It will tackle soft errors (internal fault tolerance) • It will tackle security • And it WILL CONTAIN a heavyweight ILP processor • With the levels of transformation integrated

  22. The Heavyweight ILP Processor: • Compiler/Microarchitecture Symbiosis • Multiple levels of cache • Fast track / Slow track • Part by compiler, part by microarchitecture • Block-structured ISA • Better Branch Prediction (e.g., indirect jumps) • Ample sprinkling of Refrigerators • SSMT (Also known as helper threads) • Power Awareness (more than ON/OFF switches) • Verification hooks (CAD a first class citizen) • Internal Fault tolerance (for soft errors) • Better security

  23. Unfortunately: • We train computer people to work within their layer • Too few understand anything outside their layer and, as to multiple cores: • People think sequential

  24. At least two problems

  25. Conventional Wisdom Problem 1: “Abstraction” is Misunderstood • Taxi to the airport • The Scheme Chip (Deeper understanding) • Sorting (choices) • Microsoft developers (Deeper understanding) • Wireless networks (Layers revisited)

  26. Conventional Wisdom Problem 2: Thinking in Parallel is Hard • Perhaps: Thinking is Hard • How do we get people to believe: Thinking in parallel is natural

  27. We have an Education Problem We have an Education Opportunity • Too many computer professionals don’t get it. • We can exploit all these transistors • IF we can understand each other’s layer • Thousands of cores, hundreds of accelerators • Ability to power on/off under program control • Algorithms, Compiler, Microarchitecture, Circuits all talking to each other … • Harnessing 50 billion transistor chips

  28. What does this mean for most countries? • One does NOT have to produce chips to be a major force in the multi-core, many core era • One DOES have to understand multiple layers • which does not require huge capital investment • BUT does require a very large teacher investment • One DOES have to embrace parallel thinking

  29. How does one do it? • FIRST, Do not accept the premise: Parallel programming is hard • SECOND, Do not accept the premise: It is okay to know only one layer

  30. Parallel Programming is Hard? • What if we start teaching parallel thinking in the first course to freshmen • For example: • Factorial • Parallel search • Streaming

  31. How does one do it? • FIRST, Do not accept the premise: Parallel programming is hard • SECOND, Do not accept the premise: It is okay to know only one layer

  32. Students can understand more than one layer • What if we get rid of “top-down” FIRST • Students do not get it – they have no underpinnings • Objects are too high a level of abstraction • So, students end up memorizing • Memorizing isn’t learning (and certainly not understanding) • What if we START with “motivated” bottom up • Students build on what they already know • Memorizing is replaced with real learning • Continually raise the level of abstraction • The student sees the layers from the start • The student makes the connections • The student understands what is going on

  33. If students understand, they can fix their own bugs • …and, there is no substitute for Design it wrong, Debug it yourself, Fix it yourself, AND see the working result.

  34. And, while I am at it: • Students in every country are smart • No country has an advantage in that domain • Don’t be afraid to work them very, very hard • Students can master very difficult material • And, students will not complain if they are learning • But they will complain if they are being fed “tedium.” • That means teachers who know the difference

  35. Again: what does this mean for most countries? • One does NOT have to produce chips to be a major force in the multi-core, many core era • One DOES have to understand multiple layers • which does not require huge capital investment • BUT does require a very large teacher investment • One DOES have to embrace parallel thinking

  36. The answer to both is education.

  37. Thank you!

More Related