1 / 71

Advanced Computer Architecture Introduction

Advanced Computer Architecture Introduction. A.R. Hurson 128 EECH Building, Missouri S&T hurson@mst.edu. CpE6110 Advanced Computer Architecture Instructor: A. R. Hurson hurson@mst.edu Office: EECH 128 341-6201 Office Hours: By appointment

renay
Download Presentation

Advanced Computer Architecture Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Computer ArchitectureIntroduction A.R. Hurson 128 EECH Building, Missouri S&T hurson@mst.edu

  2. CpE6110 • Advanced Computer Architecture • Instructor: A. R. Hurson hurson@mst.edu • Office: EECH 128 341-6201 • Office Hours: By appointment • Class notes and reading materials are available at: https://hurson.weebly.com/cpe-6110-advanced-computer-architecture.html • Reference: Computer Architecture a Quantitative Approach, Hennessy and Patterson • Outline: • 1. Performance and Cost Analysis • 2. Instruction set Analysis • 3. Scheduling/Load balancing • 4. Memory hierarchies (revisit) • 5. Advanced Caching • a) Internet caching • b) Cooperative caching • 6. Concurrency — Classifications • a) Vector processors • b) SIMD array architectures • c) MIMD architectures • 7. Systolic design • 8. VLIW/Super-scalar/Super-pipeline • Data-flow processing • Multithreaded architecture • Transactional Memory • Multicore architecture • Administrative: • Homework Assignments 10% • Quizzes 20% • Project (term Paper) 20% • Midterm Exam 20% • Final Exam 30%

  3. Advanced Computer Architecture Race to the top  2011

  4. Advanced Computer Architecture Race to the top  2011 #1 Sum #500

  5. Advanced Computer Architecture Race to the top  2012

  6. Advanced Computer Architecture • Race to the top: If the projection holds, we would expect an Exaflops system around 2019. • In 2012 Top500 list was dominated by IBM Blue Gene/Q with four systems in top 10. The largest in Lawrence Livermore National Laboratory with more than 16 Petaflops sustained performance. It replaced K computer from Japan (the first 10 Petaflops machine).

  7. Advanced Computer Architecture • Tianhe-2 (MilkyWay-2) is the No. 1 system since 2013. • It is used for simulation, analysis, and government security applications.  • It is a collection of 16,000 computer nodes with a total of 3,120,000 cores. • Each of the 16,000 nodes possess 88 gigabytes of memory. The total CPU plus coprocessor memory is 1,375 TiB (approximately 1.34 PiB).

  8. Advanced Computer Architecture • Four major challenges have been recognized for an Exaflops machine: • Energy and power, • Memory and storage, • Concurrency and locality, • Resiliency

  9. Advanced Computer Architecture • It is estimated that an Exaflops machine would consume about 20 Magawatt of power which is equivalent to 50 Gflops/W.

  10. Advanced Computer Architecture • Computer Architecturerefers to the attributes of a system visible to a programmer — i.e., attributes that have a direct impact on the logical execution of a program. • Architectural Attributes include the instruction set, the number of bits used to represent various data types, I/O mechanisms, and techniques for addressing memory.

  11. Advanced Computer Architecture • Computer Organizationrefers to the operational units and their interconnections that realize the architectural specifications. • Organizational Attributes include those hardware details transparent to the programmer, such as control signals, interfaces between the computer and peripherals, and the memory technology used.

  12. Advanced Computer Architecture • Computer Hardware refers to the hardware detail design — logic design, and the implementation (packaging, power, cooling, ...).

  13. Advanced Computer Architecture • For a good design, architecture (instruction set design), organization, and hardware as well as software (compiler and operating system) issues must be considered.

  14. Advanced Computer Architecture • Common Performance Metrics • Execution time, • Bandwidth, • Throughput, • User CPU Time, • MIPS • MFLOPS • Speed up • Efficiency

  15. Advanced Computer Architecture • Scaleup─ Handling large task by increasing the degree of parallelism. It is the ability to process larger tasks in the same amount of time by providing more resources. • Power Consumption ─ Becomes an important performance metric when we use mobile wireless access devices. • Network Connectivity ─ Becomes of interest when connectivity is through wireless medium. • Data reliability and integrity ─ Becomes factor of interest at the presence of mobility and wireless communication.

  16. Advanced Computer Architecture • Summary • Computer architecture/Computer organization • Role of computer architect • Some performance Metrics • How to improve performance? • Hardware technology, • Innovative architectural features, and • Efficient resource management.

  17. Advances in Technology Better Resource Management Architectural Advances Performance improvement Program behavior Advanced Computer Architecture

  18. Advanced Computer Architecture • Questions • Network Latency • Memory Latency • How to manage resources efficiently?

  19. Advanced Computer Architecture • Grosch’s Law • In 1940s Grosch studied the relationship between the power (computational speed) (P) and cost (C) of a computer. He postulated that: P = k * Cs where k and s are positive constants. • He also argued that s is close to 2.

  20. Advanced Computer Architecture • Grosch’s Law • According to this law, in order to sell a computer for twice as much, it must be four times as fast. • With the advances in technology, it is easy to see that Grosch’s law is no longer valid.

  21. Advanced Computer Architecture • Performance Measures • Amdahl's law— The performance improvement gained by improving some portion of an architecture is limited by the fraction of the time the improved portion is used — a small number of sequential operations can effectively limit the speed up of a parallel algorithm.

  22. Advanced Computer Architecture • Performance Measures • Amdahl's lawallows a quick way to calculate the speed up based on two factors: • The fraction of the computation time in the original task that is affected by the enhancement, and • The improvement gained by the enhanced execution mode (speed up of the enhanced portion).

  23. Advanced Computer Architecture • Performance Measures —Amdahl's law

  24. Advanced Computer Architecture • Gustafson-Barsis’s Law • Parallel architectures comprised of hundreds of processors can be built with substantial improvement in performance. • They argued that in practice, the problem size scales up with the number of processors (n).

  25. Advanced Computer Architecture • Gustafson-Barsis’s Law • If s and p are the serial and parallel times spend on a parallel system then: s + p * n represents the execution time. • They introduced a new factor, scaled speed up factor (SS(n)): SS(n) = (s + p * n) / (s + p)

  26. Advanced Computer Architecture • Gustafson-Barsis’s Law • Speed up should be measured by scaling the problem to the number of processors, not by fixing the problem size.

  27. Advanced Computer Architecture • Performance Measures • Maximum Concurrency — For any computer there is a maximum number of bits or bit pairs — maximum concurrency (Cm) —that can be processed concurrently whether it is under single-instruction or multiple-instruction control.

  28. Advanced Computer Architecture • Performance Measures • Average Concurrency — The maximum concurrency is an indication of the computer processing capability. The actual utilization of this capability is indicated by the average concurrency defined as: where Ci is the concurrency at Dti.

  29. Advanced Computer Architecture • Performance Measures • Average Concurrency— If ti is set to one, then the average concurrency over a period of T time units is:

  30. Advanced Computer Architecture • Performance Measures • Hardware Utilization — The average hardware utilization is defined as: where i is the hardware utilization at time i.

  31. Advanced Computer Architecture • Performance Measures • Cm is determined by the hardware design, Ca or is highly dependent on the software and applications. • A general-purpose computer should achieve a high  for as many applications as possible. • A special-purpose computer would yield a high for at least the intended applications. • In either case, maximizing the value of for a computer design is important.

  32. Advanced Computer Architecture • Performance Measures • Parallel Systems — For a parallel processor the average parallelismis defined as: for T time units.

  33. Advanced Computer Architecture • Performance Measures • Parallel Systems — Similarly the average hardware utilizationis defined as: where ri is the hardware utilization for the parallel processor at time i.

  34. ~ • If Pa is the effective parallelism over a period of T, and • , Pi, and i are the corresponding effective values, then • the effective hardware utilization is: ~ ~ ~ Advanced Computer Architecture • Performance Measures • Parallel Systems

  35. A high , as well as the required throughput for, at least, the intended application (s). Advanced Computer Architecture • Performance Measures • A successful parallel processor design should yield: This involves not only a proper hardware and software design, but also the development of efficient parallel algorithms for these applications.

  36. Advanced Computer Architecture • Summary • Amdahl's law • More performance metrics • CPU Time • Concurrency • Hardware Utilization

  37. Advanced Computer Architecture • Performance Measures • Pipeline Systems • Latency (L) is defined as the number of time units separating two successive initiations of events. • Naturally, the lower the latency the higher the performance. Latency could be any integer value including zero.

  38. Advanced Computer Architecture • Performance Measures • Pipeline Systems The average latency is defined as the average number of time unit between two initiations. The initiation rate (I) is the average number of the initiations per clock unit:

  39. Advanced Computer Architecture • Performance Measures • Pipeline Systems • For stage Si,stage utilization(USi) indicates on the average how often Sihas been used: USi = I * ni where ni represents the number of time Siis used in one initiation.

  40. 1 = U S i MAX (d ) i Advanced Computer Architecture • Performance Measures • Pipeline Systems • For a linear pipe, if idenotes the execution time of stage Sithen:

  41. Advanced Computer Architecture • Performance Measures • Means to evaluate a system • Application programs — Workload. • Real Programs — A collection of programs that are run often by the user. • Kernels — Small, key pieces from real programs. • Benchmarks — A set of familiar, small, and well behaved programs known to the user. • Synthetic benchmarks — An artificial set of small programs that are intended to match the average frequency of operations and operands of a large set of programs.

  42. Advanced Computer Architecture • Is one number enough? • As per our discussion, so far, performance was the major design constraint. However, the power is becoming a problem. • Power consumption became an issue with the growth of wireless technology and mobile devices. However, it is becoming of concern since feeding several Magawatt of power to run a supercomputer is not a trivial task and requires a great amount of supporting infrastructure

  43. Advanced Computer Architecture • Is one number enough? • It is estimated that each Magawatt of power consumption increases the electricity cost about 1 million $$$ each year. In addition to the cost, environmental impact become an issue as well. Now data centers have a significant share in the global CO2 emission.

  44. Advanced Computer Architecture • Is one number enough?

  45. Advanced Computer Architecture • Is one number enough?

  46. Advanced Computer Architecture • Is one number enough? • Alternatively, the so called Top500 Green list based on Flops/W efficiency was developed.

  47. Advanced Computer Architecture • Is one number enough? 45

  48. Advanced Computer Architecture • Is one number enough? • On the Green500 list, as per June 2012, the top 21 spots are held by IBM Blue Gene/Q systems with an efficiency of over 2.1 GFlops/Watt (a huge gap with the top non Blue Gene/Q system).

  49. Advanced Computer Architecture • As noted before, an Exaflops machine would consume about 20 Magawatt of power which is equivalent to 50 Gflops/W. • Relative to Blue Gene/Q then the power efficiency needs to be improved by a factor of 25.

  50. Advanced Computer Architecture • Power and Energy • Power is important factor for a data center, since it needs to be fed. • Energy is of more concern from an application point of view.

More Related