1 / 18

CS1104: Computer Organisation http://www.comp.nus.edu.sg/~cs1104

CS1104: Computer Organisation http://www.comp.nus.edu.sg/~cs1104. School of Computing National University of Singapore. PII Lecture 3: Benchmarking. Definitions. SPEC ’95. Amdahl’s Law. Reading: Chapter 2 of Patterson’s book: 2.4 – 2.7. Benchmarking.

dermot
Download Presentation

CS1104: Computer Organisation http://www.comp.nus.edu.sg/~cs1104

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS1104: Computer Organisation http://www.comp.nus.edu.sg/~cs1104 School of Computing National University of Singapore

  2. PII Lecture 3: Benchmarking • Definitions. • SPEC ’95. • Amdahl’s Law. • Reading: Chapter 2 of Patterson’s book: 2.4 – 2.7. Benchmarking

  3. Benchmarking • Benchmarking:Choosing programs to evaluate performance. • Measure the performance of a machine using a set of programs which will hopefully emulate the workload generated by the user’s programs. • Benchmarks: programs designed to measure performance. Benchmarking

  4. very specific • non-portable • difficult to run or measure • hard to identify cause • representative • portable • widely used • improvements useful in reality • less representative • easy to run, early in design cycle • easy to “fool” • identify peak capability and potential bottlenecks • “peak” may be a long way from application performance Benchmarks Pros Cons Actual Target Workload Full Application Benchmarks Small “Kernel” Benchmarks Microbenchmarks Benchmarking

  5. SPEC ’95 • SPEC (System Performance Evaluation Cooperative) • Companies have agreed on a set of real program and inputs • Eighteen application benchmarks (with inputs) reflecting a technical computing workload • Eight integer • go, m88ksim, gcc, compress, li, ijpeg, perl, vortex • Ten floating-point intensive • tomcatv, swim, su2cor, hydro2d, mgrid, applu, turb3d, apsi, fppp, wave5 • Must run with standard compiler flags • Eliminate special undocumented incantations that may not even generate working code for real programs • Can still be abused (Intel’s “other” bug) • Valuable indicator of performance (and compiler technology) Benchmarking

  6. SPEC ’95 (2) Benchmarking

  7. SPEC ’95 (3) • For a given ISA, increases in CPU performance can come from three sources: • Increase in clock rate • Improvements in processor organization that lower that CPI • Compiler enhancements that lower the instruction count or generate instructions with a lower average CPI (e.g., by using simpler instructions) • Next slide shows the SPECint95 and SPECfp95 measurements for a series of Intel Pentium processors and Pentium Pro processors. • Does doubling the clock rate double performance? • Can a machine with a slower clock rate have better performance? Benchmarking

  8. SPEC ’95 (4) • At same clock rate, Pentium Pro is 1.4 to 1.5 times faster (for SPECint95) and 1.7 to 1.8 times faster (for SPECfp95) – improvements come from organizational enhancements (pipelining, memory system) to the Pentium Pro. • Performance increases at a slower rate than increase in clock rate – bottleneck at memory system, Amdahl’s law at play here. Benchmarking

  9. Amdahl’s Law • Pitfall: Expecting the improvement of one aspect of a machine to increase performance by an amount proportional to the size of the improvement. • Example: • Suppose a program runs in 100 seconds on a machine, with multiply operations responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster? 100 (total time) = 80 (for multiply) + UA (unaffected) 100/4 (new total time) = 80/Speedup (for multiply) + UA Speedup = 80/5 = 16(meaning multiply now takes only 5 seconds) Benchmarking

  10. Amdahl’s Law (2) • Example (continued) • How about making it 5 times faster? 100 (total time) = 80 (for multiply) + UA (unaffected) 100/5 (new total time) = 80/Speedup (for multiply) + UA Speedup = 80/0 = ??? (impossible!) • There is no way we can enhance multiply to achieve a fivefold increase in performance, if multiply accounts for only 80% of the workload. Benchmarking

  11. Amdahl’s Law (3) • This concept is the Amdahl’s law. Performance is limited to the non-speedup portion of the program. • Execution time after improvement = Execution time of unaffected part + (Execution time of affected part / Speedup) • Corollary of Amdahl’s law: Make the common case fast. Benchmarking

  12. Example 1 • Suppose we enhance a machine making all floating-point instructions run five times faster. If the execution time of some benchmark before the floating-point enhancement is 12 seconds, what will the speedup be if half of the 12 seconds is spent executing floating-point instructions? Time = 6 (UA) + 6 (fl-pt) / 5 = 7.2 seconds. Speedup = 12/7.2 = 1.67 Benchmarking

  13. Example 2 • We are looking for a benchmark to show off the new floating-point unit described in the previous example, and we want the overall benchmark to show a speedup of 3. One benchmark we are considering runs for 100 seconds with the old floating-point hardware. How much of the execution time would floating-point instructions have to account for in this program in order to yield our desired speedup on this benchmark? Speedup = 3 = 100 / (Time_FI / 5 + 100 – Time_Fl) Time_FI = 83.33 seconds Benchmarking

  14. Sample Question (1) • Which of the following is/are true for SPEC benchmarking? • Higher SPEC number implies that the clock speed must be faster. • SPEC number is an indication of the performance of the processor hardware only. • SPEC benchmark is a single program used to measure and compare computer system performance. • All of the above. • None of the above. [Answer] Benchmarking

  15. Sample Question (2) • Which of the following is/are true for SPEC benchmarking? • The SPEC benchmark consists of 18 actual client-target workload programs that every computer system should optimize for. • The overall SPEC number of a system is affected by the quality of the compiler. • A system that has a higher clock speed must always have a higher SPEC number. • A system that has a lower overall CPI for SPEC benchmarks must always have a higher SPEC number. • None of the above. [Answer] Benchmarking

  16. Sample Question (3) • Suppose a program runs in 8642 seconds on a machine, with “rotate” operation responsible for 4321 seconds. How much do we have to improve the speed of “rotate” operation if we want the program to run 4 times faster? • Insufficient data to determine. • The speed of the “rotate” operation is improved by a factor of 4. • The speed of the “rotate” operation is improved by a factor of 8. • It is impossible to achieve the proposed speedup. • None of the above. [Answer] Benchmarking

  17. Sample Question (4) • Which of the following is true for benchmarking? • Benchmarking is a mechanism to compare the relative performance of computer systems. • Small kernel is always the best choices for benchmarking the actual performance of a computer system because it is easy to run. • Actual workload of a targeted application is always the best choice for benchmarking the performance of a future system to be designed. • All (a), (b), (c). • None of the above. [Answer] Benchmarking

  18. End of file

More Related