1 / 32

October 9, 2003

October 9, 2003. Acknowledgements. The team would like to acknowledge the technical assistance of Dr. Tyagi and Sriram Nadathur. Definitions. Clock cycle time - The time that it takes to complete a clock period. Commonly measured in frequency.

Download Presentation

October 9, 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. October 9, 2003

  2. Acknowledgements • The team would like to acknowledge the technical assistance of Dr. Tyagi and Sriram Nadathur.

  3. Definitions • Clock cycle time - The time that it takes to complete a clock period. Commonly measured in frequency. • Functional units - Individual blocks of logic in the processor. • Hazards - Situations in the processor where more than one instruction is trying to access the same memory • at the same time. • IPC - Instruction per clock cycle, a measure of the performance of a processor. • Issue buffer - A memory which determines what instructions can be executed in parallel. • Pipeline - An architectural scheme where specific tasks are performed in stages on a processor. • Pipeline latch - A memory device between the pipeline stages. • Rename space - Temporary storage space inside the processor. • Superscalar - A computer architecture where multiple instructions are executed in one clock cycle. • Stalls - When the rename space is full, the processor cannot keep issuing instructions.

  4. Superscalar Processors Superscalar processors have a pipeline which is capable of issuing multiple instructions per cycle. This control complexity is managed by using a register reorder buffer to keep instruction execution in order. The pipeline is still forced to stall when the reorder buffer is full. • The purple instruction is waiting on i/o in the reserve unit. It is at the front of the re-order buffer • The blue instructions in the commit unit have been processed, but have to wait to commit until the purple instruction is completed. • The processor must stall because the re-order buffer is full, no new instructions can be dispatched, despite that two reserve units are sitting idle.

  5. Problem Statement • Achieve a net gain in superscalar processor performance by adaptively changing the rename space size

  6. General Solution To use the idle function units in stalled pipelines as additional rename space when the reorder buffer is full.

  7. Approach • Determine if the possible performance enhancement from such a scheme outweighs the extended time per clock cycle with simulations on an Alpha processor model. • Design and implement a control algorithm to use the additional rename space. • Verify the correctness of the control logic • Implement the control algorithm in SPICE • Quantify the architectural performance gains using the SPEC2000 benchmark

  8. Operating Environment • The design will be tested using processor simulations and hardware models. • Software simulations will be done in SimpleScalar • The modified processor will not actually be fabricated, but the basic environment is that of a typical super scalar processor

  9. Intended Users • Dr. Tyagi and his research assistants • Microprocessor companies • Other researchers in the field

  10. Intended Uses • Dr. Tyagi’s research in computer architecture performance • Improve performance of sequentially executed programs • Providing research into increasing super scalar processor performance

  11. Assumptions • There will be a performance gain by using pipeline latches for rename space • When rename space is full, there are functional units that cannot be utilized • Any control strategy that would yield gains is feasible in CMOS technology

  12. Limitations • Using pipeline latches for rename space will increase capacitance and extend time per clock cycle • There are hazards that increasing the rename space size will not fix • There will be a limited numbler of pipeline latches available • Any implementation of control strategy would be processor dependent

  13. End Product/Deliverables • A research paper detailing the team’s results • Modified SimpleScalar code that simulates the new control algorithm. The code will be documented and maintainable so further work can be done if necessary. • SPICE simulations and results quantifying the affect on processor performance

  14. Approaches Considered 1/3 • Determine how performance is affected by rename space stalls Selected Approach: Simulate using SimpleScalar • Advantages: • SimpleScalar is familiar to the client • SimpleScalar is open source and easily modified • Disadvantages: none

  15. Approaches Considered 3/3 • Find an optimal size for the rename space that will decrease cycle time Approach 1 – Run many simulations varying the rename space size • Advantages: Gives detailed picture of how rename space size relates to performance • Disadvantages: Requires running a large number of simulations • Doesn’t reveal at what size rename space is used most efficiently Approach 2 – Run simulations and determine what rename space is filled to its capacity the largest percentage of the time • Advantages: Gives a detailed picture of how rename space fills up • Disadvantages: Doesn’t reveal what size yields best performance to size ratio Selected: Approach 1 and 2 – to get as much information as possible

  16. Approaches Considered 2/2 • Develop an algorithm to adaptively increase rename space using functional units Approach 1- Use standard functional units to store instructions or data • Advantages: Doesn’t involve changing the functional units • Disadvantages: May be a significant capacitance increase Approach 2-Use specially designed functional units • Advantages: May decrease capacitance compared with approach 1 • Disadvantages: Would take a lot of work that might not be worth the gain Selected: Approach 1. Approach 2 is too large a risk without being able to quantify potential gains. If 1 proves infeasible, we will switch to approach 2.

  17. Research Activities 1/3 • Research performance results of different rename space sizes

  18. Research Activities 2/3 •  Can functional units be used for additional rename space?  •  Find out which functional units are available when stalls happen • Find out how long functional units are available when stalls happen

  19. Research Activities 3/3 • Research relationship between rename space size, clock speed and performance • Decide under what conditions should additional rename space be used • How much adaptive rename space is optimal

  20. Present Accomplishments • Determined that the less rename space the less capacitance in the chip and the faster the clock can be set • Determined that after a certain size, the benefits of increasing rename space is dramatically decreased. • Determined that using a two cycle access to adaptive rename space allows us to keep the increase in clock cycle time gain gotten by decreasing the traditional rename space. • Determined that integer alu and integer multiplier functional units are often available while the rename space is stalled. • Determined that additional rename space should be issued in blocks of 8 and for at least 10 cycles • Using both functional units and dedicated memory for adaptive rename space is the best approach.

  21. Design Activities • Designed tests cases and simulations with varying space sizes • Developed rudimentary control algorithm as a test of concept

  22. Implementation Activities • Coding of control strategy in SimpleScalar • Evaluation of clock cycle speed increase of reducing rename space size from 64 to 40

  23. Future Required Activities • Develop more advanced control strategy to increase gains. • Design physical implementation for fabrication. • Write a paper discussing rename space implementation and control strategy.

  24. Resources Personnel Other Resources Hentzel – 55 Brandt – 86 Thompson–65 Taylor – 52 Total Hours: 258

  25. Schedules

  26. Project Evaluation Milestones Successfully Completed • Determine how processor performance is affected by rename space. • Determine what functional units can be used to increase rename space • Finding an optimal size for the rename space size that will decrease the cycle time of the processor Milestones in Progress • Develop an algorithm using functional units to adaptively increase rename space size • Use SPICE simulations to determine the affects of changes on capacitance and cycle time Milestones Not Yet Begun • Quantification of the increase in performance • A Research paper detailing the results The project will be a success! The team is on schedule to complete all milestones and the thorough preparation for the implementation stage has yielded a viable solution.

  27. Commercialization • This project may have future commercial considerations, but our interest is in the academic research.

  28. Recommendations for Further Work • Algorithm could be further optimized and ported to other processor architectures. • Instruction fetch buffer could be examined to find new optimal points with the new architecture.

  29. Lessons Learned • Details of superscalar processor. • Computer Architecture research and design flow. • Group motivation and task management of complex and simple tasks.

  30. Risk Management 1/2 • Anticipated • Team motivation • Handled by continual checking of group members attitude • Members falling behind on knowledge and understanding • Handled by weekly meetings where questions were asked to members. • Loss of Member or Graduate Advisor • Handled by working with a new graduate student with background in similar area.

  31. Risk Management 2/2 • Unanticipated • Time requirements of class and meeting time difficulties. • Handled by distributing projects between team members and meeting to review each other’s sections • Little gain from increasing the rename space size • Handled by making a smaller issue stage with a smaller traditional rename space, so the clock rate increased due to the use of adaptive rename space.

  32. Closing Summary • The goal was to come up with a viable strategy to enhance processor performance via implementing an adaptive rename space. The solution is on track to be a success. • This project may lead to more efficient processor designs being produced.

More Related