1 / 83

Digital Integrated Circuits A Design Perspective

Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikoli ć. Timing Issues. January 2003. Synchronous Timing. Timing Definitions. D. Q. Clk. Latch Parameters. the register maximum propagation delay t c-q is clock to output and

nolcha
Download Presentation

Digital Integrated Circuits A Design Perspective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Integrated CircuitsA Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić Timing Issues January 2003

  2. Synchronous Timing

  3. Timing Definitions

  4. D Q Clk Latch Parameters the register maximum propagation delay tc-q is clock to output and td-q is data to output delay data D must be stable to be properly registered in the latch (no unintended changes when the latch is transparent) T Clk PWm tsu D thold td-q tc-q Q Intended change must come before the latch closesby at least tsu Delays can be different for rising and falling data transitions

  5. Register Parameters D Q Data must be stable before the rising edge of the clock and held sufficiently long to be processed by the register Clk T Clk thold D tsu tc-q Q Delays can be different for rising and falling data transitions

  6. Clock Uncertainties Sources of clock uncertainty

  7. Clock Nonidealities • Clock skew (constant delay) • Spatial variation in temporally equivalent clock edges; deterministic + random, tSK • Clock jitter (random variations) • Temporal variations in consecutive edges of the clock signal; modulation + random noise • Cycle-to-cycle (short-term) tJS • Long term tJL • Variation of the pulse width • Important for level sensitive clocking

  8. Clock Skew and Jitter Clk • Both skew and jitter affect the effective cycle time • Only skew affects the race margin tSK Clk tJS

  9. Clock Skew # of registers Earliest occurrenceof Clk edge Nominal – /2 Latest occurrenceof Clk edge Nominal + /2 Clk delay Insertion delay Max Clk skew 

  10. Positive and Negative Skew

  11. Positive Skew Launching register clock edge arrives before the receiving register clock edge

  12. Negative Skew Receiving register clockedge arrives before the launching register clockedge

  13. Timing Constraints Were cd stands for a contamination or a minimum delay both in register propagation time and combinational logic delay Minimum clock cycle time: T -  = tc-q + tsu + tlogic Worst case is when receiving edge arrives early (negative ) thus a negative clock skew reduces the clock frequency

  14. Timing Constraints Hold time constraint: t(c-q, cd) + t(logic, cd) > thold +  Worst case is when receiving edge arrives late (positive skew) Race between data and clock is more likely for a positive clock skew

  15. Impact of Jitter Since jitter is a random delay it increases the minimum clock period and increases likelihood for race between clock and data

  16. Longest Logic Path in Edge-Triggered Systems TJI + d TSU Clk TClk-Q TLM T Latest point of launching Earliest arrivalof next cycle TLM - the maximum logic delay

  17. Clock Constraints in Edge-Triggered Systems If the launching edge is late and the receiving edge is early, the data will not be too late if: Tc-q + TLM + TSU < T – TJI,1 – TJI,2 - d Minimum cycle time is determined by the maximum delays through the logic Tc-q + TLM + TSU + d + 2 TJI < T Skew can be either positive or negative

  18. Shortest Path Shortest path effects feedback connections that typically have a negative clock skew Earliest point of launching Clk TClk-Q TLm TLm - the minimum logic delay Clk TH Data must not arrivebefore this time Nominalclock edge

  19. Clock Constraints in Edge-Triggered Systems If launching edge is early and receiving edge is late: Tc-q + TLm – TJI,1 > TH + TJI,2 + d Minimum logic delay Tc-q + TLm > TH + 2TJI+ d For clock skew only we had: t(c-q, cd) + t(logic, cd) > thold + 

  20. How to counter Clock Skew?

  21. Flip-Flop – Based Timing Logic propagation must finish before the next clock’s rising edge Skew Flip-flop delay Logic delay f TSU TClk-Q Flip-flop f = 1 f = 0 Logic Clock cycle Representation after M. Horowitz, VLSI Circuits 1996.

  22. Flip-Flops and Dynamic Logic Logic delay TSU TSU TClk-Q TClk-Q f = 1 f = 0 f = 1 f = 0 Logic delay Precharge Evaluate Precharge Evaluate In dynamic logic gates logic propagation must finish before the clock’s falling edge Dual relation holds for the PUN controlled by inverted clocks Flip-flops are used only with static logic

  23. Latch timing When data arrives to a transparent latch tD-Q Latch is a ‘soft’ barrier D Q Clk tClk-Q When data arrives to closed latch Data has to be ‘re-launched’

  24. Single-Phase Clock with Latches f Latch Logic Tskl Tskl Tskt Tskt Clk PW P

  25. Latch-Based Design L1 latch is transparentwhen f = 0 L2 latch is transparent when f = 1 f L1 L2 Logic Latch Latch Logic

  26. L 1 L1 L 2 In CLB_B CLB_A Q Q D D D Q t t a b c d e pd,A pd,B CLK1 CLK2 CLK1 T CLK j k l m CLK1 CLK2 slack passed to next stage shortening the clock period requirement t t t t pd,B pd,A DQ DQ e valid valid a valid valid b c valid d Slack-borrowing

  27. Clock Distribution H-tree balances the clock skew Clock is distributed in a tree-like fashion

  28. More realistic H-tree [Restle98]

  29. The Grid Clock Distribution • Does not require • rc-matching • Large power dissipation • Easier to satisfy metal • density requirement • in fabrication • Good thermal distribution

  30. Example: DEC Alpha 21164 Clock Frequency: 300 MHz - 9.3 Million Transistors Total Clock Load:3.75 nF Power in Clock Distribution network : 20 W (out of 50) Uses Two Level Clock Distribution: • Single 6-stage driver at center of chip • Secondary buffers drive left and right side clock grid in Metal3 and Metal4 Total driver size: 58 cm!

  31. final drivers pre-driver 21164 Clocking tcycle= 3.3ns • 2 phase single wire clock, distributed globally • 2 distributed driver channels • Reduced RC delay/skew • Improved thermal distribution • 3.75nF clock load • 58 cm final driver width • Local inverters for latching • Conditional clocks in caches to reduce power • More complex race checking • Device variation effects symmetry tskew = 150ps trise = 0.35ns Clock waveform Location of clock driver on die

  32. Clock Skew in Alpha Processor Clock skew

  33. tcycle= 1.67ns trise = 0.35ns tskew = 50ps EV6 (Alpha 21264) Clocking 600 MHz – 0.35 micron CMOS • 2 Phase, with multiple conditional buffered clocks • 2.8 nF clock load • 40 cm final driver width • Local clocks can be gated “off” to save power • Reduced load/skew • Reduced thermal issues • Multiple clocks complicate race checking Global clock waveform

  34. 21264 Clocking

  35. ps 5 10 15 20 25 30 35 40 45 50 ps 300 305 310 315 320 325 330 335 340 345 EV6 Clock Results GCLK Skew (at Vdd/2 Crossings) GCLK Rise Times (20% to 80% Extrapolated to 0% to 100%)

  36. EV7 Clock Hierarchy Active Skew Management and Multiple Clock Domains + widely dispersed drivers + DLLs compensate static and low-frequency variation + divides design and verification effort - DLL design and verification is added work + tailored clocks

  37. Self-timed and Asynchronous Design Functions of clock in synchronous design 1) Acts as completion signal 2) Ensures the correct ordering of events Truly asynchronous design 1) Completion is ensured by careful timing analysis 2) Ordering of events is implicit in logic Self-timed design 1) Completion ensured by completion signal 2) Ordering imposed by handshaking protocol

  38. Synchronous Pipelined Datapath • Make sure that the clock period T is larger than the max delay • T > max(tpd1,tpd2,tpd3 )+tpd,reg • Problems: • Clock skew and jitter • Strong clock currents, induces noise due to package inductance • Power dissipation • Uneven stage delay could be used to support faster processing

  39. Self-Timed Pipelined Datapath Necessary for self-timed logic is a completion signal

  40. LOGIC In Out NETWORK DELAY MODULE Critical path replica Start Done Using Delay Element (e.g. in memories) Completion Signal Generation • Completion signal can be generated by: • Replica delay • Dual-rail coding

  41. Completion Signal Generation Completion signal generation by dual-rail coding requires a redundancy in data representation Below two bits B0 and B1 represent a single bit value B value B

  42. Completion Signal in DCVSL V V DD DD B 0 Start Done B 1 B 0 B 1 In 1 In 1 PDN PDN In 2 In 2 Generation of a completion signal in DCVSL Start

  43. Self-Timed Adder Done signal generated after all carry signals are stable

  44. Completion Signal Using Current Sensing Current sensor outputs a low value when no current flows through the logic and a high value when logic is switching

  45. Hand-Shaking Protocol Two Phase Handshake Sender can cannot change its data once it sends the request signal which finishes its active cycle Receiver reads the data and produces acknowledge signal, this will start a new cycle and sender can process new data Req and Ack signals can be generated in both high-low and low-high transitions

  46. A B F n +1 A 0 0 0 F C 0 1 F n 1 0 F B n 1 1 1 (a) Schematic (b) Truth table Event Logic – The Muller-C Element Implementations of Muller-C element

  47. Data Sender Receiver logic logic Data Ready Data Accepted Req C Ack Handshake logic 2-phase Handshake Protocol Initially Req, Ack, & Data Ready are 0 With Data Ready = 1 Req goes high and Data is transmitted Once this is finished Ack goes high and control is passed to the sender

  48. Out In R 1 R 2 R 3 En Done Req Req i 0 C C C Ack Ack o i Example: Self-timed FIFO Data transferred on positive and negative transmission of En Done is a delayed En signal Examine operation of FIFO by plotting signals

  49. 2-Phase Protocol

More Related