1 / 64

Synchronization Ideas

Synchronization Ideas. Charles E. Dike Intel Corporation. Introduction. Tutorial Share some ideas about synchronization and metastability Introduce NEW, IMPROVED theory on metastability Charles Dike (cdike@ichips.intel.com). Synchronous Clock at 1.5GHz. Synchronous Clock at 3.0 GHz.

jera
Download Presentation

Synchronization Ideas

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synchronization Ideas Charles E. Dike Intel Corporation

  2. Introduction • Tutorial • Share some ideas about synchronization and metastability • Introduce NEW, IMPROVED theory on metastability • Charles Dike (cdike@ichips.intel.com)

  3. Synchronous Clock at 1.5GHz Synchronous Clock at 3.0 GHz Asynchronous Circuit Pausable Clock at 1.8 GHz Synchronous Clock at 1.5GHz Why and where synchronize? • Reduce latency between independent clock domains. • Asynchronous domain to synchronous clock. • Synchronous clock to an independent synchronous clock. • Benefit - higher performance in critical circuits.

  4. MEM MEM MEM MEM FPU FPU FPU FPU ALU ALU ALU ALU Design Direction 80s towards 100MHz 90s towards 1GHz 00s multi-GHz VALUE ADDED

  5. Chip Area Networks Late 00s multi-GHz

  6. I believe…. • We must be able to synchronize all domains to a PLL controlled clock • Interconnect on chip will be asynchronous (GALS) • We need to minimize latency • There will be two basic synchronizer uses - near neighbor and the chip net

  7. Topics of Discussion • Generic synchronizer of the type used in the TeraFlops computer • Simple synchronizer of the type used in StrongArm • The Myrinet pipeline synchronization scheme • Latest understanding of metastability

  8. Generic Synchronizer • Handles self timed to synchronous interfaces and vice-versa • Supports synchronous to synchronous interfaces • Can handle streaming data • Adaptable to any speed range • Possibly used over the chip network

  9. D D Q Q Two flop synch VALID #1 #2 CLK

  10. D D D D Q Q Q Q Q S R LATCH OUTPUT SENDER CLOCK RECEIVER CLOCK Single latch synch ACK REQ CLK1 CLK2 Write Valid Read Valid

  11. ACK REQ CLK1 D D D D D D D D Q Q Q Q Q Q Q Q CLK2 Write Valid Read Valid Q Q S S R R Multi latch synch ACK REQ CLK1 CLK2 Write Valid Read Valid

  12. 1 1 0 0 1 0 0 1 0 0 0 1 SYNC 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 General Case WRITE POINTER STATUS REGISTER READ POINTER SYNCHRONIZERS EMPTY FULL PADDING LATENCY EN EN EN Write Clock Read Clock Write Enable

  13. D D D D Q Q Q Q D D Q Q R R R R EN EN R R empty case WRITE POINTER READ POINTER STATUS REGISTER SYNCHRONIZER EMPTY Write Pointer a Read Pointer a EMPTY Write Pointer b Write Enable Write Clock Read Clock Read Pointer b

  14. 1 1 0 0 1 0 0 1 0 0 0 1 SYNC 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 General Case WRITE POINTER STATUS REGISTER READ POINTER SYNCHRONIZERS EMPTY FULL PADDING LATENCY EN EN EN Write Clock Read Clock Write Enable

  15. Topics of Discussion • Generic synchronizer of the type used in the TeraFlops computer • Simple synchronizer of the type used in StrongArm mprocessor • The Myrinet pipeline synchronization scheme • Latest understanding of metastability

  16. Simple Synchronizer • Constrained by frequency ratio • Supports synchronous to synchronous interfaces • Does it support asynch to synch? Yes, with restrictions. • Possibly used in local neighbor synchronizers

  17. D Q D D D Q Q Q Simple Synchronizer SYNC A2 A3 A A1 w x y z SLOW CLK MI* Divide by 2 FAST CLK MI* = Metastable Immune

  18. FAST CLOCK SLOW CLOCK A D Q D D D Q Q Q A1 A2 A3 SYNC timing1 SYNC A2 A3 A A1 SLOW MI* FAST Divide by 2 1 2 3 4 5 6

  19. FAST CLOCK 1 2 3 4 5 6 SLOW CLOCK SYNC CHEATER CLOCK D Q D D D Q Q Q timing2 SYNC A2 A3 A A1 SLOW MI* FAST Divide by 2

  20. D Q D D D Q Q Q timing3 SYNC A2 A3 A A1 SLOW MI* FAST Divide by 2 FAST CLOCK 1 2 3 4 5 6 SLOW CLOCK SYNC CHEATER CLOCK

  21. D D Q Q D D D D D D Q Q Q Q Q Q timing4 SYNC A2 A3 A A1 SLOW MI* MI* FAST Divide by 2 SYNC A2 A3 A A1 MI* FAST FAST CLOCK 1 2 3 4 5 6 SLOW CLOCK SYNC SLOW CLOCK# SYNC

  22. FAST CLOCK 1 2 3 4 5 6 SLOW CLOCK SYNC CHEATER CLOCK D D D D Q Q Q Q SYNC SYNC FAST CLOCK FAST CLOCK transfers SLOW TO FAST TRANSFER FAST TO SLOW TRANSFER SLOW CLOCK SLOW CLOCK

  23. Topics of Discussion • Generic synchronizer of the type used in the TeraFlops computer • Simple synchronizer of the type used in StrongArm • The Myrinet pipeline synchronization scheme • Latest understanding of metastability

  24. Pipeline Synchronizer • Supports synchronous to synchronous interfaces • Supports asynch to synch and vice-versa • Possibly used in local neighbor synchronizers • Essentially a distributed fifo and synchronizer

  25. f0 f1 f0 S S S Ri Ri Ri Ro Ro Ro Di Di Di Do Do Do Ao Ao Ao Ai Ai Ai Pipeline Synchronizer

  26. R1 A1 ME R0 A0 S f0 ME element X f0 REQ

  27. Ri Ro Di Do Ao Ai C C Fifo element Ro Ri Data Ai Ao

  28. f0 f1 f0 S S S Ri Ri Ri Ro Ro Ro Di Di Di Do Do Do f0 Ao Ao Ao Ai Ai Ai f1 Async to sync Asynchronous Synchronous

  29. Ri Ro Ri Ro Ri Ro Di Do Di Do Di Do Ao Ai Ao Ao Ai Ai S S S f0 f1 f0 f0 f1 Sync to async Asynchronous Synchronous

  30. Points to ponder #1 • All synchronizing interfaces have one thing in common - a latching element that holds data while metastabilities are being resolved. • There is no way to avoid the latency which is required to resolve metastabilities. • To minimize latency the latching element characteristics can be improved. • We will be required to understand and use this knowledge. This is the future of digital design.

  31. Topics of Discussion • Generic synchronizer of the type used in the TeraFlops computer • Simple synchronizer of the type used in StrongArm • The Myrinet pipeline synchronization scheme • Latest understanding of metastability

  32. Role of the Synchronizing Flop • Reorients incoming information to a clock edge • Its performance determines system failure rate or latency

  33. Real Life • There is no magic bullet • There is a lot of misinformation on metastability around • To date many circuits have been over designed through planning and luck • Whenever a circuit fails based on too high of a frequency ultimately the cause of failure is metastability • There is no way to synchronize a signal faster than about the time it takes to pass a signal through six static gates

  34. NODE A NODE B Metastability is.... OUT SET OUT RESET

  35. Tw (window size) - likelihood of entering a metastable state - in units of time Tau (t) - rate at which metastability resolves - in units of time MTBF (Mean Time Between Failures) e t/t MTBF = Twfdfc Technical terms <Vn2>=4kT/C <thermal noise

  36. D time of data after clock Propagation delay Simple jamb latch NODE B NODE A OUT DATA CLOCK RESET

  37. D time of data after clock Propagation delay Simple jamb latch NODE B NODE A OUT DATA CLOCK RESET ~RC time constant

  38. D time of data after clock Propagation delay Rough Histogram Tw The slope is thet D time of data after clock (log scale) Propagation delay e t/t MTBF = Twfdfc

  39. e t/t MTBF = Twfdfc Why is the theory a problem? • It assumes a uniform distribution of data about the clock • What happens when data always violates the setup/ hold window? • It is not detailed enough • Doesn’t consider a deterministic region • Doesn’t account for thermal noise • People tend to extrapolate the theory improperly

  40. Overview of refined theory • Not everything past a normal propagation is a metastable event • The Tw window can’t be improved by input edge rates • Tw has a complex relationship to t based on load • The MTBF formula needs to be modified due to non-uniform distribution of data about the clock input

  41. Schematic

  42. Simulation of a typical latching device

  43. PULSE GENERATOR #1 PC R DELAY D Q TRIGGER TEK 11801-B OSCILLOSCOPE PULSE GENERATOR #2 DELAY INPUT Test case

  44. advancing time Measuring real data

  45. Inflection point Histogram 0.6mv/0.1ps time

  46. Inflection point Histogram 0.6mv/0.1ps time

  47. Measured versus Basic Tw The slope is thet 0.6mv/0.1ps D time of data after clock (log scale) Propagation delay Propagation delay e t/t MTBF = Twfdfc

  48. t Simulated.... Battery Voltage Controlled Switch R1 = 100 W R1 = 100M W

  49. Latch outputs at nodes 1 and 2 1.5 volts 1.0 t = | t1 - t2 | V2 V1 ln 0.5 0.0 1.0 1.2 1.4ns Semilog difference between latch outputs Where: V1 = voltage at time t1 V2 = voltage at time t2 100 t2 volts 10-3 t1 10-6 1.0 1.2 1.4ns time Tau Simulated 2

  50. k = 1.38 x 10-23 J/K t = 20 picoseconds B = 1/t =5 x 1010Hz R = ~400 W T = 300o K <Vn2>=4kT/C=4kTBR Vn = ~0.6 mv

More Related