1 / 101

Reversible Computing Theory I: Reversible Logic Models

Reversible Computing Theory I: Reversible Logic Models. Reversible Logic Models. It is useful to have a logic-level (Boolean) model of adiabatic circuits. Can model all logic using maximally pipelined logic elements, which consume their inputs. A.k.a., “input consuming” gates.

Download Presentation

Reversible Computing Theory I: Reversible Logic Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reversible Computing Theory I:Reversible Logic Models

  2. Reversible Logic Models • It is useful to have a logic-level (Boolean) model of adiabatic circuits. • Can model all logic using maximallypipelined logic elements, which consume their inputs. • A.k.a., “input consuming” gates. • Warning: In such models Memory requires recirculation! Thus, this is not necessarily more energy-efficient in practice for all problems than retractile (non-input consuming) approaches! • There is a need for more flexible logic models. • If inputs are consumed, then the inputoutput logic function must be invertible.

  3. Input-consuming inverter: in out Before: After:inoutinout0 - - 1 1 - - 0 E.g. SCRL implementation: Input arrow indicates inputdata is consumed by element. Alternate symbol: in out Invertible! (Symmetric)

  4. An Irreversible Consuming Gate • Input-consuming NAND gate: Before: After:ABoutABout 0 0 - - - 1 0 1 - 1 0 - - - 0 1 1 - • Because it’s irreversible, it has no implementation in SCRL (or any fully adiabatic style) as a stand-alone, pipelined logic element! A out B 4 possible inputs, 2possible outputs. At least 2 of the 4 possibleinput cases must lead todissipation!

  5. NAND w. 1 input copied? • Still not invertible: Before AfterABA’outABA’out 0 0 - - - - 0 1 0 1 - - - - 1 1 1 0 - - - - 1 0 1 1 - - • At least 1 of the 2 transitions to the A’=0, out=1 final state must involve energy dissipation of order kBT. How much, exactly? See exercise. Delay buffer A’ A out B

  6. NAND w. 2 inputs copied? • Finally, invertible! Before:After:ABA’B’outABA’B’out 0 0 - - - - - 0 0 1 0 1 - - - - - 0 1 1 1 0 - - - - - 1 0 1 1 1 - - - - - 1 1 0 • Any function can be made invertible by simply preserving copies of all inputs in extra outputs. • Note: Not all output combinations here are legal! • Note there are more outputs than inputs. • We call this an expanding operation. • But, copied inputs can be shared by many gates. A’ A out B B’

  7. SCRL Pipelined NAND A B A out = AB 5T B Inverters only neededto restore A, B— Can be shared withother gates that takeA, B as inputs. • Including inverters: 23 transistors • Not including inverters: 7 transistors

  8. Non-Expanding Gates • Controlled-NOT (CNOT) or input-consuming XOR:ABA’C0 0 0 0 0 1 0 1 1 0 1 1 1 1 1 0 • Not universal for classical reversible computing. (Even together w. all other 1 & 2 output rev. gates.) • However, if we add 1-input, 1-output quantumgates, the resulting gate set is universal! • More on quantum computing in a couple of weeks. A A’ A A’ B C = AB B C = AB Can implement w. a diadic gate in SCRL

  9. Toffoli Gate (CCNOT) A A’ ABCA’B’C’0 0 0 0 0 00 0 1 0 0 10 1 0 0 1 00 1 1 0 1 11 0 0 1 0 01 0 1 1 0 11 1 0 1 1 01 1 1 1 1 1 • Subsumes AND, NAND, XOR, NOT, FAN-OUT, … • Note that this gate is its own inverse. • Our first universal reversible gate! A A’=A B B’=B B B’ C C’ C’ = CAB C (XOR)

  10. Fredkin Gate • The first universal reversible logic gate to be discovered. (Ed Fredkin, mid 70’s) • B and C are swapped ifA=1, else passed unchanged. • Is also conservative, conserves 1s and 0s. • Thus in theory requires no separate power input, even if 1 and 0 have different energy levels! ABCA’B’C’0 0 0 0 0 00 0 1 0 0 10 1 0 0 1 00 1 1 0 1 11 0 0 1 0 01 0 1 1 0 11 1 0 1 1 01 1 1 1 1 1 A A’ B B’ C C’

  11. Reversible Computing Theory II:Emulating Irreversible Machines

  12. Motivation for this study • We want to know how to carry out any arbitrary computation in a way that is reversible to an arbitrarily high degree. • Up to limits set by leakage, power supply, etc. • We want to do this as efficiently as possible: • Using as few “device ticks” as possible (spacetime) • Minimizes HW cost, & leakage losses • Using as few adiabatic transitions as possible (ops) • Minimizes frictional losses • But, a desired computation may be originally specified in terms of irreversible primitives.

  13. General-Case vs. Special-Case • We’d like to know two kinds of things: • For arbitrary general-purpose computations, • How to automatically emulate them in a fairly efficient reversible way, • w/o needing new intelligent/creative design work in each case? • Topic of today’s lecture • For various specific computations of interest, • What are the most efficient reversible algorithms? • Or at least, the most efficient that we can find? • Note: These may not necessarily look anything like the most efficient irreversible algorithms! • More on this point later

  14. The Landauer embedding • The obvious embedding of irreversible ops into “expanding” reversible ones leads to a linear increase in space through time. (Landauer ‘61) • Or, increase in width of an input-consuming circuit “Expanding”operations(e.g., AND) Desiredoutput “Garbage”bits input Circuit depth, or time 

  15. Lecerf Reversal • Lecerf (‘63) was interested in the group-theory question of whether an iterated permutation of items would eventually return to initial item. • Proved undecidable by reducing Turing’s halting problem to this question, w. a reversible TM. • Reversible TM reverses direction instead of halting. • Returns to initial state iff irreversible TM would halt. • Only problem with this:No useful output data! Desiredoutput f f  1 Garbage Copy ofInput Input

  16. The Bennett Trick • Bennett (‘73) pointed out that you could simply fan-out (reversibly copy) the desired output before reversing. • Note: O(T)storage is still temporarily needed! Desired output f f  1 Copy ofInput Input Garbage

  17. Improving Spacetime Efficiency • Bennett ‘73 transforms a computation taking spacetime S·T to one taking (S·T2) spacetime in the worst case. • Can we do better? • Bennett ‘89: Described a technique that takes spacetime • Actually, can generalize slightly and arrange for exponent on T to be 1+, where 0 (very slowly) • Lange, McKenzie, Tapp ‘97: Space (S) is even possible, if you use time (exp((S))) • Not any more spacetime-efficient than Bennett.

  18. Reversible “Climbing Game” • Suppose a guy armed with a hammer, N spikes, & a rope is trying to climb acliff, while obeying the following rules. • Question: How high can he climb? • Rules: • Standing on the ground or on a spike, he caninsert & remove a spike 1 meter higher up. • He can raise & lower himself betweenspikes & the ground using his rope. • He can’t insert or remove a spike whiledangling from a higher spike! • Maybe not enough leverage/stability?

  19. Analogy w. Emulation Problem • Height on cliff represents: • How many steps of progress havewe made through the irreversiblecomputation? • Number of spikes represents: • Available memory of reversible machine. • Spike in cliff at height H represents: • Using a unit of memory to record the state of the irreversible machine after H steps. • Adding/removing a spike at height H+1if there is a spike is at height H represents: • Computing/uncomputing state at H+1 steps given state at H.

  20. 1. Insert spike @ 1. 2. Insert spike @ 2. 3. Remove spike @ 1. 4. Insert spike @ 3. 5. Insert spike @ 4. 6. Remove spike @ 3. 7. Insert spike @ 1. 8. Remove spike @ 2. 9. Remove spike @ 1. 10. Can use remaining 3 spikes to climb up another 4 if desired! Let’s Climb! 0. Standing on ground.

  21. How high can we climb? • Using only N spikes, and the strategy illustrated, we can climb to height 2N1 (wow!) • Li & Vitanyi: (Theorem) This is the optimal strategy for this game. • Open question: • Are there more efficient general reversiblization techniques that are not based on this game model?

  22. “Pebble Game” Representation

  23. Triangle representation k = 2n = 3 k = 3n = 2

  24. Analysis of Bennett Algorithm • n = # of recursive levels of algorithm • k = # of lower-level iterations to go forward 1 higher-level step • Tr = # of reversible lowest-level steps executed = 2(2k1)n • Ti = # of irreversible steps emulated = kn • So, n = logkTi, and so Tr = 2(2k1)log Ti/log k = 2elog(2k1)log(Ti)/log k = 2Tilog(2k 1)/log k (n+1 spikes)

  25. Linear-Space Emulation (Lange, McKenzie, Tapp ‘97) Unfortunately, the tree may have 2S nodes!

  26. Can we do better? • Bennett ‘73 takes order-T time, LMT ‘97 takes order-S space. • Can some technique achieve both, simultaneously? • Theorem: (Frank & Ammer ‘97) The problem of iterating a black-box function cannot be done in time T & space S on a reversible machine. • Proof really does cover all possible algorithms! • The paper also proves loose lower bounds on the extra space required by a linear-time simulation. • Results might also be extended to the problem of iterating a cryptographic one-way function. • It’s not yet clear if this can be made to work.

  27. One-Way Functions • …are invertible functions f such that f is easy to compute (e.g., takes polynomial time) but f 1 is hard to compute (e.g., takes exponential time). • A simple example: • Consider: f(p,q) = pq with p,q prime. • Multiplication of integers is easy. • Factoring is hard (except using quantum computers). • The “one-way-ness” of this function is essential to the security of the RSA public-key cryptosystem. • No function has yet been proven to be one-way. • However, certain kinds of one-way functions are known to exist if P NP.

  28. Elements of Frank-Ammer Proof • Consider a chain of bit-strings (size S each) that is incompressible by a certain compressor. • This is easily proven to exist. (See next slide.) • Machine’s job is to follow this chain from one node to the next by using a black-box function. • The compressor can run a reversible machine backwards, to reconstruct earlier nodes in the chain from later machine configurations. • If the reversible machine only uses order-S space in its configurations, then the chain is compressible! • Contradicts choice of incompressible chain; QED.

  29. Existence of Incompressibles • A decompressor or description systems:{0,1}* maps any bit-string descriptiond to the described string x. • Notation f:D means a unary operator on D, f:DD • x is compressible is s iff d: s(d)=x, |d|<|x| • Notation |b| means the length of bit-string b in bits. • Theorem:Every decompressor has an incompressible input of any given length . • Proof: There are 2 length- bit-strings, but only shorter descriptions. ■

  30. Cost-Efficiency Analysis Cost EfficiencyCost Measures in ComputingGeneralized Amdahl’s Law

  31. Cost-Efficiency • Cost-efficiency of anything is %$ = $min/$, • The fraction of actual cost $that really needed to be spent to get the thing, using the best poss. method. • Measures the relative number of instances of the thing that can be accomplished per unit cost, • compared to the maximum number possible • Inversely proportional to cost $. • Maximizing %$ means minimizing $. • Regardless of what $min actually is. • In computing, the “thing” is a computational task that we wish to carry out.

  32. Components of Cost • The cost $ of a computation may generally be a sum of terms for many different components: • Time-proportional (or related) costs: • Cost to user of having to wait for results • E.g., missing deadlines, incurring penalties. • May increase nonlinearly with time for long times. • Spacetime-proportional (or related) costs: • Cost of raw physical spacetime occupied by computation. • Cost to rent the space. • Cost of hardware (amortized over its lifetime) • Cost of raw mass-energy, particles, atoms. • Cost of materials, parts. • Cost of assembly. • Cost of parts/labor for operation & maintenance. • Cost of SW licenses

  33. More cost components • Continued... • Area-time proportional (or related) costs: • Cost to rent a portion of an enclosing convex hull for getting things in & out of the system • Energy, heat, information, people, materials, entropy. • Some examples incurring area-time proportional costs: • Chip area, power level, cooling capacity, I/O bandwidth, desktop footprint, floor space, real estate, planetary surface • Note that area-time costs also scale with the maximum number of items that can be sent/received. • Energy expenditure proportional (or related) costs: • Cost of raw free energy expenditure (entropy generation). • Cost of energy-delivery system. (Amortized.) • Cost of cooling system. (Amortized.)

  34. General Cost Measures • The most comprehensive cost measure includes terms for all of these potential kinds of costs. $comprehensive = $Time + $SpaceTime + $AreaTime + $FreeEnergy • $Time is an non-decreasing function f(tstartend) • Simple model: $Time tstartend • $FreeEnergy is most generally • Simple model: $FreeEnergy  Sgenerated • $SpaceTime and $AreaTime are most generally: • Simple model: • $SpaceTime  Space  Time • $AreaTime  Area  Time Max # ops thatcould be done Max # items thatcould be I/O’d

  35. Generalized Amdahl’s Law • Given any cost that is a sum of components,$tot = $1 + … + $n, • There are diminishing proportional returns to be gained from reducing any single cost component (or subset of components) to much less than the sum of the remaining components. • ∴ Design-optimization effort should concentrate on those cost components that dominate total cost for the application of interest. • At a “design equilibrium,” all cost components will be roughly equal (unless externally driven)

  36. Reversible vs. Irreversible • Want to compare their cost-efficiency under various cost measures: • Time • Entropy • Area-time • Spacetime • Note that space (volume, mass, etc.) by itself as a cost measure is only significant if either: • (a) The computer isn’t reusable, & so the cost to build it dominates operating costs, or • (b) I/O latency  V1/3 affects other costs. Or, for some applications,one quantity might be minimizedwhile another one (space, time, area)is constrained by some hard limit.

  37. Time Cost Comparison • For computations with unlimited power/cooling capacity, and no communication requirements: • Reversible is worse than irreversible by a factor of ~s>1(adiabatic slowdown factor), times maybe a small constant depending on the logic style used.$r,Time $i,Time · s

  38. Time Cost Comparison, cont. • For parallelizable, power-limited applications: • With nonzero leakage:$r,Time $i,Time / Ron/offg • Worst-case computations: g 0.4 • Best-case computations: g = 0.5. • For parallelizable, area-limited, entropy-flux-limited, best-case applications: • with leakage  0:$r,Time  $i,Time / d 1/2 • where d is system’s physical diameter. • (see transparency)

  39. Time cost comparison, cont. • For entropy-flux limited, parallel, heavily communication-limited, best case applications: • with leakage approaching 0:$r,Time $i,Time3/4 • where $i,Time scales up with the space requirement V as $i,Time V2/9 • so the reversible speedup scales with the 1/18 power of system size. • not super-impressive! (details later)

  40. Bennett 89 alg. is not optimal k = 2n = 3 k = 3n = 2 Just look at all the spacetime it wastes!!!

  41. Parallel “Frank02” algorithm • We can simply scrunch the triangles closer together to eliminate the wasted spacetime! • Resulting algorithm is linear time for all n and k and dominates Ben89 for time, spacetime, & energy! k=3n=2 k=2n=3 Emulated time k=4n=1 Real time

  42. Setup for Analysis • For energy-dominated limit, • let cost “$” equal energy. • c$ = energy coefficient, r$ = r$(min) = leakage power • $i = energy dissipation per irreversible state-change • Let the on/off ratio Ron/off = r$(max)/r$(min) = Pmax/Pmin. • Note thatc$ $i·tmin = $i ·($i / r$(max)), sor$(max)  $i2/c$ • SoRon/off  $i2 / c$r$(min) = $i2 / c$r$

  43. Time Taken • There are n levels of recursion. • Each multiplies the width of the base of the triangle by k. • Lowest-level triangles take time c·top. • Total time is thus c·top·kn. k=4n=1 Width 4 sub-units

  44. Number of Adiabatic Ops • Each triangle contains k + (k 1) = 2k  1immediate sub-triangles. • There are n levels of recursion. • Thus number of adiabatic ops is c·(2k  1)n k=3n=2 52 = 25little triangles(adiabaticoperations)

  45. Spacetime Usage • Each triangle includes the spacetime usage of all k  1 of its subtriangles, • Plus,additional spacetime units, each consisting of 1 storage unit, for time top·kn1 k=5n=1 1 state of irrev. mach. Being stored 1 2 Time topkn-1 3 Resulting recurrence relation:ST(k,0) = 1 (or c)ST(k,n) = (2k1)·ST(k,n1) + (k23k+2)·kn1/2 1+2+3 units

  46. Reversible Cost • Adiabatic cost plus spacetime cost:$r = $a + $r = (2k-1)n·c$/t + ST(k,n)·r$t • Minimizing over t gives:$r = 2[(2k-1)n ·ST(k,n) ·c$ r$]1/2 • But, in energy-dominated limit, c$ r$ $i2 / Ron/off, • So:$r = 2$i ·[(2k-1)n ·ST(k,n) / Ron/off]1/2

  47. Tot. Cost, Orig. Cost, Advantage • Total cost $i for irreversible operation performed at end of algorithm, plus reversible cost, gives:$tot = $i ·{1 + 2[(2k-1)n ·ST(k,n) / Ron/off]1/2} • Original irreversible machine performing knops would use cost $orig = $i·kn, so, • Advantage ratio between reversible & irreversible cost,

  48. Optimization Algorithm • For any given value on Ron/off, • Scan the possible values of n (up to some limit), • For each of those, scan the possible values of k, • Until the maximum R$(i/r) for that n is found • (the function only has a single local maximum) • And return the max R$(i/r) over all n tried.

  49. Spacetime blowup Energy saved k n

  50. Asymptotic Scaling • The potential energy savings factor scales as R$(i/r) Ron/off~0.4, • while the spacetime overhead goes only as R$(i/r) R$(i/r)~0.45, or Ron/off~0.18. • E.g., with an Ron/off of 109, you can do worst-case computation in an adiabatic circuit with: • An energy savings of up to a factor of 1,200× ! • But, this point is 700,000× less hardware-efficient, if Frank02 algorithm is used for the emulation.

More Related