130 likes | 387 Views
Carry-Lookahead Addition. Ripple-Carry Adder Current design uses a “ripple-carry” adder technique Cout propagates into the Cin of next adder What is the associated electrical delay for this scheme? Assume each gate (AND/OR only) has a delay of T units
E N D
Ripple-Carry Adder • Current design uses a “ripple-carry” adder technique • Cout propagates into the Cin of next adder • What is the associated electrical delay for this scheme? • Assume each gate (AND/OR only) has a delay of T units • Two level logic implementation of a single FA: • delay of 2T to compute Cout:
Carry-Lookahead Adder • A 16-bit Ripple-Carry adder has 15 * 2T + T = 31 T total delay to compute the sum! • Grows linearly with size of adder • Is there a faster way to add? yes. • Faster design uses a “carry-lookahead” adder technique • Real ALUs use this style • Idea is to compute needed carry-in to a bit position with only a very small delay (smaller than in the R.C. case)
Generating a Carry • An adder will “generate” a carry-out on the sum of the bits ai and bi if ai• bi = 1 (i.e. a and b are both 1) Define gi = ai• bi (generate) • Hence: couti cini+1 = 1 if gi = 1 • Let ci = “carry-in to position i” • Note ci+1 = carry-in to position i+1 = carry-out from position i • Delay to compute each g = 1T
Propagating a Carry • An adder will “propagate” a carry-in (ci) by the sum of the bits ai and bi if ci = 1 and ai+ bi = 1 (i.e. cin is 1 and at least one of a or b is 1) Define pi = ai+ bi (propagate) • Hence: couti ci+1 = gi + pi• ci • A carry-out occurs from position i if it is either • generated by position i, or • a carry-in is propagated by position i • Delay to compute each p = 1T
Propagate / Generate • Ex: using 4 bits c0 = initial carry-in c1 = g0 + p0 c0 c2 = g1 + p1 c1 = g1 + p1 (g0 + p0 c0) = g1 + p1 g0 + p1 p0 c0 c3 = g2 + p2 c2 = g2 + p2 (g1 + p1 g0 + p1 p0 c0) = g2 + p2 g1 + p2 p1 g0 + p2 p1 p0 c0 • Delay to compute each c = 2T fixed delay! • I am assuming that g and p are pre-computed • They take a total of 1T to pre-compute
An Abstraction of Propagate / Generate • These equations require large gate “fan-in” to implement in 2T delay therefore stop expansion at 4 bits as above • Delay to compute each c = 2T fixed delay! • pre-computed: 1T for each p and g (in parallel!) • 1T for the AND to create the subgroups (minterms) • 1T for the OR of all subgroups • Each sum bit Si can now be computed in 3T delay:
4-Bit Carry-Lookahead Adder • Combine these ideas to design a 4-bit adder with 3T delay for the entire 4-bit Sum (assuming p & g are pre-computed) • What are P0 and G0? • P0 = p3 p2 p1 p0 (super-propagate) • G0 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0 (super-generate) • Note: the device has no “carry-out”, only P0 and G0
Super-Generate / Super-Propagate P0 = p3 p2 p1 p0 (super-propagate) G0 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0 (super-generate) • P0 represents the propagate for the entire 4-bit unit • P0 takes 1T delay units to compute • G0 represents the generate for the entire 4-bit unit • G0 takes 2T delay units to compute • P0 and G0 represent a higher level of hardware abstraction of propagation and generation
16-bit C.L. Adder • Carry-Lookahead Logicimplements: • Cin(0) = c0 (initial carry-in) • Cin(1) = G0 + P0 c0 • Cin(2) = G1 + P1 G0 + P1 P0 c0 • Cin(3) = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 c0 • Delay for Cin is (2T + 2T) = 4T • Delay for Sum = 4T + 3T (per unit) + 1T (for pre-computation of p & g) = 8T • Compare to 31T for R.C. adder
16-bit C.L. Adder Example (1) A: 0110 0011 1101 0101B: 1110 1101 1000 0011g: 0110 0001 1000 0001 1T p: 1110 1111 1101 0111 P0: 0·1·1·1 = 0 1T (2T total) P1: 1·1·0·1 = 0 P2: 1·1·1·1 = 1 P3: 1·1·1·0 = 0 G0: 0 + 0 0 + 0 1 0 + 0 1 1 1 = 0 2T (3T total)G1: 1 + 1 0 + 1 1 0 + 1 1 0 0 = 1 G2: 0 + 1 0 + 1 1 0 + 1 1 1 1 = 1 G3: 0 + 1 1 + 1 1 1 + 1 1 1 0 = 1
16-bit C.L. Adder Example (2) • Computing the actual sum (red bits only): A: 0110 0011 1101 0101B: 1110 1101 1000 0011g: 0110 0001 1000 0001p: 1110 1111 1101 0111P: 0100 G: 1110 a6 b6 c6 = 1 0 ( g5 + p5g4 + p5p4Cin(1) ) = 1 0 (0 + 0 0 + 0 1 (G0 + P0 c0)) = 1 0 (0 + 0 0 + 0 1 (0 + 0 0)) = 1 Delay to compute S6 = 1T + (2T + 2T) + (2T + 1T) = 8T
Test Yourself • Compute sum bit S10 (red bits only): A: 0110 1001 1001 0101B: 1011 0101 1000 1011g: p: P: G: a10 b10 c10 =