290 likes | 479 Views
Image Compression System. Megan Fuller and Ezzeldin Hamed. Transforms of Images. Original Image. Magnitude of DFT of Image-128 (otherwise DC component = ~8e6). Image Reconstructed from 25% of DFT coefficients. The 2D Discrete Fourier Transform. Where
E N D
Image Compression System Megan Fuller and EzzeldinHamed
Transforms of Images Original Image Magnitude of DFT of Image-128 (otherwise DC component = ~8e6) Image Reconstructed from 25% of DFT coefficients
The 2D Discrete Fourier Transform Where This can be computed separably by rearranging:
The 2D Discrete Cosine Transform • Computed separably • Computed as a DFT + 1 multiply • Generally gives better energy compaction than DFT
High Level Architecture Output Module (sending data to PC) Separable, in-place 2D DFT/DCT Input Memory Coefficient > Threshold? • The choice between DFT and DCT is provided at compile time • Threshold is provided by the user at run time
What’s Interesting? • Reducing the computation required • Sharing resources in the DCT case • Some memory organization tricks • Reducing bit width
Number of FFTs • Using FFT to calculate the 1D-DFT • We need FFTs to calculate the 2D-DFT • Can we reduce the number of FFTs?
Reduction for the DFT case Real Imag • Using the DFT properties • Input is real • Output is symmetric • Combining rows • Even/Odd decomposition S00 S01 S02 S03 S10 S11 S12 S13 S20 S21 S22 S23 S30 S31 S32 S33 • N/2 FFTs of the rows, followed by Even/Odd decomposition • Output is symmetric (discard half the columns) • N/2 FFTs of the columns • Total of N FFT computations S11 S31
Reduction in the DCT case • Again combining the rows in the same way as in DFT (N/2 FFTs) • Even/Odd decomposition then extra multiplication to calculate the DCT S00 S20 S01 S21 S02 S22 S03 S23 Real Imag • Results are not symmetric • But the DCT is real • We can combine the columns the same way we combined the rows (N/2 FFT) • The same multiplier inside the FFT is used • Another Even/Odd decomposition is required here with an extra complex multiplier • Total of N FFT computations + few extra multiplications S10 S30 S11 S31 S12 S32 S13 S33
In-Place Radix-4 FFT • Critical path • Fixed point arithmetic • Bit Width? • Quantization noise • Rounding instead of Truncation • Avoid any overflow • additions • Needs extra bits • Can we do better?
Static Scaling Vs. Dynamic Scaling • Shift when you expect an overflow • Shift after each addition • The location of the fraction point is fixed at each computation step • Almost no overhead compared to fixed point • Higher effective bit width only in the first computation steps • No effect on the critical path • Shift only when overflow occurs • Track overflows and account for them • The location of the fraction point is the same for each 1D-FFT frame • Needs simple circuitry to track the overflow and shift when required • Effective bit width depend on the data. • No effect on the critical path
Design Space Explored Dynamic Scaling Yes No DFT DCT DFT DCT 8 12 16 8 12 16 8 12 16 8 12 16 • 8 bits with dynamic scaling considered later • 8 bits without dynamic scaling (and 12 for DCT) perform too poorly to be considered • 12 does as good as 16 bits with dynamic scaling in the DFT
Dynamic Scaling of DFT • 50% of coefficients is sufficient for perfect reconstruction because of the symmetry of the DFT • 16 bits without dynamic scaling does as well as floating point • 12 bits with dynamic scaling also does nearly as well as floating point
Dynamic Scaling of DFT(continued) • Improvement in performance when dynamic scaling is used more than makes up for reduced compression because the scaling bits have to be saved • 12 bits with dynamic scaling does nearly as well as 16 bits
DCT Vs. DFT • All cases are using dynamic scaling • DCT provides better energy compaction • For DCT, 12 bits gives a lower MSE for a given compression ratio (this was not the case for the DFT).
8 Bits Image reconstructed from 50% of the DFT coefficients, computed with 8 bits, using dynamic scaling. MSE = 452. Image reconstructed from 6% of the DFT coefficients, computed with 16 bits, MSE = 129.
Physical Considerations • Critical path about the same for all designs, could probably be improved with tighter synthesis constraints • Resource usage increases with bitwidth, addition of dynamic scaling, and DCT, but overall doesn’t change much • DCT uses extra DSP blocks because of the extra multiplication
Future Work • Use of DRAM to allow compression of larger images • Support for color images • Support for rectangular images of arbitrary edge length • Combining the DCT and DFT into a single core that could compute either transform, as selected by the user at runtime
Relationship Between the DFT and the DCT The N-point DFT of a sequence is the Fourier Series coefficients for that sequence made periodic with period N.
Relationship Between the DFT and the DCT (continued) The N-point DCT of a sequence is a twiddle factor multiplied by the first N Fourier Series coefficients of the 2N point sequence y(n) made periodic with period 2N. y(n) = x(x) + x(2N-1-n) x(n)
Relationship Between the DFT and the DCT (continued) The DCT can be computed from the DFT as follows: • Define the sequences y(n) = x(n) + x(2N-1-n) v(n) = y(2n) • Compute the N-point DFT of v(n), V(k)
Rounding Conclusion: Never hurt, often helped. Free in hardware (just a register initialization), so always use it. All subsequent results will be using rounding.
Limitations of MSE Image reconstructed from 5.7% of the DCT coefficients, computed with dynamic scaling. MSE = 193 Image reconstructed from 6.1% of the DCT coefficients, computed without dynamic scaling. MSE = 338
More Limitations of MSE (Left) 8 bit DFT coefficients, computed with rounding. Compression ratio = 2.3, MSE = 869. (Right) 8 bit DFT coefficients, computed without rounding. Compression ratio = 2.1, MSE = 664 (Left) 8 bit DCT coefficients, computed with rounding. Compression ratio = 2.2, MSE = 517. (Right) 8 bit DCT coefficients, computed without rounding. Compression ratio = 2.4, MSE = 563