1 / 45

Wavelet-Based Speech Enhancement

Wavelet-Based Speech Enhancement. Sharif University of Technology. Presentation Outline. Motivation and Goals Wavelet Transform - Overview Basic Denoising in Wavelet Domain Literature Survey Implementation and Results Conclusions and Future Works. Motivation and Goals. Key Applications.

akiva
Download Presentation

Wavelet-Based Speech Enhancement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wavelet-Based Speech Enhancement Sharif University of Technology

  2. Presentation Outline • Motivation and Goals • Wavelet Transform - Overview • Basic Denoising in Wavelet Domain • Literature Survey • Implementation and Results • Conclusions and Future Works Wavelet-Based Speech Enhancement

  3. Motivation and Goals Key Applications • Improving perceptual quality of speech • Reduce listener’s fatigue • Hearing aids • Improving performance of • Speech coders • Voice recognition systems Wavelet-Based Speech Enhancement

  4. Motivation and Goals Goals of SE in Wavelet Domain • Variable window size for different frequency components • Long time intervals  precise low frequency info. • Short time intervals  precise high frequency info. • Easy to implement • Fast WT computation complexity: O(n) • FFT computation complexity: O(nlog2n) • Denoising by simple thresholding • Real-time implementation Wavelet-Based Speech Enhancement

  5. Wavelet Transform - Overview • Motivation and Goals Wavelet Transform - Overview • Basic Denoising in Wavelet Domain • Literature Survey • Implementation and Results • Conclusions and Future Works Wavelet-Based Speech Enhancement

  6. Wavelet Transform - Overview History • Fourier (1807) • Haar (1910) • Math World Wavelet-Based Speech Enhancement

  7. Wavelet Transform - Overview • What kind of Could be useful? • Impulse Function (Haar): Best time resolution • Sinusoids (Fourier): Best frequency resolution • We want both of the best resolutions • Heisenberg (1930) • Uncertainty Principle • There is a lower bound for(An intuitive prove in [Mac91]) Wavelet-Based Speech Enhancement

  8. Wavelet Transform - Overview • Gabor (1945) • Short Time Fourier Transform (STFT) • Disadvantage: Fixed window size Wavelet-Based Speech Enhancement

  9. Wavelet Transform - Overview • Constructing Wavelets • Daubechies (1988) • Compactly Supported Wavelets • Computation of WT Coefficients • Mallat (1989) • A fast algorithm using filter banks Wavelet-Based Speech Enhancement

  10. Wavelet Transform - Overview Multiresolution Signal Representation Coarse version (Approximation) more useful than the Detail • Browsing image databases on the web • Signal transmission for communication • Denoising Wavelet Tree Decomposition • Wavelet Transform (WT) • Undecimated WT (UWT) We may lose what is in the Detail Wavelet-Based Speech Enhancement

  11. Wavelet Transform - Overview Full Tree Decomposition • Wavelet Packet Transform (WPT) • Undecimated WPT (UWPT) S = A1+D1 or S = A1+AD2+DD2 or … Which decomposition path could be the best choice? The answer leads us to the Best Basis Wavelet-Based Speech Enhancement

  12. Wavelet Transform - Overview Best Basis Selection Criterions Cut if: • Entropy • Coifman, Meyer, Wickerhauser (1992) • Rate-Distortion: • Vetterli (1995) Wavelet-Based Speech Enhancement

  13. Basic Denoising in Wavelet Domain • Motivation and Goals • Wavelet Transform - Overview Basic Denoising in Wavelet Domain • Literature Survey • Implementation and Results • Conclusions and Future Works Wavelet-Based Speech Enhancement

  14. Basic Denoising in Wavelet Domain Principle • Only a few coefficients in the lower bands could be used for approximating the main features of the clean signal. Hence, by setting the smaller coefficients to zero, we can nearly optimally eliminate noise while preserving the important information of clean signal. Wavelet-Based Speech Enhancement

  15. Basic Denoising in Wavelet Domain Notation • Clean signal • Noise signal • Noisy signal Time domain Wavelet domain  Wavelet-Based Speech Enhancement

  16. Basic Denoising in Wavelet Domain Algorithm • Framing input noisy signal • Forward WT of a frame • Thresholding (detail) wavelet coefficients • Inverse WT • Keep center part of the frame • Repeat for all of the frames Wavelet-Based Speech Enhancement

  17. Basic Denoising in Wavelet Domain Threshold Value VisuShrink [DonJ94b] Threshold Estimation of Noise variance Frame length For Gaussian white noise: Another definition (wden.m): MAD: Median Absolute Difference Wavelet-Based Speech Enhancement

  18. Basic Denoising in Wavelet Domain Threshold Value Threshold in the WPT case For the correlated noise situation:Use level dependent threshold (SureShrink [DonJ94b]) Wavelet-Based Speech Enhancement

  19. Basic Denoising in Wavelet Domain How to Threshold Hard Thresholding Soft Thresholding Comparison: Discontinuity Alteration of values Wavelet-Based Speech Enhancement

  20. Literature Survey • Motivation and Goals • Wavelet Transform - Overview • Basic Denoising in Wavelet Domain Literature Survey • Implementation and Results • Conclusions and Future Works Wavelet-Based Speech Enhancement

  21. Literature Survey [SeoB97], Novelty • Title: • Speech enhancement with reduction of noise components in the wavelet domain • Novelty: • Semisoft thresholding [GaoB95] • Classification of unvoiced region in WD • Different thresholding for unvoiced region Wavelet-Based Speech Enhancement

  22. Literature Survey [SeoB97], Thresholding • Semisoft Thresholding: [GaoB95] • Less sensitivity to small perturbations in the data • Smaller bias Hard Soft Semisoft Like [DonJ94b] Wavelet-Based Speech Enhancement

  23. Literature Survey [SeoB97], Unvoiced Regions • Separation of unvoiced region • Use DWT for finding • Calculate average energy of each subband • Current speech segment is unvoiced if: Wavelet-Based Speech Enhancement

  24. Literature Survey [SeoB97], Implementations • If unvoiced then threshold just highest frequency band • Implementation results • Additive white Gaussian noise • SNR (-10dB  10 dB) • “Should we chase those cowboys?” Wavelet-Based Speech Enhancement

  25. Literature Survey [SooKY97], Novelty • Title: Wavelet for speech denoising • Novelty: • Evaluation of different wavelets and different orders (db1-10, coif1-5, sym2-8, bior1.3-6.8) • Spectral Subtraction in WD • Wiener Filtering in WD (Uses two methods for estimating the a priori SNR) • Maximum Likelihood approach • Decision Directed approach Wavelet-Based Speech Enhancement

  26. Literature Survey [SooKY97], Thresholding 1 Use DWT and find L levels of decomposition 1. Spectral Subtraction (SS) in WD if then Use similar scheme for Denoised value  else Denoised value  Expected value of the noise magnitude, could be estimated from silence frames Wavelet-Based Speech Enhancement

  27. Literature Survey [SooKY97], Thresholding 2 2. Wiener Filtering in WD is the a priori SNR Estimating a. Maximum Likelihood b. Decision Directed [0, 1], Typ. 0.9 Wavelet-Based Speech Enhancement

  28. Literature Survey [SooKY97], Implementations • Implementation results • White Gaussian noise • Both male and female voices • 10 levels of decomposition Wavelet-Based Speech Enhancement

  29. Literature Survey [SooKY97], Conclusions • The methods are not particularly sensitive to the various wavelet types with the exception of Bior3.1 • Wiener filtered speeches have better SNR values than Magnitude subtraction • For Wiener filtering, the decision directed approach gives better SNR values than the maximum likelihood approach Wavelet-Based Speech Enhancement

  30. Literature Survey [KimYK01], Novelty • Title: • Speech enhancement using adaptive wavelet shrinkage • Novelty: • Adaptive threshold value • Threshold value will depend on the variance of estimated clean signal (BayesShrink) • Classification of unvoiced region using entropy • Applies smaller threshold for unvoiced region and calls the method as “Adaptive BayesShrink” Wavelet-Based Speech Enhancement

  31. Literature Survey [KimYK01], Threshold Value • BayesShrink: Adaptive threshold value for minimizingthe Bayesian riskis • Thus, finds the estimated threshold value as Where [ChaYV00a] Wavelet-Based Speech Enhancement

  32. Literature Survey [KimYK01], Unvoiced Regions • Current region is unvoiced if • Unvoiced region has smaller energy, so apply a smaller threshold: are selected by simulation There was no comment about type of entropy,it could be as: Wavelet-Based Speech Enhancement

  33. Literature Survey [KimYK01], Implementations • Implementation results: • Additive white Gaussian noise • SNR: 0db, 10dB and 20dB Wavelet-Based Speech Enhancement

  34. Literature Survey [ChaKYK02], Novelty • Title: Speech enhancement for non-stationary noise environment by adaptive wavelet packet • Novelty: • Node dependent thresholding for adaptation in colored or non-stationary noise • Noise estimation based on spectral entropy not MAD • Modified hard thresholding to alleviate time-frequency discontinuities Wavelet-Based Speech Enhancement

  35. Literature Survey [ChaKYK02], Threshold Value • Create WPT and find best basis tree’s leaf nodes • Node dependent thresholding • Noise estimation could be like:or the following proposed method Wavelet-Based Speech Enhancement

  36. Literature Survey [ChaKYK02], Noise Estimation • Estimate spectral pdf of wavelet packet coefficients through B bins histogram • Calculate normalized spectral entropy for each node in adapted wavelet packet tree Wavelet-Based Speech Enhancement

  37. Literature Survey [ChaKYK02], Noise Estimation (cont.) • Estimate spectral magnitude intensity by histogram • Define an auxiliary threshold • Estimate standard deviation of noise # of Coef. with magnitude equal to or greater than bin’s amplitude node_length bins of Coef. magnitudes Wavelet-Based Speech Enhancement

  38. Literature Survey [ChaKYK02], Noise Estimation (cont.) Greater disorder of wavelet coefficients (less voiced, more unvoiced) More uniform spectral pdf Bigger values for entropy (0  1) Bigger value for alpha Smaller # of bins bigger than alpha Smaller estimation for standard deviation of noise Wavelet-Based Speech Enhancement

  39. Literature Survey [ChaKYK02], Thresholding ModifiedHard Thresholding Wavelet-Based Speech Enhancement

  40. Literature Survey [ChaKYK02], Implementations • Implementation results: • Pink noise, SNR: -5dB ~ 15 dB Subjective tests were in favor of the level dependent thresholding but not every time!Anyway, the proposed method has better spectral performance (spectrogram) Wavelet-Based Speech Enhancement

  41. Literature Survey [ChaKYK02], Implementations (cont.) • SNR (dB) test for various noisy speech: “We like bleu cheese but Victor prefers swiss cheese.” (SNR= 10dB) Wavelet-Based Speech Enhancement

  42. Literature Survey … • To be continued… Thank You. Wavelet-Based Speech Enhancement

  43. References (1 of 2) Wavelet-Based Speech Enhancement

  44. References (2 of 2) Wavelet-Based Speech Enhancement

  45. Wavelet-Based Speech Enhancement Course Project Presentation 1 Thank You FIND OUT MORE AT... 1. http://ce.sharif.edu/~m_amiri/ 2. http://www.aictct.com/dml/

More Related