1 / 22

Speech Enhancement with Binaural Cues Derived from a Priori Codebook

Speech Enhancement with Binaural Cues Derived from a Priori Codebook. Reporter : Nan Chen Beijing University of Technology. 1. Introduction. 2. The Proposed Method. 4. 3. Results and Conclusions. Contents. 1. Introduction. Introduction. Noise. Street. Car. Babble. office.

Download Presentation

Speech Enhancement with Binaural Cues Derived from a Priori Codebook

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Enhancement with Binaural Cues Derived from a Priori Codebook Reporter:Nan Chen Beijing University of Technology http://www.bjut.edu.cn/sci/voice/index.htm

  2. 1 Introduction 2 The Proposed Method 4 3 Results and Conclusions Contents http://www.bjut.edu.cn/sci/voice/index.htm

  3. 1 Introduction http://www.bjut.edu.cn/sci/voice/index.htm

  4. Introduction Noise Street Car Babble office http://www.bjut.edu.cn/sci/voice/index.htm

  5. Introduction 1 The traditional method of speech enhancement 2 3 4 http://www.bjut.edu.cn/sci/voice/index.htm

  6. Introduction Binaural Cue Coding(BCC) Framework Purpose: recovering the perception of the original input signals BCC analysis: extract the side information of input signals BCC synthesis: recover the input signals by making use of the side information and the mono signal Figure 1 :Block diagram of analysis and synthesis for BCC http://www.bjut.edu.cn/sci/voice/index.htm

  7. Introduction Once the Discrete Fourier transform (DFT) coefficients of mono signal is known, the DFT coefficients of each output channel Sc,k can be calculated as Where is the ICLD between channel 1 and channel c for the nth sub-band. , is a random variable which is controlled by ICC (1) (2) (3) http://www.bjut.edu.cn/sci/voice/index.htm

  8. Introduction BCC : recovering the perception of the original input signals. speech enhancement : separate clean signal from the noisy signal. The BCC principle is introduced to estimate the clean signal. The noisy speech is enhanced by BCC principle where the channel 1 is assumed as the clean speech and the channel 2 is regarded as the noise. Clean speech Clean speech Noisy speech Noise Noise http://www.bjut.edu.cn/sci/voice/index.htm

  9. 2 The Proposed Method 4 http://www.bjut.edu.cn/sci/voice/index.htm

  10. The Proposed Method Side Information The Clean Cue speech and noise level difference (SNLD) speech and noise correlation (SNC) The Pre-enhanced Cue pre-enhanced speech and noise level difference (PNLD) pre-enhanced speech and noise correlation(PNC) posterior SNR (PSNR) speech presence probability (SPP) http://www.bjut.edu.cn/sci/voice/index.htm

  11. The Proposed Method Figure 2: Block diagram of the proposed monaural speech enhancement method http://www.bjut.edu.cn/sci/voice/index.htm

  12. The Proposed Method • weighted codebook mapping algorithm Figure 3:Block diagram of the weighted codebook mapping http://www.bjut.edu.cn/sci/voice/index.htm

  13. The Proposed Method Estimation of the clean cue: 1) By comparing the Euclidean distance (ED) between the online pre-enhanced cue and the trained pre-enhanced cue, we can choose M code-vectors with relative small ED from the trained codebook. 2) calculate the degree of membership ρ of the chosen code-vectors 3) the weight of each chosen code-vector can be defined as 4) the online clean cue is obtained by weighting the trained clean cue stored in the chosen code-vector. (4) (5) http://www.bjut.edu.cn/sci/voice/index.htm

  14. The Proposed Method Speech Enhancement: According to the BCC principle, we have: where is a random function with zero mean and constant variance. Finally, the noisy speech is enhanced by: (6) (7) (8) http://www.bjut.edu.cn/sci/voice/index.htm

  15. 4 Results and Conclusions 3 http://www.bjut.edu.cn/sci/voice/index.htm

  16. Results SSNR: http://www.bjut.edu.cn/sci/voice/index.htm

  17. Results PESQ: http://www.bjut.edu.cn/sci/voice/index.htm

  18. Results LSD: http://www.bjut.edu.cn/sci/voice/index.htm

  19. Results 5dB babble clean Ref.A poposed Ref.B http://www.bjut.edu.cn/sci/voice/index.htm

  20. Results 10dB babble clean Ref.A Ref.B poposed http://www.bjut.edu.cn/sci/voice/index.htm

  21. Conclusions • We enhance the noisy speech by modeling the spectral detail, which is the reason why it can reduce the noise between harmonics. • The noise classification is cancelled because we introduce the binaural cues, which are not correlated with the type of noise, as priori information. http://www.bjut.edu.cn/sci/voice/index.htm

  22. Thank You! http://www.bjut.edu.cn/sci/voice/index.htm

More Related