1 / 11

Bayesian Enhancement of Speech Signals

Bayesian Enhancement of Speech Signals. Jeremy Reed. Outline. Speech Model Bayes application MCMC algorithm Results. Speech Model. Predict current speech sample from p previous samples (AR process) Justified by physics Lossless acoustic tubes Time for vocal tract to change shape

imala
Download Presentation

Bayesian Enhancement of Speech Signals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Enhancement of Speech Signals Jeremy Reed

  2. Outline • Speech Model • Bayes application • MCMC algorithm • Results

  3. Speech Model • Predict current speech sample from p previous samples (AR process) • Justified by physics • Lossless acoustic tubes • Time for vocal tract to change shape • Use a window of T samples for short-time analysis

  4. Speech Model • x1 are corrupted or “bad” samples • Prior for e~N(0, σe2) • Prior, p(a, σe2)=p(a, σe2)~IG(σe2; αe, βe) • αe, βe chosen to be broad enough to incorporate a (approach Jeffrey’s Prior) • AR coefficients are normal with ML mean and variance related to error and samples

  5. Speech Model • vt is the channel noise • vt~ N(0, σv2) • Inverse Gamma for prior on σv2 • Can use different distribution if have prior knowledge on the channel’s characteristics

  6. Bayesian Speech Enhancement • x is the clean speech sequence • y is x plus additive noise, v • θ is a vector containing the parameters of the speech and noise

  7. Algorithm • Window audio segment of T samples, overlapping successive windows by p samples • Assign initial values to a, σv2, and σe2 by using values from last p samples of previous windows • For first window, inferences for these parameters drawn from p(x,θ|y)

  8. Algorithm • Perform Gibbs sampling for unknown parameters:

  9. Algorithm • Rv is the covariance matrix for the corrupted samples and assumed diag(σv2)

  10. Results – 440 Hz Sine Wave

  11. Results - Speech

More Related