1 / 85

Speech Processing

Speech Processing. Homomorphic Signal Processing. Outline. Principles of Homomorphic Signal Processing Details of Homomorphic Processing Variants of Homomorphic Processing Investigation of Homomorphic systems to speech analysis and synthesis. Principles of Homomorphic Processing.

cruz-ware
Download Presentation

Speech Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Processing Homomorphic Signal Processing

  2. Outline • Principles of Homomorphic Signal Processing • Details of Homomorphic Processing • Variants of Homomorphic Processing • Investigation of Homomorphic systems to speech analysis and synthesis Veton Këpuska

  3. Principles of Homomorphic Processing • Superposition Property of Linear Systems: x1[n] x[n] L L(x[n]) a1 x2[n] a2 a1L(x1[n]) L x1[n] L(x[n]) a1 L x2[n] a2L(x2[n]) a2 Veton Këpuska

  4. 0 for ∈[0,/2] 1 for ∈[/2, ] Principles of Homomorphic Processing • Example 6.1: • If signals fall in non-overlapping frequency bands then they are separable. • x[n]=x1[n]+x2[n] • X1()=ℱ{x1[n]} & X1() [0,/2], • X2()=ℱ{x2[n]} & X2() [/2, ], y[n] = h[n]*(x1[n]+x2[n]) = h[n]*x1[n] + h[n]*x2[n] y[n] = h[n]*x2[n] = x2[n] Veton Këpuska

  5. Input rule □ : Principles of Homomorphic Processing • Generalized Superposition • Concept that would support separation of nonlinearly combined signals. • Leads to the notion of Generalized Linear Filtering. • Properties: • H(x1[n]□x2[n])=H(x1[n])○H(x2[n]) • H(c:x[n])=c◈H(x[n]) • Systems that satisfy those two properties are referred to as homomorphic systems and are said to satisfy a generalized principle of superposition. Output rule ○ H() y[n] x[n] ◈ Veton Këpuska

  6. Principles of Homomorphic Processing • Importance of homomorphic systems for speech processing lies in their capability of transforming nonlinearly combined signals to additively combined signals so that linear filtering can be performed on them. • Homomorphic systems can be expressed as a cascade of three homomorphic sub-systems depicted in the figure below – referred to as the canonic representation: H + + □ + + ○ -1 L D○ D□ x[n] . . . y[n] . : ◈ III I II Veton Këpuska

  7. The Characteristic System: Transforms □ into add “+” The linear system: transforms “add” into “add” The inverse system: transforms add into ○ Canonic Representation of a Homomorphic System I □ + D□ x[n] . : II + + L . . III + ○ -1 D○ . y[n] ◈ Veton Këpuska

  8. Homomorphic Systems • Let the goal be removal of undesired component of the signal (e.g., noise): Veton Këpuska

  9. ● M[] x[n] y[n] Multiplicative Homomorphic Systems • Consider Homomorphic Multiplicative System depicted below: • Use D□ to convert MULT into ADD. • Use D○ to convert ADD into MULT. • Which rule (operation) transforms MULT into ADD? -1 + + ● + + ● -1 L D● D● x[n] y[n] III I II Veton Këpuska

  10. Multiplicative Homomorphic Systems • If • x[n]=x1[n]●x2[n], and • x1[n]>0 & x2[n]>0 for all n • Then • log(x1[n]●x2[n])=log(x1[n])+log(x2[n]) • However, x[n] may not be always positive. • Generalization to complex signals: • x[n]=|x[n]|ejarg(x[n]) which requires definition of complex log operator. Veton Këpuska

  11. Multiplicative Homomorphic Systems • An implementation of multiplicative Homomorphic System: • Definition: • Complex log: • Complex exp.(Inverse operation) + + + + ● ● LinearSystem Complex Exp. Complex log x[n] y[n] III I II Veton Këpuska

  12. * C[] x[n] y[n] Homomorphic Systems for Convolution • Consider Homomorphic System for Convolution depicted below: • Use D□ to convert “*” into ADD. • Use D○ to convert ADD into “*”. • How to transform “*” into ADD? C + + + * + * x[n] -1 L D* D* y[n] III I II Veton Këpuska

  13. III. ● ● + + + * З-1[] З[] exp[] y[n] “time” -1 D* Homomorphic Systems for Convolution • Let x[n]=x1[n]*x2[n] • Inverse Operation I. + + * ● ● + x[n] З-1[] З[] log[] time “time” D* Veton Këpuska

  14. Homomorphic Systems for Convolution • For x[n]=x1[n]*x2[n]: • X(z)=X1(z)X2(z) • Log(X(z))=Log(X1(z)X2(z))= Log(X1(z))+Log(X2(z))Complex logarithm. • This operation requires special handling because: • X(z) > 0 • For complex X(z) phase is not uniquely defined (i.e., multiple of 2) • X(z) has to be defined on unit circle (e.g., Z transform of a stable sequence). • In practice operate on unit circle z=ej. Fourier Transform: Veton Këpuska

  15. Homomorphic Systems for Convolution • Two cases are possible in computing : • Complex Cepstrum (CC): • Real Cepstrum (RC): Veton Këpuska

  16. Homomorphic Systems for Convolution • Example 6.3 Consider a sequence x[n] consisting of a system impulse response h[n] convolved with an impulse train p[n]: • Goal is to estimate h[n]. • First form canonical representation for convolution: • If D* is such that p[n] remains train of pulses, and h[n] falls between impulses then separation is possible. h[] p[n] x[n] x[n]=h[n]*p[n] ^ ^ Veton Këpuska

  17. 0 Example 6.3 (cont.) • Let L denote such operation (i.e., rectangular window that would separate p[n] from h[n]). ^ ^ Veton Këpuska

  18. Example 6.4 • a,b real and positive:⇒ log(ab) = log(a)+log(b) • a,b real but b<0⇒ log(ab) = log(a|b|ejk)=log(a)+log(|b|)+jk, k=1,3,5,… • log(ab) is ambiguous. • This example indicates that special consideration must be made in defining the logarithm operator for complex X(z) in order to make the logarithm of the product the sum of logarithms. Veton Këpuska

  19. Homomorphic Systems for Convolution-Complex Logarithm • Suppose that X(z) is evaluated on the unit circle (z=ej) • Let x[n]=x1[n]*x2[n] ⇒ X()=X1() X2() • Consider then complex log of X(): • Considering that X()=X1() X2() then: Veton Këpuska

  20. Homomorphic Systems for Convolution-Complex Logarithm • In the previous expression the following was assumed: • Also: • Expression generally does not hold due to the ambiguity in the definition of phase: Veton Këpuska

  21. Homomorphic Systems for Convolution-Complex Logarithm • Note that: • PV denotes principal value of the phase which falls in the interval [-,]. • Arbitrary multiple of 2 can be added to the principal phase value • Thus additive property generally does not hold. • How to impose uniqueness? • Force continuity of phase: • Select k such that ∠X()=PV[∠X()]+ 2k is a continuous function. Figure 6.5 (next slide). • Phase derivative approach:It can be shown that: Veton Këpuska

  22. Fourier Transform Phase Continuity Veton Këpuska

  23. Homomorphic Systems for Convolution • Relationship of complex cepstrum to real cepstrum c[n]: • If x[n] real then: • |X()| is real and even and thus log[|X()|] is real and even • ∠X() is odd, and hence • is referred to as the complex cepstrum. • Even component of the complex cepstrum, c[n] is referred to as the real cepstrum. Veton Këpuska

  24. Complex Cepstrum of Speech-Like Sequences • Sequences with Rational z-Transform: • General form the class of sequences is given below: • Mi, Ni – are zeros and poles inside the unit circle. • Mo, No – are zeros and poles outside the unit circle. • |ak|, |bk|, |ck|, |dk| are all < 1 ⇒ • Thus there are no singularities on the unit circle. • A > 0. Veton Këpuska

  25. Complex Cepstrum of Speech-Like Sequences • Applying complex logarithm gives: • is a z-transform of sequence • Want inverse z-transform to be absolutely summable ⇒ ROC of must include unit circle, |z|=1. • This condition is equivalent to having all constituent elements of have ROC’s that include unit circle, |z|=1 Veton Këpuska

  26. Complex Cepstrum of Speech-Like Sequences Im Z-plane 1 • In order to obtain ROC for expressions of the form: • log(1-z-1) • log(1-z)they are expressed in a power series expansion:  Re ROC for log(1-z-1) Im Z-plane 1 Re 1/ ROC for log(1- z) Veton Këpuska

  27. Complex Cepstrum of Speech-Like Sequences • The ROC of is therefore given by an annulus defined by the poles & zeros of X(z) closes to the unit circle: Im Z-plane 1 Re ROC for typical rational X(z) Veton Këpuska

  28. Complex Cepstrum of Speech-Like Sequences • Complex cepstrum associated with rational X(z) can be therefore expressed as: Veton Këpuska

  29. Example 6.5 • Let:where a, b, c, are real and <1. • The ROC of X(z) includes unit circle so that x[n] is stable. • A delay z-r corresponds to a shift in the sequence. • Thus complex cepstrum is given by: Veton Këpuska

  30. Example 6.5 (cont.) • The inverse z-transform of the shift term is given by: • Contribution of z-r term is significant. • On the unit circle: z-r=e-jr=1∠-r contributes a linear ramp to the phase and thus for a large shift r, dominates the phase representation and gives a large discontinuity at  and -. Veton Këpuska

  31. Complex Cepstrum of Speech-Like Sequences • Relation of complex cepstrum and real cepstrum for x[n] with rational z-transform that is minimum phase: • Complex cepstrum of a minimum-phase sequence with a rational z-transform is right-sided: Veton Këpuska

  32. h[n] p[n] x[n] x[n]=h[n]*p[n] Z Impulse Train Convolved with Rational z-Transform Sequences • Second class of sequences of interest in the speech context is the train of uniformly-spaced unit samples with varying weights and its interaction with the system: Veton Këpuska

  33. Impulse Trans Convolved with Rational z-Transform Sequences • If p[n] is minimum phase and |ar(zN)-1|<1, zeros are inside the unit circle, log[P(z)] can be expressed as: • Thus is an infinite right-sided sequence of impulses spaced N-samples apart. • Note that in general for non-minimum phase sequences the complex cepstrum is two-sided with uniformly spaced impulses. Veton Këpuska

  34. h[n] p[n] x[n] Example 6.6 • Consider a sequence x[n]=h[n]*p[n] where z-transform of h[n] is given by: • b,b*, and c, c* are complexconjugate pairs. • Consider p[n] to be train ofperiodic pulses then: Z-plane Im b 1 a Re a* b* x[n]=h[n]*p[n] Veton Këpuska

  35. p[n] 1 … n Example 6.6 (cont) • If ∈and ||<1 then p[n] is train of decaying exponentials: • Z-transform of p[n] is given by: • Then, as derived earlier: Veton Këpuska

  36. Example 6.6 (cont) h[n] p[n] Veton Këpuska

  37. Homomorphic Filtering • In the cepstral domain: • Pseudo-time  Quefrency • Low Quefrency  Slowly varying components. • High Quefrency  Fast varying components. • Removal of unwanted components (i.e., filtering) can be attempted in the cepstral domain (on the signal , in which case filtering is referred to as liftering): • When the complex cestrum of h[n] resides in a quefrency interval less than a pitch period, then the two components can be separated form each other. Veton Këpuska

  38. Homomorphic Filtering • If log[X()] • Is viewed as a “time signal” • Consisting of low-frequency and high-frequency contributions. • Separation of this signal with a high-pass/low-pass filter. • One implementation of low pass filter: + + * + + * x[n]=h[n]*p[n] -1 l[n] D* D* y[n] Veton Këpuska

  39. Homomorphic Filtering • Alternate view of “liftering” operation: Filtering operation L() applied in the log-spectral domain • Interchange of time and frequency domain by viewing the frequency-domain signal log[X()] as a time signal to be filtered. ⇒ • “Cepstrum” can be thought of as spectrum of log[X ()] • Time axes of is referred to as “quefrency” • Filter l[n] as the “lifter”. ^ ^ L() Y() X() x[n]=h[n]*p[n] F-1 l[n] F-1 log F F exp y[n] Veton Këpuska

  40. Homomorphic Filtering • Three elements in the doted lines of previous figure can be replaced by L(), which can be viewed as a smoothing function: ^ ^ Y() X() x[n]=h[n]*p[n] F F-1 L() log exp y[n] Veton Këpuska

  41. Practical Implementation Issues • Use FFT and IFFT for Fourier Transformations. • X() is computed by: • log|X()| computed as • And for x[n] use ^ Veton Këpuska

  42. Practical Implementation Issues ^ ^ • Cepstrum x[n] is infinitely long thus xN[n] is aliased version of x[n]. That is:Thus it is necessary to use a largest N as possible • Phase component j∠X(k) must be properly unwrapped to ensure phase continuity. Goal to determine r[k] so that ∠X(k) is continuous. ^ Veton Këpuska

  43. Modulo 2 Phase Unwrapper • Goal is to determine r[k] so that X(k) is continuous PV[X()] PV[X(k)]  PrincipalValue PV  - 2/N Phase Representation in Discrete Complex Spectrum Veton Këpuska

  44. Modulo 2 Phase Unwrapper • Algorithm: • If PV[X(k)]-PV[X(k-1)]>2- • r[k]=r[k-1]-1 # Subtract 2 • Else if PV[X(k)]-PV[X(k-1)]<2- • r[k]=r[k-1]+1 # Add 2 • Else • r[k]=r[k-1] # Do not change • End • Note: Even with fine grid of (determined by N) 2/N, it is possible that subsequent PV samples may be more than 2 rad apart (case of poles/zeros close together). Veton Këpuska

  45. Phase Derivate-Based Phase Unwrapper • The phase derivative is uniquely defined by: • Then: • However, since only X(k) is available must estimate from discrete values. Veton Këpuska

  46. Phase Derivate-Based Phase Unwrapper • Re-state the Problem: • Where q(k) is an integer-valued function. • Assuming that phase has been correctly unwrapped up-to k-1 with the value (k-1) then: • An approximation: • Select value of q(k) such that E[k] is minimized:over q(k). Veton Këpuska

  47. Example Veton Këpuska

  48. Short-Time Homomorphic Analysis of Periodic Sequences • Recall Source-System model of speech production: • For voiced speech p[n] is quasi-periodic: • For unvoiced speech p[n] is noise-like. • In practice a periodic waveform is windowed by a finite-length sequence w[n]: s[n]=w[n]x[n]=w[n](p[n]*h[n]) • Approximation to s[n]: h[n] p[n] x[n]= h[n]*p[n] Veton Këpuska

  49. Short-Time Homomorphic Analysis of Periodic Sequences • If w[n] is smooth relative to h[n], that is, P large enough so that h[n-kP] do not substantially overlap, then: • Then, Cepstrum of s[n] is:where is complex cepstrum of w[n]p[n]. • Can show that:D[n] – weighting function depending on w[n]. …………() Veton Këpuska

  50. Short-Time Homomorphic Analysis of Periodic SequencesCepstral Domain (Quefrency) Perspective • Under what conditions can we perform deconvolution? • Cepstral Domain (Quefrency) Perspective • Let x[n], a voiced speech signal, produced by an infinite train of periodic impulses: • Thus the only samples in X() and log[X()] are defined at multiples of the fundamental frequency o=2/P, i.e., k=(2/P)k X(k) = P(k) H(k) log[X(k)] = log[P(k)] + log[H(k)] Veton Këpuska

More Related