1 / 19

Word Recognition Device

Word Recognition Device . C.K. Liang & Oliver Tsai. Why is speech recognition important?. Several real world applications. Dictation devices/software i.e. Dragon Naturally Speaking.

iren
Download Presentation

Word Recognition Device

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Word Recognition Device C.K. Liang & Oliver Tsai

  2. Why is speech recognition important? • Several real world applications. • Dictation devices/software i.e. Dragon Naturally Speaking. • Voice activated devices may be used to dial telephone numbers, change preset buttons in car audio, change t.v. stations, and several other possibilities.

  3. How is this possible? • Linear Predictive Coding (LPC) • LPC models waveform like Infinite Impulse (IIR) Filter. • Uses the feedback from past inputs and past outputs to predict future outputs

  4. IIR Filter a(1)*y(n) = b(1)*x(n) + b(2)*x(n-1) + ...+b(nb+1)*x(n-nb) - a(2)*y(n-1)-…-a(na+1)*y(n-na)

  5. How do we use LPC for speech recognition? • Record human speech • Pre-emphasis • Convolution pre-emphasis filter with waveform

  6. Pre-emphasis Filter

  7. Why are vowel sound used ?

  8. Hamming Window • Multiply the 240 samples point by point with hamming window • Reduce the amplitude on both ends of the window frame

  9. Waveform of a consonant sound

  10. Variance Sound analysis summary LPC Coefficients

  11. General Block Diagram A/D converter 8000 samples/sec Pre-emphasis filter Frame Blocking 30ms window framing Hamming Window Levinson-Durbin Algorithm Auto-Correlation SSD Comparison Output 4 digital bits

  12. Implementation on Motorola DSP56303 • Train Device for vowel sound template • Recognition Device for vowels

  13. Training for sound template • Detect beginning of speech • Pre-emphasize 2000 input samples • Hamming window 240-sample frame • Calculate 10 LPC coefficients • Repeat 10 times and store 10 sets of LPC coefficients

  14. Recognition Device • Detect beginning of speech • Pre-emphasize 2000 input samples • Create window frame by shifting 80 samples • Hamming window each frame • Find 10 LPC coefficients for each frame • Compute SSD between the coefficients and those in template

  15. Output Hardware Map 4 output bits from DSP board to 10 corresponding vowel LEDs plus 1 volume indicator LED with NAND chips

  16. Difficulties encountered • Insufficient data memory • Indirect connection between microphone and the DSP board • Incompatible I/O core302 assembly file • Low volume for the sound input

  17. Further Expansion • Speech compression • Large vocabulary continuous speech recognition with Hidden Markov Model

  18. H(Z) = G/(1+A1 Z-1+A2 Z-2 + …. + A10 Z-10) 239 Ri =  x(n) x(n-i) n=i for i = 1 to 10 Autocorrelation

  19. Levinson-Durbin Algorithm R0 R1 R2 …. R9 A1 R1 R1 R0 R1 …. R8 A2 R2 R1 R0 R1 …. R8 A3 = - R3 …………………… …. …. R9 R8 R7 …. R0 A10 R10 An(i) = An-1(i) + Kn An-1(n-i) Kn = (-1/En-1)  An-1(I) Rn-i (i = 0 to n-1) En = En-1 (1-Kn2 )

More Related