1 / 24

Speech Processing

Speech Processing. Applications of Images and Signals in High Schools. AEGIS RET All-Hands Meeting University of Central Florida June 22, 2012. Contributors. Dr . Veton Këpuska , Faulty Mentor, FIT Jacob Zurasky , Graduate Student Mentor, FIT

jania
Download Presentation

Speech Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Processing Applications of Images and Signals in High Schools AEGIS RET All-Hands Meeting University of Central Florida June 22, 2012

  2. Contributors Dr. VetonKëpuska, Faulty Mentor, FIT Jacob Zurasky, Graduate Student Mentor, FIT Becky Dowell, RET Teacher, BPS Titusville High

  3. Motivation • Speech audio processing has increased in its usefulness. • Applications • Siri on iPhone 4S • Automated telephone systems • Voice transcription (e.g. dictation software) • Hands-free computing (e.g., OnStar) • Video games (e.g., XBOX Kinect) • Military applications (e.g., aircraft control) • Healthcare applications

  4. Motivation • Speech recognition requires speech to first be characterized by a set of “features”. • Features are used to determine what words are spoken. • To understand how the features are computed is very important. • Our project will implement the feature extraction stage of a speech processing application.

  5. Work Completed • MATLAB fundamentals • Introduction of Signal Processing and Filtering • Beginning Project Implementation

  6. Speech Recognition Front End: Pre-processing Back End: Recognition Features Recognized speech Speech Large amount of data. Ex: 256 samples Reduced data size. Ex: 13 features • Front End – reduce amount of data for back end, but keep enough data to accurately describe the signal. Output is feature vector. • 256 samples ------> 13 features • Back End - statistical models used to classify feature vectors as a certain sound in speech

  7. Discrete Time Signals • Computer is a discrete system with finite memory resources, requires a discrete representation of sound • Sound represented as a sequence of samples • time vs. amplitude • Amplitude = volume

  8. Discrete Time Signals

  9. Discrete Time Signals • Sampling rate (# of samples per second) • 8 kHz - telephone • 44.1 KHz – CD audio • 96 kHz – DVD audio

  10. Frequency Domain • Need to analyze signals over frequency rather than time. • Sound is composed of many frequencies at the same time • Frequency determines the pitch of the sound • To recognize the sound, we need to know the frequencies that make the sound.

  11. Fast Fourier Transform (FFT) • Algorithm used to transform time domain to frequency domain. • MATLAB function: FFT(X,N) X – discrete time signal N – FFT size X – frequency spectrum K - frequency bin N – FFT size n - sample number x[n] – input signal

  12. Sine Wave Example • MATLAB function sine_sound • Generate 3 sine waves and a composite signal • Play sound and plot graphs • Compute and plot FFT of composite signal

  13. Sine Wave Example % plays a C major chord (C4, E4, F4) sine_sound(8000, 261.626, 329.628, 391.995, 1, 4096);

  14. Front-End Processing of Speech Recognizer • Pre-emphasis • Window • FFT • Mel-Scale • log • IFFT

  15. Work CompletedProject Implementation • Pre-emphasis • Windowing • FFT

  16. Pre-Emphasis • 1st order FIR filter • In human speech, higher frequencies have less energy. Need to compensate for higher frequency roll off in human speech • High Pass filter

  17. Windowing • Separates speech signal into frames • Smooth edges of framed of speech signal

  18. Connections to High School Mathematics Curriculum • Florida Math Standard (NGSSS) MA.912.T.1.8: • Solve real world problems involving applications of trigonometric functions using graphing technology when appropriate. • Pre-Calculus course • related topics include graphs of trigonometric functions, unit circle, logarithmic scale, complex numbers in trig form

  19. Timeline • Week 1 • MATLAB fundamentals • MATLAB Filter Design & Analysis Tool • Introduction to Signal Processing, FFT, Filtering • Identified topics connected to high school math curriculum • Week 2 • Continued tutorials on signal processing and filtering • Implementation of sample code for use in lesson plans • Implementation of Pre-emphasis, Windowing, FFT

  20. Timeline • Week 3 • Cepstral Transform • Implementation of Front-End Speech Processing • Week 4 • Implementation of Front-End Speech Processing • Week 5 • Implementation of Front-End Speech Processing • Work on deliverables. • Week 6 • Work on deliverables.

  21. References • Ingle, Vinay K., and John G. Proakis. Digital signal processing using MATLAB. 2nd ed. Toronto, Ont.: Nelson, 2007. • Oppenheim, Alan V., and Ronald W. Schafer. Discrete-time signal processing. 3rd ed. Upper Saddle River: Pearson, 2010. • Weeks, Michael. Digital signal processing using MATLAB and wavelets. Hingham,Mass.: Infinity Science Press, 2007.

  22. Thank you! Questions?

More Related