1 / 17

Speech Science II

Speech Science II. Capturing and representing speech. Topics. The empirical basis and the theoretical goal Capturing speech events Analysis and representation "Homework": a) Kent, Chap. 8, pp. 306-317

christa
Download Presentation

Speech Science II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Science II Capturing and representing speech

  2. Topics • The empirical basis and the theoretical goal • Capturing speech events • Analysis and representation "Homework": a) Kent, Chap. 8, pp. 306-317 b) Borden, Harris & Raphael (3rd edition), Chap. 7, pp.234-260; Raphael, Borden & Harris (5th edition), Chap. 12-13, pp.275-312; Deutsch: c) Pompino-Marschall, Teil I, S. 1-4; Übung 2 (for 13 Nov): See Exercise sheet (hand in by 12 Nov).

  3. Observation domains neuro- muscul. process Articulation Hearing CNS CNS neural processes neural processes Stimulus transform Acoustics system Measure- ment Speech signal Speech signal timeas a function of Observation domains within the speech chain The empirical basis • Speech is open to controlled observation. • It consists of registerable events. They can be repeated in a verifiable way. • But, what is the nature of the observable events? Answer: That depends on the domain you look at!

  4. The goal • Our goal is to explain how speech is produced and perceived (how we operate as speaking and listening individuals). • I.e., to derive models of speech production and perception from quantitative analysis of speech processes (physical) in relation to the resulting speech events and their communicative function (experienced) • Technically speaking, this is the relationship between the „phenomenal“ and the „trans-phenomenal“ • If we only work in the physical domain, we have no link to what speech actuallyisand does– namely something we experience, a „phenomenon“ • If we only work in the auditory domain, we cannot escape from thephenomenal (the subjective experience).

  5. Two empirical domains? • Speech was an area of empirical study long before the present-day instrumental methods were established. • Linguists/Phoneticians wrote down what they heard; • Physiologists/Physicists registered articulatory/physical processes So which is preferable? • The auditory observations are: subjective and only record events that can be named (are linguistically defined) • Instrumental records are: objective and record selected aspects of the processes that took place during the speech event.

  6. Interdependence (heard – measured)

  7. Levels of auditory analysis • The limits to our auditory processing abilities has been nicely summarized by Tillmann (1980, p. 39): „The three prosodies“ • The A-prosody is themelodic structure. • The B-prosody is thesyllabic, rhythmicstructure. • The C-prosody is thesegmental structure. • We can only directlyexperience A and B.

  8. Reality vs. analytic construct • In our search foran explanation of how speech works, we work with observation to explain what we experience. • The possible structure of what we experience as speech communication is an hypothesis ( theory) to be supported or falsified by interpretation of the observations. • Most hypotheses are formulated as linguistic statements: What are the sounds in the syllable, word …? What is the phonetic structure of the sound …? Is the syllable stressed, unstressed …? What are the tonal accents and where do they occur …? • The reality of these analytic units is almost always taken for granted!

  9. Capturing signals • Each observation domain requires ist own method of capturing data. • They differ greatly in the complexity, difficulty and expense: • Neural and neuro-physiological methods are VERY expensive,usually require medical supervision (or are only available ina medical research department). • Physiological/articulatory methods are complicated and often rather expensive. • Acoustic methods (and perceptual studies using acoustic stimuli)are relatively inexpensive and are readily available (nowadays)

  10. catheter balloon Sub-glottal air-pressure analysis Method 1: Balloon in the oesophagus Disadvantages: Indirect measurement, Position of balloon critical(uncertain accuracy) Acceptability LOW!

  11. When the subglottal pressure rises, the rear wall of the wind-pipe is pushed back against the balloon, reducing its volume and thereby increasing the pressure in the Ballon (Boyle‘s Law) Balloon 2 cm long filled with mit 2 ml of air 2 mm catheter, to measure the air pressure Vocal folds Windpipe Oesophagus Air from the lungs To the stomach Air-pressure analysis I The air pressure in the balloon is thus proportional to the sub-glottal pressure

  12. Air-pressure analysis II Method 2: Needle in the trachea! Advantage: Direct measurement Disadvantage: Medical supervision necessary. Acceptability LOW!

  13. Investigating glottal activity 1. Adduction and abduction: Transglottal illumination

  14. Investigating glottal activity 2. Voicing: Electroglottography (EGG)

  15. Investigating articulation • The accessibility of the articulator makes the problems of recording articulatory activity very specific: Lips – very accessible; optical methods possible Jaw – also accessible to optical or direct mechanical methods Tongue – not directly accessible; very complex movements Velum – not accessible; but relatively simple movements

  16. Electropalatography EPG registers the amount of contact between the tongue and the palate. Advantage: It captures the changes of contact over time. Disadvantage: It only captures the Actual contact, not the pressure, not the proximity to the palate. It disturbs the articulation slightly.

More Related