PSYC 330: Perception

PSYC 330: Perception SPEECH

SOME BASICS • Methods of Manipulation • PHONATION (air pushed across vocal cords) • Airflow • Mass and “tuning” of cords • Harmonics • ARTICULATION (changes in vocal tract – ah; ee) • Vocal tract (everything above the larynx) acts as resonator Change shape  change in resonance characteristics(shape increases/decreases energy at different frequencies) • Filter function, peaks in wave called FORMANTS • lowest freq = F1, F2, and so forth

ARTICULATION AND SPEECH • Vowels • TONGUE • Up or down • Front or back • LIPS • Degree of rounding • EE, AH, OO

Consonants • PLACE of articulation • Bilabial (lips): b,p,m • Alveolar (teeth): d,t,n • Velar (soft palate): g,k,ng • MANNER of articulation • Stops: b,d,g,p,t,k • Fricatives: s,z,f,v,th,sh • Laterals and glides: l,r,w,y • Affricates: ch,j • Nasals: n,m,ng • VOICING • Voiced: b,m,z,l,r • Not voiced: p,s,ch

Additional Complications Co-articulation • Articulation of one speech sound overlaps with the next, because we talk so fast • Adjust production of sounds based upon sounds preceding and following • Context effect(say “moody” and “eedoom” • lack of physical invariance in the stimulus – doesn’t both speech perception in practice, but big problem for theory (and for AI) Categorical perception • sharp labeling (one OR the other) • inability (or difficulty) discriminating within categories • discrimination performance predicted by labeling McGurk effect – mismatched auditory and visual inputvisual “gah”; auditory “bah”  perception “dah”

Speech Segmentation • Saffran et al (1996) • 8 m old infants trained with 2 m stream of artificial language • After brief exposure already picking out the “words” • DV listening time to presented “words” and “non-words” • Concluded that children pick out covariance of sound combinations (statistical likelihood) • pri-tee (baby, good, far, nice) • bay-bee (girl, boy, good) • tee-bay (?)

Language Development • Can you say 7,777 in Swedish? • Not just pronunciation, but hearing too • Preference methodology • In utero (HR changes) • Familiarity effect (own language, own mother, own stories) • 6 m • Preference for own vowels • 12 m • Preference for own consonants • Huppi & Dubois (2013) brain scans on premature babies (up to 3 mo early – brain not fully developed) • Found that they discriminated between male and female voices • Found that they discriminated between “ga” and “da” • Used same regions of the brain as adults do to make the discriminations

Special Topic: Emotion Perception in Language Laukka (2005) Categorical Perception of Vocal Emotion Expression Stimulus Development: Actress says “It is now 11 o’clock” with tones reflecting anger, fear, happiness and sadness Physically “morph” the sounds from one emotion to the next  continuous variation

Method • UGs presented with sequential discrimination task with two tones (ABX) is X = to A or B? • All combinations of the morphed sounds differing by 20% were compared • Also asked to judge the emotion (in addition to discriminating it)

Identification Accuracy

Reaction Times: 50% v Other

Discrimination: Across category v. Within Category

Brain Structures in Speech Comprehension • Old school: Broca’s and Wenicke’s areas lateralization effects • AI; belt and parabelt anterior temporal lobe • HOW? • Rosen et al (2011) • How (on what basis) does language (as opposed to other complex sounds) become lateralized? • IV = intelligible vs unintelligible sentences of equal auditory complexity (created by manipulating frequency and amplitude changes in auditory signal) • DV = brain scan data (PET) • Results: intelligible sentences processed in left temporal lobe; equally complex but non intelligible sentences processed bilaterally

Attention and speech "Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a 'Cocktail Party” Golubic et al, 2013 • The Cocktail Party Phenomenon – how do we do it? • Direct electrical recording from the brains of epilepsy patients • Presented naturalistic stimuli re: “cocktail party” • Findings: brain regions in and near to primary auditory cortex respond to both attended and non attended speaker; processing in subsequent paths are selective • We use bottom-up processing (temporal/amplitude patterns) to “tune” the selection – selectivity “unfolds” (becomes more prominent) across a sentence

Green dots = brain regions that responded to both speakers (attended and ignored) Red dots = brain regions that respond selectively (only to attended speaker)

PSYC 330: Perception