Phonotactics and knowledge of relative similarity

Phonotactics and knowledge of relative similarity D.Steriade, MIT

Phonotactics? • System of segmental/prosodic contrasts • E.g., is there T≠ D? • Their contextual distribution • E.g., is there T≠ D in /_#?

The 2 phonotactic questions • Trigger: what factor causes loss of contrast • E.g. *+voice? *+voice/_#? • Process: how does contrast loss come about? • E.g. what happened to (potential) D’s in/_#? did they devoice, nasalize, turn to glides, delete, get a V after, merge with preceding segment, allow their voicing to float to better positions?

1st message today • Can’t understand triggers unless we understand processes • Can’t understand processes unless we understand similarity relations. • Can’t understand effect of similarity on grammar unless we understand the inhibiting effect of lexical knowledge

2nd message • Knowledge transfer: learners transfer knowledge from one domain (phonetics, perceptual similarity) to another (phonotactic process)

Basic issues • Is there knowledge of grammar? • Its precise nature? • Source of universal laws? • Relevance to study of competence? • Links between grammar and lexicon? • Learning

Knowledge of phonotactics • perception (mis)guided by phonotactic knowledge. (Pitt 1998, Pitt & McQueen 1998; Moreton 2002 Dupoux et al. 1999;) • production limited by L1 phonotactics (L2 lit; Eckman 1978; Broselow et al. 1995; …).

Form of phonotactic knowledge • A result from OT Phonotactic systems can be factored into general constraints, ready for cross-linguistic comparison, if the constraints are ranked and violable

The Nonfinality example(adapted from Prince & Smolensky 1993) • Latin: no final stress • Except that monosyllables are stressed. • Have stress >> Nonfinality • Cairene: no final stress • Except for monosyllables and extraheavy finals (CVVC, CVCC) • Have stress, *Stressless extraheavy >> Nonfinality • Gupta’s Hindi: no final stress • Except for monosyllables and the heaviest syllable of the word, if final. • Have stress, *Stressless Heavy >> Nonfinality

It’s an ecumenical result • Open Q: if learners factor out their phonotactic knowledge into general, ranked and violable constraints. • Established result: this factorization yields a far clearer view of phonotactic typology than all previous ones. (cf. Archangeli and Pulleyblank 1987 for attempt to characterize typology by breaking down rules into elementary operations)

Why these laws? • Right context laws: If T ≠ TÓ /_ (V) then T≠ TÓ /_ V If p ≠ t ≠ k /_ (V) then p ≠ t ≠k/_ V • Left context law: If T ≠ ÓT / (V)_ then T ≠ ÓT /V_ If Ê ≠ t / (V)_ then Ê ≠ t /V_ (Steriade 1995, 1999)

Context affects perceptibility • Cues are context dependent. • And (sometimes) asymmetrically distributed: left context essential in T ≠ ÓT , Ê ≠ t right context essential in others. • Scale of optimal perceptibility for some contrast = implicational scale of licensing positions for thatcontrast [Crosswhite 1998, Flemming 1995, Hamilton 1994, Jun 1995, Kirchner 1999, Kochetov 1999-2002, Silverman 1995, Steriade 1994-1999, Zhang 2000, …] • Result: phonotactic laws have identifiable sources in speech perception and production. [general line of thought: Ohala 1990, Lindblom 1990, others]

Knowledge of the general laws? Or just the manifestations to which learners are overtly exposed? Background Jakobson 1941, Prince & Smolensky 1993 Result (and burning issue): Preference for unmarked (e.g. mp vs. np) before knowledge of language specific phonotactics Jusczyk, Smolensky, Allocco 2002

Does the phonotactic grammar “emerge from the lexicon”? [Labphon 5, Coleman & Pierrehumbert 1998, Frisch, Large& Pisoni 2000, Bailey & Hahn 2001] • Knowledge of lexical patterns not attributable to general laws [Ernestus & Baayen 2002, Pierrehumbert 2002] • Knowledge of phonotactic preferences not reflected in lexical patterns [Moreton 2002. Also Shinohara 1997, Fleischhacker 2000, Davidson 2002, Shademan 2002.]

Learning(Tesar & Smolensky 2000, Prince & Tesar 1999, Hayes 1999) • Results: 1st learning models that (a) extend beyond systems of non-interactive parameters (Dresher & Kaye 1990) (b) do not depend on a fixed learning path planted with learning cues (Dresher 1999) Models build on the assumption of violability and (re)-ranking

What I do:Intersection of 3 basic issues • • Source of phonotactic knowledge: • hidden rankings of correspondence conditions [Davidson 2002] • • Knowledge of grammar vs. knowledge of lexicon: • lexicon-based vs. hidden constraint hierarchies • • Nature of phonotactic knowledge: • {context-sensitive Markedness; context-free Correspondence} • vs. {context-free M; context-sensitive C}?

Hidden rankings: intro • The phenomenon in general: loanword adapters (+ others phonotactic freelancers) converge on solutions to phonotactic violation, without prompting from native sound system. • Significance: Any choice of solution to phonotactic violation reveals • implicit knowledge of a correspondence ranking .

Example(based on Cantonese, cf. Silverman 1992, Phonology; cf. alsoMandarin, cf. Broselow et al. 1995 SSLA) • What the lexicon tells the learner: no word final D/TÓ obstruent: *tab, *tapÓ • What it doesn’t: how to fix a deviant input delete bad coda? ta (MAX C) add V? tabi, tapÓi(DEP V) relocate bad feature? dap, tÓap (Linearity) remove coda voicing/asp! tap (Ident voice/asp) • But the learner knows this anyway. • Hidden ranking: MAX C, DEP V, Linearity >> Ident [±voice/±asp]

Hidden rankings in cluster resolution: part 1 • Language disallows CC onset/CC coda. Coda restrictions • Native system lacks relevant alternations: learner can’t tell the fate of bad syllables • Dual pattern of preservation: • Strident C’s preserved as such, in all contexts • Non-stridents lost or modified, depending on context

Cantonese (Silverman 1992) • Phonotactics: *CC onset/coda and *[fricative] in coda. • Phonotactic solutions to deviant inputs: • Post-V: allconsonants preserved, some modified •  Strident codas induce epenthesis: • /bus/ -> pasi, not *pat •  Nonstrident fricative codas become stops: • /shaft/ -> sap, not *safi(t), • Ident (+strident) >> DEP >> Ident (+cont)

Cantonese(cont) • Non-V adjacent context ( (//V)) •  Stridents induce epenthesis: • /tips/ -> tÓipsi, not *tip • /stamp/ -> sitam, not *tam • Non-stridents deleted: /bend/ -> pen, not *penti • /post/ -> posi, not *posit • MAX (strident ((//V)) >>DEP >> MAX C ( (//V)) • MAX (strident ((//V)) >> Contiguity >> MAX C ((//V))

Cantonese (end) • Next to vocoid (V, glide or liquid): All C’s preserved, phonotactics satisfied via epenthesis: /fluke/ -fuluk, not *fuk, *luk (contrast /bend/ -> pen) • A further hidden ranking: MAX (C//(Vocoid)) >> DEP, Contig >> MAX (C ((//Vocoid)) fluk-> fuluk bend -> pen

Similar dual pattern in Loan adaptation into: Hausa: Newman 2000, Dtschang: Bird 1999 Seleyarese: Broselow 1997 Jahai: Burenhult 2001 Sranan: Alber & Plag 1999 others

Hidden rankings part 2:anaptyxis vs. prothesis • Fleischhacker (2000 UCLA MA, 2003 UCLA diss) [also Broselow (1992); Zuraw 2002] • Dual pattern of CC onset avoidance:  Stop-sonorant clusters: anaptyxis Egyptian Arabic: plastic -> bilastik  Other CC clusters, esp. s-stop: prothesis Egyptian Arabic: study -> istadi  s-stop-sonorant clusters: prothesis and anaptyxis Egyptian Arabic: street -> /istirit

The law(degenerate version of Fleischhacker’s) • Anaptyxis in s-Stop implies anaptyxis in Stop-sonorant • Only anaptyxis: Japanese, Punjabi • Only prothesis: Iraqi (but different pattern in sCC) • Both sites, with anaptyxis limited to stop-sonorant: Egyptian, Amharic, Farsi, Kazakh, Sinhalese, Armenian, Wolof, …

Not just phonotactically motivated V-insertion • Pierrehumbert & Nair’s (1995 Lg&Sp) language game bœNk -> b´tœNk • S-stop clusters preserved intact: skœb -> sk´tœb • Obstruent-liquid clusters tend to split: plœn -> p´tlœn [cf. Fleischhacker 2001 for discussion and Zuraw 2002 on parallel pattern in Tagalog]

Relativized contiguity • General solution Contiguity s-stop >> Contiguity stop-sonorant • Anaptyxis ATB (svTV, TvRV): C/_V >> Contig. s-stop >> Contig.stop-sonorant Prothesis ATB (vsTV, vTRV): …>> Contig. stop-sonorant >> C/_V • Anaptyxis in TvRV, prothesis in vsTV Contig. s-stop >> C/_V >> Contig. stop-sonorant

Source of the hidden rankings? • Relative similarity judgments: D (x-y) < D (z-w) • Choice of final devoicing (over C-delete,epenthesis…) D(T-D/_#) < D(C-Ø), D(V-Ø), Steriade 2002, below • Choice of anaptyxis over prothesis in stop-son.: D (TR-TvR) < D (TR-vTR) ): Fleischhacker 2000 • Choice of prothesis over anaptyxis in s-stop D (sT-vsT) < D (sT-svT): Fleischhacker 2000 • Choice of C-preservation by context D(C-Ø( (//V)) < D(C-Ø(//V))

P-map • Set of relative perceptual similarity judgments. • Rooted in “phonetic knowledge” (Kingston & Diehl Lg 1994) • Similarity rankings provide a tool for inferring: (a) the form of correspondence constraints; (b) their rankings • E.g, if learner knows (b-d) > (m-n), he infers that (a) Ident place/ oral C ≠ Ident place in nasal C (b) Ident place/ oral C >> Ident place in nasal C • And conversely, if he believes (b-d) = (m-n), then he is free to posit a single constraint Ident place; or 2 constraints but fail to rank them • Wilson 2000: alternative way of building similarity relations into phonology

Expectations of universality? • Some similarity rankings should be constant across languages: • if based on inherent asymmetries in cue distribution between contexts: e.g. C//V vs. C/( (//V)) • No reason to expect ATB universality: (a) VNT vs. VND in Romanian (gradient post-nasal voicing) • vs. VNT [V)T] vs. VND [VND ] in English (b) [±stress] diff. in Spanish vs. French (Dupoux et al. 1999)

The real expectation • Judgments of relative similarity should correlate with choices of phonotactic repair. • And, if the similarity judgment is cross-linguistically constant, then choice of repair strategy should be too.

Sources of similarity data • Overt judgments (Mohr & Wang 1968; Singh 1970; Magen 1998; Fleischhacker 2000…) • Confusion --in noise, in quiet (Miller & Nicely 1956…) • Speeded discrimination tasks (Seo 2001…) • Similarity judgments implicit in choice of • half-rhymes (“time-nine”: Zwicky 1976 CLS, Steriade & Zhang 2001) • imperfect puns (“shrubs to gardener: eucalyptus!” Zwicky archive; Fleischhacker to appear)

Evidence for choice of repair? • Phonotactic systems: lexically manifest alternations • Phonotactic free-lancing: (on-line) adaptation • Correlate these choices with similarity rankings: • Results partially diverge: greater uniformity of choice in free-lancing. better fit with similarity ranking in free-lancing. • Phonology is unnatural. (Anderson LI 1989) • Lexically entrenched phonotactic systems are unnatural.

*NC 9: no nasal + voiceless(Pater 1995) • All but (g) are attested in phonotactic systems. • Only (a) is robustly attested in free-lancing.

Satisfying *NC 9 in Bantu: OshiKwanyamaSteinbergs 1983 SAL (a) Native lexicon • Roots: Nasal followed by voiced C only: kombo ‘goat’ , no *kompo • Prefix+Root: Merger: oku-pota ‘be rude’, on-pote -> [omote] ‘good for nothing’ (b) Loans • Roots: Postnasal voicing: stamp -> [sitamba], print -> [pelenda], ink -> [o-iNga] • Prefix+Root: Postnasal voicing: papier (Afrikaans) -> [om-bapila], kErk (Afr.) -> [oN-geleka]

Similar split in • Lumasaaba (Brown 1968): • NC induces C deletion in old alternations • Young speakers substitute Post N voicing alternations, via dialect borrowing. • Cephalonian Greek: loans from Romance • Mazateco: loans from Spanish

Verify cross-process rankings • If correspondence rankings derive from P-Map, they should be same across distinct phonological processes, in all languages (if same similarity rankings obtain). • Ranking for final devoicing MAX C >> Ident [±voice] • This ranking contradicted by some *NC systems (e.g. C-deletion: Ident [±voice] >> MAX C) • But confirmed by all free-lance solutions to *NC MAX C >> Ident [±voice]

One can infer, then • Free-lance solutions to *NC are based on fixed correspondence rankings with a constant source. • But systems of alternation are affected by additional forces. • Telescoped series of sound changes?

Historical source of *NC 9 –induced merger • Lynch (1975, OcLx) reconstruction of Proto-Oceanic: • I Prefix (na-ka) • II V-loss (nka) • III Assimilation (Nka) • IV Post NasalVoicing (Nga) • V Extension of nasal phase (Na) • End result: ka -Na rather than ka - ga alternations • Perhaps same scenario in Bantu etc.

Burning question to Kie Zuraw • What interaction of grammar and lexicon can generate dissimilar alternants through successive sound changes?

Another burning question • How does learner reconcile the conflicting correspondence hierarchies? • Established lexical stock: NC merger Ident [voice] >> Uniformity • Loans: Post-N voicing Uniformity >> Ident voice

Narrow lexical override • Technically: constraint indexing (Fukazawa 1998ROA) Ident [voice] (I-native O (list))>> Uniformity >> Ident voice (I-O) • Lexical evidence is narrowly construed: as bearing only on the the phonology of lexical classes where the evidence originates • Similarity evidence is broadly construed as bearing potentially on all phonological patterns

Half Rhymes (HR) as evidence for similarity ranking • Fact: some HR’s are more frequent than others (time-nine vs. fab-glad raised-days vs. raised-raids) • H1: Frequent HR’s are closer to identity • H2: similarity judgments determining HR choice = those determining choice of repair strategy • (H3: HR choice is governed by a linguistic system of ranked correspondence constraints.)

Initial questions • Does the incidence of feature mismatch in the rhyme domain (RD) depend on context? • E.g. Cab-Cap vs. Cába-Cápa • Does it increase in contexts of reduced perceptibility?

Romanian HR corpus • A translation corpus: 6 rhymed translation texts, mostly from Russian, 1956-1971. • Totals: 693 SR’s/ 9791 rhyming pairs. SR frequency: from a high of 18% to a low of .006% • A poetry corpus: 2+ native poets, 1950-1961. • Totals (for 2): 167 SR/6050 rhyming pairs. SR frequencies: 0.58% and 10% respectively • A poet’s private rhyming dictionary: Mihai Eminescu (cca 1880) Dict7ionar de rime

Uniformity of preference? • Do poets (if contemporary, same dialect) share a hierarchy of HR preferences? • Yes: certain HR types occur in all texts. NT-ND (skimb-timp) uNC-ÈNC (skund-rÈnd) • Sparse HR users concentrate on shared core set. • Liberal HR users augment it with additional types. • Relative frequency of HR types in any given text mirrors, in general, their position on shared hierarchy TV-DV implies NT-ND in any text TV-DV less frequent than NT-ND

Phonotactics and knowledge of relative similarity

Phonotactics and knowledge of relative similarity

Presentation Transcript

Similarity and Difference

Uses of Similarity

Phonology, Part IV: Rules, Syllables and Phonotactics

Phonology, part 6: Syllables and Phonotactics

Similarity and clustering

Syllable Structure, Phonotactics, and Stress

Phonotactics

INTERFERENCE AND SIMILARITY

Similarity and Transformations

Similarity and Denoising

Congruence and Similarity

PHONOTACTICS AND SYLLABLE

Similarity and Transformations

LING 696B: Gradient phonotactics and well-formedness

The Difference of Being Similar: Competence Similarity and Knowledge Sharing in Workgroups

Phonology, Part VI: Syllables and Phonotactics

Relative Velocity and Relative Acceleration

Congruence and Similarity

Transformations and Similarity

Symmetry and Similarity

Relative pronouns and relative clauses

Congruence and Similarity