1 / 1

Results: Prominence prediction without lexical information

Predicting Relative Prominence in Noun-Noun Compounds. Theories about prominence in Noun-noun compounds Structural theory (Bloomfield ,1933; Marchand , 1969; Heinz, 2004) NN compounds are regularly left-prominent; right-prominent NN combinations are syntactic phrases.

parson
Download Presentation

Results: Prominence prediction without lexical information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Predicting Relative Prominence in Noun-Noun Compounds • Theories about prominence in Noun-noun compounds • Structural theory (Bloomfield ,1933; Marchand, 1969; Heinz, 2004) • NN compounds are regularly left-prominent; right-prominent NN combinations are syntactic phrases. • Analogical theory (Schmerling, 1971; Olsen, 2000) • Prominence assignment in analogy to similar compounds in lexicon. • Semantic theory (Fudge, 1984; Liberman and Sproat, 1992) • Relative prominence is decided by semantic relationship between the two nouns. • Informativenesstheory (Bolinger, 1972; Ladd, 1984) • Relatively more informative and unexpected noun is given greater prominence. • Ourpaper: Compares informativeness theory and semantic composition theory using corpus-driven statistical methods in discourse-neutral contexts. • Abstract • There are several theories regarding what influences prominence assignment in English noun-noun compounds. We have developed corpus-driven models for automatically predicting prominence assignment in noun-noun compounds using feature sets based on two such theories: the informativeness theory and the semantic composition theory. The evaluation of the prediction models indicate that though both of these theories are relevant, they account for different types of variability in prominence assignment. • Prosodic Prominence in Text-to-Speech Synthesis • Prosody plays a vital role in the intelligibility and naturalness of text-to-speech synthesis. • Prosody prediction involves predicting which words of the text are to be perceptually prominent. • Prominence of a word is acoustically realized by endowing the synthesis of the word with greater pitch, greater energy and/or longer duration than the neighboring words. • Relative prominence prediction in noun-noun compounds remains a challenging problem. • Prominence in Noun-noun compounds • Example noun-noun compounds and their discourse-neutral prominence structure: • Whitehouse, cherry pie, parking lot, Madison Avenue, WallStreet, nail polish, french fries, computer programmer, dog catcher, silk tie, and self reliance. • In discouseneutral context, noun-noun compounds have a leftmost prominence structure – the left noun is more prominent than the right noun. • However, 25% of noun-noun compounds have right prominence structure. (Liberman and Sproat, 1992). • Different theories about relative prominence assignment in noun-noun compounds exist. • Informativeness (INF) Measures • We used the following five metrics to compare the individual and relative informativeness of nouns in each noun-noun compound. • Unigram Predictability (UP): Logarithm of the probability of a word given a text corpus: • Bigram Predictability (BP):Logarithm of the conditional probability of the second noun, given the first noun. • Pointwise Mutual Information (PMI): Logarithm of the ratio of the joint probability of the two nouns and the product of their marginal probabilities. • Dice Coefficient (DC): A collocation measure defined as: • PointwiseKullback-Leibler Divergence (PKL): Relative entropy of the second noun given the first noun. • Semantic Relationship Modeling • Each of the two nouns in a noun-noun compound is assigned a semantic category vector. • Semantic category vector (SCV): 26 elements representing categories (such as food, event, act, location, artifact) assigned to nouns in WordNet. • SVC of a noun is a bit vector of 26 dimensions with an element assigned a value of 1 if the lemmatized noun is assigned the associated category by WordNet, 0 otherwise. • Semantic relationship (SRF) of two nouns is defined as a cross-product of their semantic category vectors. • Semantic Informativeness Features (SIF): We also maintain, for each noun, • Number of possible synsets associated with the noun • Left positional family size • Right positional family size • Positional family size is the number of unique noun-noun compounds that include the particular noun, either on the left or on the right (Bell and Plag, 2010). • Intuition: • Smaller the synset count, the more specific the meaning of a noun and hence more information content. • Larger positional family size indicates that the noun is present in many possible compounds, and less likely to receive higher prominence. • Experiments • Data description: • Corpus of 7767 noun-noun compounds randomly selected from the Associated Press newswire. • Hand-labeled for left or right prominence (Sproat, 1994) • Computed the informativeness features for each word using LDC English Gigaword corpus. • Semantic category vectors for each noun were constructed using Wordnet. • Using each of the three feature sets, we built a Boostexter-based discriminative binary classifier (Freund and Schapire, 1996) to predict relative prominence. • Training data: 6835 samples; Test data: 932 samples • Evaluation: Average prominence prediction error using 5-fold cross validation. • Baseline: Majority class (left noun prominence) to all test samples. • Results: Prominence prediction without lexical information • Each type of feature reduces the error rate over the baseline. • SRF and INF features appear to be more predictive than SIF features. • Overall reduction can be as large as 32% over the baseline error when all features are combined. • Results: Prominence prediction with lexical information • Incorporating lexical information provides substantial improvement; more than 52% error reduction over baseline error. • (Sproat 1994): Relative error reduction over baseline using SIF: 46.6% • Summary • Presented a comparison of two theories of prominence in noun-noun compounds using data driven methods. • Each theory accounts for different types of variability in prominence assignment. • Lexical information improves prominence prediction substantially over baseline models; • Non-lexical models have broader coverage and still provide significant error reduction.

More Related