450 likes | 579 Views
INFORMATICA UMANISTICA D: LESSICOGRAFIA E COMPUTER. Semantica lessicale Tesauri WordNet. SEMANTICA LESSICALE. Nella lezione 2 iniziammo a discutere la caratterizzazione del significato delle parole nei dizionari contemporanei
SEMANTICA LESSICALE • Nella lezione 2 iniziammo a discutere la caratterizzazione del significato delle parole nei dizionari contemporanei • In questa lezione discuteremo piu’ in dettaglio queste definizioni, e parleremo di altri tipi di dizionari che cercano di caratterizzare questi significati in modo piu’ preciso: tesauri e WordNet
TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA: • “stating the superordinate concept next to the definiendum together with at least one distinctive feature” • SINONIMIA • TIPICALITA’ • USO
GENUS DIFFERENTIAE GENUS E DIFFERENTIA horsenoun 1 a solid-hoofed plant-eating domesticated mammal with a flowing mane and tail, used for riding, racing, and to carry and pull loads New Oxford Dictionary of English
LIMITI DELLA DEFINIZIONE VIA GENUS & DIFFERENTIA (lez.2) • Putnam: • `faggio’ / `olmo’ • `diamante’ / `zircone’ • Jackson: happen vs occur vs befall vs transpire • Everything is illuminated: `harmonize’ vs `agree’,
TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA • SINONIMIA • Molte parole, specialmente astratte, difficili da definire in modo analitico • In questo caso si usano sinonimi • TIPICALITA’ • USO
CIRCOLARITA DEFINIZIONE PER SINONIMIA miserable 1 very unhappy, wretched 2 causing misery 3 squalid 4 mean unhappy 1 sad or depressed 2 unfortunate or wretched wretched 1 miserable or unhappy 2 worthless Collins Pocket English Dictionary (2000)
TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA • SINONIMIA • TIPICALITA’ • La definizione specifica cos’e’ “tipico” del referente • USO
DEFINIZIONE PER TIPICALITA’ day of rest a day set aside from normal activity, typically, Sunday on religious grounds measles an infectious viral disease causing fever and a red rash, typically occurring in childhood Concise Oxford Dictionary
TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA • SINONIMIA • TIPICALITA’ • USO • La definizione spiega l’uso di una parola • Tipica specialmente per le parole funzionali (articoli, preposizioni, etc)
RELAZIONI DI SIGNIFICATO • Molte di queste definizioni stabiliscono il significato di una parola tramite relazioni di significato con altre parole: • IPONIMIA: cane / animale • SINONIMIA: scemo / cretino • ANTONIMIA: giusto / sbagliato • MERONIMIA: cavallo / criniera
IPONIMIA • HYPONYMY is the relation between a subclass and a superclass: • CAR and VEHICLE • DOG and ANIMAL • BUNGALOW and HOUSE • Generally speaking, a hyponymy relation holds between X and Y whenever it is possible to substitute Y for X: • That is a X -> That is a Y • E.g., That is a CAR -> That is a VEHICLE. • HYPERNYMY is the opposite relation
SINONIMIA • Two words are SYNONYMS if they have the same meaning at least in some contexts • E.g., PRICE and FARE; CHEAP and INEXPENSIVE; LAPTOP and NOTEBOOK; HOME and HOUSE • I’m looking for a CHEAP FLIGHT / INEXPENSIVE FLIGHT • From Roget’s thesaurus: • OBLITERATION, erasure, cancellation, deletion • But few words are truly synonymous in ALL contexts: • I wanna go HOME / ?? I wanna go HOUSE • The flight was CANCELLED / ?? OBLITERATED / ??? DELETED
ANTONIMIA • La relazione di antonimia lega lemmi con significati opposti: • giusto / sbagliato; piccolo / grande • Alle volte anche antonimia ‘estesa’ • destra / sinistra; cane / gatto
ANTONIMIA artificial not real conventional not spontaneous or sincere or original vacant not occupied Concise Oxford Dictionary 9
MERONIMIA • La relazione tra le parti ed il tutto: • Criniera / cavallo; ruota / auto
HYPERNYM PARTI MERONIMIA NELLE DEFINIZIONI horsenoun 1 a solid-hoofed plant-eating domesticated mammal with a flowing mane and tail, used for riding, racing, and to carry and pull loads New Oxford Dictionary of English
QUANTI SIGNIFICATI? • horsenoun • 1 a solid-hoofed plant-eating domesticated mammal with a flowing mane and tail, used for riding, racing, and to carry and pull loads • Equus caballus, family Equidae (the horse family), descended from the wild Przewalski’s horse. The horse family also includes the asses and zebras. • An adult male horse; a stallion or gelding. A wild mammal of the horse family • 2 a frame or structure on which something is mounted or supported, especially a sawhorse. • 3 [mass noun] informal heroin • 4 informal a unit of horsepower: the huge 63-horse 701-cc engine • 5 Mining an obstruction in a vein • New Oxford Dictionary of English
QUANTI SIGNIFICATI? horsen 1 a domesticated perissodactyl mammal, Equus caballus, used for draught work and riding: family Equidae 2 the adult male of this species; stallion. 3 wild horse. 3a a horse (Equus caballus) that has become feral. 3b another name for Przewalski’s horse. 4a any other member of the family Equidae, such as the zebra or ass. 4b (as modifier): the horse family5 (functioning as pl) horsemen, especially cavalry: a regiment of horse6 Also called: buckGymnastics: a padded apparatus on legs, used for vaulting, etc 7 a narrow board supported by a pair of legs at each end, used as a frame for sawing or as a trestle, barrier, etc 8 a contrivance on which a person may ride and exercise 9 a slang word for heroin10Mining a mass of rock within a vein or ore. 11Nautical. A rod, rope or cable, fixed at the ends, along which something may slide by means of a thimble, shackle, or other fitting; traveller. 12Chess. An informal name for knight. 13Informal. Short for horsepower. 14 (modifier) drawn by horse or horses: a horse cart. Collins English Dictionary 4
OMONIMIA E POLISEMIA • OMONIMIA: I significati sono ben distinti (e.g., etimologie diverse) • BANK • ‘SCANNARE’ come ‘fare a pezzi’ / ‘italianizzazione di TO SCAN’; GRU come uccello / macchina per sollevare pesi • POLISEMIA: i significati sono collegati • MOUTH • VERDE’ come ‘avente un certo colore’ e come ‘ricco di vegetazione’
QUANTI SIGNIFICATI? The `lumpers’ like to lump meanings together and leave the user to extract the nuance of meaning that corresponds to a particular context, whereas the `splitters’ prefer to enumerate differences of meaning in more detail; the distinction corresponds to that between summarizing and analysing. Allen, R. Lumping and splitting, English today, 16(4), 61-3
CRITERI ? • GRAMMATICALI • Sensi nominali vs verbali • Usi transitivi & intransitivi (Hirst, 1987) • Ross KEPT staring at Nadia’s decolletage • Nadia KEPT calm and made a cutting remark • Ross wrote of his embarassment in the diary that he KEPT. • COLLOCAZIONI • isometric da CED4: • (of a crystal or system of crystallization) having three mutually perpendicular equal axes • (of a method of projecting a drawing in three dimensions) having the three axes equally inclined and all lines drawn to scale • ETIMOLOGIA
PROBLEMI • Gia’ menzionato: distinzioni di senso non sempre facili • Circolarita’ • Relazioni non usate in modo coerente
EAT-LEX-1 SEMANTICA & LESSICO: UN RIASSUNTO “eat” “eats” eat0600 eat0700 “ate” “eaten” WORD-FORMS LEXEMES SENSES
STOCK-LEX-1 STOCK-LEX-2 STOCK-LEX-3 L’ORGANIZZAZIONE DEL LESSICO stock0100 stock0200 stock0600 “stock” stock0700 stock0900 stock1000 WORD-FORMS LEXEMES SENSES
CHEAP-LEX-1 CHEAP-LEX-2 INEXP-LEX-3 SINONIMIA cheap0100 “cheap” …. …… cheapXXXX inexp0900 “inexpensive” inexpYYYY WORD-FORMS LEXEMES SENSES
TESAURI • Dizionari organizzati per argomenti sono apparsi simultaneamente a quelli organizzati alfabeticamente (Ǽlfric: Glossary, ~ 1000) • Piu’ famoso dizionario tematico: Peter Mark Roget, Thesaurus of English Words and Phrases, apparso per la prima volta nel 1852
ROGET THESAURUS: CLASSI • ABSTRACT RELATIONS Sezioni: Existence, relation, quantity, order, number, time, change, causation • SPACE • MATTER • INTELLECT • VOLITION • AFFECTIONS
ROGET’S THESAURUS: SEZIONI & INSIEMI DI PAROLE • ABSTRACT RELATIONS • ….IV. ORDER • 1. GENERAL 58 Order 59 Disorder 60 Arrangement 61 Derangement • 2. CONSECUTIVE 62 Precedence 63 Sequence 64 Precursor 65 Sequel 66 Beginning 67 End 68 Middle
WORDNET • A lexical database created at Princeton • Freely available for research from the Princeton site • http://www.cogsci.princeton.edu/~wn/ • Information about a variety of SEMANTICAL RELATIONS • Three sub-databases (supported by psychological research as early as (Fillenbaum and Jones, 1965)) • NOUNs • VERBS • ADJECTIVES and ADVERBS • Each database organized around SYNSETS
SYNSETS • Senses (or `lexicalized concepts’) are represented in WordNet by the set of words that can be used in AT LEAST ONE CONTEXT to express that sense / lexicalized concept: the SYNSET • E.g., {chump, fish, fool, gull, mark, patsy, fall guy, sucker, shlemiel, soft touch, mug}(gloss: person who is gullible and easy to take advantage of)
IL DATABASE DEI NOMI • About 90,000 forms, 116,000 senses • Relations:
IPERNIMIA 2 senses of robin Sense 1robin, redbreast, robin redbreast, Old World robin, Erithacus rubecola -- (small Old World songbird with a reddish breast) => thrush -- (songbirds characteristically having brownish upper plumage with a spotted breast) => oscine, oscine bird -- (passerine bird having specialized vocal apparatus) => passerine, passeriform bird -- (perching birds mostly small and living near the ground with feet having 4 toes arranged to allow for gripping the perch; most are songbirds; hatchlings are helpless) => bird -- (warm-blooded egg-laying vertebrates characterized by feathers and forelimbs modified as wings) => vertebrate, craniate -- (animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain enclosed in a skull or cranium) => chordate -- (any animal of the phylum Chordata having a notochord or spinal column) => animal, animate being, beast, brute, creature, fauna -- (a living organism characterized by voluntary movement) => organism, being -- (a living thing that has (or can develop) the ability to act or function independently) => living thing, animate thing -- (a living (or once living) entity) => object, physical object -- => entity, physical thing --
MERONIMIA wn beak –holon Holonyms of noun beak 1 of 3 senses of beak Sense 2 beak, bill, neb, nib PART OF: bird
VERBI • About 10,000 forms, 20,000 senses • Relations between verb meanings:
RELAZIONI TRA SIGNIFICATI VERBALI V1 ENTAILS V2 when Someone V1 (logically) entails Someone V2- e.g., snore entails sleep TROPONYMY when To do V1 is To do V2 in some manner- e.g., limp is a troponym of walk
AGGETTIVI & AVVERBI • About 20,000 adjective forms, 30,000 senses • 4,000 adverbs, 5600 senses • Relations:
COME USARLO • Online: http://cogsci.princeton.edu/cgi-bin/webwn • Scaricatevelo, poi da command line: • Get synonyms: • wn –synsn bank • Get hypernyms: • wn –hypen robin • (also for adjectives and verbs): get antonyms • wn –antsa right
I LIMITI DI WORDNET • Coverage • words not in WordNet • Crocidolite, spinoff (spin-off) • Missing information: MERONYMY • Context-dependent senses: • slump, crash, bust all synonyms in the WSJ corpus • The structure of WordNet • Some information is encoded in complex ways (room, wall, floor) • But: MOVING TARGET!!
MERONIMIA IN WORDNET: UN ESPERIMENTO • 100 bridging descriptions in a mereological relation • Ran a script trying to find a direct link in WordNet (1.7) between one of the senses of the BD and one of the senses of any of the previous NPs • Results: in only 6 cases there is in WordNet a direct lexical relation between a BD and one of the CFs
SOLUZIONE: ACQUISIZIONE LESSICALE • Parziale (aggiungi informazioni a WordNet, specialmente per domini specialistici) • Totale (crei un nuovo lessico a partire da zero)
LETTURE • Jackson, cap. 8 • C. Fellbaum. WordNet: An electronic lexical database. MIT Press, 1998 • cap. 1