1 / 37

An example of a good translation

An example of a good translation. En inbyggd oljepump levererar olja under tryck både till hydraulsystemet och växellådans oljesystem.  An integrated oil pump delivers pressurised fluid both to the hydraulic system and to the lubrication system of the gearbox. An example of a bad translation.

eugene
Download Presentation

An example of a good translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An example of a good translation • En inbyggd oljepump levererar olja under tryck både till hydraulsystemet och växellådans oljesystem. • An integrated oil pump delivers pressurised fluid both to the hydraulic system and to the lubrication system of the gearbox. @ Anna Sågvall Hein 2005

  2. An example of a bad translation • Stackars Kalle var rädd. -> • Wretched cold each cautious. @ Anna Sågvall Hein 2005

  3. Fundamental problems in MT • lexical ambiguity in SL • translation ambiguity • grammatical differences between SL and TL @ Anna Sågvall Hein 2005

  4. Lexical ambiguity in SL • form • hus (sg/pl, basic/genitive case) • part-of-speech • var (verb, pronoun, adverb, noun) • polysemy • fil (milk, tool, traffic lane) @ Anna Sågvall Hein 2005

  5. Handling form and part-of-speech ambiguity in SL • Syntactic analysis of the input sentence • Han köpte ett nytt hus (sg, basic case). • Stackars Kalle var(verb) rädd. @ Anna Sågvall Hein 2005

  6. Handling polysemy • domain • context • rules based on grammatical analysis • anta en elev (obj) –> admit a student • anta, att (compl) -> suppose that • examples @ Anna Sågvall Hein 2005

  7. Handling translation ambiguity • rules based on grammatical analysis • bilen på gatan -> the car on the street (lokationsattribut) • taket på huset -> the roof of the house (partonymiattribut) • examples @ Anna Sågvall Hein 2005

  8. Grammatical differences • morphology • Hon köpte en liten hund. -> Sie hat einen kleinen Hund gekauft. • syntax • Genom att svänga till vänster hittar du huset. -> Turning left you will find the house. • word order • Sedan gick han hem. -> Then he went home. @ Anna Sågvall Hein 2005

  9. Handling grammatical differences • syntactic-semantic analysis + transfer rules • deep analysis (interlingua) + generation according to TL grammar • examples @ Anna Sågvall Hein 2005

  10. Re-use techniques • sentence alignment • linking source and target sentences pairwise • success rate close to 100 % • translation memories • basis for word alignment @ Anna Sågvall Hein 2005

  11. Sentence alignment • I oljefilterhållarensitter en överströmningsventil. • The oil filter retainer has an overflow valve. • (sventscan3888 1-1) • Undvik hudkontakt med kylvätska. Hudkontakt kan medföra irritation. • Avoid contact with the skin as this may cause irritation. • (sventscan3200 2-1) @ Anna Sågvall Hein 2005

  12. Sentence alignment, cont. • Skruvarna sträcks vid varje åtdragning, därför får skruvarna i en del förband återanvändas endast ett visst antal gånger. • Bolts are stretched each time they are tightened. For this reason, the bolts in some joints should only be reused a certain number of times. • (sventscan783 1-2) @ Anna Sågvall Hein 2005

  13. Re-use techniques, cont. • word alignment • linking sub-sentence segments, typically, source and target words and phrases, pair-wise • co-occurrence, word similarity, dictionary • large-scale processing • success rate close to 80 % • translation dictionaries • bi- or multi-lingual term databases • data-driven machine translation @ Anna Sågvall Hein 2005

  14. A word alignment example • Jag tar mittplatsen, som jag inte tycker om. • I take the middle seat, which I dislike. • jag – I • tar – take • mittplatsen – the middle seat • som – which • jag – I • inte tycker om – dislike • (from Tiedemann 2003) @ Anna Sågvall Hein 2005

  15. Evaluation of MT • human • adequacy • acceptance • automatic • comparison with a gold standard • n-gram technique: e.g. BLEU, NEVA • edit distance • See further http://stp.ling.uu.se/~evafo/gslt_eval.pdf (OH-presentation by Eva Forsbom) @ Anna Sågvall Hein 2005

  16. Automatic evaluation, ex. 1 • SL: Framställningsmetod och särskild beredningsmetod: En hög kvalitet på råvaran komjölk är viktig för tillverkningen. • MT: Manufacturing method and special manufacturing method: A high quality of the raw material cow's milk is important to the production. • Ref: Specific production or manufacturing method: High-quality cow's milk is important to production. • NEVA: 0,27 @ Anna Sågvall Hein 2005

  17. Automatic evaluation, ex. 2 • SL: Mjölkråvaran som används för ystning pastöriseras till 72 ºC i 15 sekunder. • MT: The milk that is used for coagulation is pasteurised to 72 ºC for 15 seconds. • Ref: The milk used for coagulation is pasteurised at 72 ºC for 15 seconds. • NEVA: 0,59 @ Anna Sågvall Hein 2005

  18. Basic translation strategies • rule-based translation • direct translation • transfer-based translation • interlingua translation • datadriven translation • statistical translation • example-based translation • hybrids @ Anna Sågvall Hein 2005

  19. Direct translation • translation proceeds word by word, or phrase by phrase • no intermediary sentence structure • the most important language component is a translation dictionary • translation problems are handled more or less ad hoc by means of specific rules @ Anna Sågvall Hein 2005

  20. Simplistic direct approach • sentence splitting • tokenisation • handling capital letters • dictionary look-up and lexical substitution incl. heuristics for handling ambiguities • copying unknown words, digits, signs of punctuation etc. • formal editing @ Anna Sågvall Hein 2005

  21. Advanced direct approach(Tucker 1987) • source text dictionary look-up and morphological analysis • identification of homographs • identification of compound nouns • identification of nouns and verb phrases • processing of idioms @ Anna Sågvall Hein 2005

  22. Advanced approach, cont. • processing of prepositions • subject-predicate identification • syntactic ambiguity identification • synthesis and morphological processing of TL • rearrangement of words and phrases in TL @ Anna Sågvall Hein 2005

  23. Feasibility of direct translation • quality • typically browsing quality • depends on • the quality of the translation dictionary • the coverage of the translation rules • editing quality may be achieved • problems with • ambiguity • inflection • word order • other structural differences @ Anna Sågvall Hein 2005

  24. SYSTRAN • SYStem TRANslation • advanced direct translation (moving towards transfer-based translation) • http://babelfish.altavista.com/ • http://www.systranet.com/systran/net ) @ Anna Sågvall Hein 2005

  25. EC Systran • 1,600,000 dictionary units • 20 domain dictionaries • daily use by EC translators, administrators of the European institutions @ Anna Sågvall Hein 2005

  26. Ex. 1: fairly good translation • "Enskilda företagare som inte bildat bolag klassificeras hit."  • "Individual entrepreneurs that have not formed companies are classified  here.” • Systemet känner igen bildat som en perfektform och översätter korrekt have formed, trots att hjälpverbet är utelämnat. Negationen not placeras på rätt plats. @ Anna Sågvall Hein 2005

  27. Ex. 2: word order problem/ Systran sv-en • "När byarna kontaktades hade de inte ens utsatts för influensa."  • "When the villages were contacted had they not even been exposed to flu.” • Systemet hittar inte subjekt och predikat och ger därför fel ordföljd. @ Anna Sågvall Hein 2005

  28. Ex. 3: ambiguity problem • "Vad kan vi lära av Arrawetestammen?"  • "What can we faith of the Arawete?” • Systemet hittar inte sambandet mellan kan och lära och ser därför inte att lära är ett verb. @ Anna Sågvall Hein 2005

  29. Ex. 4: ambiguity problem • ”Extrapoleringen går till så här. "  • ”The extrapolation goes to so here.” • Systemet känner inte till partikelverbet känna till och översätter därför felaktigt ord för ord. @ Anna Sågvall Hein 2005

  30. Transfer-based translation • intermediary sentence structure • provides a basis for the systematic handling of grammatical problems and some types of lexical choices • basic processes • analysis • transfer • generation (synthesis) @ Anna Sågvall Hein 2005

  31. Transfer-based translation, cont. • knowledge-intensive • language modules • dictionary and grammar of SL • transfer dictionary and transfer rules • dictionary and grammar of TL @ Anna Sågvall Hein 2005

  32. Multra • transfer-based translation engine • transfer via grammatical relations • TL word order not inherited from SL • modular • unification-based • focus on restricted domains • developped at Uppsala University @ Anna Sågvall Hein 2005

  33. An example • Sv. I oljefilterhållaren sitter en överströmningsventil. • En. The oil filter retainer has an overflow valve. • (from the Scania corpus) • transfer rule: • sitter  has, adv  subj, subj  obj @ Anna Sågvall Hein 2005

  34. Interlingua translation • analysis of SL sentence into a language-independent meaning representation, an interlingua • ideally, no trace of the SL structure in the interlingua • generation of TL sentence from the interlingua @ Anna Sågvall Hein 2005

  35. Statistical machine translation • translation model based on word alignment • language model based on n-grams • decoding algorithm • selecting the most probable combination of alternatives in the translation model and the language model @ Anna Sågvall Hein 2005

  36. Statistical MT on the market • Language Weaver • http://www.languageweaver.com/ @ Anna Sågvall Hein 2005

  37. Example-based machine translation • non-trivial use of translation examples in the translation process • preliminary definition • alignment of texts • matching of input sentences against phrases (examples) • selection and extraction of equivalent TL phrases • adaptation and combination of TL phrases as acceptable output sentences (from Hutchins, J., Towards a definition of example-based machine translation. Proc. of Workshop: Example-Based Machine Translation. MT SUMMIT X. Phuket. Thailand. 2005) @ Anna Sågvall Hein 2005

More Related