1 / 41

Results of the STEVIN programme STEVIN Final Event, Rotterdam, Nov 28 2011

This is a summary of the final event of the STEVIN programme, which focuses on the objectives, digital language infrastructure, creation and resource management, IPR, strategic research, and LST community consolidation. Various statistics are also included.

annaramsey
Download Presentation

Results of the STEVIN programme STEVIN Final Event, Rotterdam, Nov 28 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Results of the STEVIN programme STEVIN Final Event, Rotterdam, Nov 28 2011 Jan Odijk

  2. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  3. STEVIN Objectives • Digital Language Infrastructure (DLI) • Strategic Research (SR) • LST community consolidation (CC) STEVIN Final Event, Rotterdam, 28 Nov 2011

  4. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  5. Digital Language Infrastructure • Creation • Resource Management • IPR STEVIN Final Event, Rotterdam, 28 Nov 2011

  6. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  7. DLI: Creation • Priorities for written language: • A large corpus of written Dutch • An electronic lexicon • Parallel corpora STEVIN Final Event, Rotterdam, 28 Nov 2011

  8. Realisation: Written (1) • D-COI + SONAR: 500M word corpus (a) • LASSY: 1M word Treebank (a) • CORNETTO: 40k entry lexical semantic database (b) • DPC: 10M word parallel corpus D-E / D-F (c ) STEVIN Final Event, Rotterdam, 28 Nov 2011

  9. Realisation: Written (2) • COREA: co-reference corpus (a) • IRME: 5k MWE lexical database (b) • DAESO: 1M word monolingual parallel corpus (c) • DAISY (a) • DUOMAN (a) • PACO-MT (a,c) STEVIN Final Event, Rotterdam, 28 Nov 2011

  10. Creation: Priorities Speech (1) • speech and multimodal corpora for CALL, NAW, CCQA applications • multimodal corpora for • broadcast news transcription or • person identification; • text corpora for stochastic language models; STEVIN Final Event, Rotterdam, 28 Nov 2011

  11. Creation: Priorities Speech (2) • tools and data for the development of • robust speech recognition; • automatic annotation of corpora; • speech synthesis; STEVIN Final Event, Rotterdam, 28 Nov 2011

  12. Realisation: Speech (1) • Autonomata (a, NAW; e) • JASMIN-CGN (a, CALL) • D-COI + SONAR (c ) • SPRAAK (d) • STEVINcanPRAAT (d) STEVIN Final Event, Rotterdam, 28 Nov 2011

  13. Realisation: Speech (2) • Missing • (b) Multimodal corpora • But partially covered by other projects • EU: AMI, AMIDA (U Twente) • NL: IMIX STEVIN Final Event, Rotterdam, 28 Nov 2011

  14. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  15. DLI: Resource Management • HLT Agency set up • See presentation by Remco van Veenendaal STEVIN Final Event, Rotterdam, 28 Nov 2011

  16. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  17. DLI: IPR • Systematic attention for IPR & Ethical Issues from the start • Not easy but • The only way to ensure usage of LRs by the R&D community in a legal manner • Specific regulation on how to deal with IPR in the STEVIN programme and projects STEVIN Final Event, Rotterdam, 28 Nov 2011

  18. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  19. Strategic Research • Will be dealt with by Walter in his presentation • Work programme lists examples of applications • how do STEVIN projects contribute to such applications (directly or indirectly) STEVIN Final Event, Rotterdam, 28 Nov 2011

  20. SR: Applications (1) • Information extraction from Speech: • Rechtspraakherkenning, NEON, and SNRT • AUTONOMATA, JASMIN-CGN, SPRAAK, STEVINcanPRAAT, N-BEST, AUTONOMATA TOO and MIDAS. • Detection of accent and identity of speakers. • JASMIN-CGN, SPRAAK, DISCO, Diademo, Rechtspraakherkenning STEVIN Final Event, Rotterdam, 28 Nov 2011

  21. SR: Applications (2) • Extraction of information from (monolingual or multilingual) text. • DAESO, DUOMAN, Gemeenteconnect and YourNews. • COREA, IRME, D-COI, SONAR, DPC, LASSY, CORNETTO, and PACO-MT • Semantic web: • CORNETTO, D-COI and SONAR STEVIN Final Event, Rotterdam, 28 Nov 2011

  22. SR: Applications (3) • Dialogue systems and Q&A solutions • DAISY, DUOMAN, Gemeenteconnect, Web Assess. • Automatic summarization and text generation • DAESO, Web Assess • D-COI and SONAR, STEVIN Final Event, Rotterdam, 28 Nov 2011

  23. SR: Applications (4) • Automatic Translation • DPC, PACO-MT • D-COI, SONAR, LASSY, IRME, COREA, CORNETTO • Educational systems • DISCO, SpelSpiek, Primus, HATCI, WooDy, AAP • All resource creation projects STEVIN Final Event, Rotterdam, 28 Nov 2011

  24. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  25. LST Community Consolidation • Create networks • consolidate LST activities • educate new experts • promote discussion • promote transfer of knowledge STEVIN Final Event, Rotterdam, 28 Nov 2011

  26. LST Community Consolidation • Set aside a specific budget and a dedicated WG • joint KI/SME and NL/FL projects preferred • 330 binary cooperation link occurrences • demonstration projects stimulated companies to participate STEVIN Final Event, Rotterdam, 28 Nov 2011

  27. LST Community Consolidation • Educational projects (3) • Master classes (2) • Networking events organized • brokerage events, “Taal in Bedrijf” (‘language@work’), STEVIN programme meetings, etc.. • Networking events supported • e.g. CLIN, InterSpeech2007, ICT-Delta STEVIN Final Event, Rotterdam, 28 Nov 2011

  28. Overview • STEVIN Objectives • Digital Language Infrastructure • Creation • Resource Management • IPR • Strategic Research • LST Community Consolidation • Various Statistics STEVIN Final Event, Rotterdam, 28 Nov 2011

  29. Money Distribution • R&D (76.0%) • Demonstration ( 8.5%) • Supporting Activities ( 6.0%) • HLT Agency ( 2.5%) • STEVIN Management ( 6.5%) STEVIN Final Event, Rotterdam, 28 Nov 2011

  30. Strata Coverage • Basic resources for LST (51.1%) • Basic Research (23.3%) • Application-oriented Res. (15.4%) • Demonstration projects (10.2%) STEVIN Final Event, Rotterdam, 28 Nov 2011

  31. NL / FL Proportion • R&D Projects 63%:37% • Demonstrator projects 66%:34% • Overall 64%-36% • Educational projects (3) 68%:32% • Master classes (2) 100%:0% STEVIN Final Event, Rotterdam, 28 Nov 2011

  32. KI / SME Proportion • Money 83%: 17% • R&D projects by project 19 : 13 • R&D projects by #participations 80%: 20% • Demonstration projects 15%: 85% • Master classes 0%:100% • Education activities 83%: 17% STEVIN Final Event, Rotterdam, 28 Nov 2011

  33. Language / Speech • Money: 53.1%:46.9% STEVIN Final Event, Rotterdam, 28 Nov 2011

  34. Funded v. Submitted • R&D count 1 19/52 (36.5%) • R&D count 2 19/68 (27.9%) • Demonstration 14/41 (30.0%) • Educational 3/ 5 (60%) • Master Classes 2/ 3 (66.6%) • Most proposals were very good • So many more could and should be done STEVIN Final Event, Rotterdam, 28 Nov 2011

  35. Thanks for your Attention! STEVIN Final Event, Rotterdam, 28 Nov 2011

  36. DO NOT GO BEYOND THIS SLIDE DO NOT GO BEYOND THIS SLIDE! STEVIN Final Event, Rotterdam, 28 Nov 2011

  37. Strategic Research • Priorities written language: • semantic analysis (tagging, integration with syntax and morphology) • text pre-processing (tokenization, spelling correction, named entity recognition, ...) • morphological analysis (compounding and derivation) • syntactic analysis: a robust parser for Dutch STEVIN Final Event, Rotterdam, 28 Nov 2011

  38. SR: Realisation Written (1) • COREA: co-reference resolution (a) • IRME: MWE identification + lexical representation (d, a) • LASSY: parser (d) • DAESO: semantic relations and text-to-text generation (a) STEVIN Final Event, Rotterdam, 28 Nov 2011

  39. SR: Realisation Written (2) • DAISY: automatic summarization (a) • DUOMAN: attitude detection (a) • PACO-MT: Machine translation (d, a) • D-COI / SONAR (a, b) • Lacking: (c): morphological analysis for derivation and compounding. STEVIN Final Event, Rotterdam, 28 Nov 2011

  40. SR: Priorities Speech • robustness of speech recognition; • output treatment (inverse text normalization); • confidence measures; • adaptation; • lattices. STEVIN Final Event, Rotterdam, 28 Nov 2011

  41. SR: Realisation Speech • AUTONOMATA (a) • MIDAS (a) • N-BEST (a ) • SPRAAK (a,b,c,d,e ) • DISCO: (a + CALL priority) • AUTONOMATA TOO (a) STEVIN Final Event, Rotterdam, 28 Nov 2011

More Related