1 / 43

MITRE Dialog Management Workshop – a review

MITRE Dialog Management Workshop – a review. Dan Bohus Dialogs on Dialogs reading group CMU, November 2003. The Workshop. MITRE Dialog Workshop @ MITRE, Bedford/Boston October 27-28, 2003 Idea Bring together researchers working on dialog management Give them a homework

dacia
Download Presentation

MITRE Dialog Management Workshop – a review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MITRE Dialog Management Workshop – a review Dan Bohus Dialogs on Dialogs reading group CMU, November 2003

  2. The Workshop • MITRE Dialog Workshop • @ MITRE, Bedford/Boston • October 27-28, 2003 • Idea • Bring together researchers working on dialog management • Give them a homework • Adapt you dialog manager to a medical diagnosis domain (details in a sec) • Discuss, compare, learn MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  3. The Homework • Implement a dialog system for the medical diagnosis domain • Task left open-ended (diagnosis, tutoring, etc) • No speech, just text in and out • Backend provided backend.doc • Java version and web-based interface version • 3 diseases: malaria, coccidioidomycosis, another one • List of symptoms: headache, nausea, muscle pain, etc. • Decision tree involving symptoms and tests (fever, blood tests, travel patterns, etc) • Small enough to presumably not be lots of work, but large enough to allow illustration of functionalities, and provide some skeleton to the discussions… MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  4. Participants • MITRE (Carl Burke et al) MiDiKi • Gothenburg (Staffan Larsson) GoDiS (TRINDIKit) • USC ICT (David Traum) ICT Dialogue Manager • NTT/CMU (Matthias Denecke) Ariadne • CMU (Dan, Alex) RavenClaw • Ames (Beth-Ann Hockey) NASA Dialogue Manager • DFKI (Norbert Reithinger) DFKI Dialogue Manager • MERL (Candy Sidner, Charles Rich) COLLAGEN … and others invited but not present MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  5. GoDiS GoDiS MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  6. GoDiS • TRINDIKit – information state update dialogue management toolkit • Information state • Private: dialog plan, beliefs, agenda (short term goals) • Shared: established facts, QUD, last utterance information • Dialog moves • Update rules • GoDiS: dialog management system implemented in TRINDIKit, handing: • information oriented dialogue • action oriented dialogue MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  7. control DME input inter- pret update select gene- rate output • TIS • DEVICES LEXICON DOMAIN backend interface lexicon domain knowledge TRINDIKit / GoDiS architecture Dialog plansOntology Connection to Java Backend MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  8. GoDiS: Task Representation • Plans; propositional logic • Dialogue plans for dealing with diagnosis (issues opened at dialogue start) • ?x.disease(x): ”which disease is diagnosed?” • ?confirmed_by_interview: ”Is the diagnosis confirmed by additional information?” • ?confirmed_by_tests: ”Is the diagnosis confirmed by medical tests?” • Additional plans • ?x.info(x): ”What information is there about a given disease?” • ?x.treatment(x): ”What treatment is there for a given disease?” MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  9. GoDiS: Alternate Tasks • User-driven dialogue (implemented) • Not load issues when resetting; user has to raise all issues • User can ask system to • Provide a diagnosis • Confirm whether user has given disease • Decision trees as dialogue plans • Move backend knowledge into dialogue plans • Information conversion could be done automatically • Separate genre: expert system dialogue • Add special purpose update rules • Dynamic dialogue planning by expert MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  10. GoDiS: Highlights / Lowlights • Highlights: • Reuse, you get for free: • Grounding • Accomodation / plan recognition • Multiple simultaneous issues & info sharing • High-level abstraction for dialog plans • Rapid prototyping • Lowlights • Not used in this type of domain so far, so not entirely straight-forward (update rule changes) • Dynamic dialog plans (backend decides) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  11. GoDiS RavenClaw MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  12. RavenClaw • Captures all domain-specific dialog (task) logic with a hierarchical description • The authoring effort is focused entirely here Dialog Task (Specification) Domain-independent Dialog Engine • Manages dialog by executing the dialog task specification • Provides domain-independent conversational strategies MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  13. have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  14. have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  15. have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Welcome Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  16. have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  17. diagnostic chart general_feeling have_fever headache RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… LoadSymptoms Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  18. diagnostic chart general_feeling have_fever headache RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  19. diagnostic chart general_feeling have_fever headache RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… GeneralFeel Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  20. headache chart have_fever diagnostic general_feeling RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel GeneralFeel Diagnose R:HowAreYou? I:Glad I:Glad I:Sorry I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… general_feeling: [good], [bad], [soso] How are you feeling today? general_feeling: [good], [bad], [soso] Not so good, I think I have a fever general_feeling: [good], [bad], [soso]have_fever: [fever]. ![yes], ![no]headache: [headache], ![yes], ![no]cough: [cough], ![yes], ![no]… … [soso](not so good)[fever](I think I have a fever) HowAreYou GeneralFeel GeneralFeel Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  21. Illustrated Features • Dynamic generation of dialog task structure • Symptoms loaded from backend, appropriate structures to “talk about them” created on-the-fly • New symptoms – no DM changes • Dynamic dialog control policy • The order in which symptoms are addressed is controlled by the backend • Conversational skills MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  22. Illustrated Features • Dynamic generation of dialog task structure • Symptoms loaded from backend, appropriate structures to “talk about them” created on-the-fly • New symptoms – no DM changes • Dynamic dialog control policy • The order in which symptoms are addressed is controlled by the backend • Conversational skills MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  23. Backend Decision Tree headache have_fever chart general_feeling diagnostic Dynamic Dialog Control … Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… How are you today? Not so good, I think I have a headacheSorry to hear you’re not feeling so good,Tell me more about your symptoms… Do you have abdominal pain? Diagnose Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  24. Illustrated Features • Dynamic generation of dialog task structure • Symptoms loaded from backend, appropriate structures to “talk about them” created on-the-fly • New symptoms – no DM changes • Dynamic dialog control policy • The order in which symptoms are addressed is controlled by the backend • Conversational skills MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  25. Conversational Skills • Corresponding agencies added automatically to the dialog task tree • Help • What Can I Say? • Repeat • Suspend / Resume • Start Over • Timeout handling (not illustrated) • Still need all the language generation prompts and grammar, but some of those are develop-once, too MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  26. RavenClaw Conclusion • Highlights • Set task posed no challenges to the framework • Easy to implement • Dynamic dialog structure and control • Automatic use of domain-independent conversational skills • Lowlights? • Toolkit perspective: how easy would it be for someone else to build it? • Asynchronous behaviors? (timing) • Couple of bugs / fixes (or is that a highlight?) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  27. GoDiS Collagen MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  28. Collaborative Interface Agent * focus stack plan tree Collagen communicate observe observe interact interact COLLAGEN MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  29. COLLAGEN Systems • air travel planning • email reading and responding (w. IBM/Lotus) • GUI design tool operation • car navigation system operation • airport landing path planning (w. MITRE) • gas turbine operator training (w. USC/ISI) • personal video recorder operation • programmable thermostat operation (with Delft U.) • multi-modal web-based form-filling MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  30. Java Implementation SharedPlan Discourse Theory Intentional purposes, contributes focus stack focus spaces, focus stack segments, lexical items Linguistic Attentional purpose tree (Grosz, Sidner, Kraus, Lochbaum 1974-1998) Collagen: Theory and Implementation MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  31. (fixing an air compressor, E = expert, A = apprentice) E: Replace the pump and belt please. A: Ok, I found a belt in the back. A: Is that where it should be? A: [removes belt] A: It’s done. E: Now remove the pump. … E: First you have to remove the flywheel. … E: Now take the pump off the base plate. A: Already did. replace belt replace pump and belt replace pump (Grosz, 1974) Collagen: Discourse Segments and Purposes MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  32. Focus Stack Purpose Tree replace pump and belt current focus space replacebelt replace pump and belt replace pump replace belt E: Replace the pump and belt please. A: Ok, I found a belt in the back. A: Is that where it should be? A: [removes belt] A: It’s done replace pump and belt replace belt (Grosz & Sidner, 1986) Discourse state representation MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  33. focus stack purpose tree • directly achieves the purpose • is a step in the plan for the purpose * • identifies the recipeused to achieve the purpose • identifieswho should perform the purpose or a step in the plan • identifies a parameter of the purpose or a step in the plan An act contributes to the purpose of a segment if it: * does not include recursive plan recognition (see later topic) Discourse interpretation algorithm The current (communication or manipulation) act either: • starts a new segment/focus space (push) • ends the current segment/focus space (pop) • continues (contributes to) the current segment/... (add) (Lochbaum, 1998) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  34. COLLAGEN … my take • Separation of task from dialog/discourse engine • Recipes / Domain plans / Task tree • Full-blown HTN • Hierarchical • Preconditions (constraints) • Effects • Completion / failure • Live nodes • Stack to keep track of focus and discourse structure • Tree explicitly contains agent and user nodes • Formalized / descriptive recipe specs (actually Java underneath), with procedure overwrites… MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  35. GoDiS Themes … MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  36. Themes: Task Representation • Task representation • Separation of task representation from dialog engine • High-level representations of task • Descriptive rather than procedural • Procedural will be unavoidable for complex tasks • Expressive power • GoDiS, RavenClaw, Collagen: plan based representations of task MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  37. Themes: Task/Domain/Gendre • The notion of dialog gendre • Tutoring • Diagnosis • Information Access • Where to fold it in a dialog manager? • GoDiS: update/select rules • Ariadne: plugins • RavenClaw: collapsed with task • How clear is that separation: task vs. gendre? MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  38. Themes: Development time • Systems took on the order of 3-5 days to develop • Significant effort in the backend connection • Some sites shortcut it • Significant effort in grammar/language generation development • Some sites shortcut it • Everyone that had an implementation: “fixed a couple of bugs, but no major changes required” MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  39. Themes: Development tools • Regression testing (GoDiS) • Systems are complex. Change something in a dialog management framework, can you prove that it did not screw up things that used to work? • System-wise, very intractable • Component-wise, maybe: i.e. DM with DM inputs/outputs • System diagnosis / log visualization tools (Collagen) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  40. Themes: Timing • (Micro)timing • unaddressed • Turn-taking models • in general, very rudimentary • Asynchronous behaviors • Could be accomplished, but no-one seemed to have it • Multi-party conversation • unaddressed MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  41. Themes: the important problems • Different people have different views of what those are: • Plan / Intention recognition • Reference resolution • Backup in complex systems • Tense problems • Negations • Grounding; error prevention / recovery MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  42. Themes: Reasoning • Dialog Managers vs Backends • Where to draw the line? • Who does the reasoning? • Can we avoid duplicating it? • How rich is the interaction between them? • Dialog systems - use language to act in a domain, so they are generally strongly tied • Basic set of conversational skills can be identified • Drawing that line is still an “art”, no general agreement or solutions exist MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

  43. Themes: Science of Dialog? • How much science do we have? • Theory vs. experiment • Interesting Collagen / RavenClaw similarities • Representation or not? • GUI analogy • Do we have the checkboxes and radio-buttons? MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes

More Related