1 / 29

Probabilistic Temporal Planning with Uncertain Durations

This paper discusses the challenges of planning in real-world domains with uncertain durations and presents various algorithms for efficient and optimal planning.

Download Presentation

Probabilistic Temporal Planning with Uncertain Durations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Temporal Planning with Uncertain Durations Mausam Joint work with Daniel S. Weld University of Washington Seattle

  2. Motivation Three features of real world planning domains • Concurrency • Calibrate while rover moves • Uncertain Effects • ‘Grip a rock’ may fail • Uncertain Durative actions • Wheels spin, so speed uncertain

  3. Contributions • Novel Challenges • Large number of decision epochs • Results to manage this blowup in different cases • Large branching factors • Approximation algorithms • Five planning algorithms • DURprun : optimal • DURsamp : near-optimal • DURhyb : anytime with user defined error • DURexp : super-fast • DURarch : balance between speed and quality • Identify fundamental issues for future research

  4. Outline of the talk • Background • Theory • Algorithms and Experiments • Summary and Future Work

  5. Outline of the talk • Background • MDP • Decision Epochs: happenings, pivots • Theory • Algorithms and Experiments • Summary and Future Work

  6. unit duration Markov Decision Process • S : a set of states, factored into Boolean variables. • A : a set of actions • Pr (S£A£S! [0,1]): the transition model • C (A!R) : the cost model • s0 : the start state • G : a set of absorbing goals

  7. GOAL of an MDP • Find a policy (S!A) which: • minimises expected cost of reaching a goal • for a fully observable • Markov decision process • if the agent executes for indefinite horizon. • Algorithms • Value iteration, Real Time Dynamic Programming, etc. • iterative dynamic programming algorithms

  8. Definitions (Durative Actions) • Assumption: (Prob.) TGP Action model • Preconditions must hold until end of action. • Effects are usable only at the end of action. • Decision epochs: time point when a new action may be started. • Happenings: A point when action finishes. • Pivot: A point when action could finish.

  9. Outline of the talk • Background • Theory • Explosion of Decision Epochs • Algorithms and Experiments • Summary and Future Work

  10. Decision Epochs (TGP Action Model) • Deterministic Durations [Mausam&Weld05] : • Decision Epochs = set of happenings • Uncertain Durations: • Non-termination has information! • Theorem: Decision Epochs = set of pivots

  11. Illustration: A bimodal distribution Duration distribution of a Expected Completion Time

  12. Conjecture if all actions have duration distributions independent of effects unimodal duration distributions then Decision Epochs = set of happenings

  13. Outline of the talk • Background • Theory • Algorithms and Experiments • Expected Durations Planner • Archetypal Durations Planner • Summary and Future Work

  14. Planning with Durative Actions • MDP in an augmented state space Time 0 2 4 6 a b X c <X1,{(a,4), (c,4)}> X1 : Application of b on X. <X,;>

  15. Uncertain Durations: Transition Fn action a : uniform(1,2) action b : uniform(1,2) a <Xa, {(b,1)}> b 0.25 a <Xb, {(a,1)}> 0.25 b a, b <X,;> 0.25 a <Xab, ;> 0.25 b a <Xab, ;> b

  16. Branching Factor If n actions m possible durations r probabilistic effects Then Potential Successors (m-1)[(r+1)n – rn – 1] + rn

  17. Algorithms • Five planning algorithms • DURprun : optimal • DURsamp : near-optimal • DURhyb : anytime with user defined error • DURexp : super-fast • DURarch : balance between speed and quality

  18. Expected Durations Planner (DURexp) • assign each action a deterministic duration equal to the expected value of its distribution. • build a deterministic duration policy for this domain. • repeat execute this policy and wait for interrupt (a) action terminated as expected – do nothing (b) action terminated early – replan from this state (c) action terminated late – revise a’s deterministic duration and replan for this domain until goal is reached

  19. Planning Time

  20. Multi-modal distributions • Recall: conjecture holds only for unimodal distributions happenings if unimodal Decision epochs = pivots if multimodal

  21. Multi-modal Durations: Transition Fn action a : uniform(1,2) action b : 50% : 1 50% : 3 a <Xa, {(b,1)}> b 0.25 a <Xb, {(a,1)}> 0.25 b a, b <X,;> 0.25 a <Xab, ;> 0.25 b a <X, {(a,1), (b,1)> b

  22. Multi-modal Distributions • Expected Durations Planner (Durexp) • One deterministic duration per action • Big approximation for multi-modal distribution • Archetypal Durations Planner (Durarch) • Limited uncertainty in durations • One duration per mode of distribution

  23.     Planning Time (multi-modal)

  24.     Expected Make-span (multi-modal)

  25. Outline of the talk • Background • Theory • Algorithms and Experiments • Summary and Future Work • Observations on Concurrency

  26. Summary • Large number of Decision Epochs • Results to manage explosion in specific cases • Large branching factors • Expected Durations Planner • Archetypal Durations Planner (multi-modal)

  27. Handling Complex Action Models • So Far: Probabilistic TGP • Preconditions hold over-all. • Effects usable only at end. • What about: Probabilistic PDDL2.1 ? • Preconditions at-start, over-all, at-end • Effects at-start, at-end • Decision epochs must be arbitrary points.

  28. preconditions p q a G G p,: q b q : p effects Ramifications • Result independent of uncertainty!! • Existing decision epoch planners are incomplete. • SAPA, Prottle, etc. • All IPC winners

  29. Related Work • Tempastic (Younes and Simmons’ 04) • Generate, Test and Debug • Prottle (Little, Aberdeen, Thiebaux’ 05) • Planning Graph based heuristics • Uncertain Durations w/o concurrency • Foss and Onder’05 • Boyan and Littman’00 • Bresina et.al.’02, Dearden et.al.’03

More Related