1 / 49

Heuristics for Planning

Heuristics for Planning. Initial project advisors assigned—please contact them Project 1 ready; due 2/14 Due to underwhelming interest, note-taking has been suspended. Before, planning algorithms could synthesize about 6 – 10 action plans in minutes

tal
Download Presentation

Heuristics for Planning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Heuristics for Planning Initial project advisors assigned—please contact them Project 1 ready; due 2/14 Due to underwhelming interest, note-taking has been suspended

  2. Before, planning algorithms could synthesize about 6 – 10 action plans in minutes Significant scale-up in the last 6-7 years Now, we can synthesize 100 action plans in seconds. Realistic encodings of Munich airport! Scalability of Planning Problem is Search Control!!! The primary revolution in planning in the recent years has been domain-independent heuristics to scale up plan synthesis IJCAI'07 Tutorial T12

  3. Motivation • Ways to improve Planner Scalability • Problem Formulation • Search Space • Reachability Heuristics • Domain (Formulation) Independent • Work for many search spaces • Flexible – work with most domain features • Overall complement other scalability techniques • Effective!! IJCAI'07 Tutorial T12

  4. Rover Domain IJCAI'07 Tutorial T12

  5. Classical Planning • Relaxed Reachability Analysis • Types of Heuristics • Level-based • Relaxed Plans • Mutexes • Heuristic Search • Progression • Regression • Plan Space • Exploiting Heuristics IJCAI'07 Tutorial T12

  6. Planning Graph and Search Tree • Envelope of Progression Tree (Relaxed Progression) • Proposition lists: Union of states at kth level • Lowerbound reachability information IJCAI'07 Tutorial T12

  7. Level Based Heuristics • The distance of a proposition is the index of the first proposition layer in which it appears • Proposition distance changes when we propagate cost functions – described later • What is the distance of a Set of propositions?? • Set-Level: Index of first proposition layer where all goal propositions appear • Admissible • Gets better with mutexes, otherwise same as max • Max: Maximum distance proposition • Sum: Summation of proposition distances IJCAI'07 Tutorial T12

  8. Example of Level Based Heuristics set-level(sI, G) = 3 max(sI, G) = max(2, 3, 3) = 3 sum(sI, G) = 2 + 3 + 3 = 8 IJCAI'07 Tutorial T12

  9. How do Level-Based Heuristics Break? P1 P2 P0 A0 A1 q q q B1 B1 p1 p1 B2 B2 p2 p2 B3 B3 p3 p3 B99 p99 p99 B99 B100 p100 p100 B100 g A The goal g is reached at level 2, but requires 101 actions to support it. IJCAI'07 Tutorial T12

  10. Relaxed Plan Heuristics • When Level does not reflect distance well, we can find a relaxed plan. • A relaxed plan is subgraph of the planning graph, where: • Every goal proposition is in the relaxed plan at the last level • Every proposition in the relaxed plan has a supporting action in the relaxed plan • Every action in the relaxed plan has its preconditions supported. • Relaxed Plans are not admissible, but are generally effective. • Finding the optimal relaxed plan is NP-hard, but finding a greedy one is easy. Later we will see how “greedy” can change. • Later, in over-subscription planning, we’ll see that the effort of finding an optimal relaxed plan is worthwhile IJCAI'07 Tutorial T12

  11. Example of Relaxed Plan Heuristic Count Actions RP(sI, G) = 8 Support Goal Propositions Individually Identify Goal Propositions IJCAI'07 Tutorial T12

  12. Results Relaxed Plan Level-Based IJCAI'07 Tutorial T12

  13. Optimizations in Heuristic Computation • Taming Space/Time costs • Bi-level Planning Graph representation • Partial expansion of the PG (stop before level-off) • It is FINE to cut corners when using PG for heuristics (instead of search)!! • Branching factor can still be quite high • Use actions appearing in the PG (complete) • Select actions in lev(S)vs Levels-off (incomplete) • Consider action appearing in RP (incomplete) IJCAI'07 Tutorial T12

  14. Adjusting for Negative Interactions • Until now we assumed actions only positively interact. What about negative interactions? • Mutexes help us capture some negative interactions • Types • Actions: Interference/Competing Needs • Propositions: Inconsistent Support • Binary are the most common and practical • |A| + 2|P|-ary will allow us to solve the planning problem with a backtrack-free GraphPlan search • An action layer may have |A| actions and 2|P| noops • Serial Planning Graph assumes all non-noop actions are mutex IJCAI'07 Tutorial T12

  15. Binary Mutexes have(soil) has a supporter not mutex with a supporter of at(), --feasible together at level 2-- P0 A0 P1 A1 P2 avail(soil, ) avail(soil, ) avail(soil, ) sample(soil, ) sample(soil, ) have(soil) only supporter is mutex with at() only supporter --Inconsistent Support-- avail(rock, ) avail(rock, ) avail(rock, ) drive(, ) drive(, ) avail(image,) avail(image, ) avail(image, ) drive(, ) drive(, ) at() at() at() sample needs at(), drive negates at() --Interference-- sample(image,) have(soil) have(soil) sample(rock, ) have(image) commun(soil) have(rock) drive(, ) at() at() Set-Level(sI, {at(),.have(soil)}) = 2 Max(sI, {at(),. have(soil)}) = 1 drive(, ) comm(soil) drive(, ) at() at() IJCAI'07 Tutorial T12 drive(, )

  16. Admissible Set-Level with memos Partition-k Adjusted Sum Combo Distance of a Set of Literals • lev(p) :index of the first level at which p comes into the planning graph • lev(S):index of the first level where all props in S appear non-mutexed. • If there is no such level, then If the graph is grown to level off, then ¥ Else k+1 (k is the current length of the graph) h(S) =pS lev({p}) h(S) = lev(S) Sum Set-Level IJCAI'07 Tutorial T12

  17. Start with RP heuristic and adjust it to take subgoal interactions into account Negative interactions in terms of “degree of interaction” Positive interactions in terms of co-achievement links Ignore negative interactions when accounting for positive interactions (and vice versa) It is NP-hard to find a plan (when there are mutexes), and even NP-hard to find an optimal relaxed plan. It is easier to add a penalty to the heuristic for ignored mutexes Adjusting the Relaxed Plans [AAAI 2000] HAdjSum2M(S)= length(RelaxedPlan(S)) + max p,qS (p,q) Where (p,q) = lev({p,q}) – max{lev(p), lev(q)} /*Degree of –ve Interaction */ IJCAI'07 Tutorial T12

  18. Anatomy of a State-space Regression planner Problem: Given a set of subgoals (regressed state) estimate how far they are from the initial state [AAAI 2000; AIPS 2000; AIJ 2002; JAIR 2003] IJCAI'07 Tutorial T12

  19. Rover Example in Regression comm(soil) avail(rock, ) avail(image, ) at() at() comm(soil) avail(rock, ) have(image) at() s4 s3 s2 s1 sG comm(soil) have(rock) have(image) comm(soil) comm(rock) have(image) comm(soil) comm(rock) comm(image) commun(rock) commun(image) sample(rock, ) sample(image, ) Should be ∞, s4 is inconsistent, how do we improve the heuristic?? Sum(sI, s4) = 0+0+1+1+2 =4 Sum(sI, s3) = 0+1+2+2 =5 IJCAI'07 Tutorial T12

  20. AltAlt Performance Level-based Adjusted RP Logistics Scheduling Problem sets from IPC 2000 IJCAI'07 Tutorial T12

  21. Then it was cruelly UnPOPped In the beginning it was all POP. The good times return with Re(vived)POP Plan Space Search IJCAI'07 Tutorial T12

  22. POP Algorithm • Plan Selection: Select a plan P from • the search queue • 2. Flaw Selection: Choose a flaw f • (open cond or unsafe link) • 3. Flaw resolution: • If f is an open condition, • choose an action S that achieves f • If f is an unsafe link, • choose promotion or demotion • Update P • Return NULL if no resolution exist • 4. If there is no flaw left, return P g1 g2 1. Initial plan: Sinf S0 2. Plan refinement (flaw selection and resolution): p q1 S1 S3 g1 Sinf S0 g2 g2 oc1 oc2 S2 ~p • Choice points • Flaw selection (open condition? unsafe link? Non-backtrack choice) • Flaw resolution/Plan Selection (how to select (rank) partial plan?) IJCAI'07 Tutorial T12

  23. Distance heuristics to estimate cost of partially ordered plans (and to select flaws) If we ignore negative interactions, then the set of open conditions can be seen as a regression state Mutexes used to detect indirect conflicts in partial plans A step threatens a link if there is a mutex between the link condition and the steps’ effect or precondition Post disjunctive precedences and use propagation to simplify PG Heuristics for Partial Order Planning IJCAI'07 Tutorial T12

  24. avail(rock, ) have(rock) S2 S3 at() have(rock) comm(rock) commun(rock) sample(rock, ) S0 at() comm(soil) S1 have(image) avail(soil,) S1 comm(image) comm(rock) comm(image) avail(rock,) avail(image, ) commun(image) have(image) S1 comm(image) commun(image) Regression and Plan Space comm(soil) avail(rock, ) avail(image, ) at() at() comm(soil) avail(rock, ) have(image) at() s4 s3 s2 s1 sG comm(soil) have(rock) have(image) comm(soil) comm(rock) have(image) comm(soil) comm(rock) comm(image) commun(rock) commun(image) sample(rock, ) sample(image, ) avail(rock, ) have(rock) S2 S3 at() S0 have(rock) comm(rock) at() comm(soil) S1 commun(rock) sample(rock, ) avail(soil,) comm(rock) comm(image) avail(rock,) avail(image, ) avail(image, ) S4 at() IJCAI'07 Tutorial T12 have(image) sample(image, )

  25. RePOP’s Performance • RePOP implemented on top of UCPOP • Dramatically better than any other partial order planner before it • Competitive with Graphplan and AltAlt • VHPOP carried the torch at ICP 2002 • Written in Lisp, runs on Linux, 500MHz, 250MB • You see, pop, it is possible to Re-use all the old POP work! [IJCAI, 2001] IJCAI'07 Tutorial T12

  26. Exploiting Planning Graphs • Restricting Action Choice • Use actions from: • Last level before level off (complete) • Last level before goals (incomplete) • First Level of Relaxed Plan (incomplete) – FF’s helpful actions • Only action sequences in the relaxed plan (incomplete) – YAHSP • Reducing State Representation • Remove static propositions. A static proposition is only ever true or false in the last proposition layer. IJCAI'07 Tutorial T12

  27. Classical Planning Conclusions • Many Heuristics • Set-Level, Max, Sum, Relaxed Plans • Heuristics can be improved by adjustments • Mutexes • Useful for many types of search • Progresssion, Regression, POCL IJCAI'07 Tutorial T12

  28. Cost-Based Planning IJCAI'07 Tutorial T12

  29. Cost-based Planning • Propagating Cost Functions • Cost-based Heuristics • Generalized Level-based heuristics • Relaxed Plan heuristics IJCAI'07 Tutorial T12

  30. Rover Cost Model IJCAI'07 Tutorial T12

  31. Cost Propagation P0 A0 P1 A1 P2 avail(soil, ) avail(soil, ) avail(soil, ) 20 20 sample(soil, ) sample(soil, ) avail(rock, ) avail(rock, ) avail(rock, ) 10 10 drive(, ) drive(, ) avail(image, ) avail(image, ) avail(image,) 30 30 drive(, ) drive(, ) at() at() at() 35 20 20 sample(image,) have(soil) have(soil) 35 35 sample(rock, ) have(image) 25 35 commun(soil) have(rock) 25 10 10 drive(, ) at() at() Cost Reduces because Of different supporter At a later level 15 25 drive(, ) comm(soil) 35 30 25 drive(, ) at() at() IJCAI'07 Tutorial T12 40 drive(, )

  32. 1-lookahead Cost Propagation (cont’d) P2 A2 P3 A3 P4 avail(soil, ) avail(soil, ) avail(soil, ) 20 20 sample(soil, ) sample(soil, ) avail(rock, ) avail(rock, ) avail(rock, ) 10 10 drive(, ) drive(, ) avail(image, ) avail(image, ) avail(image, ) 30 30 drive(, ) drive(, ) at() at() at() 30 30 sample(image,) 20 20 20 sample(image,) have(soil) have(soil) have(soil) 35 35 35 30 30 sample(rock, ) sample(rock, ) have(image) have(image) have(image) 25 25 35 35 35 commun(soil) commun(soil) have(rock) have(rock) have(rock) 25 25 10 10 10 drive(, ) drive(, ) at() at() at() 15 15 25 25 25 drive(, ) drive(, ) comm(soil) comm(soil) comm(soil) 30 30 25 25 25 drive(, ) drive(, ) at() at() at() 40 40 40 35 drive(, ) drive(, ) comm(image) comm(image) 35 40 40 commun(image) 40 IJCAI'07 Tutorial T12 commun(image) comm(rock) comm(rock) 40 40 commun(rock) commun(rock)

  33. Terminating Cost Propagation • Stop when: • goals are reached (no-lookahead) • costs stop changing (∞-lookahead) • k levels after goals are reached (k-lookahead) IJCAI'07 Tutorial T12

  34. Guiding Relaxed Plans with Costs Start Extract at last level (goal proposition is cheapest) IJCAI'07 Tutorial T12

  35. Cost-Based Planning Conclusions • Cost-Functions: • Remove false assumption that level is correlated with cost • Improve planning with non-uniform cost actions • Are cheap to compute (constant overhead) IJCAI'07 Tutorial T12

  36. Partial Satisfaction (Over-Subscription) Planning IJCAI'07 Tutorial T12

  37. Partial Satisfaction Planning • Selecting Goal Sets • Estimating goal benefit • Anytime goal set selection • Adjusting for negative interactions between goals IJCAI'07 Tutorial T12

  38. Partial Satisfaction (Oversubscription) Planning In many real world planning tasks, the agent often has more goals than it has resources to accomplish. Example: Rover Mission Planning (MER) Need automated support for Over-subscription/Partial Satisfaction Planning Actions have execution costs, goals have utilities, and the objective is to find the plan that has the highest net benefit. IJCAI'07 Tutorial T12

  39. Adapting PG heuristics for PSP • Challenges: • Need to propagate costs on the planning graph • The exact set of goals are not clear • Interactions between goals • Obvious approach of considering all 2n goal subsets is infeasible • Idea: Select a subset of the top level goals upfront • Challenge: Goal interactions • Approach: Estimate the net benefit of each goal in terms of its utility minus the cost of its relaxed plan • Bias the relaxed plan extraction to (re)use the actions already chosen for other goals IJCAI'07 Tutorial T12

  40. Goal Set Selection In Rover Problem Found By Cost Propagation Found By RP 20-35 = -15 60-40 = 20 50-25 = 25 comm(soil) comm(rock) comm(image) Found By Biased RP 50-25 = 25 110-65 = 45 70-60 = 20 comm(rock) comm(image) 130-100 = 30 comm(image) IJCAI'07 Tutorial T12

  41. SAPAPS (anytime goal selection) • A* Progression search • g-value: net-benefit of plan so far • h-value: relaxed plan estimate of best goal set • Relaxed plan found for all goals • Iterative goal removal, until net benefit does not increase • Returns plans with increasing g-values. IJCAI'07 Tutorial T12

  42. Some Empirical Results for AltAltps Exact algorithms based on MDPs don’t scale at all [AAAI 2004] IJCAI'07 Tutorial T12

  43. Adjusting for Negative Interactions (AltWlt) • Problem: • What if the apriori goal set is not achievable because of negative interactions? • What if greedy algorithm gets bad local optimum? • Solution: • Do not consider mutex goals • Add penalty for goals whose relaxed plan has mutexes. • Use interaction factor to adjust cost, similar to adjusted sum heuristic • maxg1, g2 2 G {lev(g1, g2) – max(lev(g1), lev(g2)) } • Find Best Goal set for each goal IJCAI'07 Tutorial T12

  44. Goal Utility Dependencies BAD GOOD One Shoe Two Shoes Cost: 50 Utility: 0 Cost: 100 Utility: 500 High-res photo utility: 150 Soil sample utility: 100 Both goals: Additional 200 High-res photo utility: 150 Low-res photo utility: 100 Both goals: Remove 80 IJCAI'07 Tutorial T12

  45. Dependencies goal interactions exist as two distinct types cost dependencies utility dependencies • Goals share actions in the plan trajectory • Defined by the plan • Goals may interact in the utility they give • Explicitly defined by user See talk Thurs. (11th) in session G4 (Planning and Scheduling II) Exists in classical planning, but is a larger issue in PSP No need to consider this in classical planning All PSP Planners SPUDS AltWlt, Optiplan, SapaPS Our planner based on SapaPS IJCAI'07 Tutorial T12

  46. PSP Conclusions • Goal Set Selection • Apriori for Regression Search • Anytime for Progression Search • Both types of search use greedy goal insertion/removal to optimize net-benefit of relaxed plans IJCAI'07 Tutorial T12

  47. Overall Conclusions • Relaxed Reachability Analysis • Concentrate strongly on positive interactions and independence by ignoring negative interaction • Estimates improve with more negative interactions • Heuristics can estimate and aggregate costs of goals or find relaxed plans • Propagate numeric information to adjust estimates • Cost, Resources, Probability, Time • Solving hybrid problems is hard • Extra Approximations • Phased Relaxation • Adjustments/Penalties IJCAI'07 Tutorial T12

  48. Why do we love PG Heuristics? • They work! • They are “forgiving” • You don't like doing mutex? okay • You don't like growing the graph all the way? okay. • Allow propagation of many types of information • Level, subgoal interaction, time, cost, world support, probability • Support phased relaxation • E.g. Ignore mutexes and resources and bring them back later… • Graph structure supports other synergistic uses • e.g. action selection • Versatility… IJCAI'07 Tutorial T12

  49. PG Variations Serial Parallel Temporal Labelled Propagation Methods Level Mutex Cost Label Versatility of PG Heuristics • Planning Problems • Classical • Resource/Temporal • Conformant • Planners • Regression • Progression • Partial Order • Graphplan-style IJCAI'07 Tutorial T12

More Related