1 / 52

Structured Models for Multi-Agent Interactions

Structured Models for Multi-Agent Interactions. Daphne Koller Stanford University. Joint work with Brian Milch, U.C. Berkeley. Scaling Up. Question: Modeling and solving small games is already hard How can we scale up to larger ones? Answer: Real-world situations have a lot of structure

miron
Download Presentation

Structured Models for Multi-Agent Interactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structured Models forMulti-Agent Interactions Daphne Koller Stanford University Joint work with Brian Milch, U.C. Berkeley

  2. Scaling Up • Question: • Modeling and solving small games is already hard • How can we scale up to larger ones? • Answer: • Real-world situations have a lot of structure • Otherwise people wouldn’t be able to handle them • Goal: construct • languages based on structured representations, allowing compact models of complex situations • algorithms that exploit this structureto support effective reasoning

  3. Normal form basic units: strategies game representation loses all structure matrix size exponentially larger than game tree Extensive form basic units: events game structure explicitly encodes time, information game tree size can still be very large Representations of Games strategies of player II strategies of player I

  4. 30 Normal form Sequence form 25 20 15 10 5 0 0 5000 10000 15000 20000 Representation & Inference Minimax linear program for two-player zero-sum games Applied to abstract 2-player poker [Koller + Pfeffer] solution time (sec) size of tree [Romanovskii, 1962; Koller, Megiddo & von Stengel, 1994]

  5. Cost Resource Allocation A Sales Strategy B Sales Strategy Commission Commission Sales-A Sales-B Revenue MAID Representation • MAID form • basic units: variables & dependencies between them • game structure explicitly encodes time, information, independence • can be exponentially smaller than game tree • game structure supports new forms of decomposition & backward inductions • solving can be exponentially more efficient than extensive form

  6. Outline • Probabilistic Reasoning: Bayesian networks [Pearl, Jensen, …] • Influence Diagrams • Strategic Relevance • Exploiting Structure for Solving Games

  7. Probability Distributions • Probabilistic model (e.g., a la Savage): • set of possible states in which the world can be; • probability distribution over this space. • State: assignment of values to variables • diseases, symptoms, predisposing factors, … • Problem: • n variables  2n states (or more); • representing the joint distribution is infeasible.

  8. B E P(A | B, E) b e 0.8 0.2 b e 0.6 0.4 b e 0.2 0.8 0.01 0.99 b e Bayesian Network P(A | B,E) a function Val(B,E)  (Val(A)) Burglary Earthquake Alarm Newscast PhoneCall nodes = random variables edges = direct probabilistic influence Network structure encodes conditional independencies: Phone-Call is independent of Burglary given Alarm

  9. B E A C N BN Semantics: Probability Model • Compact & natural representation: • nodes have  k parents  2kn vs. 2n parameters • parameters natural and easy to elicit. qualitative BN structure local probability models full joint distribution over domain + =

  10. B E A B E B B E E C N A A A C C C N N N BN Semantics: Independencies • The graph structure of the BN implies a set of conditional independence assumptions • satisfied by every distribution over this graph Burglary and Earthquake independent Burglary and Call independent given Alarm Newscast and Alarm independent given Earthquake

  11. B E A B E C N A C N BN Semantics: Dependencies • BN structure also specifies potential dependencies • those that might hold for some distribution over graph • Burglary and Earthquake dependent given Alarm

  12. Active paths A A B, C can be dependent • Probabilistic influence “flows” along “active” paths • “d-separation” if there is no active path B, C are independent given A B C B, C can be dependent given A,D D D Simple linear-time algorithm for testing conditional independence using only graphical structure: • Sound: d-separation  independence for all P • Complete: no d-separation  dependence for almost all P

  13. Bayesian network system diagnoses better than doctor who designed it CPCS  21000 states

  14. Bayesian Networks • Explicit representation of domain structure • Cognitively intuitive compact models of complex domains • Same model allows relevant probabilities to be computed in any evidence state • Algorithms that exploit structure for effective inference even in very large models

  15. Outline • Probabilistic Reasoning: Bayesian networks • Influence Diagrams [Howard, Shachter, Jensen, …] • Strategic Relevance • Exploiting Structure for Solving Games

  16. Example: The Tree Killer • Alice wants a patio, but the benefit outweighs the cost only if she gets an ocean view • Bob’s tree blocks her view • Alice chooses whether to poison the tree • Tree may become sick • Bob chooses whether to call a tree doctor • Alice can see whether tree doctor comes • Alice chooses whether to build her patio • Tree may die when winter comes

  17. Standard Representation: Game Tree Poison Tree? Tree Sick? Call Tree Doctor? Build Patio? Tree Dead? 5 levels; 25 = 32 terminal nodes

  18. Multi-Agent Influence Diagrams (MAIDs) Influence diagram representation easily extended to multiple agents Build Patio Spike Tree Tree Doctor TreeSick Cost View TreeDead “Tree killer” example Tree

  19. Decision Nodes • Incoming edges are information edges • variables whose values the agent knows when deciding • agent’s strategy can depend on values of parents • Each parent instantiation • u Val(Parents(D)) is an information set • Perfect recall: if D1 precedes D2 • at D2 agent remembers: • his decision at D1 • everything he knew at D1 • formally: {D1,Parents(D1)}  Parents(D2) • usually perfect recall edges are implicit, not drawn Spike Tree TreeSick Tree Doctor Build Patio

  20. Strategies • Strategy  at D: • A pure (deterministic) strategy specifies an action at D for every information set u • A behavior strategy specifies a distribution over actions for every u • Strategy  specifies distribution P(D | Parents(D)) • turns a decision node into a chance node • information parents play exactly the same role as parents of chance node

  21. MAID Semantics • MAID M defines a set of possible strategy profiles • M plus any strategy profile  defines a BN M[] • Each decision node D becomes a chance node, with [D] as its CPD • M[] defines a probability distribution, from which we can derive an expected utility for each agent: • Thus, a MAID defines a mapping from strategy profiles to expected utility vectors

  22. U Readability P1 Hand P2 Hand Bet Bet Bet Flop Cards Bet Bet Bet Card 4 Bet Bet Bet

  23. Compactness Suitability 1W Suitability 1E Util 1W Building 1E Building 1W Util 1E Suitability 2W Suitability 2E Util 2W Building 2W Building 2E Util 2E Suitability 3W Suitability 3E Util 3W Building 3W Building 3E “Road” example Util 3E

  24. Compactness • Assume all variables have three values • Each decision node observes three variables • Number of information sets per agent: 33 = 27 • Size of MAID: • n chance nodes of “size” 3 • n decision nodes of “size” 27·3 • Size of game tree: • 2n splits, each over three values • Size of normal (matrix) form: • n players, each with 327 pure strategies 54n 32n  (327)n

  25. Outline • Probabilistic Reasoning: Bayesian networks • Influence Diagrams • Strategic Relevance • Exploiting Structure for Solving Games

  26. Optimality and Equilibrium • Let E be a subset of Da, and let  be a partial strategy over E • Is  the best partial strategy for agent a to adopt? • Depends on decision rules for other decision nodes •  is optimal for a strategy profile  if for all partial strategies ’ over E: • A strategy profile  is a Nash equilibrium if for every agent a, Da is optimal for 

  27. MAIDs and Games • A MAID is equivalent to a game tree: it defines a mapping from strategy profiles to payoff vectors • Finding equilibria in the MAID is equivalent to finding equilibria in the game tree • One way to find equilibrium in MAID: • construct the game tree • solve the game Incurs exponential blowup in representation size • Question: can we find equilibria in a MAID directly?

  28. Local Optimization • Consider finding a decision rule for a single decision node D that is optimal for  • For each instantiation pa of Pa(D), must find P* that maximizes: • Some decision rules in  may not affect this maximization problem

  29. Strategic Relevance • Intuitively, D relies on D’ if we need to know the decision rule for D’ in order to determine the optimal decision rule for D. • We define a relevance graph, with: • a node for each decision • an edge from D to D’ if D relies on D’ D D’

  30. D D D D U D’ D’ D’ U D’ U U U don’t care simultaneous move perfect info perfect enough D D D D D’ D’ D’ D’ Examples I: Information

  31. Bet1 Bet2 U Examples II: Simple Card Game Deal Bet1 Bet2 Bet2 relies on Bet1 even though Bet2 observes Bet1 • Bet2 can depend on Deal • Deal influences U • Need probability model of Bet2 to derive posterior on Deal and compute expectation over U Decision D can require D’ even if D’ is observed at D !

  32. Examples III: Decoupled Utilities Deal Bet1 Bet1 Bet2 Bet2 U U Bet2 relies on Bet1 even without influence on utility • Bet2 can depend on Deal • Deal influences U • Need probability model of Bet2 to derive posterior on Deal and compute expectation over U

  33. Build Patio Tree Doctor Examples IV: Tree Killer Poison Tree Build Patio Poison Tree Tree Doctor TreeSick Cost View TreeDead Tree

  34. s-Reachability given D relies on D’ (D’ relevant to D) D D D’ CPD of D’ influences P(U | D,Pa(D)) exists U U • D’ is s-reachable from D if there is some among the descendants of D, such that if a new parent were added to D, there would be an active path from to U given D and Pa(D).

  35. s-Reachability Nodes that D relies on are the nodes that are s-reachable from D. Theorem: s-reachability is sound & complete for strategic relevance • Sound: no s-reachability  strategic irrelevance  P,U • Complete: s-reachability  relevance for some P,U Theorem: Can build the relevance graph in quadratic time using only structure of MAID

  36. Outline • Probabilistic Reasoning: Bayesian networks • Influence Diagrams • Strategic Relevance • Exploiting Structure for Solving Games

  37. D D U D’ D’ U U D D D’ D’ Intuition: Backward Induction • D’observes D • Can optimize decision rule at D’ without knowing decision rule at D • Having optimized D, can optimize D’ • D doesn’t care about D’ • Can optimize decision rule at D without knowing decision rule at D’ • Having optimized D’, can optimize D

  38. D D U D D D’ D’ D’ D’ U U Generalized Backward Induction Idea: Solve decisions by order of relevance graph • Generalized Backward Induction: • Choose decision node D that relies on no other • Find optimal strategy for D by maximizing its local expected utility • Replace D by chance node

  39. D1 D2 Dn-1 Finding Equilibria: Acyclic Relevance Graphs • Choose any strategy profile  for D1,…,Dn-1 • Derive decision rule  for Dn that is optimal for  • NodeDn does not rely on preceding ones •  is optimal for any other strategy profile as well! … D1 D2 Dn-1 Dn-1 Dn Dn • We can now set  as CPD for Dn • And continue by optimizing Dn-1

  40. Generalized Backward Induction • Given topological sort D1,…,Dn of relevance graph: • Begin with arbitrary fully mixed strategy profile  • For i = n down to 1: • Find decision rule  for Di that is optimal for  • Decision rules at previous decisions fixed earlier • Decision rules at subsequent decisions irrelevant • Let (Di) =  Theorem: If the relevance graph of a MAID is acyclic, it can be solved by generalized backward induction, and the result is a pure-strategy Nash equilibrium

  41. When is the Relevance Graph Acyclic? • Single-agent influence diagrams with perfect recall • Multi-agent games with perfect information • Some games with imperfect information • e.g., Tree Killer example But in many MAIDs the relevance graph has cycles…

  42. Cyclic Relevance Graphs Question: What if the relevance graph is cyclic? • Strongly connected component (SCC): • maximal subgraph s.t.  directed path between every pair of nodes • The decisions in each SCC require each other • They must be optimized together • Different SCCs can be solved separately

  43. Generalized Backward Induction Given topological sort C1,…,Cm of SCCs in relevance graph: • Begin with arbitraryfully mixed strategy profile  • For i = m down to 1: • Construct reduced MAID M[-Ci] • Strategies for previous SCCs selected before • Strategies for subsequent SCCs irrelevant • Create game tree for M[-Ci] • Use game solver to find equilibrium strategy profile  for Ciin this reduced game • Let (Ci) =  Theorem: If find equilibrium for each SCC, the result is equilibrium for whole game

  44. “Road” Relevance Graph 1W 1E 2W 2E 3W 3E Note: Reduced games over SCCs are not subgames!

  45. Experiment: “Road” Example Reminder, for n=4: Tree size: 6561 nodes Matrix size: 4.71027 For n=40: Tree size: 1.47 1038nodes

  46. Cutting Cycles • Idea: enumerate possible values d for some decision D • Once we determine D, residual MAID has acyclic relevance graph • Solve residual MAID using generalized backward induction • Check whether combined strategy with d is an equilibrium D • May need to instantiate several decision nodes to cut cycle • Can deal with each SCC separately Theorem: Can find all pure strategy equilibria in time linear in # of SCCs, exponential in max # of decisions required to cut all loops in component

  47. Irrelevant Information What if B can observe A’s decision completely irrelevant to him Cost Resource A Sales B Sales Commission Commission Sales-A Sales-B Revenue • We can automatically • analyze relevance based on graph structure • eliminate irrelevant information edges • In associated tree, safe merging of information sets • Leads to exponential decrease in # of decisions to optimize in influence diagram!

  48. Related Work • Suryadi and Gmytrasiewicz (1999) use multi-agent influence diagrams, but with recursive modeling • Milch and Koller (2000) use the MAID representation described here, but have no algorithm for finding equilibria • Nilsson and Lauritzen (2000) discuss limited memory influence diagrams (LIMIDs) and derive s-reachability, but do not apply it to multi-agent case • La Mura (2000) proposes game networks, with an undirected notion of strategic dependence

  49. Future Work • Take advantage of structure within SCCs • Represent asymmetric scenarios compactly • Detect irrelevant observations

  50. Expert analysis of: “Prototypical” examples that highlight key issues Abstracted problems for big organizations Autonomous agents interacting economically Decision support systems for consumers • Goals: Make game theory • a broadly usable tool even for lay people • a formal basis for interacting autonomous agents • by allowing real-world games to be easily represented • and solved. Computational Game Theory Game theory: Past Game theory: Future • Simplified examples • small enough to be analyzed by hand • Complex problems: • many relevant variables • interacting decisions

More Related