Multiagent Systems

Multiagent Systems Extensive Form Games

Extensive Form Games • Normal form games don’t allow to represent • sequentiality of decisions of the agents • Multiple sequential decisions of an agent • temporal structure of multiagent decisions • Extensive form games provide • Explicit representation of temporal structure/protocol of decisions • Explicit representation of multiple sequential decisions by an agent

Extensive Form Games • To capture different amounts of information available in different scenarios, there are two main variants of extensive form games • Perfect Information Games • Each player knows the current state in the decision making sequence and is aware of all decisions that the other agents have made • Imperfect Information Games • Different parts of the game look identical to the agent and it can not decide which of them it is in • In extensive form games the decision making sequence is represented as a decision tree

Perfect Information Games • A perfect information game in extensive form is defined as: • N is the set of n agents • A is the set of actions • H is the set of non-terminal choice (decision) nodes • Z is the set of terminal nodes • χ: H→2Aindicates all actions available to the agent in a node • ρ:H→N indicates which agent makes decisions in a given node • σ:H✕A→H U Z is the successor function indicating the next node in the game • u=(u1,…un) is the vector of utility functions for each player

Perfect Information Games • The sharing game in extensive form • Two siblings receive two presents • One sibling decides how to share them • The second sibling decides whether to accept the shares or to decline the presents 1 0-2 2-0 1-1 2 2 2 no yes no yes no yes (0,0) (1,1) (0,0) (0,2) (2,0) (0,0)

Pure Strategies in Perfect Information Games • A pure strategy for agent i in a perfect information game is a complete specification of the (deterministic) actions the agent will take in each decision node associated with the agent • Strategies have to include action choices even for nodes that can not be encountered under the strategy

Pure Strategies in Perfect Information Games • Pure strategies for agent 1: (2-0), (1-1), (0-2) • Pure strategies for agent 2: (yes,yes,yes), (yes,yes,no), (yes,no,yes), (yes,no,no), (no,yes,yes), (no,yes,no), (no,no,yes), (no,no,no) 1 0-2 2-0 1-1 2 2 2 no yes no yes no yes (0,0) (1,1) (0,0) (0,2) (2,0) (0,0)

Pure Strategies in Perfect Information Games 1 • Pure strategies for agent 1: (A,H), (A,I), (B,H), (B,I) • Note: (A,H) and (A,I) are pure strategies even though the decision between H and I after A never has to be taken • Pure strategies for agent 2: (C,E,G), (C,F,G), (D,E,G), (D,F,G) B A 2 2 D E F C 2 1 (1,1) (5,2) H I G (3,2) (2,1) (1,0)

Strategies and Equilibria • Solution strategies can be defined as in normal form games: • Mixed strategies are defined by a probability distribution over pure strategies • Best responses for agent i are strategies that lead to optimal utilities in the context of the strategies of the other agents • A Nash equilibrium is a strategy profile in which each agent’s strategy is a best response to the other agents’ strategies in the profile

Nash Equilibria in Perfect Information Extensive Form Games • Every perfect information game in extensive form has a pure strategy Nash equilibrium • Since the agents make decisions sequentially and are aware of all prior decisions, random decisions making can not hide the actual outcome and therefore reduce to a deterministic action choice. • Every perfect information game in extensive form can be converted into normal form • The reverse is not true since extensive form requires knowledge of prior, sequential decisions

Induced Normal Form 1 • Pure strategies for agent 1: (A,H), (A,I), (B,H), (B,I) • Note: (A,H) and (A,I) are pure strategies even though the decision between H and I after A never has to be taken • Pure strategies for agent 2: (C,E,G), (C,F,G), (D,E,G), (D,F,G) B A 2 2 D E F C 2 1 (1,1) (5,2) H I G (3,2) (2,1) (1,0)

Induced Normal Form 1 • Pure strategy Nash equilibria: • (A,G),(C,F) • (A,H),(C,F) • (B,H),(C,E) B A 2 2 D E F C 1 (3,8) (8,3) (5,5) G H (2,10) (1,0)

Induced Normal Form • Using the induced normal form, all techniques from normal form games can be used • Extensive form is more compact than induced normal form • More utility values have to be represented • Some of the Nash equilibria are counterintuitive • E.g. (B,H),(C,E) -Why would agent 1 ever play H ? • H is a threat for player 2 not to play F • Is this threat credible ?

Subgames and Subgame Perfect Equilibria • A subgame of a game in extensive form is defined by a subtree rooted in a nodein H • A subgame perfect equilibrium is a Nash equilibrium for which its restriction to the nodes in any subgame is also a Nash equilibrium • Nash equilibria with non-credible threats are not subgame perfect • Every perfect information game in extensive form has at least one subgame perfect Nash equilibrium

Computing Subgame Perfect Equilibria • Backward induction can be used to compute a subgame perfect equilibrium for n-player general-sum games • Starting with the smallest subgames, propagate the vector containing the maximum utility for the particular decision agent to the root of the subtree • For the equilibrium strategy the agents take the best action (the one that links to the maximum value) at each node • In zero-sum games this is the common minimax algorithm

The Centipede Problem 1 • Subgame perfect equilibrium: (E,E,E),(E,E,E) • The outcome of this strategy profile is pareto dominated by all but one other outcome C E 2 C E (1,0) 1 C E (0,2) 2 C E (3,1) 1 C E (2,4) (3,5) (4,3)

Imperfect Information Games • Imperfect information games handle situations where agents do not have complete knowledge of the stage of the game or the decision the other agents are taking • Imperfect information is modeled by associating nodes in the decision tree to information sets • Different nodes in the same information set can not be distinguished • Unknown actions of other agents or incomplete knoweldge of the stage of the game lead to non-distinguishable nodes

Imperfect Information Game • An imperfect information game in extensive form is defined as: • N, A, H, Z,χ, σ, u define a perfect information game • I=(I1,…In) is the vector of the information sets Ii of agent i defining the sets of indistinguishable nodes for this agent • Ii = (Ii,1,…,Ii,ki) is a partition of the nodes assigned to agent i where nodes in the same partition (equivalence class) are indistinguishable for agent i

Pure Strategies in Imperfect Information Games • A pure strategy for agent i in an imperfect information game is a complete specification of the (deterministic) actions the agent will take in each information class • Strategies have to include action choices even for information classes (and thus nodes) that can not be encountered under the strategy

Imperfect Information Game • Prisoners’ Dilemma 1 S C 2 2 S C S C (-1,-10) (-10,-1) (-5,-5) (-3,-3) • Pure strategies for agent 1: (C), (S) • Pure strategies for agent 2: (C), (S)

Strategies and Equilibria • All solution strategies can be defined as in perfect information games • As in perfect information games, every imperfect information game can be converted into a normal form game • Every normal form game can be converted into an imperfect information game • Simply put all nodes for player 2 into the same information class

Randomized Strategies • In imperfect information games we can define a second way to generate randomized strategies • Mixed strategies: randomization over pure strategies • Behavioral strategies: strategies containing independent randomization over the actions in each information set

Mixed and Behavioral Strategies 1 • Mixed strategy example for agent 1: • (0.6:(A,G); 0.4:(B,H)) • Behavioral strategy example for agent 1: • ([0.5:A;0.5:B],[0.3:G;0.7:H]) B A 2 2 D E F C 1 (3,8) (8,3) (5,5) G H (2,10) (1,0)

Randomized Strategies • Expressive power of mixed and behavioral strategies are noncomparable • In some games there are outcomes that can be achieved using mixed strategies but not using behavioral strategies • In some games there are outcomes that can be achieved using behavioral strategies but not using mixed strategies

Behavioral Strategy Example 1 R L 1 2 R U D L • Pure strategies: • Agent 1: (L), (R); Agent 2: (U), (D) • Mixed strategy equilibrium: • R,D • Behavioral strategy equilibrium: • [98/198:L;100/198:R],D (100,100) (5,1) (1,0) (2,2)

Perfect Recall • A player in an imperfect information game has perfect recall if he does not forget anything he knew about moves made so far • For every path to two nodes in the same information set for player i, the node sequence leading to the nodes has to be representable by a unique sequence of information classes and for each node sequence, the actions taken by agent i have to be the same as the corresponding ones in any other path

Perfect Recall • Formally: for any two nodes h, h’in the same information class, for every path h0,a0,…hn,an,h and h0,a’0,…h’m,a’m,h’ • m=n • hjand h’jare in the same information class for player i • For all j, ρ(hj)=i → aj=a’j • A game of perfect recall is an imperfect information game in which every agent has perfect recall

Games of Perfect Recall • (Kuhn, 1953): In a game of perfect recall, any mixed strategy of a given agent can be replaced by an equivalent behavioral strategy, and any behavioral strategy can be replaced by an equivalent mixed strategy. • In games of perfect recall, Nash equilibria can be found in the form of behavioral strategies

Equilibria for Games of Perfect Recall • Convert the game to normal form and solve for the game. • Exponential complexity in the normal form game size • In games of perfect recall we can use the sequence form to accelerate the solution by avoiding the increase in the size of the game when converting to normal form • Instead of strategies, use the action sequences of the agents on the path to a terminal node and realization probabilities (representing the probabilities of reaching the terminal nodes under the strategy) • Zero-sum games can be solved in time polynomial in the size of the extensive form game. • General-sum games can be solved in time exponential in the size of the extensive form game

Multiagent Systems