Game Theory Sequential bargaining and Repeated Games

Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

Sequential Bargaining Ultimatum game is a sequential bargaining game with one round. SPE we know Consider then a sequential bargaining game with two rounds and alternating offers, and players discounting future pay-off with δ. SPE pay-offs are (1-δ, δ) Player 2 can propose to keep everything in last round and this will be accepted. Thus, by refusing in the first round he can guarantee himself δ Player 1 should give him at least δ in first round if 2 is about to accept; he can get at most 1-δ 2

Alternating offers (Rubinstein, Stahl) Same stage game, but repeated infinitely often. What are equilibrium profits? Define v (v*) as lowest (highest) pay-off you can get if you make an offer Because of infinite horizon and equal discount factors, period 1 analysis is the same as period 2 analysis v≥1- δv*: lowest pay-off player 1 can guarantee himself is remaining of highest discounted pay-off player 2 can guarantee himself in the next round v* ≤1- δv : highest pay-off player 1 can guarantee himself is remaining of lowest discounted pay-off player 2 can guarantee himself in the next round v ≥1/(1+δ) and v* ≤ 1/(1+δ). Hence, equalities have to hold Player 1 is better off as he makes first proposal, but advantage disappears when δ gets close to 1. Intuitive First offer such that it is immediately accepted! Why to bother about rest of the game? Unique subgame perfect equilibrium strategies 3

What if δ’s differ across players Period 1 analysis is similar to period 3 analysis, but not anymore to period 2 analysis Define vi (vi*) as lowest (highest) pay-off player i can get if she makes an offer v1≥ 1- δ2v2*; by symmetry, the same thing holds for player 2. v1* ≤ 1- δ2v2; by symmetry, the same thing holds for player 2. v1≥ (1- δ2)/(1- δ1δ2) and v*1≤ (1- δ2)/(1- δ1δ2) Hence, equalities have to hold; additional advantage for player with highest δ. 4

Notation in repeated games • Define history of play as follows. • Let a0 = (a01 ,a02 ,…,a0n) be the action profile that is played in stage 0, i.e., the actions played by all players • History at the beginning of period 1, h1 = a0 • History at the beginning of stage t+1, ht+1 = (a0,…,at) • The set Ht is the set of all possible histories ht and Ai(ht) is the set of actions that player i can choose after history ht and Ai(Ht) is the union of this set over all possible histories • Strategy σi of player i is a sequence of mappings {σki} where each σki maps Hk to mixed actions. • Note that you cannot condition on the random events

Subgame perfection and the one-stage deviation principle in finitely repeated games One stage deviation principle: No player can deviate by deviating in a single period and then returning back to the (equilibrium) strategy There is no player i and strategy s’(i) that is equal to s*(i) apart from the action in one period given one history h, such that ui (s’i ,s*-i ) > ui (s*i ,s*-i ) given that history h Prop. In finite horizon games, a strategy combination s* is a SPE if, and only if, it satisfies “one stage deviation principle”. Only if: clear, otherwise there is an immediate violation of SPE definition If: suppose to the contrary, s* satisfies the principle but is not SPE. Then there is a stage t and a history ht s.t. at least one player has a strategy s’i(ht)≠s*i(ht) and s’i(ht) is a better response. Continuation next slide 6

Proof one stage deviation principle Let t’ be the last period in which s’i(ht’)≠s*i(ht’) Because of the one-stage-deviation principle t’ > t Period t’ is defined such that for all t” > t’ s’i(ht”)=s*i(ht”) Define then another strategy sI that is such that it coincides with s’I up to t’ and coincides with s*I at t’ and afterwards. Because of the one-stage-deviation principle and since s’i(ht”)=si(ht”)for all t” > t’, si is as good a response given history ht If t’ = t+1, then si only differs in one period from s*, and therefore the one stage deviation principle implies that si cannot be strictly better If t’ > t+1, similar argument applies (details page 109) 7

Additional equilibria in repeated games Main interest in repeated games is what type of equilibrium outcomes can be supported that cannot be supported in a static game Repetition of static equilibrium is always an equilibrium in a repeated game; not so interesting Thus, what else? Consider an example 8

A Static Game 9

Multiple Equilibria Nash Equilibria 10

Can non-Nash outcomes of the static game be supported in equilibrium if the game is repeated 2 times? 11

Last period analysis • In the last period they cannot choose for (U,L) • As both firms have an incentive to “cheat” as 16 is a higher pay-off than 12 • Punishment is not possible (as it is the last period) 12

First-period analysis • But: in the first period they can choose for (U,L) • Strategy: • - Choose “U (L) ” in period 1 • - Choose “M (C)” in period 2 when other chooses “L (U)” in period 1 • Choose “B (R)” in period 2 when other chooses somthing else in period 1 • Punishment is part of strategy • Is this an equilibrium? Is it a SPE 13

Pay-offs in infinitely repeated games • Overall pay-offs ui; stage game pay-offs gi, continuation pay-off from period t onwards • Want to have an expression where one can easily compare stage game pay-offs and repeated game pay-offs, i.e., normalisation: • Time averaging is sometimes used for the case of complete patience

Folk Theorem I • If players are sufficiently patient, then any feasible, individually rational pay-offs can be enforced by an equilibrium • Individually rational pay-offs: minimax pay-off • vi = • mji is action player j chooses to minimax player i • Feasible pay-offs is the convex hull V of the static game pay-offs, i.e., V = convex hull {v / there is an a  A such that g(a) =v} • Both terms need some explanation

Minimax pay-offs • What are the Nash equilibria of this game? • Denote by q the probability player 2 chooses L • In a mixed strategy eq ⅓≤q≤⅔, pay-offs 0 and 1 • Minimax for player 1 • u(U) = -3q+1 • u(M) = 3q-2 • U(D) = 0 • Minimax is 0 • Minimax for player 2 is also 0 • By 1 choosing (½,½,0) • Thus, minimax pay-offs can be lower than Nash eq. pay-offs

Feasible pay-offs • Equilibrium pay-offs are (2,1), (1,2) and (⅔, ⅔) • Convex hull of eq. pay-offs is triangle connecting the three points (also e.g. (1½,1½)) • V connects (2,1), (1,2) and (0, 0) • But (1½,1½) cannot be obtained by independent mixing, only as correlated eq • Correlated mixing can happen in repeated setting by alternating between playing two equilibria (and time averaging pay-offs or δ close to 1) Eq. pay-offs

Folk Theorem II • Prop. For every feasible pay-off vector v with vi> vi, there exist a δ < 1 such that for all δ > δ there exist a Nash equilibrium of the infinitely repeated game with pay-off v. • Pay-offs in repeated game cannot only be larger, but also smaller than static Nash eq pay-offs!! • Basic idea: if players are sufficiently patient, then any finite gain in a one period deviation is nothing compared to a small, but permanent loss in future pay-offs (punishment by minimaxing a player)

“Proof” “Nash Folk Theorem” • Consider feasible pay-off v and action profile g(a)=v • If there is no action profile a that yields v, you may choose a sequence of actions such that v is (close to) average (discounted) pay-offs (or a public randomization) • Consider strategy: start by playing ai; play ai as long as others do, if one player j deviates minimax him forever, i.e., choose mji • Deviation in period t yields normalised pay-off • which is smaller than vi if δ is larger than δi, where δi solves

Is the threat of Minimaxing credible? • If we restrict analysis to static “Nash threats”, then Friedman shows that only pay-offs larger than the static Nash equilibrium pay-offs can be supported • Others show that in games where the minimax pay-offs are lower than the static equilibrium pay-offs, even worse outcomes can be compatible with a SPE of the infinitely repeated game.

Basic idea of SPE with minimax pay-offstime averaging • After a deviation, play the minimax pay-off for N periods, where N is chosen for all players s.t. • After N periods return back to “cooperative” mood • (finite) N ensures that no player has an incentive to deviate • Cost of punishment is extremely small as with time averaging pay-offs in a finite number of periods “do not make a difference” • Average pay-off to player j when i is punished is vj

Basic idea of SPE with minimax pay-offsdiscounted pay-offs • Previous strategies (for time averaging pay-offs) do not work as it may be that minimaxing another player gives a player a lower pay-off than his own minimax pay-off. • Reward punishers, instead of punishing them if they don’t punish • Choose a vector in the interior of V such that for each i you can still give a higher pay-off. • V needs to be of “full dimension” • Play in three phases: • Initial cooperative phase • Punishment phase where players minimax for N(j) periods the deviator j (as before); • switch to punishment phase for player i if i deviates in one of the N(j) periods. • Reward phase after the punishment phase is fully completed

Renegotiation proofness in repeated games • Is SPE the best notion of a credible threat? • Suppose you cooperate for some time in the PD and then someone defects, by chance. Should you go back immediately to always defect? • Or should players “renegotiate”? • It is in both players interest to revert back to the cooperative outcome • In any subgame the equilibrium played must not be Pareto-dominated. • Pareto-optimality as an assumption and the critique that is possible (risk dominance and Pareto-dominance) • Deviations are accidents and unlikely to be repated? “Bygones are bygones”

Pareto perfection only applies in two-player games A • Two Nash equilibria in pure strategies: (U,L,A) and (D,R,B) • ULA is Pareto-efficient • Natural candidate? • Suppose players 1 and 2 expect matrix chooser to choose A. Then they can renegotiate and gain by playing (D,R) B

Definition of Pareto perfect equilibrium • Fix stage game g and play it for T periods. • Let P(T) the set of pay-offs of pure strategy SPE of G(T) • R(t) is the set of strongly efficient points of P(t), i.e., this is the set of points such that there is not another pay-off point where no player is worse off and some player is better off. • Set Q(1) = P(1) • For any t, let Q(t) be the set of pay-offs of pure strategy SPE that can be enforced with continuation pay-offs in R(t-1) • A SPE is Pareto perfect if for every possible history and in every time period t, the continuation pay-offs are in R(T-t)

Pareto perfection restricts threats • Some efficient equilibria cannot be supported anymore under Pareto-perfection • It restricts the set of threats, and thereby it is more difficult to keep players on the equilibrium path • Example

Example Pareto-perfection • Three pure strategies in G(1) with pay-offs (4,1), (1,4) and (3,3) • In G(2) without discounting pay-off of 8 is possible. Unique element in R(2) • Without restriction to Pareto perfection in G(3) pay-off of 13 possible • With Pareto perfection in first period of G(3) no threat possible; one has to play stage game equilibrium • Equilibrium play alternates between odd and even periods under Pareto perfection

Game Theory Sequential bargaining and Repeated Games

Game Theory Sequential bargaining and Repeated Games

Presentation Transcript

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Bargaining Games

Non-cooperative game theory: Three fisheries games

Nash Bargaining Game

Games, Theory and Application

Nash Bargaining Solution and Alternating Offer Games

Bargaining Theory

Game Theory Dynamic Bayesian Games II

Game Theory Dynamic Bayesian Games

Game Theory

Games of Strategy (Game Theory)

Game Theory

Game Theory Static Bayesian Games

Game theory Impartial Games

Bargaining games