1.22k likes | 1.33k Views
Games, Times, and Probabilities: Value Iteration in Verification and Control. Krishnendu Chatterjee Tom Henzinger. Graph Models of Systems. vertices = states edges = transitions paths = behaviors. Extended Graph Models. OBJECTIVE: - automaton. - regular game.
E N D
Games, Times, and Probabilities:Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger
Graph Models of Systems vertices = states edges = transitions paths = behaviors
Extended Graph Models OBJECTIVE: -automaton -regular game CONTROL: game graph stochastic game graph PROBABILITIES: Markov decision process stochastic hybrid system CLOCKS: timed automaton
Graphs vs. Games a a a b a b
Games model Open Systems Two players: environment / controller / input vs. system / plant / output Multiple players: processes / components / agents Stochastic players: nature / randomized algorithms
Example P1: init x := 0 loop choice | x := x+1 mod 2 | x := 0 end choice end loop 1: (x = y ) P2: init y := 0 loop choice | y := x | y := x+1 mod 2 end choice end loop 2:( y = 0 )
Graph Questions 8 ( x = y ) 9 ( x = y ) CTL
Graph Questions 8 ( x = y ) 9 ( x = y ) X 00 01 10 11 CTL
Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) ATL [Alur/H/Kupferman]
Zero-Sum Game Questions 00 00 00 hhP1ii ( x = y ) hhP2ii ( y = 0 ) 10 01 10 01 10 01 11 ATL [Alur/H/Kupferman] 11 11
Zero-Sum Game Questions 00 00 00 hhP1ii ( x = y ) hhP2ii ( y = 0 ) X 10 01 10 01 10 01 11 ATL [Alur/H/Kupferman] 11 11
Zero-Sum Game Questions 00 00 00 hhP1ii ( x = y ) hhP2ii ( y = 0 ) X 10 01 10 01 10 01 11 ATL [Alur/H/Kupferman] 11 11
Nonzero-Sum Game Questions 00 hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 00 10 01 10 01 10 01 11 Secure equilibra [Chatterjee/H/Jurdzinski] 11 11
Nonzero-Sum Game Questions 00 hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 00 10 01 10 01 10 01 11 Secure equilibra [Chatterjee/H/Jurdzinski] 11 11
Strategies Strategies x,y: Q*! Q From a state q, a pair (x,y) of a player-1 strategy x21 and a player-2 strategy y22 gives a unique infinite path Outcomex,y(q) 2 Q.
Strategies Strategies x,y: Q*! Q From a state q, a pair (x,y) of a player-1 strategy x21 and a player-2 strategy y22 gives a unique infinite path Outcomex,y(q) 2 Q. hhP1ii1 = (9x21) (8 y22)1(x,y) Short for: q ²hhP1ii1 iff (9 x21) (8 y22) ( Outcomex,y(q) ²1 )
Strategies Strategies x,y: Q*! Q From a state q, a pair (x,y) of a player-1 strategy x21 and a player-2 strategy y22 gives a unique infinite path Outcomex,y(q) 2 Q. hhP1ii1 = (9 x21) (8 y22)1(x,y) hhP1ii1 hhP2ii2 = (9 x21)(9 y22) [ (1 Æ 2)(x,y) Æ(8 y’22) (2 ! 1)(x,y’) Æ(8 x’21) (2 ! 1)(x,y) ]
Objectives 1and 2 Qualitative:reachability; Buechi; parity (-regular) Quantitative: max; lim sup; lim avg
Normal Forms of -Regular Sets Reachability } a Safety a = :}: a Borel-1
Normal Forms of -Regular Sets Reachability } a Safety a = :}: a Buechi } a coBuechi } a = :}: a Borel-1 Borel-2
Normal Forms of -Regular Sets Reachability } a Safety a = :}: a Buechi } a coBuechi } a = :}: a Streett Æ ( } a !} b ) = Æ ( }: a Ç} b ) Rabin Ç ( } a Æ} b ) Parity: complement-closed subset of Streett/Rabin Borel-1 Borel-2 Borel-2.5
Buechi Game G q1 q3 q2 q0 B q4
Buechi Game G q1 q3 q2 q0 B q4 • Secure equilibrium (x,y) at q0: • x: if q1! q0, then q2 else q4. y: if q3! q1, then q0 else q4. • Strategies require memory.
Zero-Sum Games: Determinacy 1 = :2 W1 hhP1ii1 W2 hhP2ii2
Nonzero-sum Games W00 W01 hhP2ii (2Æ:1) W11 W10 hhP1ii (1Æ:2 ) hhP1ii1 hhP2ii2
Objectives Qualitative:reachability; Buchi; parity (-regular) Quantitative: max; lim sup; lim avg
Objectives Qualitative:reachability; Buchi; parity (-regular) Quantitative: max; lim sup; lim avg Borel-3 Borel-1 Borel-2
Quantitative Games hhP1ii lim sup hhP1ii lim avg 2 4 0 0 2 3 2 0 4
Quantitative Games hhP1ii lim sup = 3 hhP1ii lim avg 2 4 0 0 2 3 2 0 4
Quantitative Games hhP1ii lim sup = 3 hhP1ii lim avg = 1 2 4 0 0 2 3 2 0 4
Solving Games by Value Iteration Generalization of the -calculus: computing fixpoints of transfer functions (pre; post). Generalization of dynamic programming: iterative optimization. R(q’) Region R: Q ! V q q’
Solving Games by Value Iteration Generalization of the -calculus: computing fixpoints of transfer functions (pre; post). Generalization of dynamic programming: iterative optimization. R(q) := pre(R(q’)) R(q’) Region R: Q ! V q q’
Graph Q states transition labels d: Q Q transition function
Graph Q states transition labels d: Q Q transition function = [ Q ! {0,1} ] regions with V = B 9pre: q 9pre(R) iff () d(q,) R 8pre: q 8pre(R) iff () d(q,) R
Graph a b c 9 c = ( X) ( c Ç9pre(X) )
Graph a b c 9 c = ( X) ( c Ç9pre(X) )
Graph a b c 9 c = ( X) ( c Ç9pre(X) )
Graph a b c 9 c = ( X) ( c Ç9pre(X) ) 8 c= ( X) ( c Ç8pre(X) )
Graph Reachability R Given RµQ, find the states from which some path leads to R. R
Graph Reachability R = (m X) (R Ç9pre(X)) Given RµQ, find the states from which some path leads to R. R R [ pre(R)
Graph Reachability R = (m X) (R Ç9pre(X)) Given RµQ, find the states from which some path leads to R. R R [ pre(R) R [ pre(R) [ pre2(R)
Graph Reachability R = (m X) (R Ç9pre(X)) Given RµQ, find the states from which some path leads to R. R R R [ pre(R) . . . R [ pre(R) [ pre2(R)
Graph Reachability R = (m X) (R Ç8pre(X)) Given RµQ, find the states from which all paths lead to R. R R R [ pre(R) . . . R [ pre(R) [ pre2(R)
Value Iteration Algorithms • consist of • LOCAL PART: 9pre and 8pre computation • GLOBAL PART: evaluation of a fixpoint expression • We need to generalize both parts to solve games.
Turn-based Game Q1, Q2 states ( Q = Q1[ Q2 ) transition labels d: Q Q transition function
Turn-based Game Q1, Q2 states ( Q = Q1[ Q2 ) transition labels d: Q Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff q 2 Q1Æ ( ) d(q,) R or q 2 Q2Æ (8 2) (q,) 2 R
Turn-based Game Q1, Q2 states ( Q = Q1[ Q2 ) transition labels d: Q Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff q 2 Q1Æ ( ) d(q,) R or q 2 Q2Æ (8 2) (q,) 2 R 2pre: q 2pre(R) iff q 2 Q1Æ (8 ) d(q,) R or q 2 Q2Æ (9 2 ) (q,) 2 R
Turn-based Game c a b
Turn-based Game c a b hhP1iic = ( X) ( c Ç1pre(X) )
Turn-based Game c a b hhP1iic = ( X) ( c Ç1pre(X) )