661 likes | 1.8k Views
Markov Chains. 1. Markov Chains (1). A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed by transition probability.
E N D
Markov Chains (1) A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed by transition probability. Suppose the random variable take state space (Ω) that is a countable set of value. A Markov chain is a process that corresponds to the network. 2
Markov Chains (2) The current state in Markov chain only depends on the most recent previous states. Transition probability where http://en.wikipedia.org/wiki/Markov_chainhttp://civs.stat.ucla.edu/MCMC/MCMC_tutorial/Lect1_MCMC_Intro.pdf 3
An Example of Markov Chains 1 2 3 4 5 where is initial state and so on. is transition matrix. 4
Definition (1) Define the probability of going from state i to state j in n time steps as A state j is accessible from state i if there are n time steps such that , where A state i is said to communicate with state j (denote: ), if it is true that both i is accessible from j and that j is accessible from i. 5
Definition (2) A state i has period if any return to state i must occur in multiples of time steps. Formally, the period of a state is defined as If , then the state is said to be aperiodic; otherwise ( ), the state is said to be periodic with period . 6
Definition (3) A set of states C is a communicating class if every pair of states in C communicates with each other. Every state in a communicating class must have the same period Example: 7
Definition (4) A finite Markov chain is said to be irreducible if its state space (Ω) is a communicating class; this means that, in an irreducible Markov chain, it is possible to get to any state from any state. Example: 8
Definition (5) A finite state irreducible Markov chain is said to be ergodic if its states are aperiodic Example: 9
Definition (6) A state i is said to be transient if, given that we start in state i, there is a non-zero probability that we will never return back to i. Formally, let the random variable Ti be the next return time to state i (the “hitting time”): Then, state i is transient iff there exists a finite Ti such that: 10
Definition (7) A state i is said to be recurrent or persistent iff there exists a finite Ti such that: . The mean recurrent time . State i is positive recurrent if is finite; otherwise, state i is null recurrent. A state i is said to be ergodic if it is aperiodic and positive recurrent. If all states in a Markov chain are ergodic, then the chain is said to be ergodic. 11
Stationary Distributions Theorem: If a Markov Chain is irreducible and aperiodic, then Theorem: If a Markov chain is irreducible and aperiodic, then and where is stationary distribution. 12
Definition (8) A Markov chain is said to be reversible, if there is a stationary distribution such that Theorem: if a Markov chain is reversible, then 13
An Example of Stationary Distributions 0.4 2 0.3 0.3 0.3 0.3 0.7 0.7 1 3 A Markov chain: The stationary distribution is 14
Properties of Stationary Distributions Regardless of the starting point, the process of irreducible and aperiodic Markov chains will converge to a stationary distribution. The rate of converge depends on properties of the transition probability. 15
Monte Carlo Markov Chains MCMC method are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution. The state of the chain after a large number of steps is then used as a sample from the desired distribution. http://en.wikipedia.org/wiki/MCMC 17
Metropolis-Hastings Algorithm (1) • The Metropolis-Hastings algorithm can draw samples from any probability distribution , requiring only that a function proportional to the density can be calculated at . • Process in three steps: • Set up a Markov chain; • Run the chain until stationary; • Estimate with Monte Carlo methods. • http://en.wikipedia.org/wiki/Metropolis-Hastings_algorithm
Metropolis-Hastings Algorithm (2) • Let be a probability density (or mass) function (pdf or pmf). • is any function and we want to estimate • Construct the transition matrix of an irreducible Markov chain with states , whereand is its unique stationary distribution.
Metropolis-Hastings Algorithm (3) • Run this Markov chain for times and calculate the Monte Carlo sum then • Sheldon M. Ross(1997). Proposition 4.3. Introduction to Probability Model. 7th ed. • http://nlp.stanford.edu/local/talks/mcmc_2004_07_01.ppt
Metropolis-Hastings Algorithm (4) • In order to perform this method for a given distribution , we must construct a Markov chain transition matrix with as its stationary distribution, i.e. . • Consider the matrix was made to satisfy the reversibility condition that for all and . • The property ensures that for all and hence is a stationary distribution for
States from Qij not π Tweak States from Pij π Metropolis-Hastings Algorithm (5) • Let a proposal be irreducible where , and range of is equal to range of . • But is not have to a stationary distribution of . • Process: Tweak to yield .
Metropolis-Hastings Algorithm (6) • We assume that has the formwhere is called accepted probability, i.e. given , take
Metropolis-Hastings Algorithm (7) • For • WLOG for some , . • In order to achieve equality (*), one can introduce a probability on the left-hand side and set on the right-hand side.
Metropolis-Hastings Algorithm (8) • Then • These arguments imply that the accepted probability must be
Metropolis-Hastings Algorithm (9) • M-H Algorithm:Step 1: Choose an irreducible Markov chain transition matrix with transition probability .Step 2: Let and initialize from states in .Step 3 (Proposal Step): Given , sample form .
Metropolis-Hastings Algorithm (10) • M-H Algorithm (cont.):Step 4 (Acceptance Step):Generate a random number from If , set elseStep 5: , repeat Step 3~5 until convergence.