1 / 5

Lecture 15

Lecture 15. Sampling. Approximate Inference. Motivation comes from EM, where in the E-step we need to do inference which is often intractable.

benjamin
Download Presentation

Lecture 15

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 15 Sampling

  2. Approximate Inference • Motivation comes from EM, where in the E-step we need to do inference which is often intractable. • Approximate techniques include: Laplace approx. (Gaussian around the mode), mean field (factorized ansatz), loopy belief propagation – all are biased but are efficient. • Sampling is asymptotically unbiased but is typically slow.

  3. Sampling • Ancestral sampling: top-down sampling from Bayes’ net. • Sampling by transformation: try to find a transformation f(.) such that x = f(y) and y is simple to sample from. • Rejection sampling: find a curve f(x) >= p(x) and sample from f(x). Then accept a fraction p(x)/f(x). • Many more methods that we don’t have time for. Most methods however break down in high dimensions: curse of dimensionality.

  4. MCMC & Gibbs • We give up that all samples have to be independent. • Sample a new sample given the previous one according to some proposal distribution S(x’|x). • We only accept the new sample x’ with a certain probability A(x,x’). • T(x,x’) = S(x’|x) X A(x,x’) is chosen such that the samples eventually come from the desired distribution p(x). • p(x) should be the invariant distribution of the kernel T, which is unique of the chain is ergodic (aperiodic & irreducible). • Detailed balance is a condition that is sufficient and is easier to satisfy.

  5. MH & Gibbs • Metropis Hastings Markov Chain Monte Carlo sampler always accepts of the prob. of x’ is larger and reject with a fraction p(x’)/p(x) (in case of symmetric S). • Gibbs sampling is MH method that always accepts and uses the conditional distributions to sample: X, Y ~P(X,Y)  iterate X~P(X|Y) & Y~P(Y|X). • MCMC has problems if the random variables are highly dependent and if there are multiple modes. In both cases it takes very long to explore all of the space that has significant probability. This is called slow mixing.

More Related