1 / 59

§❺ Metropolis-Hastings sampling and general MCMC approaches for GLMM

§❺ Metropolis-Hastings sampling and general MCMC approaches for GLMM. Robert J. Tempelman. Genetic linkage example…again. Recall plant genetic linkage analysis problem or Suppose flat constant prior ( p( q )  1) was used. Then. Suppose posterior density is not recognizable .

brigit
Download Presentation

§❺ Metropolis-Hastings sampling and general MCMC approaches for GLMM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. §❺ Metropolis-Hastings sampling and general MCMC approaches for GLMM Robert J. Tempelman

  2. Genetic linkage example…again Recall plant genetic linkage analysis problem or Suppose flat constant prior (p(q)1) was used. Then

  3. Suppose posterior density is not recognizable • Additionally, suppose there is no clear data augmentation strategy • Several solutions: • e.g. adaptive rejection sampling (not discussed here) • One recourse is to use the Metropolis-Hastings algorithm in which one generates from a candidate (or proposal)density function q(q', q'') in generating a MCMC chain of random variatesfrom. • q‘ : where you’re at nowat current MCMC cycle • q'': proposed value for next MCMC cycle

  4. Metropolis Hastings • Say MCMC cycle is currently at value q[t-1] from cycle t-1. • Draw a proposed value q* from candidate density for cycle t. • Accept move from q[t-1] to q[t] = q* with probability: • Otherwise set q[t] = q* Good readable reference? Chib and Greenburg (1995)

  5. How to compute this ratio “safely” • Always use logarithms whenever evaluating ratios!!! • Once you compute this…then backtransform

  6. Back to plant genetics example Recall y1=1997, y2=906, y3=904, y4=32. Let’s use as the candidate generating function (based on likelihood approx.) 1.Determine a starting value (i.e. 0th cycle) q[o] 2.For t = 1, m (number of MCMC cycles) a) Generate q* from q(q[t-1], q*) = N(0.0357,3.6338 x 10-5) b) Generate U from a Uniform(0,1) distribution c) If U<a(q[t-1], q*) then set q[t]=q *, else set q[t]=q[t-1] • Note that this is an independence chains algorithm q(q[t-1], q*) = N(m =0.0357,s2 = 3.6338 x 10-5) q(q[t-1], q*) = q(q*)

  7. Independence chains Metropolis • When candidate does not depend on q[t-1] • i.e. • However, in spite of this “independence” label, there is still serial autocorrelation between the samples. • IML code online. Generate output for 9000 draws after 1000 burn-in samples. Save every 10. q(q[t-1], q*) = q(q*)

  8. Key plots and summaries

  9. Monitoring MH acceptance rates over cycles for genetic linkage example • Average MH acceptance rates (for every 10 cycles) Many acceptance rates close to 1! Is this good? NO Intermediate acceptance ratios (0.25-0.5) are optimal for MH mixing.

  10. How to optimize Metropolis acceptance ratios • Recall q(q[t-1], q*) = N(m,s2) • m = 0.0357, s2=3.6338 x 10-5 • Suggest using q(q[t-1], q*) = N(m,cs2) and modify c (during burn-in) so that MH acceptance rates are intermediate • Increase c….decrease acceptance rates • Decrease c….increase acceptance rates.

  11. “Tuning” the MH-sampler:My strategy • Every 10 MH cycles for first half of burnin, assess the following: • if average acceptance rate > .80, then set c= 1.2c, • if average acceptance rate < .20 then set c= 0.7c, • otherwise let c be. • SAS PROC MCMC has a somewhat different strategy. • Let’s rerun same PROC IML code again but with this modification.

  12. Average acceptance ratio versus cycle(during 400 burn-in cycles) One should finish the tuning process not much later than half-ways through “burnin” c cycle

  13. Monitoring MH acceptance rates over cycles • Average MH acceptance rates (every 10 cycles) post burn-in (16000 cycles)

  14. Posterior density of q.

  15. Random walk Metropolis sampling • More common (especially when proposals based on likelihood function are not plausible) than independence chains Metropolis. • Proposal density is chosen to be symmetric in q* and q[t-1]. • i.e. q(q[t-1], q*) = q(q*, q[t-1]) • Example: generate a random variatedfrom N(0,cs2) and add it to the previous cycle value q[t-1] to generate q*= q[t-1] + d:same as sampling from

  16. Random walk Metropolis (cont’d) • Because of symmetricity of q(q[t-1], q*) in q[t-1] and q*, MH acceptance ratio simplifies: • i.e., because

  17. Back to example. • Start again with s2 = 0.00602 and a starting value for q[t-1] at t=1. • Generate proposed value from acceptwith probability • i.e., generate from N(0,cs2) and add to q[t-1] • Tune c for intermediate acceptance ratesduring burn-in.

  18. Summary

  19. What about “canned” software? • WinBugs • AD Model Builder • Various R packages (MCMCglmm) • SAS PROC MCMC • Will demonstrate shortly…functions a bit like PROC NLINMIXED (no class statement) • They all work fine. • But sometimes they don’t recognize conjugacy in priors • i.e., can’t distinguish between conjugate and non-conjugate (Metropolis) sampling. • So often defaults to Metropolis. (PROC MCMC: random walk Metropolis)

  20. Recall old split plot in time example • Recall the “bunny” example from earlier. • We used PROC GLIMMIX and MCMC (SAS PROC IML) to analyze the data. • Our MCMC implementation involved recognizeable FCD • Split plot in time assumption. • Other alternatives? • Marginal versus conditional specifications on CS • AR(1) • Others? • Some FCD are not recognizeable • Metropolis updates necessary. • Let’s use SAS PROC MCMC.

  21. First create the dummy variables using PROC TRANSREG (PROC MCMC does not have a “CLASS” statement)(Dataset called ‘recodedsplit’) Part of X matrix (full-rank) &_trgind

  22. SAS PROC MCMC(“Conditional” specification) data null; call symputx(‘seed',8723); call symputx('nvar',12); run; Where to save the MCMC samples procmcmc data=recodedsplit outpost=ksu.postsplitpropcov=quanew seed = &seed nmc=400000thin=10 monitor = (beta1-beta&nvar sigmaesigmag); array covar[&nvar] intercept &_trgind; array beta[&nvar] ; parmssige1 ; * residual sd; parmssigg1 ; * random efsd; parms (beta1-beta&nvar) 1; prior beta:~normal(0,var=1e6); /* prior beta: ~ general(0); could also do this too */ prior sige ~ general(0,lower=0); /* Gelman prior */ prior sigg ~ general(0,lower=0); /* Gelman prior */ Metropolis implementation strategy Save how often? Total number of samples after burnin NBI = 1000 (default number of burn-in cycles) Fixed effects dummy variables Fixed effects Parms: starting values Priors: b ~ N(0,106) p(se) ~ constant; p(su) ~ constant

  23. SAS PROC MCMC(conditional specification) beginnodata; sigmae= sige*sige; sigmau= sigg*sigg; endnodata; call mult(covar, beta, mu); random u ~ normal (0,var=sigmau) subject=trtrabbit ; model y ~ normal(mu + u,var=sigmae); run;

  24. PROC MCMC output

  25. Compare to conditional model results from § 82,84

  26. LSMEANS USING PROC MIXED

  27. “Least-squares means”using output from PROC MCMC Cell means Marginal means Compare to Gibbs sampling results from § 85

  28. Posterior densities of s2us2e Bounded above 0…by definition

  29. The Marginal Model Specification (Type = CS) • SAS PROC MIXED CODE title "Marginal Model: Compound Symmetry using PROC MIXED"; procmixed data=ear ; class trt time rabbit; model temp = trt time trt*time /solution; repeated time /subject = rabbit(trt) type=csrcorr; lsmeanstrt*time; run;

  30. Now • To ensure R is p.s.d, • nt: number of repeated measures per rabbit

  31. Need to format data differently data=recodedsplit1

  32. I’ll keep the covariates in a different file too. data=covariates

  33. PROC MCMC data a; run; /* PROC MCMC WITH COMPOUND SYMMETRY ASSUMPTION */ title1 "Bayesian inference on compound symmetry "; procmcmcjointmodeldata=a outpost=ksu.postcspropcov=quanew seed = &seed nmc=400000 thin=10; array covar[1]/nosymbols ; array data[1]/nosymbols; array first1[1]/nosymbols; array last1[1]/nosymbols; array beta[&nvar] ; array mu[&nrec]; array ytemp[&nrep]; array mutemp[&nrep]; array VCV[&nrep,&nrep]; This data step is a little silly but it is required. jointmodeloption implies that each observation contribution to likelihood function is NOT conditionally independent.

  34. begincnst; rc = read_array("recodedsplit1",data,"y"); rc = read_array("recodedsplit1",first1,"first"); rc = read_array("recodedsplit1",last1,"last"); rc = read_array("covariates",covar); endcnst; parmssige.25 ; * residual sd; parmsintrcl.3 ; * intraclass correlation; parms (beta1-beta&nvar) 1;

  35. beginnodata; prior beta:~normal(0,var=1e6); prior sige ~ general(0, lower=0); /* Gelman prior */ prior intrcl ~ general(0,lower=&lbound1,upper=.999); sigmae = sige*sige; sigmag = intrcl*sigmae; call fillmatrix(VCV,sigmag); do i = 1 to &nrep; VCV[i,i] = sigmae; end; call mult(covar,beta,mu); endnodata; ljointpdf = 0; • &lbound1 = -1/3 (lower bound on CS correlation when blocksize = 4)

  36. do irec = 1 to &nrec; if (first1[irec] = 1) then counter=0; counter = counter + 1; ytemp[counter] = data[irec]; mutemp[counter] = mu[irec]; if (last1[irec] = 1) then do; do; ljointpdf = ljointpdf + lpdfmvn(ytemp, mutemp, VCV); end; end; end; model general(ljointpdf); run;

  37. PROC MCMC

  38. PROC MIXED vs PROC MCMC PROC MIXED PROC MCMC

  39. Posterior marginal densities for s2u and s2e under marginal model Notice how much of the posterior density of s2u is concentrated to the left of 0! Potential “ripple effect” on inferences on K’b? (Stroup and Littell., 2002) relative to conditional spec.?

  40. First order autoregressive model (type = AR(1)) • SAS PROC MIXED CODE title "Marginal Model: AR(1)using PROC MIXED"; procmixed data=ear ; class trt time rabbit; model temp = trt time trt*time /solution; repeated time /subject = rabbit(trt) type= AR(1)rcorr; lsmeanstrt*time; run; CORRECTION!

  41. Specifying VCV for AR(1) • Note • Might be easier to specify: Especially for large Rk(i) Example MCMC code provided online.

  42. Variance Component Inference MCMC PROC MIXED

  43. An example of a “sticky” situation • Consider a Poisson (count data) example. • Simulated data from a split plot design. • 4 whole plots per each of 3 levels of a whole plot factor. • 3 subplots per whole plot -> 3 levels of a subplot factor. • Whole plot variance: s2w = 0.50 • Overdispersion (G-side) variance: • B*wholeplot variance: s2e = 1.00

  44. GLIMMIX code: procglimmix data=splitplot method=laplace; class A B wholeplot subject ; model y = A|B /dist=poisson solution ; random wholeplot(A) B*wholeplot(A); lsmeans A B A*B/e ilink; run;

  45. Inferences on variance components: • PROC GLIMMIX

  46. Using PROC MCMC procmcmc data=recodedsplit outpost=postoutpropcov=quanewseed = 9548nmc=400000 thin=10; array covar[&nvar] intercept &_trgind; array beta[&nvar] ; parmssigmau.5; parmssigmae.5; parms (beta1-beta&nvar) 1; prior beta: ~ normal(0,var=10E6); prior sigmae ~ igamma(shape=.1,scale=.1); prior sigmau ~ igamma(shape=.1,scale=.1); call mult(covar, beta, mu); random u~ normal (0,var=sigmau) subject=plot ; random e~ normal (0,var=sigmae) subject= subject; lambda = exp(mu + u + e); model y ~ poisson(lambda); run;

  47. Some output In the same ball-park as the PROC GLIMMIX solutions/VC estimates…but there is a PROBLEM ->>>>>

  48. Pretty slow mixing

  49. beta1 sigmag sigmae beta2

More Related