1 / 34

Practical Model Selection and Multi-model Inference using R

Practical Model Selection and Multi-model Inference using R. Modified from on a presentation by : Eric Stolen and Dan Hunt. Theory. This is the link with science, which is about understanding how the world works.

ciara
Download Presentation

Practical Model Selection and Multi-model Inference using R

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical Model Selection and Multi-model Inference using R Modified from on a presentation by : Eric Stolen and Dan Hunt

  2. Theory • This is the link with science, which is about understanding how the world works

  3. Indigo Snake Habitat selectionDavid R. Breininger, M. Rebecca Bolt, Michael L. Legare, John H. Drese, and Eric D. StolenSource: Journal of Herpetology, 45(4):484-490. 2011. • Animal perception • Evolutionary Biology • Population Demography http://www.seaworld.org/animal-info/animal-bytes/spooky-safari/eastern-indigo-snake.htm

  4. Hypotheses • To use the Information-theoretic toolbox, we must be able to state a hypothesis as a statistical model (or more precisely an equation which allows us to calculate the maximum likelihood of the hypothesis) http://www.seaworld.org/animal-info/animal-bytes/spooky-safari/eastern-indigo-snake.htm

  5. Multiple Working Hypotheses • We operate with a set of multiple alternative hypotheses (models) • The many advantages include safeguarding objectivity, and allowing rigorous inference. Chamberlain (1890) Strong Inference - Platt (1964) Karl Popper (ca. 1960)– Bold Conjectures

  6. Deriving the model set • This is the tough part (but also the creative part) • much thought needed, so don’t rush • collaborate, seek outside advice, read the literature, go to meetings… • How and When hypotheses are better than What hypotheses (strive to predict rather than describe)

  7. Models – Indigo Snake exampleDavid R. Breininger, M. Rebecca Bolt, Michael L. Legare, John H. Drese, and Eric D. StolenSource: Journal of Herpetology, 45(4):484-490. 2011. • Study of indigo snake habitat use • Response variable: home range size ln(ha) • SEX • Land cover – 2-3 levels (lC2) • weeks = effort/exposure • Science question: “Is there a seasonal difference in habitat use between sexes?”

  8. Models – Indigo Snake example SEX land cover type (lc2) weeks SEX + lc2 SEX + weeks llc2 + weeks SEX + lc2 + weeks SEX + lc2 + SEX * lc2 SEX + lc2 + weeks + SEX * lc2 http://www.herpnation.com/hn-blog/indigo-snake-survival-demographics/?simple_nav_category=john-c-murphy

  9. Models – Indigo Snake example SEX land cover type (lc2) weeks SEX + lc2 SEX + weeks llc2 + weeks SEX + lc2 + weeks SEX + lc2 + SEX * lc2 SEX + lc2 + weeks + SEX * lc2

  10. Modeling • Trade-off between precision and bias • Trying to derive knowledge / advance learning; not “fit the data” • Relationship between data (quantity and quality) and sophistication of the model

  11. Precision-Bias Trade-off Bias 2 Model Complexity – increasing umber of Parameters

  12. Precision-Bias Trade-off variance Bias 2 Model Complexity – increasing umber of Parameters

  13. Precision-Bias Trade-off variance Bias 2 Model Complexity – increasing umber of Parameters

  14. Kullback-Leibler Information • Basic concept from Information theory • The information lost when a model is used to represent full reality • Can also think of it as the distance between a model and full reality

  15. Kullback-Leibler Information Truth / reality G1 (best model in set) G2 G3

  16. Kullback-Leibler Information Truth / reality G1 (best model in set) G2 G3

  17. Kullback-Leibler Information Truth / reality G1 (best model in set) G2 G3

  18. Kullback-Leibler Information Truth / reality G1 (best model in set) G2 G3 The relative difference between models is constant

  19. Akaike’s Contributions • Figured out how to estimate the relative Kullback-Leibler distance between models in a set of models • Figured out how to link maximum likelihood estimation theory with expected K-L information • An (Akaike’s) Information Criteria • AIC = -2 loge (L{modeli }| data) + 2K

  20. AICci = -2*loge (Likelihood of model i given the data) + 2*K (n/(n-K-1)) or = AIC + 2*K*(K+1)/(n-K-1) (where K = the number of parameters estimated and n = the sample size)

  21. AICcmin = AICcfor the model with the lowest AICc value Di = AICci– AICcmin

  22. wi =Prob{gi | data} Model Probability (model probabilities) evidence ratio of model i to model j = wi / wj

  23. Least Squares Regression AIC = n loge (s2) + 2*K (n/(n-K-1)) Where s2 = RSS / n

  24. Counting Parameters: K = number of parameters estimated Least Square Regression K = number of parameters + 2 (for intercept & s)

  25. Counting Parameters: K = number of parameters estimated Logistic Regression K = number of parameters + 1 (for intercept)

  26. Comparing Models Model selection based on AICc : K AICc Delta_AICc AICcWt Cum.Wt LL mod4 4 112.98 0.00 0.71 0.71 -51.99 mod7 5 114.89 1.91 0.27 0.98 -51.67 mod1 3 121.52 8.54 0.01 0.99 -57.47 mod5 4 122.27 9.29 0.01 1.00 -56.64 mod2 3 125.93 12.95 0.00 1.00 -59.67 mod6 4 128.34 15.36 0.00 1.00 -59.67 mod3 3 141.26 28.28 0.00 1.00 -67.34 Model 1 = “SEX ", Model 2 = "ha.ln ~ lc2", Model 3 = "ha.ln ~ weeks ", Model 4 = "ha.ln ~ SEX + lc2", Model 5 = "ha.ln ~ SEX + weeks", Model 6 = "ha.ln ~ lc2 + weeks", Model 7 = "ha.ln ~ SEX + lc2 + weeks"

  27. Model Averaging Predictions

  28. Model-averaged prediction Model Averaging Predictions

  29. Prediction from modeli Model Averaging Predictions

  30. Weight modeli Model Averaging Predictions

  31. Model-averaged parameter estimate Model Averaging Parameters

  32. Unconditional Variance Estimator

  33. Unconditional Variance Estimator

More Related