[R-sig-ME] MCMC model selection reference

Sun Apr 1 20:30:03 CEST 2012

Hi,

My understanding of DIC (and information criterion generally) is  
woeful, but here are my thoughts on DIC - which I hope others will  
correct if they disagree.

Does DIC wok in principal  - yes, could it work in practice  -  
sometimes, does it work in practice - rarely (for hierarchical models).

DIC needs to be "focused". Imagine you have single Gaussian  
observations (y) on children within schools.  We have fixed effects b,  
random effects u, and variance parameters Vs (between school variance)  
and Ve (within school variance). We also have the fixed-effect design  
matrix X and random-effect design matrix Z.  We could calculate the  
deviance using two likelihoods:

a) dmvnorm(y, X%*%b+Z%*%u, I*Ve)
b) dmvnorm(y, X%*%b, Z%*%t(Z)*Vs+I*Ve)

In a) we are conditioning on the school effects in b) we marginalise  
them. The focus in a) is of the form "can we predict new observations  
in *these* schools" and in b) "can we predict new observations in  
*new* schools".

As a parent you're probably interested in a) as a scientist you're  
probably interested in b).

MCMCglmm (and I believe WinBUGS, depending on how the model is  
parameterised) focuses at the highest level a). The reason for this is  
that MCMCglmm Gibbs samples u and then Gibbs samples Vs conditional on  
u with out the need to calculate b) which is expensive (If DIC=TRUE,  
a) will be calculated and this is easy). Presumably WinBUGS could  
calculate a) or b) depending on how it is set up, but I think b) is  
more usual (?) because of performance issues.

With over-dispersed non-Gaussian data the case for DIC (as  
implemented) is very bad, because the highest level is the latent  
variable (linear predictor).  Lets imagine our observations on  
children were how many times they missed the bus and we treated them  
as log-normal Poisson. DIC would be focused at "can we predict how  
many times *these* children miss the bus".

Modelling over-dispersion using a two-parameter distribution (without  
observational-level effects), perhaps a negative binomial in our  
example, may get us back to "can we predict how many times children  
from *these* schools miss the bus" but getting down to a) may be more  
difficult because with non-Gaussian data the random effects cannot be  
marginalised analytically.

For non-Gaussian data I never use DIC, and have seriously considered  
removing it from MCMCglmm.

Cheers,

Jarrod

Quoting "Steven J. Pierce" <pierces1 at msu.edu> on Sun, 1 Apr 2012  
09:47:03 -0400:

> Here are a couple references on DIC that I happen to have handy:
>
> Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A.
> (2002). Bayesian measures of model complexity and fit. Journal of the Royal
> Statistical Society: Series B (Statistical Methodology), 64(4), 583-639.
> doi: 10.1111/1467-9868.00353  http://www.jstor.org/stable/3088806
>
> Barnett, A. G., Koper, N., Dobson, A. J., Schmiegelow, F., & Manseau, M.
> (2010). Using information criteria to select the correct variance-covariance
> structure for longitudinal data in ecology. Methods in Ecology and
> Evolution, 1(1), 15-24. doi: 10.1111/j.2041-210X.2009.00009.x
> http://dx.doi.org/10.1111/j.2041-210X.2009.00009.x
>
>
> Steven J. Pierce, Ph.D.
> Associate Director
> Center for Statistical Training & Consulting (CSTAT)
> Michigan State University
> E-mail: pierces1 at msu.edu
> Web: http://www.cstat.msu.edu
>
> -----Original Message-----
> From: Ray Danner [mailto:danner.ray at gmail.com]
> Sent: Saturday, March 31, 2012 2:24 PM
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] MCMC model selection reference
>
> Dear list,
>
> I'm looking for guidance on model selection using DIC values.  I'm
> particularly interested in comparing mixed models created with the
> package MCMCglmm.  I currently use AIC for my models built with lme
> and (g)lmer and like the ability to calculate evidence ratios and
> model average predictions, which are very easy for readers to
> conceptualize.  AICcmodavg is great for these things.
>
> Can anyone recommend a resource that describes the appropriate use of
> DIC for model selection (and its limitations)?  I'm mainly an
> ecologist, so a less-technical treatment would be ideal.
>
> My main questions are:
> 1. Can DIC be used to select among mixed models?
> Kery and Schaub (2012 p. 42) raise concerns about counting the correct
> number of parameters and state that WinBUGS does not calculate them
> appropriately, though Millar (2009) provides a method that is
> appropriate for hierarchical models.  On the other hand, Saveliev et
> al. (2009) use DIC to compare models with random effects built with
> the BRugs package.  Hadfield's MCMCglmm Tutorial says that lower DIC
> is better, but doesn't give details about use.
>
> 2. Any rules of thumb on what constitutes sufficiently large deltaDIC
> values?  Are evidence ratios acceptable?
>
> 3. Can DIC be used to calculate model average predictions?
>
> Thanks in advance and please forgive me if I missed your publication.
> Ray
>
>
> Refs
> Kery and Schaub. 2012. Bayesian Population Analysis Using WinBUGS: A
> Hierarchical Perspective.
> Millar. 2009. Comparison of hierarchical Bayesian models for
> overdispersed count data using DIC and Bayes' Factors. Biometrics
> 65:962-969.
> Saveliev et al. 2009. Ch. 23 in Zuur, Mixed Effects Models and
> Extensions in Ecology with R.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.