[R-sig-ME] WAIC calculation in MCMCglmm

Jed Macdonald jedimacdonald at gmail.com
Thu Nov 30 11:12:35 CET 2017


Hi Jarrod,

Thanks very much for your reply, I have both Gaussian and binomial
responses in these models, but also some out-of-sample data. So, as you
suggested, I'll try 'refocusing' for the Gaussian case in addition to
running some validation tests with the hold-out set for both the Gaussian
and binomial models.

For those keen on understanding the 'focus' of the deviance in multilevel
models, I found a previous thread in this list (answered by Jarrod)
entitled '[R-sig-ME] MCMC model selection reference) pretty helpful.

Cheers,
Jed



On Wed, Nov 29, 2017 at 5:44 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
wrote:

> Hi Jed,
>
> The problem with all of these information criterion is that the deviance
> is 'focussed' at the highest level in MCMCglmm. For most scientific
> inference the focus should probably be at the lowest level. The problem is
> that the deviance cannot be calculated at the lowest level for GLMM (except
> in the Gaussian case) - that is why MCMC is being used. If the response is
> Gaussian you could refocus the deviance after the chain is ran. If not,
> brute force cross validation is probably the way to go, but of course in an
> MCMC context this can be costly in terms of computing time.
>
> Cheers,
>
> Jarrod
>
>
> On 28/11/2017 01:06, Jed Macdonald wrote:
>
>> Dear list,
>>
>> I’ve fitted a series of univariate mixed models of varying complexity in
>> the 'MCMCglmm' package, and would like to compute WAIC for model selection
>> purposes, for comparison with DIC, and with AICc returned for equivalent
>> models fitted in 'lme4'. As I understand it, a first step in the WAIC
>> calculation is to compute the log pointwise predictive density (i.e.
>> pointwise log-likelihood), which is evaluated using draws from the
>> retained
>> posterior simulations (after burn-in). For the number of data points *N*
>> and number of retained draws *S*, we can then get a *N* x *S*
>> log-likelihood matrix, which can be used to estimate pointwise
>> out-of-sample prediction accuracy (e.g. using WAIC or LOO cross-validation
>> in the ‘loo’ package) (see Gelman et al. 2014, Vehtari et al. 2016 for an
>> overview).
>>
>> MCMCglmm doesn’t return the pointwise log-likelihood directly, so my
>> thinking was to use the deviance (D), given by D = −2log-likelihood in
>> MCMCglmm, which is returned for each chain iteration. My question(s) is,
>> do
>> these values reflect the mean deviance across all *N* data points for a
>> given iteration? And if so, is there a way to decompose this to pointwise
>> deviance (and hence pointwise log-likelihood) values in an MCMCglmm model?
>>
>> Any advice would much appreciated!
>>
>> Best regards,
>> Jed
>>
>> Gelman, A., Hwand, J. and Vehtari, A. (2014) Understanding predictive
>> information criteria for Bayesian models. Stat Comput 24, 997-1016.
>> Vehtari, A., Gelman, A. and Gabry, J. (2016) Practical Bayesian model
>> evaluation using leave-one-out cross-validation and WAIC.
>> arXiv:1507.04544.
>>
>>
>>
>>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>


-- 
Jed Macdonald
PhD candidate
MARICE
Faculty of Life and Environmental Sciences
University of Iceland
(currently visiting the School of BioSciences, The University of Melbourne)
e: jedimacdonald at gmail.com
t: +61 428 242 066

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list