[R-sig-ME] glmer Z-test with individual random effects

Jarrod Hadfield j.hadfield at ed.ac.uk
Fri Nov 12 12:51:29 CET 2010

Hi John,

I'm not sure which part of the MCMCglmm routine is unclear but in a  
nut shell:

The data are distributed according to the specified distribution (e.g.  
Poisson ) with the dsitribution parameter equal to the linear  
predictor (nu) on the inverse link scale:

y ~ Pois(exp(nu))

A linear (mixed) model is proposed for the linear predictor:

nu ~ N(Xb+Zu, I*Ve)

In all models a "units" term is included which is the same as an  
observational-level random effect.  This is the residual term in the  
linear part of the model.

Priors are passed as  a list of the form:


where B is the prior for the fixed effects, R is the prior for Ve (or  
possibly a residual covariance matrix depending on the model) and the  
elements of G are the priors for each of the random effect  

The prior for b is multivariate normal Pr(b) ~ N(B$mu, B$V)

The prior for Ve is inverse-Wishart Pr(Ve) ~ IW(R$nu, R$V*R$nu) where  
the first term is the degree of belief and R$V*R$nu is the scale matrix

The prior for the random effects variances (Vg) are set in the same  
way as Ve, although its possible to also fit parameter expanded priors.

When the variance term is scalar (rather than a covariance matrix)  
then the inverse-Wishart and inverse-gamma are the same and IW(R$nu, R 
$V*R$nu) = IG(R$nu/2, R$V*R$nu/2) where the two parameters are the  
shape and scale. Your specification R=list(V=1, nu=0.002)  is  
therefore  IG(0.001, 0.001) which used to be used a lot (and probably  
still is) as a "weak" prior by users of WinBUGS.  The REML estimator  
for Ve should be equal to the posterior mode in MCMCglmm when the  
improper prior R=list(V=0, nu=-2)  is used. 0 will not be accepted by  
MCMCglmm so you would have to use something small (e.g. 1e-6). The  
prior is improper so you have to be careful when using it, and  
although the point estimate may have some useful properties (the same  
as the REML estimate) the full posterior distribution may not be ideal  
with small sample sizes.

The sampling scheme is:

update nu using MH updates or slice sampling
Gibbs sample b and u together
Gibbs sample (co)variance components sequentially.

For other models  such as ordinal regression or simultaneous/recursive  
models there are other schemes also, but these are quite specialised.



On 11 Nov 2010, at 23:45, John Maindonald wrote:

> The Wald tests (as represented in glm and glmer output of z-statistics
> and p-values) are approximate for both glm models and glmm models
> with non-identity links. The approximation can fail badly if the  
> link is
> highly non-linear over a region of the response that is relevant for a
> parameter of interest.  The Hauck-Donner phenomenon, where the
> z-statistic decreases as the effect estimate increases, is an extreme
> example.
> (This happens, e.g., as one of the levels being compared gives a
> fitted value that moves close to a binomial proportion of 0 or 1, or
> close to a Poisson estimate of 0.)
> The additional complication for a glmm is that the SE may have
> two components -- e.g., a poisson or binomial error, and a random
> normal error that is added on the scale of the linear predictor.  This
> random normal error somewhat alleviates variance change effects
> (including Hauck-Donner) that result from non-linearity in the link,
> while adding uncertainty to the SE estimate.  If the contribution from
> the glm family error term is largish relative to contribution from the
> random normal error (> 3 or 4  times as large?), treating the z- 
> statistic
> as normal may not in many circumstances be too unreasonable,
> even if the relevant degrees of freedom are as small as maybe 4
> (e.g., where the test is for consistency across 5 locations).
> On data where I seem to have this situation, fairly consistently
> across a number of responses, I initially used MCMCglmm().
> I was concerned about the contribution from the random normal
> error (including a contribution from an observation level random
> effect term).  I found I hat I had to choose a somewhat informative
> prior (inverse Wishart with V=1, nu=0.002) to consistently get
> convergence.  With this prior, the MCMCglmm credible intervals
> were remarkably close to glmer confidence intervals, treating the
> z-statistics as normal.
> No doubt the effect of the chosen prior is to insist that the true
> random normal error variance is fairly close to the variance as
> estimated by glmer().  A frustration (for me, at least) with using
> MCMCglmm is that I do not know just what MCMCglmm() is doing
> in this respect, short of doing some careful investigative exploration
> of the MCMC simulation results (which is an after-the-event check,
> where I'd like to know before the event).  Comments, Jarrod?
> All this is to emphasise that we are not in the arena, unless the
> relevant degrees of freedom are large, of precise science.
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> http://www.maths.anu.edu.au/~johnm
> On 12/11/2010, at 2:18 AM, Ben Bolker wrote:
>> On 11/11/2010 09:58 AM, Jens Åström wrote:
>>> Dear list,
>>> As I have read (Bolker et al. 2009 TREE), the Wald Z test is only
>>> appropriate for GLMMs in cases without overdispersion.
>>> Assuming we use family=poisson with lmer and tackle overdispersion  
>>> by
>>> incorporating an individual random effect AND this adequately  
>>> "reduces"
>>> the overdispersion, is it then OK to use the Wald z test as  
>>> reported by
>>> lmer?
>>> In other words, are the p-values reported by lmer in those cases
>>> useful/"correct"? Or do they suffer from the usual problems with
>>> figuring out the number of parameters used by the random effects?
>> They are equivalent to assuming an infinite/large 'denominator  
>> degrees
>> of freedom'.  If you have a large sample size (both a large number of
>> total samples relative to the number of parameters, and a large  
>> number
>> of random-effects levels/blocks) then this should be reasonable -- if
>> not, then yes, the 'usual problems with figuring out the number of
>> parameters' is relevant.  On the other hand, if you're willing to  
>> assume
>> that the sample size is large, then likelihood ratio rests
>> (anova(model1,model2)) are probably better than the Wald tests  
>> anyway.
>>> Secondly, is it good practice to judge lmer's capability of  
>>> "reducing"
>>> the overdispersion by summing the squared residuals (pearson) and
>>> compare this to a chi square distribution (with N-1 degrees of  
>>> freedom)?
>>  I would say this is reasonable, although again it's a rough guide
>> because the true degrees of freedom are a bit fuzzy -- it should
>> probably be at most N-(fixed effect degrees of freedom)?
>>  Would be happy to hear any conflicting opinions.
>> Ben Bolker
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

More information about the R-sig-mixed-models mailing list