[R-sig-ME] Interpreting the output of summary() of a glmer-object

Tue Sep 25 04:04:19 CEST 2012

Hans Ekbrand <hans at ...> writes:

> First, I have a very simple question. In the summary output of a
> glmer-object, What does the "Variance" and "Std.Dev" mean for the
> Random effects? What is the scale for these measures?

  It's a little hard to think of a way to say this that doesn't
seem redundant ... "Variance" is the estimated variance of the
random effects, "Std.Dev" is the standard deviation (i.e. the
square root of the variance -- these quantities give redundant
information; seeing the variance can be useful because of the
additivity of variances and the traditional presentation of
mixed models in terms of variance decomposition, while the
standard deviation can be useful because it is on the same scale
as the estimated fixed-effect coefficients).  The scale is the
same as the scale of the fixed-effect coefficients, i.e. the
scale of the linear predictor.

  For example, for a Poisson GLMER

glmer(y~x+(1|grp),family=poisson,...)

  the underlying statistical model is

  Y_{ij} ~ Poisson(lambda_{ij})
  log(lambda_{ij}) = b_0 + b_1*x_{ij} + eps_j
  eps_j ~ Normal(0,sigma^2_g)

  "Variance" is the estimate of sigma^2_g

the estimated 
> 
> load(url("http://sociologi.cjb.net/temp/a.strange.df.RData"))
> my.fit.1 <- glmer(MV744A ~ (1|MV024), 
>    data = a.strange.df, family = "binomial")
> summary(my.fit.1)
> 
> Generalized linear mixed model fit by the Laplace approximation 
> Formula: MV744A ~ (1 | MV024) 
>    Data: a.strange.df 
>    AIC   BIC logLik deviance
>  76209 76227 -38102    76205
> Random effects:
>  Groups Name        Variance Std.Dev.
>  MV024  (Intercept) 0.40558  0.63685 
> Number of obs: 73601, groups: MV024, 29
> 
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)    
> (Intercept)  -1.4187     0.1191  -11.91   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> 
> I think that I understand that if the Variance term, here 0.40558 is
> low relative to the Std.Dev, 

  This is not really a meaningful statement.  The Variance reported
is always the (Std.Dev)^2.

> there is not much variation caught by the
> random term (in this case where the random term represents "Regions",
> few Regions would then significantly differ from the grand mean). 

  It's probably easiest to compare the standard deviation to
the fixed effect coefficients.  It's a little hard to know whether
it's "small" because there's not really anything else in this model
to compare it to ... (you could compare it to the intercept, but
that only tells you about the possibility that the binomial probability
is 0.5 (corresponding to 0 on the logit scale), so it's probably
not meaningful ...)

> Here
> we have a big underlying n, which might explain that most Regions did
> signficantly differ from the mean.

  Not clear to me what this means.

> dotplot(ranef(my.fit.1, postVar = TRUE))
> 
> Secondly, after adding several fixed terms, each with a substantial effect, I
> would (given my vague understanding of what the "Variance" term means)
> expect the "Variance" of the random effect to decrease, but on the
> contrary it increased:
> 
> summary(my.fit.5 <- glmer(MV744A ~ (1|MV024) + MV025 + 
> MV106 + MV012 + MV130, data = a.strange.df, family = "binomial"))
> 
> Generalized linear mixed model fit by the Laplace approximation 
> Formula: MV744A ~ (1 | MV024) + MV025 + MV106 + MV012 + MV130 
>    Data: a.strange.df
>    AIC   BIC logLik deviance
>  73327 73483 -36646    73293
> Random effects:
>  Groups Name        Variance Std.Dev.
>  MV024  (Intercept) 0.46855  0.6845  
> Number of obs: 73560, groups: MV024, 29
> 

 [snip]

> Sure, the Std.Dev of the random effect also increased (from 0.63685 to
> 0.6845) but still, isn't the increase of the variance of random effect
> (from 0.40558 to 0.46855) rather odd here?

  Not necessarily.  We're not necessarily talking about "explained
variance" here.

> The caterpillar plot for my.fit.5, shows all regions except 3 of them
> differ signifcantly from the mean, even when controlling for the fixed
> terms.

  In the context of a mixed model it probably doesn't make a lot
of sense to discuss regions "differing significantly from the mean" --
if you want to do hypothesis tests like that, you should treat
the region as a fixed effect ...