# [R-sig-ME] Between and within variance in a GLMM

Douglas Bates bates at stat.wisc.edu
Wed Nov 26 15:29:55 CET 2008

```On Wed, Nov 26, 2008 at 5:11 AM, Renwick, A. R. <a.renwick at abdn.ac.uk> wrote:

> I have a wee query regarding the variation explained by the random effect in a GLMM using lme4 package.  In the model output I was under the assumption that the value given as Std Dev is the VARIANCE BETWEEN the random factor and the VARIANCE WITHIN could be calculated using the following equation:

> Variance WITHIN = ST value x Variance BETWEEN.
> Therefore, Variance WITHIN = ST value x Std Dev value in model output.

> However, the ST value is always equal to the Std Dev given in model output so can someone clarify how to calulate the within and between variance of the random effect.

> EXAMPLE:

> Generalized linear mixed model fit by the Laplace approximation
> Formula: fleapresence ~ sex + width + sess + Nhat + alt + width:sess +      (1 | LocTran)
>   Data: flea
>  AIC  BIC logLik deviance
>  1797 1863 -886.7     1773
> Random effects:
>  Groups  Name        Variance Std.Dev.
>  LocTran (Intercept) 0.10512  0.32422
> Number of obs: 1697, groups: LocTran, 14

>  ..@ ST      :List of 1
>  .. ..\$ : num [1, 1] 0.324

Is fleapresence a binary response?

Your question doesn't have an answer for models fit to a binary
response (Bernoulli or binomial conditional distribution) or to a
count response (Poisson conditional distribution).

The way that the GLMM models fit by glmer are defined, the
unconditional distribution of the random effects is always a
multivariate Gaussian distribution with mean zero and a parameterized
variance-covariance matrix.  Thus there are parameters in the model
that represent the variances of the random effects. The conditional
distribution of the response given a value of the random effects can
be Gaussian or Bernoulli or Poisson or gamma or ...  We do require the
components are conditionally independent and that the scalar
conditional distributions be completely determined by the conditional
mean and, at most, one additional parameter that is common to all
components.  If this parameter exists then a "within" sum of squares
has an interpretation.  However, the Bernoulli or Poisson conditional
distributions are completely determined by the conditional mean so
there isn't any separate scale parameter.  (Those who would claim that
the quasibinomial or quasipoisson families allow for this should bear
in mind that these families do not correspond to actual probability
distributions, which is why it is difficult to define a likelihood for
such models.  These families are artificial constructs with, at best,
questionable mathematical justification.)

When you leave the world of Gaussian distributions and start dabbling
in other conditional distributions you must give up many convenient
properties.  In the Gaussian distribution the mean and
variance/covariance are orthogonal to one another.  You can change the
mean without affecting the variance/covariance and vice versa.  With
other distributions you can't do that.

This is all to say that a "within sum of squares" doesn't correspond
to any parameter or property of the distributions in the model.  One
could create a number that kind-of, sort-of represents something like
the estimate of the scalar conditional variance in a Gaussian mixed
model but it has no meaning for the model.  It is strictly an
artificial construct.

This is not a criticism of your question.  To me the fact that so many
concepts in the realm of analysis of variance are deeply misunderstood
is a symptom of the way that statisticians have taught the subject.
R. A. Fisher had brilliant geometric insight and was able to determine
how comparisons between certain nested models could be conveniently
summarized by partitioning the variability of the response, as
expressed by the sum of squares of the deviations about the mean, into
orthogonal components.  Unfortunately we often present these beautiful
mathematical results by starting at the intermediate steps - sums of
squares, degrees of freedom, mean squares, expected mean squares - and
not mentioning that these are simply computational short-cuts that
only apply to certain, very highly structured models, and furthermore
they are unnecessary in modern computing environments.  The result,
unfortunately, is that we ascribe properties or interpretations to
models inappropriately.

Again, let me emphasize that I am not criticising your question.  If I
haven't explained myself adequately let me know and I will try again.

```