[R-sig-ME] Sum of squares discrepancy in lmer

Wed Sep 18 20:04:06 CEST 2013

Perhaps this has been answered some time ago, but if so I don't know,
and couldn't find a way to efficiently search the archive.
Consider the following small fake dataset:

> foo <- expand.grid(A=c("a1","a2","a3"), B=c("b1","b2"), rep=1:3)
> cell.means <- c(20,23,23, 16,17,18)
> set.seed(1234567)
> foo$y <- rnorm(18, mean=rep(cell.means,3), sd=5)

I will consider A fixed and B random. Depending on whether they are
crossed or nested, we get the following standard analyses.

> foo.2way <- lmer(y ~ A + (1|B) + (1|A:B), data=foo)
> foo.nest <- lmer(y ~ A + (1|A:B), data=foo)

Now compare the ANOVA tables:

> anova(foo.2way)
Analysis of Variance Table
  Df Sum Sq Mean Sq F value
A  2 95.613  47.807  1.9603

> anova(foo.nest)
Analysis of Variance Table
  Df Sum Sq Mean Sq F value
A  2   68.5   34.25  1.4044

If I were doing the analysis the old-fashioned way, I'd have the same
SS(A)  in both of these models;
yet they are different here. Moreover, thje value of SS(A) from hand
calculations is 136.547 -- different
from either ANOVA table above. Here are the calculations using lm:

> anova(lm(y ~ A*B, data = foo))
Analysis of Variance Table

Response: y
          Df  Sum Sq Mean Sq F value Pr(>F)
A          2 136.547  68.274  2.7996 0.1005
B          1  76.182  76.182  3.1238 0.1026
A:B        2  69.657  34.828  1.4281 0.2777
Residuals 12 292.647  24.387              

I can verify that for the crossed model, I get F = 68.274/34.828 - 1.96;
and for the nested model,
I pool B and A:B together for a MS of 48.613 (3 df) and hence an F ratio
of 1.40. So in the anova output for lmer, the F ratios match up, but the
sums of squares are scaled somehow. Why is this?

Russ

Russell V. Lenth  -  Professor Emeritus
Department of Statistics and Actuarial Science   
The University of Iowa  -  Iowa City, IA 52242  USA   
Voice (319)335-0712 (Dept. office)  -  FAX (319)335-3017