[R] Balanced design, differences in results using anova and lmer/anova
Lars Kunert
lkunert at mpi-sb.mpg.de
Fri Feb 27 10:06:17 CET 2009
Hi, I am trying to do an analysis of variance for an unbalanced design.
As a toy example, I use a dataset presented by K. Hinkelmann and O.
Kempthorne in "Design and Anaylysis of Experiments" (p353-356).
This example is very similar to my own dataset, with one difference: it
is balanced.
Thus it is possible to do an anaylsis using both: (1) anova, and (2) lmer.
Furthermore, I can compare my results with the results presented in the
book (the book uses SAS).
In short:
> using anova, I can reproduce the results presented in the book.
> using lmer, I fail to reproduce the results
However, for my "real" analysis, I need lmer - what do I do wrong?
The example uses as randomized complete block desigh (RCBD) with a
nested blocking structure
and subsampling.
response:
height (of some trees)
covariates:
HSF (type of the trees)
nested covariates:
loc (location)
block (block is nested in location)
# the data (file: pine.txt) looks like this:
loc block HSF height
1 1 1 210
1 1 1 221
1 1 2 252
1 1 2 260
1 1 3 197
1 1 3 190
1 2 1 222
1 2 1 214
1 2 2 265
1 2 2 271
1 2 3 201
1 2 3 210
1 3 1 220
1 3 1 225
1 3 2 271
1 3 2 277
1 3 3 205
1 3 3 204
1 4 1 224
1 4 1 231
1 4 2 270
1 4 2 283
1 4 3 211
1 4 3 216
2 1 1 178
2 1 1 175
2 1 2 191
2 1 2 193
2 1 3 182
2 1 3 179
2 2 1 180
2 2 1 184
2 2 2 198
2 2 2 201
2 2 3 183
2 2 3 190
2 3 1 189
2 3 1 183
2 3 2 200
2 3 2 195
2 3 3 197
2 3 3 205
2 4 1 184
2 4 1 192
2 4 2 197
2 4 2 204
2 4 3 192
2 4 3 190
#
# then I load the data
#
read.data = function()
{
d = read.table( "pines.txt", header=TRUE )
d$loc = as.factor( d$loc )
d$block.tmp = as.factor( d$block )
d$block = ( d$loc:d$block.tmp )[drop=TRUE] # lme4 does not support
implicit nesting
d$HSF = as.factor( d$HSF )
return( d )
}
d = read.data()
#
# using anova.....
#
m.aov = aov( height ~ HSF*loc + Error(loc/block + HSF:loc/block), data=d )
summary( m.aov )
#
# I get:
#
Error: loc
Df Sum Sq Mean Sq
loc 1 20336 20336
Error: loc:block
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 6 1462.33 243.72
Error: loc:HSF
Df Sum Sq Mean Sq
HSF 2 12170.7 6085.3
HSF:loc 2 6511.2 3255.6
Error: loc:block:HSF
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 12 301.167 25.097
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 24 529.00 22.04
#
# which is, what I expected, however, using lmer....
#
m.lmer = lmer( height ~ HSF*loc + HSF*(loc|block), data=d )
anova( m.lmer )
#
# I get:
#
Analysis of Variance Table
Df Sum Sq Mean Sq
HSF 2 12170.7 6085.3
loc 1 1924.6 1924.6
HSF:loc 2 6511.2 3255.6
#
# what is, at least not what I expected...
#
Thanks for your help, Lars
More information about the R-help
mailing list