[R-sig-ME] Split-plot Design

Fri Mar 21 14:51:44 CET 2008

John Maindonald wrote:
> I do not think it quite true that the aov model that has an
> Error() term is a fixed effects model.  The use of the word
> "stratum" implies that a mixed effects model is lurking
> somewhere.  The F-tests surely assume such a model.

This is something that has been bothering me for a while.  Often it's 
argued that ANOVA is just regression; clearly this is not true when it's 
a repeated measures ANOVA, unless "regression" is interpreted broadly. 
I think Andrew Gelman argues this somewhere.  I don't see how to get aov 
to give me a formula, and lm doesn't fit stuff with an Error() term, but 
if it could, logically I would expect the formula to resemble closely 
the sort of thing you get with a mixed effects models.

The closest analogy I can find is that doing this...

 > aov1 = aov(yield ~  N+P+K + Error(block), npk)
 > summary(aov1)

Error: block
           Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5    343      69

Error: Within
           Df Sum Sq Mean Sq F value Pr(>F)
N          1  189.3   189.3   11.82 0.0037 **
P          1    8.4     8.4    0.52 0.4800
K          1   95.2    95.2    5.95 0.0277 *
Residuals 15  240.2    16.0
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

[I believe the F's and p's here come from a linear construction of 
nested models (i.e., N versus intercept only; N+P versus N; plus N+P+K 
versus N+P).]

... is a bit like doing this with lmer...

 > lmer0 = lmer(yield ~  1 + (1|block), npk)
 > lmer1 = lmer(yield ~  N + (1|block), npk)
 > lmer2 = lmer(yield ~  N+P + (1|block), npk)
 > lmer3 = lmer(yield ~  N+P+K + (1|block), npk)
 > anova(lmer0,lmer1)
Data: npk
Models:
lmer0: yield ~ 1 + (1 | block)
lmer1: yield ~ N + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer0  2 157.5 159.8  -76.7
lmer1  3 151.5 155.1  -72.8  7.93      1     0.0049 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 > anova(lmer1,lmer2)
Data: npk
Models:
lmer1: yield ~ N + (1 | block)
lmer2: yield ~ N + P + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1  3 151.5 155.1  -72.8
lmer2  4 153.0 157.8  -72.5  0.47      1       0.49
 > anova(lmer2,lmer3)
Data: npk
Models:
lmer2: yield ~ N + P + (1 | block)
lmer3: yield ~ N + P + K + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer2  4 153.0 157.8  -72.5
lmer3  5 149.0 154.9  -69.5  6.02      1      0.014 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Namely, fitting the models, and making comparisons using likelihood 
ratio tests.  Okay, the LLR is asymptotically chisq distributed with df 
the difference in parameters whereas... well F is different - I always 
get confused with the df and what bits of the residuals plug in where.

But, is this vaguely right?  (I'm aware that order matters when the 
design isn't balanced, and have read much about the type I errors versus 
type III errors disagreements.)

Actually this does lead to one question: following through the analogy, 
are repeated measures ANOVAs (those fitted with aov) always random 
intercept models, i.e. with no random slopes?

Cheers,

Andy

-- 
Andy Fugard, Postgraduate Research Student
Psychology (Room F3), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk