[R-sig-ME] Split-plot Design

Fri Mar 21 15:22:33 CET 2008

The following (which uses the variable names in the original query)
shows that "lme" and "aov" produce the same results:

(mod <- lme(measure ~ nitro*year,random= ~ 1|bloc/nitro))
anova(mod)
summary(aov(measure ~ nitro*year + Error(bloc/nitro)))

(In the text, Kuehl rounded off the mean squares before computing
the F statistics, so they don't match.)

This is the classical "split plot" analysis, and for balanced data
there is no controversy over the F statistics.  Of course as always
we must assume (I prefer "pretend") that the random effects, including
the "errors", have normal distributions with constant variances.

The trouble arises with unbalanced data and/or more complex models.
Hopefully, reliable and convenient inference methods for those cases
will be developed, but until then those of us who continue to use
"nlme" (or SAS, or ...) must understand that we do so at our own risk.

Regards,   Rob Kushler

Andy Fugard wrote:
> John Maindonald wrote:
>> I do not think it quite true that the aov model that has an
>> Error() term is a fixed effects model.  The use of the word
>> "stratum" implies that a mixed effects model is lurking
>> somewhere.  The F-tests surely assume such a model.
> 
> This is something that has been bothering me for a while.  Often it's 
> argued that ANOVA is just regression; clearly this is not true when it's 
> a repeated measures ANOVA, unless "regression" is interpreted broadly. 
> I think Andrew Gelman argues this somewhere.  I don't see how to get aov 
> to give me a formula, and lm doesn't fit stuff with an Error() term, but 
> if it could, logically I would expect the formula to resemble closely 
> the sort of thing you get with a mixed effects models.
> 
> The closest analogy I can find is that doing this...
> 
> 
>  > aov1 = aov(yield ~  N+P+K + Error(block), npk)
>  > summary(aov1)
> 
> Error: block
>            Df Sum Sq Mean Sq F value Pr(>F)
> Residuals  5    343      69
> 
> Error: Within
>            Df Sum Sq Mean Sq F value Pr(>F)
> N          1  189.3   189.3   11.82 0.0037 **
> P          1    8.4     8.4    0.52 0.4800
> K          1   95.2    95.2    5.95 0.0277 *
> Residuals 15  240.2    16.0
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> 
> [I believe the F's and p's here come from a linear construction of 
> nested models (i.e., N versus intercept only; N+P versus N; plus N+P+K 
> versus N+P).]
> 
> ... is a bit like doing this with lmer...
> 
> 
>  > lmer0 = lmer(yield ~  1 + (1|block), npk)
>  > lmer1 = lmer(yield ~  N + (1|block), npk)
>  > lmer2 = lmer(yield ~  N+P + (1|block), npk)
>  > lmer3 = lmer(yield ~  N+P+K + (1|block), npk)
>  > anova(lmer0,lmer1)
> Data: npk
> Models:
> lmer0: yield ~ 1 + (1 | block)
> lmer1: yield ~ N + (1 | block)
>        Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
> lmer0  2 157.5 159.8  -76.7
> lmer1  3 151.5 155.1  -72.8  7.93      1     0.0049 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>  > anova(lmer1,lmer2)
> Data: npk
> Models:
> lmer1: yield ~ N + (1 | block)
> lmer2: yield ~ N + P + (1 | block)
>        Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
> lmer1  3 151.5 155.1  -72.8
> lmer2  4 153.0 157.8  -72.5  0.47      1       0.49
>  > anova(lmer2,lmer3)
> Data: npk
> Models:
> lmer2: yield ~ N + P + (1 | block)
> lmer3: yield ~ N + P + K + (1 | block)
>        Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
> lmer2  4 153.0 157.8  -72.5
> lmer3  5 149.0 154.9  -69.5  6.02      1      0.014 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> 
> Namely, fitting the models, and making comparisons using likelihood 
> ratio tests.  Okay, the LLR is asymptotically chisq distributed with df 
> the difference in parameters whereas... well F is different - I always 
> get confused with the df and what bits of the residuals plug in where.
> 
> But, is this vaguely right?  (I'm aware that order matters when the 
> design isn't balanced, and have read much about the type I errors versus 
> type III errors disagreements.)
> 
> Actually this does lead to one question: following through the analogy, 
> are repeated measures ANOVAs (those fitted with aov) always random 
> intercept models, i.e. with no random slopes?
> 
> 
> Cheers,
> 
> Andy
> 
>