[R] 2^2 problem revisited

Thu Nov 13 23:31:10 CET 2008

Edna Bell wrote:
> Dear R gurus:
> 
> Here is the following from Montgomery's Design and Analysis of
> Experiments, 5th edition.
> 
>> str(rout1.df)
> 'data.frame':   16 obs. of  3 variables:
>  $ resp: num  18.2 18.9 12.9 14.4 27.2 24 22.4 22.5 15.9 14.5 ...
>  $ A   : Factor w/ 2 levels "-1","1": 1 1 1 1 2 2 2 2 1 1 ...
>  $ B   : Factor w/ 2 levels "-1","1": 1 1 1 1 1 1 1 1 2 2 ...
>> rout1.df
>    resp  A  B
> 1  18.2 -1 -1
> 2  18.9 -1 -1
> 3  12.9 -1 -1
> 4  14.4 -1 -1
> 5  27.2  1 -1
> 6  24.0  1 -1
> 7  22.4  1 -1
> 8  22.5  1 -1
> 9  15.9 -1  1
> 10 14.5 -1  1
> 11 15.1 -1  1
> 12 14.2 -1  1
> 13 41.0  1  1
> 14 43.9  1  1
> 15 36.3  1  1
> 16 39.9  1  1
>> rout1.aov <- aov(resp~A*B,data=rout1.df)
>> summary(rout1.aov)
>             Df  Sum Sq Mean Sq F value    Pr(>F)
> A            1 1107.23 1107.23 185.252 1.175e-08 ***
> B            1  227.26  227.26  38.023 4.826e-05 ***
> A:B          1  303.63  303.63  50.801 1.201e-05 ***
> Residuals   12   71.72    5.98
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> When you test on interaction, you reject (of course).
> 
> Now, I thought that you could not test on the main effects, A and B.
> Is that true, please?

Well, you _can_, question is if you should. And if you do, what does it 
mean?

It is quite instructive in this case to see what happens if you do the 
same thing with lm() and treatment contrasts:

                       Estimate Std. Error t value Pr(>|t|)
(Intercept)             16.100      1.222  13.171 1.70e-08 ***
factor(A)1               7.925      1.729   4.584 0.000628 ***
factor(B)1              -1.175      1.729  -0.680 0.509595
factor(A)1:factor(B)1   17.425      2.445   7.127 1.20e-05 ***

Notice how the significance of B has disappeared completely.

Things become clearer if you produce the actual table of means:

 > with(df,tapply(resp,list(A=A,B=B),mean))
     B
A        -1      1
   -1 16.100 14.925
   1  24.025 40.275

The point is that lm is looking at the effect of B at A=-1, i.e. the 
16.100 vs. 14.925, whereas the aov tests are based on the averages over A,

 > with(df,tapply(resp,list(B=B),mean))
B
      -1       1
20.0625 27.6000

However, seeing the full table, the matter seems to be that B has no 
effect at A=-1, but it does at A=1, or put differently, that the effect 
of A is larger when B=1 than when B=-1.

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907