[R] Anova - interpretation of the interaction term

Sat Apr 23 17:55:38 CEST 2005

> From: Ted.Harding at nessie.mcc.ac.uk
> 
> On 23-Apr-05 Bill.Venables at csiro.au wrote:
[snip]

> [technical setup snipped]
> 
> >> contrasts(dat$vac_inf) <- ginv(m)
> >> gm <- aov(y ~ vac_inf, dat)
> >> summary(gm)
> >             Df  Sum Sq Mean Sq F value  Pr(>F)
> > vac_inf      3 12.1294  4.0431   7.348 0.04190
> > Residuals    4  2.2009  0.5502
> > 
> > This doesn't tell us too much other than there are differences,
> > probably.  Now to specify the partition:
> >                 
> >> summary(gm, 
> >       split = list(vac_inf = list("- vs +|N" = 1, 
> >                                   "- vs +|Y" = 2)))
> >                     Df  Sum Sq Mean Sq F value  Pr(>F)
> > vac_inf              3 12.1294  4.0431  7.3480 0.04190
> >   vac_inf: - vs +|N  1  7.9928  7.9928 14.5262 0.01892
> >   vac_inf: - vs +|Y  1  3.7863  3.7863  6.8813 0.05860
> > Residuals            4  2.2009  0.5502
> 
> Wow, Bill! Dazzling. This is like watching a rabbit hop
> into a hat, and fly out as a dove. I must study this syntax.
> But where can I find out about the "split" argument to "summary"?
> I've found the *function* split, whose effect is similar, but
> I've wandered around the "summary", "summary.lm" etc. forest
> for a while without locating the *argument*.

gm is fitted with aov(), so wouldn't it make sense to look at ?summary.aov?
It says:

split an optional named list, with names corresponding 
      to terms in the model. Each component is itself a 
      list with integer components giving contrasts 
      whose contributions are to be summed. 

and even has an example showing how it's used.

Andy

> My naive ("cop-out") approach would have been on the lines
> of (without setting up the contrast matrix):
> 
>   summary(aov(y~vac*inf,data=dat))
>               Df  Sum Sq Mean Sq F value  Pr(>F)  
>   vac          1  0.3502  0.3502  0.6364 0.46968  
>   inf          1 11.3908 11.3908 20.7017 0.01042 *
>   vac:inf      1  0.3884  0.3884  0.7058 0.44812  
>   Residuals    4  2.2009  0.5502                  
> 
> so we get the 2.2009 on 4 df SS for redisuals with mean SS 0.5502.
> 
> Then I would do:
> 
>   mNp<-mean(y[(vac=="N")&(inf=="+")])
>   mNm<-mean(y[(vac=="N")&(inf=="-")])
>   mYp<-mean(y[(vac=="Y")&(inf=="+")])
>   mYm<-mean(y[(vac=="Y")&(inf=="-")])
> 
>   c(     mYp,       mYm,       mNp,      mNm       )
>   ##[1]  2.4990492  0.5532018  2.5212655 -0.3058972
> 
>   c(    mYp-mYm,   mNp-mNm )
>   ##[1] 1.945847   2.827163
> 
> 
> after which:
> 
>   1-pt(((mYp-mYm)/sqrt(0.5502)),4)
>   ##[1] 0.02929801
> 
>   1-pt(((mNp-mNm)/sqrt(0.5502)),4)
>   ##[1] 0.009458266
> 
> give you 1-sided t-tests, and
> 
>   1-pf(((mYp-mYm)/sqrt(0.5502))^2,1,4)
>   ##[1] 0.05859602
> 
>   1-pf(((mNp-mNm)/sqrt(0.5502))^2,1,4)
>   ##[1] 0.01891653
> 
> give you F-tests (equivalent to 2-sided t-tests) which agree
> with Bill's results above.
> 
> And, in this case, presenting the results as mean differences
> shows that the effect of infection appears to differ between
> vaccinated and unvaccinated groups; but a simple test shows
> this not to be significant:
> 
>   1-pf( (sqrt(1/2)*((mYp-mYm)-(mNp-mNm))/sqrt(0.5502))^2, 1,4)
>   ##[1] 0.4481097
> 
> As I said above, this would be my naive approach to this
> particular case (and to any like it), and I would expect
> to be able to explain all this in simple terms to a colleague
> who was asking the sort of questions you have reported.
> Or is it the case that offering an anova table is needed,
> in order to evoke consent to the results by virtue of the
> familiar format?
> 
> > As expected, infection changes the mean for both vaccinated and
> > unvaccinated, as we arranged when we generated the data.
> 
> The challenge to generalise is interesting. However, as implied
> above, I'm already impressed by Bill's footwork in R for this
> simple case, and it might be some time before I'm fluent
> enough in that sort of thing to deal with more complicated
> cases, let alone the general one.
> 
> For users like myself, a syntax whose terms are closer to
> ordinary language would be more approachable. Something
> on the lines of
> 
>   summary(aov(y ~ vac*inf), by=inf, within=vac)
> 
> which would present a table similar to Bill's above (by inf
> within the different levels of vac).
> 
> Intriguing. The challenge is tempting ... !
> 
> Best wishes,
> Ted.
> 
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 23-Apr-05                                       Time: 14:33:00
> ------------------------------ XFMail ------------------------------
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>