[R] Anova - interpretation of the interaction term
Bill.Venables@csiro.au
Bill.Venables at csiro.au
Sat Apr 23 04:57:38 CEST 2005
: -----Original Message-----
: From: r-help-bounces at stat.math.ethz.ch
: [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of
: michael watson (IAH-C)
: Sent: Friday, 22 April 2005 7:47 PM
: To: r-help at stat.math.ethz.ch
: Subject: [R] Anova - interpretation of the interaction term
:
:
: Hi
:
: So carrying on my use of analysis of variance to check for the effects
: of two factors. It's made simpler by the fact that both my
: factors have
: only two levels each, creating four unique groups.
:
: I have a highly significant interaction term. In the context of the
: experiment, this makes sense. I can visualise the data
: graphically, and
: sure enough I can see that both factors have different effects on the
: data DEPENDING on what the value of the other factor is.
:
: I explain this all to my colleague - and she asks "but which ones are
: different?" This is best illustrated with an example. We have either
: infected | uninfected, and vaccinated | unvaccinated (the two
: factors).
: We're measuring expression of a gene. Graphically, in the infected
: group, vaccination makes expression go up. In the uninfected group,
: vaccination makes expression go down. In both the vaccinated and
: unvaccinated groups, infection makes expression go down, but it goes
: down further in unvaccinated than it does in vaccinated.
:
: So from a statistical point of view, I can see exactly why the
: interaction term is significant, but what my colleage wants to know is
: that WITHIN the vaccinated group, does infection decrease expression
: significantly? And within the unvaccinated group, does infection
: decrease expression significantly? Etc etc etc Can I get this
: information from the output of the ANOVA, or do I carry out a separate
: test on e.g. just the vaccinated group? (seems a cop out to me)
No, you can't get this kind of specific information out of the anova
table and yes, anova tables *are* a bit of a cop out. (I sometimes
think they should only be allowed between consenting adults in private.)
What you are asking for is a non-standard, but perfectly reasonable
partition of the degrees of freedom between the classes of a single
factor with four levels got by pairing up the levels of vaccination and
innoculation. Of course you can get this information, but you have to
do a bit of work for it.
Before I give the example which I don't expect too many people to read
entirely, let me issue a little challenge, namely to write tools to
automate a generalized version of the procedure below.
Here is the example, (drawing from the explanation given in a certain
book, to wit chapter 6):
> dat <- expand.grid(vac = c("N", "Y"), inf = c("-", "+"))
> dat <- rbind(dat, dat) # to get a bit of replication
Now we make a 4-level factor from vaccination and infection and
generate a bit of data with an infection effect built into it:
> dat <- transform(dat, vac_inf = vac:inf,
y = as.numeric(inf) + rnorm(8))
> dat
vac inf vac_inf y
1 N - N:- 0.2285096
2 Y - Y:- 1.3504610
3 N + N:+ 2.5581254
4 Y + Y:+ 2.9208313
11 N - N:- -0.8403039
21 Y - Y:- -0.2440574
31 N + N:+ 2.4844055
41 Y + Y:+ 2.0772671
Now give the joint factor contrasts reflecting the partition
we want to effect:
> levels(dat$vac_inf)
[1] "N:-" "N:+" "Y:-" "Y:+"
> m <- matrix(scan(), ncol = 4, byrow = T)
1: -1 1 0 0
5: 0 0 -1 1
9: 1 1 -1 -1
13:
Read 12 items
> fractions(ginv(m)) ## just to see what it looks like
[,1] [,2] [,3]
[1,] -1/2 0 1/4
[2,] 1/2 0 1/4
[3,] 0 -1/2 -1/4
[4,] 0 1/2 -1/4
Note that we could have simply used t(m), but this
is not always possible. Associate these contrasts, fit
and analyse:
> contrasts(dat$vac_inf) <- ginv(m)
> gm <- aov(y ~ vac_inf, dat)
> summary(gm)
Df Sum Sq Mean Sq F value Pr(>F)
vac_inf 3 12.1294 4.0431 7.348 0.04190
Residuals 4 2.2009 0.5502
This doesn't tell us too much other than there are differences,
probably. Now to specify the partition:
> summary(gm,
split = list(vac_inf = list("- vs +|N" = 1,
"- vs +|Y" = 2)))
Df Sum Sq Mean Sq F value Pr(>F)
vac_inf 3 12.1294 4.0431 7.3480 0.04190
vac_inf: - vs +|N 1 7.9928 7.9928 14.5262 0.01892
vac_inf: - vs +|Y 1 3.7863 3.7863 6.8813 0.05860
Residuals 4 2.2009 0.5502
As expected, infection changes the mean for both vaccinated and
unvaccinated, as we arranged when we generated the data.
:
: Many thanks, and sorry, but it's Friday.
:
: Mick
:
: ______________________________________________
: R-help at stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide!
: http://www.R-project.org/posting-guide.html
:
More information about the R-help
mailing list