[R] anova subhypotheses
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Thu Mar 6 12:30:01 CET 2003
"rob foxall (IFR)" <rob.foxall at BBSRC.AC.UK> writes:
> Hello all,
>
> A really noddy question for you all: I'm trying without success to do some subhypothesis testing. Using simple anova model, with a toy dataset from a book. I have four factors A,B,C,D, and wish to test mu_C = mu_D. This is what I have tried:
>
>
>
> > contrasts(infants$group,how.many=1) <- c(0,0,1,-1)
>
> > contrasts(infants$group)
>
> [,1]
>
> A 0
>
> B 0
>
> C 1
>
> D -1
>
> > fit <- aov(age~group,data=infants)
>
> > summary(fit)
>
> Df Sum Sq Mean Sq F value Pr(>F)
>
> group 1 0.740 0.740 0.2693 0.6092
>
> Residuals 21 57.727 2.749
>
>
>
> Now I know from the book, hand calculations and SPSS that for "group", Sum Sq = Mean Sq = 1.12, not 0.740. Also from the standard anova:
>
>
>
> > contrasts(infants$group) <- "contr.treatment"
>
> > fit <- aov(age~group,data=infants)
>
> > summary(fit)
>
> Df Sum Sq Mean Sq F value Pr(>F)
>
> group 3 14.778 4.926 2.1422 0.1285
>
> Residuals 19 43.690 2.299
>
>
>
> So I'd like to have the (correct) Mean Sq value divided by 2.299 and
> not 2.749, with 19 and not 21 df. Any advice on how to correctly use
> contrasts for subhypothesis testing, including where to find it in
> the manuals, would be much appreciated.
(The pervasive Zelazo dataset, eh? That's not a toy, it's a Science
paper. With methodological errors, even.)
You're not comparing the same models. With a contrast specification
like that you're describing a model where group A and B are equal to
the intercept and C is as much above the intercept as D is below.
What you need to do is to compare two models, one in which C and D are
equal (and A and B are arbitrary) and one in which they are not. E.g.,
I have
> gr
[1] active active active active active active passive passive passive
[10] passive passive passive none none none none none none
[19] ctr.8w ctr.8w ctr.8w ctr.8w ctr.8w
Levels: active passive none ctr.8w
> gr2 <- gr ; levels(gr2) <- c("active","passive","ctr/non","ctr/non")
# (note that you need to be very careful that levels are specified in
# correct order there...)
> model1 <- lm(age~gr)
> model2 <- lm(age~gr2)
> anova(model2,model1)
Analysis of Variance Table
Model 1: age ~ gr2
Model 2: age ~ gr
Res.Df RSS Df Sum of Sq F Pr(>F)
1 20 44.812
2 19 43.690 1 1.123 0.4883 0.4931
Or, a little more sneaky, use the fact that the anova table for a
model is cumulative (Type I, if you want to speak SAS-ish):
> anova(lm(age~gr2+gr))
Analysis of Variance Table
Response: age
Df Sum Sq Mean Sq F value Pr(>F)
gr2 2 13.655 6.827 2.9692 0.0755 .
gr 1 1.123 1.123 0.4883 0.4931
Residuals 19 43.690 2.299
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list