[R] two way ANOVA with unequal sample sizes
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Tue Oct 16 22:44:40 CEST 2001
julien claude <claude at isem.univ-montp2.fr> writes:
> Hi,
>
> I am trying a two way anova with unequal sample sizes but results are not
> as expected:
>
> I take the example from Applied Linear Statistical Models (Neter et al.
> pp889-897, 1996)
>
> growth rate gender bone development
> 1.4 1 1
> 2.4 1 1
> 2.2 1 1
> 2.4 1 2
> 2.1 2 1
> 1.7 2 1
> 2.5 2 2
> 1.8 2 2
> 2 2 2
> 0.7 3 1
> 1.1 3 1
> 0.5 3 2
> 0.9 3 2
> 1.3 3 2
>
> expected results are
>
> source of variation SS df MS F
> gender 0.12 1 0.12 0.74
> bone development 4.1897 2 2.0949 12.89**
> interaction 0.0754 2 0.377 0.23
> Error 1.3 8 0.1625
>
> # I use
> aov (growrate ~ gender * bonedevelopment)->m
> summary(m)
>
> Df Sum Sq Mean Sq F value Pr(>F)
> as.factor(gender) 2 4.3063 2.1531 13.2501
> 0.002891 **
> as.factor(bonedevlopment) 1 0.0926 0.0926 0.5697
> 0.472022
> as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034
> Residuals 8 1.3000 0.1625
Ahem. Tab damage detected... and your command and output don't match
up.
The as.factor(gender:bonedevlopment) is playing with fire... You
should calculate factor() of each term. However, it would seem that
you already did manage to convert things to factors or you would have
gotten something to this effect:
> evalq(as.factor(gender:bone.development),d)
[1] 1
Levels: 1
Warning messages:
1: Numerical expression has 14 elements: only the first used in:
gender:bone.development
2: Numerical expression has 14 elements: only the first used in:
gender:bone.development
>
> #if I change the order of factors, results are different
> aov (growrate ~ bonedevelopment * gender)->m
> summary(m)
>
> Df Sum Sq Mean Sq F value
> Pr(>F)
> as.factor(bonedevlopment) 1 0.0029 0.0029 0.0176
> 0.897785
> as.factor(gender) 2 4.3960 2.1980 13.5262 0.002713 **
> as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034
> Residuals 8 1.3000 0.1625
>
> #In the both cases, results for main effects differ from those expected in
> Neter et al.
> However interaction and residuals are well estimated.
> Can anyone help, either I am wrong in the formula, or either is there an
> other problem? Is there a mean to conduct easily the test as in it is in
> Neter et al. ?
> The same problems occurs with anova(lm(....))?
I don't think we're the ones with the problem... There are various
boneheaded ways in which people try to use to assign some kind of
SumSq to main effects in the presence of interaction, and they are all
wrong - although maybe not very wrong if the unbalance is slight.
The tests *should* depend on the test order, as is most clearly seen
if the predictors are highly collinear.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list