[R] ANOVA contrast matrix vs. TukeyHSD?

John Fox jfox at mcmaster.ca
Thu Sep 18 01:30:31 CEST 2008


Dear Sam,

The "basis" matrix for your contrasts is not orthogonal, and so the tests
are almost surely not what you intended. Moreover, the "basis" matrix is not
even of rank 5, as you can see from the message produced by lm(),
"Coefficients: (1 not defined because of singularities)," and the missing
coefficient.

Regards,
 John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
> Behalf Of Sam Yeaman
> Sent: September-17-08 4:58 PM
> To: r-help at r-project.org
> Subject: [R] ANOVA contrast matrix vs. TukeyHSD?
> 
> Dear Help List,
> 
> Thanks in advance for reading...I hope my questions are not too ignorant.
> 
> I have an experiment looking at evolution of wing size [centroid] in
> fruitflies and the effect of 6 different experimental treatments
> [treatment]. I have five replicate populations [replic] in each
> treatment and have reared the flies in two different temperatures [cond]
> to assay the wing size, making measurements on males and females
> [gender]. My design can be summarized as follows:
> 
> This is my model (I think it's right, ignoring interaction terms for
> simplicity):
> 
>  > lm1 ~ aov (centroid ~ gender + cond + treatment/replic, data = parents)
> 
> The treatments are:
> 
>  > levels (parents$treatment)
> [1] "c"  "h"  "mc" "mh" "s"  "t"
> 
> I only care about a few of the pairwise comparisons between the levels
> of "treatment", as only certain contrasts are scientifically interesting:
> 
> c vs. h
> mh vs. mc
> (c + h) vs. s   [I would like to compare the mean of c and h (my
> controls) to s and t)
> (c + h) vs. t
> s vs. t
> h vs. mh
> c vs. mc
> 
> These are two more than I can specify using "contrasts()" and they are
> not orthogonal. I can use the TukeyHSD and only look at the comparisons
> I care about, but I think this should give me much less power than
> specifying a few a priori contrasts (?). Also, I don't know how to
> combine my controls (c + h) into a single comparison using TukeyHSD.
> 
> 
> My first problem is that when I specify the matrix shown below (the
> first 5 comparisons from above), I get a much higher p-value on some of
> the planned contrasts than I do on the TukeyHSD:
> 
> contrasts (parents$treatment) <- cbind
> (c(-1,1,0,0,0,0),c(-1,-1,0,0,2,0),c(-1,-1,0,0,0,2),c(0,0,-
> 1,1,0,0),c(0,0,0,0,1,-1))
> 
> 
>  > contrasts(parents$treatment)
>   [,1] [,2] [,3] [,4] [,5]
> c    -1   -1   -1    0    0
> h     1   -1   -1    0    0
> mc    0    0    0   -1    0
> mh    0    0    0    1    0
> s     0    2    0    0    1
> t     0    0    2    0   -1
> 
> 
> #### THE OUTPUT (truncated) ####
> 
> Call:
> lm(formula = centroid ~ gender + cond + treatment/replic, data = parents)
> 
> Residuals:
>      Min        1Q    Median        3Q       Max
> -81.58846  -4.53540   0.00803   4.76568  39.84177
> 
> Coefficients: (1 not defined because of singularities)
>                     Estimate Std. Error  t value Pr(>|t|)
> (Intercept)         328.73096    0.26303 1249.770  < 2e-16 ***
> genders             -37.39069    0.19661 -190.179  < 2e-16 ***
> condu               -37.47740    0.19693 -190.308  < 2e-16 ***
> treatment1            0.51026    0.40084    1.273 0.203079
> treatment2           -0.17333    0.23175   -0.748 0.454541
> treatment3            0.07761    0.22535    0.344 0.730566
> treatment4           -1.96020    0.38524   -5.088 3.73e-07 ***
> treatment5                 NA         NA       NA       NA
> 
> ###### The TukeyHSD output (truncated) #####
> 
>  Tukey multiple comparisons of means
>    95% family-wise confidence level
> 
> Fit: aov(formula = centroid ~ gender + cond + treatment/replic, data =
> parents)
> *
> *
> *
> $treatment
>             diff          lwr        upr     p adj
> h-c   -1.38085941 -2.382732615 -0.3789862 0.0012123
> mc-c  -2.22026936 -3.198423972 -1.2421147 0.0000000
> mh-c  -2.27157901 -3.268013478 -1.2751445 0.0000000
> s-c   -1.19540471 -2.170272952 -0.2205365 0.0063382
> t-c   -0.39899955 -1.374954044  0.5769549 0.8533107
> mc-h  -0.83940995 -1.813993060  0.1351732 0.1378366
> mh-h  -0.89071960 -1.883648319  0.1022091 0.1081954
> s-h    0.18545470 -0.785829956  1.1567394 0.9943136
> t-h    0.98185986  0.009484949  1.9542348 0.0462121
> mh-mc -0.05130965 -1.020300865  0.9176816 0.9999892
> s-mc   1.02486465  0.078064558  1.9716647 0.0249356
> t-mc   1.82126980  0.873351301  2.7691883 0.0000007
> s-mh   1.07617430  0.110500644  2.0418480 0.0187007
> t-mh   1.87257946  0.905809220  2.8393497 0.0000005
> t-s    0.79640515 -0.148121782  1.7409321 0.1550137
> 
> 
> When I specify the c vs. h comparison, I am getting a p-value of
> 0.203079, but the TukeyHSD gives the same contrast a p-value of
> 0.0012123. Also, the fifth comparison gives "NA"; I assume this is due
> to it being non-orthogonal? I feel like I am either misunderstanding the
> point of contrasts() completely or I have done something wrong, so I
> would really appreciate any help.
> 
> My other question is related...just wondering why I need to limit myself
> to only orthogonal comparisons using contrasts()? This eliminates
> comparisons of scientific interest, for example if c vs. h, mc vs. c,
> and mh vs. h are all different, I have no way of knowing if mc vs. mh is
> significantly different by examining the other contrasts.
> 
> Sorry if these questions are ignorant...I have spent a long time trying
> to figure it out and haven't found the answer in either the available
> books or the help list.
> 
> Many thanks,
> Sam Yeaman
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list