[R] Is using glht with "Tukey" for lme post-hoc comparisons an appropriate substitute to TukeyHSD?

Wed Jan 18 16:26:01 CET 2012

Hello Richard,
Thank you for all of your feedback and for introducing me to the  
interaction_average argument. I realize that it is probably quite  
simple, but after some research, I am still having difficulty  
understanding how the interaction_average argument changes the  
calculation of multiple comparisons.  If you have a 2-way (A-2 levels,  
B-2 levels) ANOVA with a non-significant interaction term, then does  
interaction_average=FALSE mean that you are comparing A1 pooled over  
both levels of B to A2 pooled over both levels of B?  When  
interaction_average=TRUE, does this mean that you are doing the same  
pooling as above, but that you are also subtracting the contribution  
of the interaction term from the levels of A that you are interested  
in comparing?

I am trying to decide if I should be using interaction_average=TRUE  
for post-hoc testing.  Is it always appropriate to use  
interaction_average=TRUE when non-significant interaction terms are  
present in the model, or does it depend on your specific research  
questions?  Alternatively, should you just remove non-significant  
interaction terms from the model?

Thank you,
Anne

Quoting "Richard M. Heiberger" <rmh at temple.edu>:

> Anne,
>
> Thank you for writing back, and for including your data.
>
> I have two things here.  First, I ran an a analysis of your data and have
> my observations
> on interpretation.  Second, I answer your general question about glht and
> TukeyHSD when there are interactions.
> I illustrate how to get the same answer from glht with an example from your
> data.
>
>
> ## install.packages("HH")  ## if you do not already have it
> library(HH)
>
> Active <- read.table(textConnection("
> Treatment Habitat    pActive
>       1G       E 0.18541667
>       1G       E 0.02500000
>       1G       E 0.04208333
>       1G       E 0.14847222
>       1G       E 0.08055556
>       1G       E 0.16777778
>       1G       S 0.05111111
>       1G       S 0.19083333
>       1G       S 0.12333333
>       1G       S 0.35722222
>       1G       S 0.43750000
>       1G       S 0.02638889
>       1R       E 0.38736111
>       1R       E 0.51180556
>       1R       E 0.14916667
>       1R       E 0.61041667
>       1R       E 0.36013889
>       1R       E 0.11347222
>       1R       S 0.10805556
>       1R       S 0.18722222
>       1R       S 0.27625000
>       1R       S 0.25236111
>       1R       S 0.18208333
>       1R       S 0.16152778
>       2G       E 0.25916667
>       2G       E 0.37194444
>       2G       E 0.02263889
>       2G       E 0.18402778
>       2G       E 0.45750000
>       2G       E 0.02250000
>       2G       S 0.02958333
>       2G       S 0.10069444
>       2G       S 0.12875000
>       2G       S 0.11361111
>       2G       S 0.13680556
>       2G       S 0.07875000
>       2R       E 0.57513889
>       2R       E 0.12888889
>       2R       E 0.32000000
>       2R       E 0.55736111
>       2R       E 0.78888889
>       2R       E 0.65055556
>       2R       S 0.35527778
>       2R       S 0.48361111
>       2R       S 0.21361111
>       2R       S 0.35277778
>       2R       S 0.52611111
>       2R       S 0.29416667
>      R+G       E 0.37027778
>      R+G       E 0.20263889
>      R+G       E 0.07194444
>      R+G       E 0.49041667
>      R+G       E 0.21847222
>      R+G       E 0.13555556
>      R+G       S 0.20861111
>      R+G       S 0.23986111
>      R+G       S 0.02180556
>      R+G       S 0.23250000
>      R+G       S 0.28916667
>      R+G       S 0.50208333
> "), header=TRUE)
> ## close.connection()
> closeAllConnections()
>
> logitAct <- logit(Active$pActive)
> model3 <- aov(logitAct ~ Treatment*Habitat, data=Active)
> summary(model3)
> summary(glht(model3, focus="Treatment", linfct=mcp(Treatment="Tukey")))
> TukeyHSD(model3)
> with(Active, table(Treatment, Habitat))
>
> par(omd=c(0, .95, 0, 1))
> model3.mmc <- glht.mmc(model3, linfct=mcp(Treatment="Tukey"))
> plot(model3.mmc, ry=c(-2.75, 0.25), x.offset=.8)
> plot.matchMMC(model3.mmc$mca)
>
> ##             1G  1R  2G  2R R+G
> R.linear <- c(-1,  0,  -1,  2,  0 )
> names(R.linear) <- c('1G', '1R', '2G', '2R', 'R+G')
> lmat.R <- orthog.complete(as.matrix(R.linear))
> dimnames(lmat.R)[[2]][1] <- "R.linear"
>
> model3.mmc <- glht.mmc(model3, linfct=mcp(Treatment="Tukey"),
> focus.lmat=lmat.R)
> plot(model3.mmc, ry=c(-2.75, 0.25), x.offset=.8)
> plot.matchMMC(model3.mmc$lmat, col.signif="blue")
>
> Active$Treatment <- factor(Active$Treatment, levels=c('1G', '2G', 'R+G',
> '1R', '2R'))
> contrasts(Active$Treatment) <- lmat.R[c('1G', '2G', 'R+G', '1R', '2R'),]
> model3g <- aov(logitAct ~ Treatment*Habitat, data=Active)
> summary(model3g, split=list(Treatment=list(R.linear=1, rest=2:4)),
> expand.split=FALSE)
>
> position(Active$Habitat) <- c(2, 4)
> interaction2wt(logitAct ~ Treatment*Habitat, data=Active)
>
> ## from the above tables and graphs, it looks like the only thing
> ## going on in the data you posted is the linear effect of R.
>
>
> ## On your questions on glht and TukeyHSD.
> ## Without interactions or covariates they give the same answer.
> ## With interactions or covariates the default for glht is to give
> different answers.
> ## The output from glht includes a warning to that effect.
> ##
> ## From ?glht
> ## Warning message:
> ## In mcp2matrix(model, linfct = linfct) :
> ##   covariate interactions found -- default contrast might be inappropriate
> ##
> ## From ?glht, read about the interaction_average argument.
> ## With the glht argument interaction_average=TRUE, they give the same
> answer
> ## With your example,
> summary(glht(model3, linfct=mcp(Treatment="Tukey",
> interaction_average=TRUE)))
> TukeyHSD(model3, which="Treatment")
>
> Rich
> On Tue, Jan 3, 2012 at 7:26 PM, Anne Aubut <an438516 at dal.ca> wrote:
>
>> Hello Richard,
>> Thank you so much for getting back to me.  In the ?glht example, the
>> confidence intervals are the same and the p-values are very similar.  I ran
>> a 2-way ANOVA and compared the results for the glht code with "Tukey" and
>> TukeyHSD for "Treatment", which was a significant main effect (output is
>> below).  I found that the p-values for glht and TukeyHSD differed quite a
>> bit.  If glht with "Tukey" is just another method to run Tukey HSD, I don't
>> understand why the two methods yeilded different results.  If they are not
>> equivalent, how is glht calculating the p-values?  I also ran my 2-way
>> ANOVA without the Treatment*Habitat interaction and I found that the glht
>> and TukeyHSD methods did provide the same p-values (I did not include this
>> output). Does this mean that glht is only equivalent to TukeyHSD when
>> non-significant interactions are removed? Should I be removing all of my
>> non-significant interaction terms prior to running post-hoc testing with
>> glht?
>>
>> Treatment Habitat    pActive
>>       1G       E 0.18541667
>>       1G       E 0.02500000
>>       1G       E 0.04208333
>>       1G       E 0.14847222
>>       1G       E 0.08055556
>>       1G       E 0.16777778
>>       1G       S 0.05111111
>>       1G       S 0.19083333
>>       1G       S 0.12333333
>>       1G       S 0.35722222
>>       1G       S 0.43750000
>>       1G       S 0.02638889
>>       1R       E 0.38736111
>>       1R       E 0.51180556
>>       1R       E 0.14916667
>>       1R       E 0.61041667
>>       1R       E 0.36013889
>>       1R       E 0.11347222
>>       1R       S 0.10805556
>>       1R       S 0.18722222
>>       1R       S 0.27625000
>>       1R       S 0.25236111
>>       1R       S 0.18208333
>>       1R       S 0.16152778
>>       2G       E 0.25916667
>>       2G       E 0.37194444
>>       2G       E 0.02263889
>>       2G       E 0.18402778
>>       2G       E 0.45750000
>>       2G       E 0.02250000
>>       2G       S 0.02958333
>>       2G       S 0.10069444
>>       2G       S 0.12875000
>>       2G       S 0.11361111
>>       2G       S 0.13680556
>>       2G       S 0.07875000
>>       2R       E 0.57513889
>>       2R       E 0.12888889
>>       2R       E 0.32000000
>>       2R       E 0.55736111
>>       2R       E 0.78888889
>>       2R       E 0.65055556
>>       2R       S 0.35527778
>>       2R       S 0.48361111
>>       2R       S 0.21361111
>>       2R       S 0.35277778
>>       2R       S 0.52611111
>>       2R       S 0.29416667
>>      R+G       E 0.37027778
>>      R+G       E 0.20263889
>>      R+G       E 0.07194444
>>      R+G       E 0.49041667
>>      R+G       E 0.21847222
>>      R+G       E 0.13555556
>>      R+G       S 0.20861111
>>      R+G       S 0.23986111
>>      R+G       S 0.02180556
>>      R+G       S 0.23250000
>>      R+G       S 0.28916667
>>      R+G       S 0.50208333
>>
>> logitpAct<-logit(Active$**pActive)
>> model3<-aov(logitAct~**Treatment*Habitat,data=Active)
>> summary(glht(model3, linfct=mcp(Treatment="Tukey"))**)
>>
>>  Simultaneous Tests for General Linear Hypotheses
>>
>> Multiple Comparisons of Means: Tukey Contrasts
>>
>>
>> Fit: aov(formula = logitAct ~ Treatment * Habitat, data = Active)
>>
>> Linear Hypotheses:
>>              Estimate Std. Error t value Pr(>|t|)
>> 1R - 1G == 0    1.6196     0.5936   2.728  0.06369 .
>> 2G - 1G == 0    0.5468     0.5936   0.921  0.88736
>> 2R - 1G == 0    2.3100     0.5936   3.892  0.00264 **
>> R+G - 1G == 0   1.0713     0.5936   1.805  0.38235
>> 2G - 1R == 0   -1.0728     0.5936  -1.807  0.38095
>> 2R - 1R == 0    0.6904     0.5936   1.163  0.77204
>> R+G - 1R == 0  -0.5483     0.5936  -0.924  0.88639
>> 2R - 2G == 0    1.7632     0.5936   2.970  0.03516 *
>> R+G - 2G == 0   0.5245     0.5936   0.884  0.90164
>> R+G - 2R == 0  -1.2387     0.5936  -2.087 0.24185 <2.087%20%200.24185>
>> ---
>> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>> (Adjusted p values reported -- single-step method)
>>
>> Warning message:
>> In mcp2matrix(model, linfct = linfct) :
>>  covariate interactions found -- default contrast might be inappropriate
>>
>>
>> TukeyHSD(model3)
>>
>>  Tukey multiple comparisons of means
>>    95% family-wise confidence level
>>
>> Fit: aov(formula = logitAct ~ Treatment * Habitat, data = Active)
>>
>> $Treatment
>>               diff        lwr       upr     p adj
>> 1R-1G   0.976208536 -0.2115769 2.1639939 0.1538496
>> 2G-1G   0.008919932 -1.1788655 1.1967053 1.0000000
>> 2R-1G   1.774309645  0.5865243 2.9620950 0.0009174
>> R+G-1G  0.735518351 -0.4522670 1.9233037 0.4123535
>> 2G-1R  -0.967288603 -2.1550740 0.2204968 0.1605251
>> 2R-1R   0.798101110 -0.3896843 1.9858865 0.3299881
>> R+G-1R -0.240690185 -1.4284756 0.9470952 0.9783561
>> 2R-2G   1.765389713  0.5776043 2.9531751 0.0009817
>> R+G-2G  0.726598419 -0.4611870 1.9143838 0.4247935
>> R+G-2R -1.038791294 -2.2265767 0.1489941 0.1128708
>>
>>
>>
>> Quoting "Richard M. Heiberger" <rmh at temple.edu>:
>>
>>  glht is probably what you should be using.  Both TukeyHSD and glht give
>>> essesntially identical confidence intervals for
>>> the example in ?glht.  What aren't you satisfied with?
>>>
>>> amod <- aov(breaks ~ tension, data = warpbreaks)
>>> confint(glht(amod, linfct = mcp(tension = "Tukey")))
>>> TukeyHSD(amod)
>>> On Mon, Jan 2, 2012 at 6:19 PM, Anne Aubut <an438516 at dal.ca> wrote:
>>>
>>>  Hello,
>>>>
>>>> I am trying to determine the most appropriate way to run post-hoc
>>>> comparisons on my lme model.  I had originally planned to use Tukey HSD
>>>> method as I am interested in all possible comparisons between my
>>>> treatment
>>>> levels.  TukeyHSD, however, does not work with lme.  The only other code
>>>> that I was able to find, and which also seems to be widely used, is glht
>>>> specified with Tukey:
>>>>
>>>> summary(glht(model, linfct=mcp(Treatment="Tukey"))****)
>>>>
>>>>
>>>> Out of curiosity, I ran TukeyHSD and the glht code for a simple ANOVA and
>>>> found that they had quite different p-values.  If the glht code is not
>>>> running TukeyHSD, what does the "Tukey" in the code specify?  Is using
>>>> glht
>>>> code appropriate if I am interested in a substitute for TukeyHSD?  Are
>>>> there any other options for multiple comparisons for lme models?  I am
>>>> really interested in knowing if the Tukey contrasts generated from the
>>>> glht
>>>> code is providing me with appropriate p-values for my post-hoc
>>>> comparisons.
>>>>
>>>> I feel like I have reached a dead end and am not sure where else to look
>>>> for information on this issue. I would be grateful for any feedback on
>>>> this
>>>> matter.
>>>>
>>>>
>>>> Anne Cheverie
>>>> M.Sc. Candidate
>>>> Dalhousie University
>>>>
>>>> ______________________________****________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/****listinfo/r-help<https://stat.ethz.ch/mailman/**listinfo/r-help>
>>>> <https://stat.**ethz.ch/mailman/listinfo/r-**help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>>> >
>>>> PLEASE do read the posting guide  
>>>> http://www.R-project.org/**<http://www.r-project.org/**>
>>>> posting-guide.html  
>>>> <http://www.r-project.org/**posting-guide.html<http://www.r-project.org/posting-guide.html>>
>>>>
>>>>
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>>
>>
>>
>