[R-sig-ME] Comparing results from glmer and glht

Colin Wahl biowahl at gmail.com
Mon Jan 16 21:54:48 CET 2012


Thierry,
Thank you for the suggestion. I think I understand, though creating
user defined contrasts will be difficult.

I create a combined variable:
wshrip <-c bind(wsh, rip)

Resulting in a new model
modelEPT<-glmer(EPT ~ wshrip + (1|stream) + (1|stream:rip) + (1|obs),
data=ept, family=binomial(link="logit"))

The user defined contrasts I tried previously were:
wsh <- rbind("C vs. D" = c(1,0,0,0,0,0,0,0),
            "C vs. F" = c(0,1,0,0,0,0,0,0),
            "C vs. G" = c(0,0,1,0,0,0,0,0),
            "D vs. F" = c(-1,1,0,0,0,0,0,0),
            "D vs. G" = c(-1,0,1,0,0,0,0,1),
            "F vs. G" = c(0,-1,1,0,0,0,0,0))

This did not give me reasonable results, prompting me to use built-in
contrasts:

summary(glht(modelEPT, linfct=mcp(wsh="Tukey")))

For riparian contrasts I used:
rip <- rbind("C: Forested vs. Non-Forested" = c(0,0,0,0,1,0,0,0),
            "D: Forested vs. Non-Forested" = c(0,0,0,0,1,1,0,0),
            "F: Forested vs. Non-Forested" = c(0,0,0,0,1,0,1,0),
            "G: Forested vs. Non-Forested" = c(0,0,0,0,1,0,0,1))

I dont understand how these contrasts are defined, but based them on
contrasts on an almost identical design found here:
http://thebiobucket.blogspot.com/2011/06/glmm-with-custom-multiple-comparisons.html#more

Could you please explain how to use user-defined for the model with a
wshrip combined variable? I cant find a clear example of how to do
this. The parameter length is different from the original model now
that there is a combined variable, correct?

Thank you very much, I'd be lost in the dark without this mailing list.

Colin






On Mon, Jan 16, 2012 at 12:58 AM, ONKELINX, Thierry
<Thierry.ONKELINX at inbo.be> wrote:
> Dear Colin,
>
> I don't think your Tukey tests are very informative. Because you are testing main effects how have an interaction with another variable. So you are not testing the overall effect of a variable but the effect when all other variable are 0 (continuous) or at the reference level (factor).
>
> A better way of doing this could bet o create a new variable which is the interaction between wsh and rip and use that in your model instead of wsh * rip. Then you can use glht() with user defined contrasts so that the contrasts test what you want to test.
>
> Best regards,
>
> Thierry
>
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
> + 32 2 525 02 51
> + 32 54 43 61 85
> Thierry.Onkelinx at inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> -----Oorspronkelijk bericht-----
> Van: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] Namens Colin Wahl
> Verzonden: zondag 15 januari 2012 1:23
> Aan: r-sig-mixed-models at r-project.org
> Onderwerp: [R-sig-ME] Comparing results from glmer and glht
>
> I will try to make this concise.
>
> Background: I am testing the effects of land use and forested riparian buffers on stream invertebrates and in-stream variables. There are 4 watershed types (defined by 4 types of land use) and two riparian types (forested and non). Percent EPT (relative abundance) was my main response variable. I also measured a variety of in-stream variables like temperature, nutrients, and toxicity. There are 72 observations for invertebrates, and 24 for in-stream variables.
>
> I am curious of how acceptable p values are from pairwise comparisons using glht() from the multcomp package
>
> I used glmer with a binomial error structure and an observation-level random effect (to account for overdispersion), to model invertebrates:
>
> modelEPT<-glmer(EPT ~ wsh*rip + (1|stream) + (1|stream:rip) + (1|obs), data=ept, family=binomial(link="logit"))
>
>   AIC   BIC logLik deviance
>  284.4 309.5 -131.2    262.4
> Random effects:
>  Groups     Name        Variance Std.Dev.
>  obs        (Intercept) 0.30186  0.54942
>  stream:rip (Intercept) 0.40229  0.63427
>  stream     (Intercept) 0.12788  0.35760
> Number of obs: 72, groups: obs, 72; stream:rip, 24; stream, 12
>
> Fixed effects:
>                 Estimate Std. Error z value Pr(>|z|)
> (Intercept)   -4.2906     0.4935   -8.694  < 2e-16 ***
> wshd           -2.0557     0.7601  -2.705 0.00684 **
> wshf            3.3575     0.6339   5.297  1.18e-07 ***
> wshg           3.3923     0.7486    4.531  5.86e-06 ***
> ripN             0.1425     0.6323   0.225  0.82165
> wshd:ripN     0.3708     0.9682   0.383  0.70170
> wshf:ripN    -0.8665     0.8087   -1.071  0.28400
> wshg:ripN    -3.1530     0.9601  -3.284  0.00102 **
> ---
>
> Correlation of Fixed Effects:
>                 (Intr)  wshd   wshf   wshg   ripN   wshd:N wshf:N
> wshd        -0.649
> wshf        -0.779  0.505
> wshg        -0.659  0.428  0.513
> ripN         -0.644  0.418  0.501  0.424
> wshd:ripN  0.421 -0.672 -0.327 -0.277 -0.653 wshf:ripN  0.503 -0.327 -0.638 -0.332 -0.782  0.511 wshg:ripN  0.424 -0.275 -0.330 -0.632 -0.659  0.430  0.515
>
>
> I then used this model to do Tukey's HSD contrasts between watershed types:
>
> summary(glht(modelEPT, linfct=mcp(wsh="Tukey"))) Linear Hypotheses:
>
>                Estimate Std. Error z value Pr(>|z|)
> d - c == 0 -2.05573    0.76010  -2.705   0.0341 *
> f - c == 0  3.35753    0.63386   5.297   <0.001 ***
> g - c == 0  3.39231    0.74862   4.531   <0.001 ***
> f - d == 0  5.41326    0.70176   7.714   <0.001 ***
> g - d == 0  5.44804    0.80692   6.752   <0.001 ***
> g - f == 0  0.03479    0.68931   0.050   1.0000
>
> and riparian types:
>
>                                                          Estimate Std. Error z value Pr(>|z|)
> C: Forested vs. Non-Forested == 0         0.1425     0.6323   0.225  0.99999
> D: Forested vs. Non-Forested == 0         0.5134     0.7332   0.700  0.98659
> F: Forested vs. Non-Forested == 0        -0.7239     0.5042  -1.436  0.69625
> G: Forested vs. Non-Forested == 0        -3.0105     0.7225  -4.167  < 0.001 ***
>
> Are these p values accurate? Or is that a personal judgement I have to make based on the clarity of the patterns they reflect?
>
> I've shown these results in my figures and explained them in my results. I've basically explained that though these p values reasonably reflect patterns in my data, effects sizes, and variances, that they are inexact and potentially anti-conservative due to the issues with degrees of freedom in mixed models.
>
> >From what I understand from my research in the last year is that
> Douglas Bates and others advocate something of a paradigm shift away from the petagogically reinforced reliance on cryptic p values toward more in depth discussions of effects sizes and variances. The use of MCMC sampling and HPD intervals are suggested, but these are not available for generalized models.
>
> I am interested in publishing these results as an ecologist, not a statistician (pardon the somewhat artificial distinction), and, I am very interested in what kind of a discussion the statisticians and ecologists of the r-sig-mixed-models mailing list would like to see as potential reviewers.
>
> Thank you,
>
> Colin Wahl
>
> M.S. candidate,
> Dept. of Biology
> Western Washington University
> Bellingham, WA
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list