[R-sig-ME] Comparing results from glmer and glht

Mon Jan 16 12:06:17 CET 2012

Colin,

A simple graph of means for your eight combinations of 4 watershed types 
and two riparian types shows there are significant differences between 
watershed types c, d and f, and that for those types of watershed, 
riparian type makes no difference to the response. For watershed g, 
response is similar to watershed f for forested streams, but 
significantly lower at non-forested streams (if I have interpreted your 
factor labels correctly). These conclusions are based on inspection of a 
very simple graph with say a white dot for forested stream, black dot 
for unforested stream, and different locations along x axis for the 4 
watershed types. Somewhere on the graph there needs to be a bar whose 
length represents a rough estimate of the average standard error of a 
mean (or difference between two means). Note that neither p-values nor 
multiple comparisons have been mentioned.

Colin Wahl wrote:
> I will try to make this concise.
> 
> Background: I am testing the effects of land use and forested riparian
> buffers on stream invertebrates and in-stream variables. There are 4
> watershed types (defined by 4 types of land use) and two riparian
> types (forested and non). Percent EPT (relative abundance) was my main
> response variable. I also measured a variety of in-stream variables
> like temperature, nutrients, and toxicity. There are 72 observations
> for invertebrates, and 24 for in-stream variables.
> 
> I am curious of how acceptable p values are from pairwise comparisons
> using glht() from the multcomp package
> 
> I used glmer with a binomial error structure and an observation-level
> random effect (to account for overdispersion), to model invertebrates:
> 
> modelEPT<-glmer(EPT ~ wsh*rip + (1|stream) + (1|stream:rip) + (1|obs),
> data=ept, family=binomial(link="logit"))
> 
>    AIC   BIC logLik deviance
>  284.4 309.5 -131.2    262.4
> Random effects:
>  Groups     Name        Variance Std.Dev.
>  obs        (Intercept) 0.30186  0.54942
>  stream:rip (Intercept) 0.40229  0.63427
>  stream     (Intercept) 0.12788  0.35760
> Number of obs: 72, groups: obs, 72; stream:rip, 24; stream, 12
> 
> Fixed effects:
>                  Estimate Std. Error z value Pr(>|z|)
> (Intercept)   -4.2906     0.4935   -8.694  < 2e-16 ***
> wshd           -2.0557     0.7601  -2.705  0.00684 **
> wshf            3.3575     0.6339   5.297  1.18e-07 ***
> wshg           3.3923     0.7486    4.531  5.86e-06 ***
> ripN             0.1425     0.6323   0.225  0.82165
> wshd:ripN     0.3708     0.9682   0.383  0.70170
> wshf:ripN    -0.8665     0.8087   -1.071  0.28400
> wshg:ripN    -3.1530     0.9601  -3.284  0.00102 **
> ---
> 
> Correlation of Fixed Effects:
>                  (Intr)  wshd   wshf   wshg   ripN   wshd:N wshf:N
> wshd        -0.649
> wshf        -0.779  0.505
> wshg        -0.659  0.428  0.513
> ripN         -0.644  0.418  0.501  0.424
> wshd:ripN  0.421 -0.672 -0.327 -0.277 -0.653
> wshf:ripN  0.503 -0.327 -0.638 -0.332 -0.782  0.511
> wshg:ripN  0.424 -0.275 -0.330 -0.632 -0.659  0.430  0.515
> 
> 
> I then used this model to do Tukey's HSD contrasts between watershed types:
> 
> summary(glht(modelEPT, linfct=mcp(wsh="Tukey")))
> Linear Hypotheses:
> 
>                 Estimate Std. Error z value Pr(>|z|)
> d - c == 0 -2.05573    0.76010  -2.705   0.0341 *
> f - c == 0  3.35753    0.63386   5.297   <0.001 ***
> g - c == 0  3.39231    0.74862   4.531   <0.001 ***
> f - d == 0  5.41326    0.70176   7.714   <0.001 ***
> g - d == 0  5.44804    0.80692   6.752   <0.001 ***
> g - f == 0  0.03479    0.68931   0.050   1.0000
> 
> and riparian types:
> 
>                                                           Estimate
> Std. Error z value Pr(>|z|)
> C: Forested vs. Non-Forested == 0         0.1425     0.6323   0.225  0.99999
> D: Forested vs. Non-Forested == 0         0.5134     0.7332   0.700  0.98659
> F: Forested vs. Non-Forested == 0        -0.7239     0.5042  -1.436  0.69625
> G: Forested vs. Non-Forested == 0        -3.0105     0.7225  -4.167  < 0.001 ***
> 
> Are these p values accurate? Or is that a personal judgement I have to
> make based on the clarity of the patterns they reflect?
> 
> I've shown these results in my figures and explained them in my
> results. I've basically explained that though these p values
> reasonably reflect patterns in my data, effects sizes, and variances,
> that they are inexact and potentially anti-conservative due to the
> issues with degrees of freedom in mixed models.
> 
>>From what I understand from my research in the last year is that
> Douglas Bates and others advocate something of a paradigm shift away
> from the petagogically reinforced reliance on cryptic p values toward
> more in depth discussions of effects sizes and variances. The use of
> MCMC sampling and HPD intervals are suggested, but these are not
> available for generalized models.
> 
> I am interested in publishing these results as an ecologist, not a
> statistician (pardon the somewhat artificial distinction), and, I am
> very interested in what kind of a discussion the statisticians and
> ecologists of the r-sig-mixed-models mailing list would like to see as
> potential reviewers.
> 
> Thank you,
> 
> Colin Wahl
> 
> M.S. candidate,
> Dept. of Biology
> Western Washington University
> Bellingham, WA
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.