[R] dispcrepancy between aov F test and tukey contrasts results with mixed effects model

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Mar 16 01:14:47 CET 2009


lbaril at montana.edu wrote:
> Thanks Peter for the advice and quick response.  I just want to clarify
> what you suggest.  I should average values within a site then do a one-way
> anova to test for differnces between sites based on the 2 to 3 new samples
> per stand type -- and not use random effects for site?  Or, because I've
> reduced the data I've 'corrected' the problem with the glht multiple
> comparisons and can use the p-values from that summary if I include site
> as a random effect?   Thanks again for your advice.

This is tricky to say in a few lines, but the basic idea of a random 
effects model is that the site averages vary more than they should 
according to within-site variability. In the balanced case (equal 
number of observations per site), it turns out that the mixed-effects 
analysis is _equivalent_ to modeling the site averages. This is not 
ignoring the random effects of site; rather, it is coalescing it with 
the residual since the variance of a site average is v_site + 1/k v_res 
where k is the number of within-site observations.

In the unbalanced case it is not strictly correct to analyze averages, 
because thy have different variances. However, the differences can be 
slight (when the k's are similar or v_site dominates in the above formula).

A side effect of looking at averages is that you are fitting a plain lm 
model rather than lme and that glht in that case knows how to handle the 
degrees of freedom adjustment. (Assuming that the averages are normally 
distributed, which is as always dubious; but presumably, it is better 
than not correcting at all.)


> 
> 
>> lbaril at montana.edu wrote:
>>> Hello,
>>> I have some conflicting output from an aov summary and tukey contrasts
> with a mixed effects model I was hoping someone could clarify.  I am
> comparing the abundance of a species across three willow stand types.
> Since I have 2 or 3 sites within a habitat I have included site as a
> random effect in the lme model.  My confusion is that the F test given by
>>> aov(model) indicates there is no difference between habitats, but the
> tukey contrasts using the multcomp package shows that one pair of
> habits
>>> is significantly different from each other.  Why is there a
> discrepancy?
>>> Have I specified my model correctly?  I included the code and output
> below.  Thank you.
>> Looks like glht() is ignoring degrees of freedom. So what it does is
> wrong but it is not easy to do it right (whatever "right" is in these
> cases). If I understand correctly, what you have is that "stand" is
> strictly coarser than "site", presumably the stands representing each 2,
> 2, and 3 sites, with a varying number of replications within each site.
> Since the between-site variation is considered random, you end up with a
> comparison of stands based on essentially only 7 pieces of information.
> (The latter statement requires some qualification, but let's not go there
> to day.)
>> If you have roughly equal replications within each site, I'd be strongly
> tempted to reduce the analysis to a simple 1-way ANOVA of the site averages.
>>>> co.lme=lme(coye~stand,data=t,random=~1|site)
>>>> summary (co.lme)
>>> Linear mixed-effects model fit by REML
>>>  Data: R
>>>        AIC      BIC    logLik
>>>   53.76606 64.56047 -21.88303
>>> Random effects:
>>>  Formula: ~1 | site
>>>         (Intercept)  Residual
>>> StdDev:   0.3122146 0.2944667
>>> Fixed effects: coye ~ stand
>>>                  Value Std.Error DF    t-value p-value
>>> (Intercept)  0.4936837 0.2305072 60  2.1417277  0.0363
>>> stand2       0.4853222 0.3003745  4  1.6157240  0.1815
>>> stand3      -0.3159230 0.3251201  4 -0.9717117  0.3862
>>>  Correlation:
>>>        (Intr) stand2
>>> stand2 -0.767
>>> stand3 -0.709  0.544
>>> Standardized Within-Group Residuals:
>>>        Min         Q1        Med         Q3        Max
>>> -2.4545673 -0.5495609 -0.3148274  0.7527378  2.5151476
>>> Number of Observations: 67
>>> Number of Groups: 7
>>>> anova(co.lme)
>>>             numDF denDF   F-value p-value
>>> (Intercept)     1    60 23.552098  <.0001
>>> stand           2     4  3.738199  0.1215
>>>> summary(glht(co.lme,linfct=mcp(stand="Tukey")))
>>>          Simultaneous Tests for General Linear Hypotheses
>>> Multiple Comparisons of Means: Tukey Contrasts
>>> Fit: lme.formula(fixed = coye ~ stand, data = R, random = ~1 | site)
> Linear Hypotheses:
>>>            Estimate Std. Error z value Pr(>|z|)
>>> 2 - 1 == 0   0.4853     0.3004   1.616   0.2385
>>> 3 - 1 == 0  -0.3159     0.3251  -0.972   0.5943
>>> 3 - 2 == 0  -0.8012     0.2994  -2.676   0.0202 *
>>> ---
>>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>> (Adjusted p values reported -- single-step method)
>>> Lisa Baril
>>> Masters Candidate
>>> Department of Ecology
>>> Montana State University - Bozeman
>>> 406.994.2670
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>>     O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>>    c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>>   (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45)
> 35327918
>> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
> 
> 
> Lisa Baril
> Masters Candidate
> Department of Ecology
> Montana State University - Bozeman
> 406.994.2670
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list