[R-sig-ME] lmer model specification

Wed Dec 7 22:31:52 CET 2011

Thanks for the very helpful reply.  I have a couple of follow up
questions.  I originally described my study as follows

>> In my study, we compared invertebrate biomass in paired organic and
>> conventional fields.  I had 9 conventional and 9 organic fields, for a
>> total of 18. Each field was sampled in May and in July.  At each
>> sampling time, two transects, each of 4 traps, were placed in each
>> field.
>>

It was suggested that since my design is balanced, and I could use lme
to analyse the results.  However, this being ecology, not all my
insect traps survived, so that I actually have between 1 and 4 traps
within each transect (4 traps in about 2/3 of transects, most of the
rest of the transect have 3 traps, a few have 2 or 1) so that although
I have equal numbers of fields, sampling months and transects, the
numbers of traps are not completely balanced - does this mean I should
be using lmer, or is the design balanced enough to proceed with lme as
previously suggested?

Secondly, I have a question about df in lme.  When I run my analysis
in lme I get

> model.edge<-lme(biomass~management*month, random=~1|pair/month/transect, data=edge)
> summary(model.edge)
Linear mixed-effects model fit by REML
 Data: edge
      AIC      BIC   logLik
  538.154 565.5182 -261.077

Random effects:
 Formula: ~1 | pair
        (Intercept)
StdDev:   0.1340750

 Formula: ~1 | month %in% pair
         (Intercept)
StdDev: 7.078553e-05

 Formula: ~1 | transect %in% month %in% pair
        (Intercept)  Residual
StdDev:   0.2681423 0.7008559

Fixed effects: biomass ~ management*month
                              Value Std.Error  DF    t-value p-value
(Intercept)              -1.7974245 0.1198274 192 -15.000114  0.0000
managementorg  0.1903121 0.1343578 192   1.416458  0.1583
monthR2          0.3627624 0.1585973   8   2.287317  0.0515
managementorg:monthR2  0.1982992 0.1889491 192   1.049485  0.2953
 Correlation:
                         (Intr) typerg rtt.R2
managementorg     -0.526
monthR2               -0.650  0.397
managementorg:monthR2  0.373 -0.709 -0.578

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max
-3.10692917 -0.65352204  0.06339836  0.63018992  2.52049694

Number of Observations: 230
Number of Groups:
                                  pair              month %in% pair
transect %in% month %in% pair
                                     9
    18                                     36
> anova(model.edge)
                    numDF denDF  F-value p-value
(Intercept)             1   192 350.9113  <.0001
management          1   192   9.8925  0.0019
month                  1     8  12.5699  0.0076
management:month     1   192   1.1014  0.2953

My question is do I really have df=192 for the test of management, and
the interaction test, and if not how do I calculate the df?

Thanks again!
Jude Girard

> jude girard <jude.girard at ...> writes:
>
>  [snip]
>
>> In my study, we compared invertebrate biomass in paired organic and
>> conventional fields.  I had 9 conventional and 9 organic fields, for a
>> total of 18. Each field was sampled in May and in July.  At each
>> sampling time, two transects, each of 4 traps, were placed in each
>> field.
>>
>> My fixed effects are management type and month (I expect invertebrate
>> abundance to be higher in July than in May), and the interaction
>> between them.  At first I specified my random effects as
>> pair/month/transect.  However, when I run this model
>>
>> model.edge<-lmer(biomass~management*month+(1|pair/month/transect), data=edge)
>> summary(model.edge)
>>
>> Linear mixed model fit by REML
>> Formula: biomass ~ management + month + (1 | pair/month/transect)
>>    Data: edge
>>    AIC   BIC logLik deviance REMLdev
>>  535.8 559.8 -260.9    513.4   521.8
>> Random effects:
>>  Groups                Name        Variance Std.Dev.
>>  transect:(month:pair) (Intercept) 0.071400 0.26721
>>  month:pair            (Intercept) 0.000000 0.00000
>>  pair                  (Intercept) 0.018794 0.13709
>>  Residual                          0.491406 0.70100
>> Number of obs: 230, groups: transect:(month:pair), 36;
>> month:pair, 18; pair, 9
>>
>> Fixed effects:
>>               Estimate Std. Error t value
>> (Intercept)   -1.84429    0.11147 -16.545
>> managementorg  0.29042    0.09474   3.065
>> monthR2        0.45889    0.12924   3.551
>
>  [snip]
>
>> the variance for month:pair is 0, which I think might be because the
>> month is already specified in the fixed effects?
>
>  No: month is specified in the fixed effects only as its
> main effect, whereas the variance specifies month:pair interaction
> only.  You're getting an estimate of zero variance because there
> is essentially not any detectable among-(month:pair) variation
> over and above that expected from transect:month:pair variation.
> This is essentially as expected when trying to fit several levels
> of nesting to a moderate-sized data set.
>
>> So, now I am
>> wondering if a better way to run the model might be
>> model2.edge<-lmer(biomass~management*month+(1|pair/transect.unique),
>> data=edge)  where is transect is treated as a unique level,
>> irrespective of which month it was run in?
>
>   I believe that this would then leave out the effect
> of "month within pair" (still trying to figure out why 0.1
> of the variance comes out of the residual variance and is
> counted within the transect:unique.pair variance instead here).
>
>> When I run the second model I get
>>
>> summary(model2.edge)
>> Linear mixed model fit by REML
>> Formula: biomass ~ management * month + (1 | pair/transect.unique)
>>    Data: edge
>>    AIC   BIC logLik deviance REMLdev
>>  524.2 548.3 -255.1    501.7   510.2
>> Random effects:
>>  Groups               Name        Variance Std.Dev.
>>  transect.unique:pair (Intercept) 0.17567  0.41913
>>  pair                 (Intercept) 0.01887  0.13737
>>  Residual                         0.39778  0.63070
>> Number of obs: 230, groups: transect.unique:pair, 70; pair, 9
>>
>> Fixed effects:
>>                       Estimate Std. Error t value
>> (Intercept)            -1.8028     0.1374 -13.122
>> managementorg           0.1665     0.1866   0.892
>> monthR2                 0.3745     0.1842   2.033
>> managementorg:monthR2   0.1866     0.2645   0.705
>>
>
>  The bigger difference in this model is that by putting the
> interaction (month:management) in you are allowing for
> month-to-month variation in the effect of management, which
> wasn't in the previous model.  These two effects (whether
> you include a pair:month variance term, and whether you
> include a month:management fixed effect) are more or less
> orthogonal, I believe.  (Note that adding the interaction
> term screws up your power to detect the management effect ...)
>
>  I would also consider running this in lme -- it should
> give you the same answers, and it will give you denominator
> df and p-values (which should be reasonably well justified
> in this classical, balanced linear mixed model).
>
>