[R-sig-ME] package nlme

Ben Bolker bbolker at gmail.com
Tue Apr 12 20:40:27 CEST 2011

On 04/07/2011 06:33 AM, Anna Renwick wrote:
> Dear All
> I am currently analysing ecological data looking at the effect of
> agricultural intensity on a range of biodiversity measures. I have data from
> 26 sites grouped into 9 clusters. I am therefore running a linear mixed
> model using the nlme package. I am aware that there is issues in mixed
> models of determing the number of degrees of freedom and hence the p values
> but as these are reported by nlme I have a query I was hoping I might be
> able to resolve.
> The model I have run is below:
> beerichmix1<-lme(MeanBeeSppRich~cultivate, data=bees, random=~1|cluster,
>  method="ML")
> Linear mixed-effects model fit by maximum likelihood
> Data: bees
>       AIC      BIC    logLik
>  165.0555 170.0879 -78.52774
> Random effects:
> Formula: ~1 | cluster
>        (Intercept) Residual
> StdDev:    1.899494 4.636802
> Fixed effects: MeanBeeSppRich ~ cultivate
>               Value Std.Error DF   t-value p-value
> (Intercept) 38.19658  4.267833 16  8.949878  0.0000
> cultivate   -0.22248  0.059514 16 -3.738322  0.0018
> Correlation:
>          (Intr)
> cultivate -0.962
> Standardized Within-Group Residuals:
>       Min         Q1        Med         Q3        Max
> -1.8576001 -0.5935824  0.0859121  0.7933137  1.9665311
> Number of Observations: 26
> Number of Groups: 9
> I can see that the df has been calculated as 26 sites - 9 clusters - 1 = 16.
> However, this is clearly a misrepresentation of the data structure in the
> model and hence so is the p value. Using some sort of correction (i.e.
> Kenward-Roger correction), even if not precise would offer a slightly better
> solution in my circumstance than not correcting at all there fore I was
> wondering if anybody has addressed this in R or if people are just moving
> more and more away from reporting df and p values.

  It's not entirely clear to me what the structure of your data is.
Does cultivation status vary within clusters? (Presumably, since it's a
continuous variable.)  On the basis of classic experimental design/ANOVA
rules, what do you think the df should be, or what is a reasonable
  You can read about the rules that lme uses to guess denominator df
("inner-outer method" in PB 2000 (page 91 at

 if cultivate varies within cluster, its value is "inner" to cluster
"If a term is inner to all Q grouping factors in a model, it is
estimated at the level of the within-group errors, which we denote as
the Q+1st level" -- so cultivate would be estimated at level 2.

  Letting m_i denote the number of groups in level i (with the
convention that m_0=1 when the fixed effect model includes an intercept
and 0 otherwise, and m_{Q+1}=N) and p_i the sum of the degrees of
freedom corresponding to the terms estimated at level i, the i^th level
denominator df is defined as

denDF_i = m_i-(m_{i-1}+p_i), i = 1 ... Q+1

  So in this case

 m_0 = 1
 denDF_2 = m_2 - (m_1 + p_1) = N - (# clusters + # parameters) =
26 - (9 + 1)

 exactly as you have suggested above.

 Can you explain further why this is wrong? (I'm not saying it isn't,
just that I don't understand the problem ...)

  Ben Bolker

More information about the R-sig-mixed-models mailing list