[R-sig-ME] package nlme
Ben Bolker
bbolker at gmail.com
Tue Apr 12 20:40:27 CEST 2011
On 04/07/2011 06:33 AM, Anna Renwick wrote:
> Dear All
>
> I am currently analysing ecological data looking at the effect of
> agricultural intensity on a range of biodiversity measures. I have data from
> 26 sites grouped into 9 clusters. I am therefore running a linear mixed
> model using the nlme package. I am aware that there is issues in mixed
> models of determing the number of degrees of freedom and hence the p values
> but as these are reported by nlme I have a query I was hoping I might be
> able to resolve.
>
> The model I have run is below:
>
> beerichmix1<-lme(MeanBeeSppRich~cultivate, data=bees, random=~1|cluster,
> method="ML")
>
> Linear mixed-effects model fit by maximum likelihood
> Data: bees
> AIC BIC logLik
> 165.0555 170.0879 -78.52774
>
> Random effects:
> Formula: ~1 | cluster
> (Intercept) Residual
> StdDev: 1.899494 4.636802
>
> Fixed effects: MeanBeeSppRich ~ cultivate
> Value Std.Error DF t-value p-value
> (Intercept) 38.19658 4.267833 16 8.949878 0.0000
> cultivate -0.22248 0.059514 16 -3.738322 0.0018
> Correlation:
> (Intr)
> cultivate -0.962
>
> Standardized Within-Group Residuals:
> Min Q1 Med Q3 Max
> -1.8576001 -0.5935824 0.0859121 0.7933137 1.9665311
>
> Number of Observations: 26
> Number of Groups: 9
>
> I can see that the df has been calculated as 26 sites - 9 clusters - 1 = 16.
> However, this is clearly a misrepresentation of the data structure in the
> model and hence so is the p value. Using some sort of correction (i.e.
> Kenward-Roger correction), even if not precise would offer a slightly better
> solution in my circumstance than not correcting at all there fore I was
> wondering if anybody has addressed this in R or if people are just moving
> more and more away from reporting df and p values.
>
It's not entirely clear to me what the structure of your data is.
Does cultivation status vary within clusters? (Presumably, since it's a
continuous variable.) On the basis of classic experimental design/ANOVA
rules, what do you think the df should be, or what is a reasonable
approximation?
You can read about the rules that lme uses to guess denominator df
("inner-outer method" in PB 2000 (page 91 at
<http://books.google.com/books?id=y54QDUTmvDcC&printsec=frontcover&dq=pinheiro+bates&hl=en&src=bmrr&ei=_ZqkTaaOKMeB0QGWhcnBBQ&sa=X&oi=book_result&ct=result&resnum=1&ved=0CCgQ6AEwAA#v=onepage&q=inner&f=false>):
if cultivate varies within cluster, its value is "inner" to cluster
"If a term is inner to all Q grouping factors in a model, it is
estimated at the level of the within-group errors, which we denote as
the Q+1st level" -- so cultivate would be estimated at level 2.
Letting m_i denote the number of groups in level i (with the
convention that m_0=1 when the fixed effect model includes an intercept
and 0 otherwise, and m_{Q+1}=N) and p_i the sum of the degrees of
freedom corresponding to the terms estimated at level i, the i^th level
denominator df is defined as
denDF_i = m_i-(m_{i-1}+p_i), i = 1 ... Q+1
So in this case
m_0 = 1
denDF_2 = m_2 - (m_1 + p_1) = N - (# clusters + # parameters) =
26 - (9 + 1)
exactly as you have suggested above.
Can you explain further why this is wrong? (I'm not saying it isn't,
just that I don't understand the problem ...)
Ben Bolker
More information about the R-sig-mixed-models
mailing list