[R-sig-ME] DF in lme

Wed Mar 16 08:52:29 CET 2011

Hi, I'm using lme and lmer in my dissertation and it's the first time 
I've used these methods. Taking into account replies from my previous 
query I decided to go through with a model simplification, and then try 
to validate the models in various ways and come up with the best one to 
include in my work, be it a linear mixed effects model or general linear 
effects model, with log() data or not etc - interestingly it does not 
seems like doing transofrmations and such makes much difference so far, 
looking at changes in diagnostic plots and AIC.

Anywho, I simplified to the model using lme (I've pasted it at the 
bottom). And looking at the anova output the numDF looks right. However 
I'm concerned about the 342 df in the denDF in anova() and in the 
summary() output, as it seems to high to me, because at the observation 
level is too high and pseudoreplicated; 4 readings per disk, 3 disks, 
per plate, 3 plates per lineage, 5 lineages per group, 2 groups so:  
4*3*3*5*2=360. If I take this to disk level 3*3*5*2=90, and at dish 
level it's 3*5*2=30 degrees of freedom for error. And either dish or 
disk (arguments for both) is the level at which one independant point of 
datum is obtained, most probably Dish. So I'm wondering if either I'de 
done something wrong, or I'm not understanding how df are presented and 
used in mixed models. It's not really explained in my texts, and my 
lecturer told me I'm working at the edge of his personal/professional 
experience.
I've used lmer and the function in languageR to extract p-values without 
it even mentioning df. Now if the lmer method with pvals.fnc() makes it 
so as I don't have to worry about these df then in a way it makes my 
issue a bit redundant. But it is playing on my mind a bit so felt I 
should ask.

My second question is about when I do the equivalent model using lmer: 
"lmer(Diameter~Group*Lineage+(1|Dish)+(1|Disk), data=Dataset)" - which 
I'm sure does the same because all my plots of residuals against fitted 
and such are the same, if I define it with the poisson family, which 
uses log, then I get a much lower AIC of about 45, compared to over 1000 
without family defined, which I think defaults to gaussian/normal. And 
my diagnostic plots still give me all the same patters, but just looking 
a bit different because of the family distribution specified. I then did 
a model logging the response variable by using log(Diameter), again, I 
get the same diagnostic plot patterns, but on a different scale, and I 
get an AIC of - 795.6. Now normally I'd go for the model with the lowest 
AIC, however, I've never observed this beahviour before, and can't help 
but think thhat the shift from a posotive 1000+ AIC to a negative one is 
due to the fact the data has been logged, rather than that the model 
fitted to log data in this way is genuinley better.

Finally, I saw in a text, an example of using lmer but "Recoding Factor 
Levels" like:
lineage<-Group:Lineage
dish<-Group:Lineage:Dish
disk<-Group:Lineage:Dish:Disk
model<-lmer(Diameter~Group+(1|lineage)+(1|dish)+(1|disk)

However I don't see why this should need to be done, considering, the 
study was hieracheal, just like all other examples in that chapter, and 
it does not give a reason why, but says it does the same job as a nested 
anova, which I though mixed models did anyway.

Hopefully sombody can shed light on my concerns. In terms of my work and 
university, I could include what I've done here and be as transparrant 
as possible and discuss these issues, because log() of the data or 
defining a distribution in the model is leading to the same plots and 
conclusions. But I'd like to make sure I come to term with what's 
actually happening here.

A million thanks,
Ben W.

lme14 <- lme(Diameter~Group*Lineage,random=~1|Dish/Disk, data=Dataset, 
method="REML")

 >anova(lme14):

  numDF denDF   F-value p-value
(Intercept)       1   342 16538.253 <.0001
Group             1   342   260.793 <.0001
Lineage           4   342     8.226 <.0001
Group:Lineage     4   342     9.473 <.0001

 > summary(lme14)
Linear mixed-effects model fit by REML
  Data: Dataset
        AIC      BIC    logLik
   1148.317 1198.470 -561.1587

Random effects:
  Formula: ~1 | Dish
         (Intercept)
StdDev:   0.1887527

  Formula: ~1 | Disk %in% Dish
          (Intercept) Residual
StdDev: 6.303059e-05 1.137701

Fixed effects: Diameter ~ Group * Lineage
                                         Value Std.Error  DF  t-value 
p-value
(Intercept)                         15.049722 0.2187016 342 68.81396  0.0000
Group[T.NEDettol]                    0.980556 0.2681586 342  3.65662  0.0003
Lineage[T.First]                    -0.116389 0.2681586 342 -0.43403  0.6645
Lineage[T.Fourth]                   -0.038056 0.2681586 342 -0.14191  0.8872
Lineage[T.Second]                   -0.177500 0.2681586 342 -0.66192  0.5085
Lineage[T.Third]                     0.221111 0.2681586 342  0.82455  0.4102
Group[T.NEDettol]:Lineage[T.First]   2.275000 0.3792336 342  5.99894  0.0000
Group[T.NEDettol]:Lineage[T.Fourth]  0.955556 0.3792336 342  2.51970  0.0122
Group[T.NEDettol]:Lineage[T.Second]  0.828333 0.3792336 342  2.18423  0.0296
Group[T.NEDettol]:Lineage[T.Third]   0.721667 0.3792336 342  1.90296  0.0579
  Correlation:
                                     (Intr) Gr[T.NED] Lng[T.Frs] Lng[T.Frt]
Group[T.NEDettol]                   -0.613
Lineage[T.First]                    -0.613  0.500
Lineage[T.Fourth]                   -0.613  0.500     0.500
Lineage[T.Second]                   -0.613  0.500     0.500      0.500
Lineage[T.Third]                    -0.613  0.500     0.500      0.500
Group[T.NEDettol]:Lineage[T.First]   0.434 -0.707    -0.707     -0.354
Group[T.NEDettol]:Lineage[T.Fourth]  0.434 -0.707    -0.354     -0.707
Group[T.NEDettol]:Lineage[T.Second]  0.434 -0.707    -0.354     -0.354
Group[T.NEDettol]:Lineage[T.Third]   0.434 -0.707    -0.354     -0.354
                                     L[T.S] L[T.T] Grp[T.NEDttl]:Lng[T.Frs]
Group[T.NEDettol]
Lineage[T.First]
Lineage[T.Fourth]
Lineage[T.Second]
Lineage[T.Third]                     0.500
Group[T.NEDettol]:Lineage[T.First]  -0.354 -0.354
Group[T.NEDettol]:Lineage[T.Fourth] -0.354 -0.354  0.500
Group[T.NEDettol]:Lineage[T.Second] -0.707 -0.354  0.500
Group[T.NEDettol]:Lineage[T.Third]  -0.354 -0.707  0.500
                                     Grp[T.NEDttl]:Lng[T.Frt] G[T.NED]:L[T.S
Group[T.NEDettol]
Lineage[T.First]
Lineage[T.Fourth]
Lineage[T.Second]
Lineage[T.Third]
Group[T.NEDettol]:Lineage[T.First]
Group[T.NEDettol]:Lineage[T.Fourth]
Group[T.NEDettol]:Lineage[T.Second]  0.500
Group[T.NEDettol]:Lineage[T.Third]   0.500                    0.500

Standardized Within-Group Residuals:
         Min          Q1         Med          Q3         Max
-2.47467771 -0.75133489  0.06697157  0.67851126  3.27449064

Number of Observations: 360
Number of Groups:
           Dish Disk %in% Dish
              3              9