[R-sig-ME] DF in lme
Ben Ward
benjamin.ward at bathspa.org
Wed Mar 16 08:52:29 CET 2011
Hi, I'm using lme and lmer in my dissertation and it's the first time
I've used these methods. Taking into account replies from my previous
query I decided to go through with a model simplification, and then try
to validate the models in various ways and come up with the best one to
include in my work, be it a linear mixed effects model or general linear
effects model, with log() data or not etc - interestingly it does not
seems like doing transofrmations and such makes much difference so far,
looking at changes in diagnostic plots and AIC.
Anywho, I simplified to the model using lme (I've pasted it at the
bottom). And looking at the anova output the numDF looks right. However
I'm concerned about the 342 df in the denDF in anova() and in the
summary() output, as it seems to high to me, because at the observation
level is too high and pseudoreplicated; 4 readings per disk, 3 disks,
per plate, 3 plates per lineage, 5 lineages per group, 2 groups so:
4*3*3*5*2=360. If I take this to disk level 3*3*5*2=90, and at dish
level it's 3*5*2=30 degrees of freedom for error. And either dish or
disk (arguments for both) is the level at which one independant point of
datum is obtained, most probably Dish. So I'm wondering if either I'de
done something wrong, or I'm not understanding how df are presented and
used in mixed models. It's not really explained in my texts, and my
lecturer told me I'm working at the edge of his personal/professional
experience.
I've used lmer and the function in languageR to extract p-values without
it even mentioning df. Now if the lmer method with pvals.fnc() makes it
so as I don't have to worry about these df then in a way it makes my
issue a bit redundant. But it is playing on my mind a bit so felt I
should ask.
My second question is about when I do the equivalent model using lmer:
"lmer(Diameter~Group*Lineage+(1|Dish)+(1|Disk), data=Dataset)" - which
I'm sure does the same because all my plots of residuals against fitted
and such are the same, if I define it with the poisson family, which
uses log, then I get a much lower AIC of about 45, compared to over 1000
without family defined, which I think defaults to gaussian/normal. And
my diagnostic plots still give me all the same patters, but just looking
a bit different because of the family distribution specified. I then did
a model logging the response variable by using log(Diameter), again, I
get the same diagnostic plot patterns, but on a different scale, and I
get an AIC of - 795.6. Now normally I'd go for the model with the lowest
AIC, however, I've never observed this beahviour before, and can't help
but think thhat the shift from a posotive 1000+ AIC to a negative one is
due to the fact the data has been logged, rather than that the model
fitted to log data in this way is genuinley better.
Finally, I saw in a text, an example of using lmer but "Recoding Factor
Levels" like:
lineage<-Group:Lineage
dish<-Group:Lineage:Dish
disk<-Group:Lineage:Dish:Disk
model<-lmer(Diameter~Group+(1|lineage)+(1|dish)+(1|disk)
However I don't see why this should need to be done, considering, the
study was hieracheal, just like all other examples in that chapter, and
it does not give a reason why, but says it does the same job as a nested
anova, which I though mixed models did anyway.
Hopefully sombody can shed light on my concerns. In terms of my work and
university, I could include what I've done here and be as transparrant
as possible and discuss these issues, because log() of the data or
defining a distribution in the model is leading to the same plots and
conclusions. But I'd like to make sure I come to term with what's
actually happening here.
A million thanks,
Ben W.
lme14 <- lme(Diameter~Group*Lineage,random=~1|Dish/Disk, data=Dataset,
method="REML")
>anova(lme14):
numDF denDF F-value p-value
(Intercept) 1 342 16538.253 <.0001
Group 1 342 260.793 <.0001
Lineage 4 342 8.226 <.0001
Group:Lineage 4 342 9.473 <.0001
> summary(lme14)
Linear mixed-effects model fit by REML
Data: Dataset
AIC BIC logLik
1148.317 1198.470 -561.1587
Random effects:
Formula: ~1 | Dish
(Intercept)
StdDev: 0.1887527
Formula: ~1 | Disk %in% Dish
(Intercept) Residual
StdDev: 6.303059e-05 1.137701
Fixed effects: Diameter ~ Group * Lineage
Value Std.Error DF t-value
p-value
(Intercept) 15.049722 0.2187016 342 68.81396 0.0000
Group[T.NEDettol] 0.980556 0.2681586 342 3.65662 0.0003
Lineage[T.First] -0.116389 0.2681586 342 -0.43403 0.6645
Lineage[T.Fourth] -0.038056 0.2681586 342 -0.14191 0.8872
Lineage[T.Second] -0.177500 0.2681586 342 -0.66192 0.5085
Lineage[T.Third] 0.221111 0.2681586 342 0.82455 0.4102
Group[T.NEDettol]:Lineage[T.First] 2.275000 0.3792336 342 5.99894 0.0000
Group[T.NEDettol]:Lineage[T.Fourth] 0.955556 0.3792336 342 2.51970 0.0122
Group[T.NEDettol]:Lineage[T.Second] 0.828333 0.3792336 342 2.18423 0.0296
Group[T.NEDettol]:Lineage[T.Third] 0.721667 0.3792336 342 1.90296 0.0579
Correlation:
(Intr) Gr[T.NED] Lng[T.Frs] Lng[T.Frt]
Group[T.NEDettol] -0.613
Lineage[T.First] -0.613 0.500
Lineage[T.Fourth] -0.613 0.500 0.500
Lineage[T.Second] -0.613 0.500 0.500 0.500
Lineage[T.Third] -0.613 0.500 0.500 0.500
Group[T.NEDettol]:Lineage[T.First] 0.434 -0.707 -0.707 -0.354
Group[T.NEDettol]:Lineage[T.Fourth] 0.434 -0.707 -0.354 -0.707
Group[T.NEDettol]:Lineage[T.Second] 0.434 -0.707 -0.354 -0.354
Group[T.NEDettol]:Lineage[T.Third] 0.434 -0.707 -0.354 -0.354
L[T.S] L[T.T] Grp[T.NEDttl]:Lng[T.Frs]
Group[T.NEDettol]
Lineage[T.First]
Lineage[T.Fourth]
Lineage[T.Second]
Lineage[T.Third] 0.500
Group[T.NEDettol]:Lineage[T.First] -0.354 -0.354
Group[T.NEDettol]:Lineage[T.Fourth] -0.354 -0.354 0.500
Group[T.NEDettol]:Lineage[T.Second] -0.707 -0.354 0.500
Group[T.NEDettol]:Lineage[T.Third] -0.354 -0.707 0.500
Grp[T.NEDttl]:Lng[T.Frt] G[T.NED]:L[T.S
Group[T.NEDettol]
Lineage[T.First]
Lineage[T.Fourth]
Lineage[T.Second]
Lineage[T.Third]
Group[T.NEDettol]:Lineage[T.First]
Group[T.NEDettol]:Lineage[T.Fourth]
Group[T.NEDettol]:Lineage[T.Second] 0.500
Group[T.NEDettol]:Lineage[T.Third] 0.500 0.500
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.47467771 -0.75133489 0.06697157 0.67851126 3.27449064
Number of Observations: 360
Number of Groups:
Dish Disk %in% Dish
3 9
More information about the R-sig-mixed-models
mailing list