[R] nested mixed-effect model: variance components
Spencer Graves
spencer.graves at pdf.com
Sat Jun 10 18:16:54 CEST 2006
I have seen no reply to this, so I will offer a couple of comments in
spite of the fact that I know very little about "aov" other than it is
old and has largely been superceded by "lme" in the "nlme" package.
I've replied to many posts on random and mixed effects over the past few
years, and I only remember one case where 'aov' returned an answer that
could not have been obtained easily from 'lme'. That one application
involved estimating a saturated model from perfectly balanced
experimental data. 'lme' refused to return an answer, at least as I
specified the model, and 'aov' returned numbers from which anyone
familiar with the theory for that special case could compute the desired
F ratios and p-values. In all other cases that I remember, 'aov' would
either return the same answer, or much more commonly, a substantively
inferior answer -- if it were feasible to use 'aov' at all.
I mention this, because the simplest way I can think of to check your
answer is to suggest you try to work it using 'lme'. To learn how to do
this, I strongly encourage you to consult Pinheiro and Bates (2000)
Mixed-Effects Models in S and S-PLUS (Springer). Bates has been for
many years one of the leading contributors in this area and is the
primary author of the 'nlme', 'lme4' and 'Matrix" packages to support
working with these kinds of models. This book discusses both how to fit
these kinds of models as well as how to produce plots of various kinds
that can be very helpful for explaining the experimental results to
others as well as diagnosing potential problem. Moreover, the R scripts
discussed in the book are available in files called "ch01.R", "ch02.R",
..., "ch06.R", "ch08.R" in a subfolder '~library\nlme\scripts' of the
directory in which R is installed on your hard drive. This makes it
much easier to work through the examples in the book one line at a time,
experimenting at with modifications. In addition, there is one point I
know where the R syntax differs from that in the book: S-Plus will
accept x^2 in a formula as meaning the square of a numeric variable; R
will not. To specify a square in R, you need something like I(x^2).
When I copied the commands out of the book, I had serious trouble
getting answers like those in the book until I identified and corrected
this difference in syntax. If you use the script files, they provide
the R syntax.
I'm not certain, but I believe the following should estimate the
model you wrote below:
fit <- lme(fixed=COVER ~ HABITAT,
random = ~1|LAGOON/HABITAT,
data=cov).
For comparison, I refer you to Pinheiro and Bates, p. 47, where I
find the following:
fm1Oats <- lme( yield ~ ordered(nitro) * Variety, data = Oats,
random = ~ 1 | Block/Variety )
There are 3 Varieties and 6 Blocks in this Oats data.frame. The
fixed effect of Variety has therefore 2 degrees of freedom. However,
the random effect of Variety occurs in 18 levels, being all the 6*3
Block:Variety combinations. You can see this by examining 'str(fm1Oats)'.
If you want to know "how much variation is due to lagoons? habitats?
lagoons*habitat? transects?", this model will NOT tell you that, and I
don't know how to answer that question using 'lme'. I was able to
answer a question like that using 'lmer' associated with the 'lme4' and
'Matrix' packages. Unfortunately, these packages have some names
conflicts with 'nlme', and the simplest way I know to change from one to
the other is to "q()" and restart R Before I did, however, I made a
local copy of the "Oats" data.frame. After I did that, I seemed to get
sensible numbers from the following:
library(lme4)
fm1Oats4 <- lmer(yield~ ordered(nitro) * Variety
+(1|Block)+(1|Variety)+(1|Block:Variety),
data=Oats)
For both "lme" and "lmer", the default "method" is "REML" (restricted
maximum likelihood). This is what you want for estimation. For testing
random effects, you still want "REML", but you should adjust the degrees
of freedom of the reference "F" distribution as discussed in section 2.4
of Pinheiro and Bates; this also applies to confidence intervals for
the random effects. For testing fixed effects, you should use "method =
'ML'".
"lme4" is newer than "nlme" and does not currently have available the
complete set of helper functions for plotting, etc. Thus, you will
likely want to use both. For documentation on 'lmer', you should still
start with Pinheiro and Bates for the general theory, then refer to the
vignettes associated with the "mlmRev" and "lme4" packages; if you
don't know about vignettes RSiteSearch("graves vignette") will lead you
to "http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67006.html" and
other replies to r-help where I've described how to use them. You will
also want the relevant article by Doug Bates in R News. To find that,
go "www.r-project.org" -> Documentation: Newsletter -> Download:
"Table of Contents", then search for Bates until you find "Douglas
Bates. Fitting linear mixed models in R. R News, 5(1):27-30, May 2005."
That says you want vol. 5(1).
Hope this helps,
Spencer Graves
Eric Pante wrote:
> Dear listers,
>
> I am trying to assess variance components for a nested, mixed-effects
> model. I think I got an answer that make sense from R, but I have a
> warning message and I wanted to check that what I am looking at is
> actually what I need:
>
> my data are organized as transects within stations, stations within
> habitats, habitats within lagoons.
> lagoons: random, habitats: fixed
> the question is: how much variation is due to lagoons? habitats?
> lagoons*habitat? transects?
>
> Here is my code:
>
> res <- aov(COVER ~ HABITAT + Error(HABITAT+LAGOON+LAGOON/HABITAT),
> data=cov)
> summary(res)
>
> and I get Sum Sq for each to calculate variance components:
>
> Error: STRATE
> Df Sum Sq Mean Sq
> STRATE 5 4493.1 898.6
>
> Error: ATOLL
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 5 3340.5 668.1
>
> Error: STRATE:ATOLL
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 18 2442.71 135.71
>
> Error: Within
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 145 6422.0 44.3
>
> My error message seems to come from the LAGOON/HABITAT, the Error is
> computed.
> Warning message: Error() model is singular in: aov(COVER ~ HABITAT +
> Error(HABITAT+LAGOON+LAGOON/HABITAT), data=cov),
>
> THANKS !!!
> eric
>
>
>
> Eric Pante
> ----------------------------------------------------------------
> College of Charleston, Grice Marine Laboratory
> 205 Fort Johnson Road, Charleston SC 29412
> Phone: 843-953-9190 (lab) -9200 (main office)
> ----------------------------------------------------------------
>
> "On ne force pas la curiosite, on l'eveille ..."
> Daniel Pennac
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list