[R] Need help with lmer model specification syntax for nested mixed model

Sun Oct 31 17:52:54 CET 2010

On Sun, Oct 31, 2010 at 2:35 AM, Carabiniero <jason at troutnut.com> wrote:
>
> I haven't been able to fully make sense of the conflicting online information
> about whether and how to specify nesting structure for a nested, mixed
> model.  I'll describe my experiment and hopefully somebody who knows lme4
> well can help.
>
> We're measuring the fluorescence intensity of brain slices from frogs that
> have undergone various treatments.  We want to use multcomp to look for
> differences between treatments, while accounting for the variance introduced
> by the random effects of brain and slice.  There are a few measurements per
> slice, several slices per brain, and several brains per treatment.  In the
> data file, the numbering for slices starts over from 1 for each brain, and
> the numbering for brains starts over from 1 for each treatment.

This is what I call "implicit nesting" in the definition of the
variables.  My general recommendation is to create new variables that
reflect the actual structure of the data, as in

mydata <- within(mydata, {
    ubrain <- factor(Treatment:Brain)
    uslice <- factor(Treatment:Brain:Slice)
}

then define the model in terms of these factors, ubrain and uslice,
that have the desirable property that each distinct brain has a
distinct label.

> In other words:  Treatment is a fixed effect, brain is a random effect
> nested in treatment, and slice is a random effect nested in brain.
>
> As I understood the documentation, this is the correct specification:
>
> log(Intensity) ~ Treatment + (1|Brain) + (1|Slice)

That will work with ubrain and uslice instead of the implicitly nested
Brain and Slice.

> However, I don't see how lmer understands the correct nesting structure from
> that.  How does it know brain isn't crossed with treatment?

lmer can determine the crossed or nested structure from the data
whenever the data reflect the structure.  Implicitly nested factors
don't reflect the structure of the data and rely on external
information to augment the data given.

The computational methods used in lmer don't depend on whether the
grouping factors for the random effects are nested or not.  However
they do require that the grouping factors are well-defined.

> Here are two other things I tried, and each gave different results:
>
> log(Intensity) ~ Treatment + (1|Slice/Brain/Treatment)
> log(Intensity) ~ Treatment + (1|Brain/Treatment) + (1|Slice/Brain)
>
> I'm not sure why these things give different results, or which one (if any)
> is right.  Can anyone help?

I have taken the liberty of cc:ing the R-SIG-Mixed-Models mailing list
on this reply and suggest that any follow-ups be on that list.