[R-sig-ME] lmer formula specification
Gjalt-Jorn Peters
gjalt-jorn at behaviorchange.eu
Mon Nov 5 16:03:27 CET 2012
Dear all,
I posted a question last week, but I haven't received any replies - I'm
not sure why (because there were no replies :-)), but hereby I try
again, this time taking it one question at a time.
I don't manage to find a clear explanation of the lmer model
specification syntax. I haven't been able to find a webpage explaining
this. I did find some archived posts of this list, and a variety of
webpages, but some resources seem outdated, and explanations often seem
to assume proficiency with mixed models and/or prior knowledge of some
parts of the specification. Which is a bit of an obstacle when you're
new to both R and lmer :-)
Did I just overlook a resource, or does nothing exist yet and is this
knowledge indeed kind of fragmented over the internet? I've been
struggling with this for a few weeks now (not full-time though :-)), but
perhaps I'm missing crucial search terms or something.
If no source like this exists yet, could somebody perhaps correct my
inferences? If one or more of you are willing to provide some feedback,
I can hopefully write a tutorial/webpage thing (similar to, e.g.
http://www.rensenieuwenhuis.nl/r-sessions-16-multilevel-model-specification-lme4/,
but a bit more complete, and similar to Douglas Bates' article at
http://cran.r-project.org/doc/Rnews/Rnews_2005-1.pdf, but a bit more
geared towards relative lay people). From a variety of sources, I pieced
together the following.
The basic form is 'criterion ~ formula', where formula specifies the
model you use to predict the criterion. This model consists of one or
more terms separated by plusses (+). A term can be:
-- 1 -> specifies that the intercept should be estimated. Is in fact
optional as the intercept is always estimated;
-- a variable name >- specifies that coefficient of that variable should
be estimated (i.e. its slope);
-- an interaction term, consisting of two or more variable names
separated by colons -> specifies that the interactions between all those
variables should be estimated, as well as their regular coefficients;
-- a specification of random effects, which can take a number of forms:
---- (1 | variable name) -> for each level of variable name, random
intercepts are estimated
---- (variable name 1 | variable name 2) -> for each level of variable
name 2, random slopes are estimated for variable name 1, and random
intercepts are estimated (note: '1 + ' implicit, see first bullet);
---- (0 + variable name 1 | variable name 2) -> for each level of
variable name 2, random slopes are estimated for variable name 1, but
only one (fixed) intercept is estimated;
---- (variable name 1 + variable 2 | variable name 3) -> for each level
of variable name 3, random slopes are estimated for variable name 1 and
variable name 2, and random intercepts are estimated;
---- (variable name 1 | variable name 3 : variable name 2) -> for each
unique combination of levels of variable name 2 and variable name 3
(where variable name 3 is the higher level), random slopes are estimated
for variable name 1;
---- (variable name 1 | variable name 2 / variable name 3) -> for each
level of variable name 2, which is nested within variable name 3, random
slopes are estimated for variable name 1;
The next step would be to generate a series of potential scenarios and
providing syntax for each scenario, like Rense Nieuwenhuis does at
http://www.rensenieuwenhuis.nl/r-sessions-16-multilevel-model-specification-lme4/.
I hope somebody either knows a resource that roughly does this, or
thinks it may be nice to help make something like this!
Kind regards, and thank you in advance,
Gjalt-Jorn Peters
More information about the R-sig-mixed-models
mailing list