[R-sig-ME] lmer formula specification

Gjalt-Jorn Peters gjalt-jorn at behaviorchange.eu
Mon Nov 5 16:03:27 CET 2012


Dear all,

I posted a question last week, but I haven't received any replies - I'm 
not sure why (because there were no replies :-)), but hereby I try 
again, this time taking it one question at a time.

I don't manage to find a clear explanation of the lmer model 
specification syntax. I haven't been able to find a webpage explaining 
this. I did find some archived posts of this list, and a variety of 
webpages, but some resources seem outdated, and explanations often seem 
to assume proficiency with mixed models and/or prior knowledge of some 
parts of the specification. Which is a bit of an obstacle when you're 
new to both R and lmer :-)

Did I just overlook a resource, or does nothing exist yet and is this 
knowledge indeed kind of fragmented over the internet? I've been 
struggling with this for a few weeks now (not full-time though :-)), but 
perhaps I'm missing crucial search terms or something.

If no source like this exists yet, could somebody perhaps correct my 
inferences? If one or more of you are willing to provide some feedback, 
I can hopefully write a tutorial/webpage thing (similar to, e.g. 
http://www.rensenieuwenhuis.nl/r-sessions-16-multilevel-model-specification-lme4/, 
but a bit more complete, and similar to Douglas Bates' article at 
http://cran.r-project.org/doc/Rnews/Rnews_2005-1.pdf, but a bit more 
geared towards relative lay people). From a variety of sources, I pieced 
together the following.

The basic form is 'criterion ~ formula', where formula specifies the 
model you use to predict the criterion. This model consists of one or 
more terms separated by plusses (+). A term can be:
-- 1 -> specifies that the intercept should be estimated. Is in fact 
optional as the intercept is always estimated;
-- a variable name >- specifies that coefficient of that variable should 
be estimated (i.e. its slope);
-- an interaction term, consisting of two or more variable names 
separated by colons -> specifies that the interactions between all those 
variables should be estimated, as well as their regular coefficients;
-- a specification of random effects, which can take a number of forms:
---- (1 | variable name) -> for each level of variable name, random 
intercepts are estimated
---- (variable name 1 | variable name 2) -> for each level of variable 
name 2, random slopes are estimated for variable name 1, and random 
intercepts are estimated (note: '1 + ' implicit, see first bullet);
---- (0 + variable name 1 | variable name 2) -> for each level of 
variable name 2, random slopes are estimated for variable name 1, but 
only one (fixed) intercept is estimated;
---- (variable name 1 + variable 2 | variable name 3) -> for each level 
of variable name 3, random slopes are estimated for variable name 1 and 
variable name 2, and random intercepts are estimated;
---- (variable name 1 | variable name 3 : variable name 2) -> for each 
unique combination of levels of variable name 2 and variable name 3 
(where variable name 3 is the higher level), random slopes are estimated 
for variable name 1;
---- (variable name 1 | variable name 2 / variable name 3) -> for each 
level of variable name 2, which is nested within variable name 3, random 
slopes are estimated for variable name 1;

The next step would be to generate a series of potential scenarios and 
providing syntax for each scenario, like Rense Nieuwenhuis does at 
http://www.rensenieuwenhuis.nl/r-sessions-16-multilevel-model-specification-lme4/.

I hope somebody either knows a resource that roughly does this, or 
thinks it may be nice to help make something like this!

Kind regards, and thank you in advance,

Gjalt-Jorn Peters



More information about the R-sig-mixed-models mailing list