[R] Nesting order for mixed models

Wed Mar 11 03:52:41 CET 2009

On Tue, Mar 10, 2009 at 2:47 PM, Jon Zadra <jrz9f at virginia.edu> wrote:
> Hello,

> I am confused about the order of nesting in mixed models using functions
> like aov(), lme(), lmer().

> I have the following data:
> n subjects in either condition A or B
> each subject tested at each of 3 numerical values ("distance" = 40,50,60),
> repeated 4 times for each of the 3 numerical values ("trial" = 1,2,3,4)

> Variable summary:
> Condition: 2 level factor
> Distance: numerical (but only 3 values) in the same units as "y"
> Trial: 4 level factor

I don't think Trial is necessary.  If I understand correctly it is not
really an experimental or observational factor in that you don't
expect that trial 1 for one subject/distance combination will be
related to trial 1 for another combination.

> I expect the subjects' data to differ due to condition and distance, and am
> doing repeated measurements to reduce any variability due to measurement
> error.
>
> Currently I'm using this model:
>
> lme(y ~ Condition + Distance, random = ...)
>
> the question is how do I organize the random statement?  Is it:
> random = ~1 | Subject

I think that is all you need.   In lmer the formula would be y ~
Condition + Distance + (1|Subject).

> random = ~1 | Subject/Trial
> random = ~1 | Trial/Subject
> random = ~1 | Condition/Distance/Subject/Trial
> ...etc, or something else entirely?
>
> Mostly I'm unclear about whether the Trials should be grouped under subject
> because I expect the trials to be more similar within a subject than across
> subjects, or whether subjects should be grouped under trials because the
> trials are going to differ depending on the subject.  If trials should be
> grouped under subjects, then do the condition or distance belong as well,
> since the trials will be most similar within each distance within each
> subject?

In some ways of thinking of the model, Trial would be grouped under
the Subject:Distance combination but then it becomes unnecessary
because it is just another way of labeling the observations.  A random
effect for Trial within Subject:Distance is confounded with the
"residual" or per-observation noise term.