[R-sig-ME] random effects specification

Sun Apr 6 17:04:34 CEST 2008

On Sat, Apr 5, 2008 at 10:14 PM, Ken Beath <kjbeath at kagi.com> wrote:
> On 05/04/2008, at 12:05 AM, Sebastian P. Luque wrote:
>  > On Fri, 4 Apr 2008 07:17:36 -0500,
>  > "Douglas Bates" <bates at stat.wisc.edu> wrote:
>  >
>  > [...]
>  >
>  >> I'm not sure that I understand what you mean by "treatment being
>  >> nested within community".  Does this mean that there are really 8
>  >> different treatments because treatment 1 in community A is different
>  >> from treatment 1 in community B?  If so, then it would make sense to
>  >> me to simply create a new factor that is the interaction of treatment
>  >> and community.
>  >
>  > I was not employing the term "nested" properly.  The number of levels
>  > for both community and treatment are 2 and 4, respectively, just as in
>  > the example.  The same 4 treatments were used in both communties, so
>  > in
>  > fact, treatment is crossed with community, not nested.  However,
>  > subjects are nested within communities because each subject belongs to
>  > one community only, yet received all 4 treatments.  Sorry for this
>  > confusion.
>  >
>
>  Once they are considered fixed effects, concepts of crossing and
>  nesting are irrelevant. They are simply covariates. So a model of the
>  form n ~ treatment + community +(1|id) or if the treatment effect is
>  allowed to vary between communities n ~ treatment *community +(1|id)
>  is appropriate. The main problem is your subject id are not unique.
>  You will need to define a new id.

I agree with everything up to here.

> The easiest way is to add a
>  different large number to id depending on community.

That approach contradicts your later advice to represent a factor
variable as a factor in R.  If id is a factor (as it should be) you
can't add  a large number to it.

The specification (1|treatment:id) generates unique id's.

To me the convention that different experimental units should be given
the same level of 'id' is just another nonsensical aspect of the
traditional approaches to random-effects models using observed and
expected mean squares, for which it makes sense to index the
observations by group and by unit within group.

If we could manage to unlearn old habits and just give each subject a
unique id at the start it would make life easier.

>
>
>  >
>  >> Perhaps I am approaching the community factor incorrectly.  In your
>  >> data there are two communities so, even if it would be reasonable to
>  >> model community effects as random effects, that would be difficult.
>  >> With only two levels I think it is best modeled as a fixed effect,
>  >> which would mean that questions about treatment and community are
>  >> related to the fixed effects.
>  >
>  > Could you please show a formula for the case where each individual is
>  > seen at both communities (community and treatment still being fixed)?
>  > This would help me understand the syntax better.
>  >
>
>  Same model as previous, provided a subject only receives a treatment
>  once. If a subject receives the same treatment more than once then
>  there needs to be a random effect that models the correlation between
>  repeated measurements of the same treatment, so the model is
>  y~treatment+community+(1|id/treatment) One problem that may have
>  occurred in your original attempts is that id and treatment need to be
>  factors.

Yes, that is one way of expressing an interaction between a random
effect for id and a fixed effect for treatment.

It expands to two random effects terms (1|id) + (1|id:treatment).  The
first is the effect for person and the second is the effect of
different individuals having different responses to the levels of
treatment.

A more general model (and consequently more difficult to estimate on
occasion) has possible correlations of the random effects for
different levels of treatment within individual.  The term is written
(treatment|id).