[R-sig-ME] Is multi-level modelling applicable to crossed designs?
Douglas Bates
bates at stat.wisc.edu
Wed May 4 20:06:39 CEST 2016
"Multilevel model" and "hierarchical linear model" are terms that are
outmoded and, in my opinion, introduce more confusion than illumination. In
some ways you are better off ignoring such descriptions.
To begin with you should distinguish between experimental factors, like
Treatment, and blocking factors, like Clinic. Random effects are
associated with blocking factors, like Clinic, which show up on the right
hand side of the | in a random-effects term like (Treatment | Clinic).
Experimental factors occur in fixed-effects terms and on the left hand side
of the | in random-effects terms. In your case you only have one factor,
Clinic, for which there are random-effects. The characterization of
"levels" or, as we term them, the "grouping factors" for the random-effects
terms, only applies to the blocking factors.
Treatment is an experimental factor with clearly defined, reproducible
levels. Thus it is modeled as a fixed-effects term. (Note that the name
"fixed-effects" is a misnomer - it is the levels of the factor that are
fixed, not the effects of these levels.) The Clinic factor is a blocking
factor - a known source of variability for which we must control. The
levels represent a sample from a population of clinics that could
participate in the trial. These levels could be different in another trial.
A term like (1 | Clinic) allows for an additive shift between clinics
whereas a term like (1 + Treatment | Clinic) allows for the additive shift
plus a shift in the effect of each level of treatment between clinic. You
should not use (1 | Clinic) + (Treatment | Clinic) because the latter term
expands to (1 + Treatment | Clinic) and you would have the shift by Clinic
of the intercept in the model twice.
For this to make sense you must have different treatments applied at the
same clinic. In other words, it would be a mistake to try to use a term
like (1 + Treatment | Clinic)l if Clinic was nested within Treatment. But
you should be able to figure that out - how can I measure an effect for the
change in Treatment within each Clinic if only one Treatment is used at
each clinic?
In the language of multilevel models you have two levels of variability
here, the residual or per-observation random variability and the
variability between clinics. With only one grouping factor for random
effects (Clinic) or a two-level model, there is no concept of nested or
non-nested for the random-effects terms.
If you had another known source of variability, say Doctor, it could be
nested within Clinic (each Doctor practices at only one Clinic) or it could
be non-nested (at least one of the Doctor's is seeing patients in the trial
at more than one of the clinics). The latter case is neither nested nor
completely crossed. It is a case of partially crossed factors. It is
perhaps easier to think of examples in Subject/Item types of experiments.
Experiments in Psychology often have completely crossed random effects
where each of a sample of subjects is exposed to each of a sample of items
and both the subjects and the items represent samples from populations. In
a rater experiment, say Netflix-like data where people rate movies, there
may be a large number of raters and a large number of movies but you don't
expect every subject to rate every movie.
The point of the methods behind the lme4 package is that they can handle
nested or partially crossed or completely crossed factors. Earlier
software for multilevel models or hierarchical linear models depended upon
having a nested structure for the grouping factors in the random effects.
There are certain simplifications available with nested factors. (Again,
let me emphasize that this is for the blocking factors only). It is okay
to use such simplifications except that the descriptions tended to conflate
mixed-effects models with multilevel models, as in the passage you quote.
Multilevel models or hierarchical models are a subset of linear or
generalized linear mixed-effects models. Also, the descriptions of the
models emphasized the levels of the hierarchy for the grouping factors and
for the experimental factors.
I hope this helps.
On Wed, May 4, 2016 at 7:04 AM <bernhard.voelkl at vetsuisse.unibe.ch> wrote:
> Is multi-level modelling applicable to fully-crossed designs?
> Dear mailing list,
> Dear Ben,
>
> The more I read about multi-level models the more confused I get. What I
> have read now in several different sources (e.g. Moerbeek & Teerenstra
> 2016) are statements like this: (1) .. multilevel
> model is also known as the hierarchical model, mixed effects model, random
> coefficient model or variance component model. And (2) The multilevel model
> differs from the traditional (single level) model since it explicitly
> accounts for the nested (sic) data structure by including random effects
> at the group level.
>
> Now, here is my question: the design that I have is one that would be
> classically described as “crossed design” and the classical textbooks go at
> length emphasising the difference between “nested” and “crossed” (which
> also leads to different ways for calculating degrees of freedom and
> standard errors in the fixed-factor case). My question: can I use mixed
> models for analysing a crossed design? Does the distinction between nested
> and crossed make any sense in the hierarchical/multilevel modelling
> approach?
>
> In R-speak: does Y ~ Treatment + (1|Clinic) + (Treatment|Clinic) make
> sense if Clinic and Treatment are crossed (not nested) factors but
> Treatment is fixed while Clinic is random?
>
> To be more concrete here is my setup: I have a large pool of subjects
> which I can randomly distribute to a number of clinics. At each clinic
> subjects are (randomly) divided into two groups and get either treatment A
> or treatment B. Then I take one measure from each subject. Subjects are
> really randomly distributed to clinics and at all clinics exactly the same
> treatments are applied. This is a classical crossed design. Treatment is
> clearly a fixed factor but I would like to treat clinic as a random factor
> (as I have many clinics, they are a sample of all existing clinics and I
> want to make generalizations beyond the specific clinics).
>
> (Just to clarify: I searched both “Ecological Models and Data in R” and
> the Gelman/Hill book “Data analysis using regression and
> multilevel/hierarchical models” but both did not explicitly mention crossed
> designs, yet in a relatively recent points-of-significance in Nature
> Methods (2014, 11, 977-978) Krzywinski nicely explains the difference
> between nested and crossed, so it doesn’t seem to be an obsolete
> distinction.)
>
> Any help highly appreciated!
> Kind regards,
> Bernhard
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list