[R-sig-ME] (in plain text) Replication in nested structure, how much?

Ben Bolker bbolker at gmail.com
Tue Sep 3 23:11:09 CEST 2013


Stephen T <stwebvanuatu at ...> writes:

> Following classical ANOVA, I thought it important to have
> replication at each level. Maybe this is not essential for mixed
> models?

> Here's my model: Y~1+(1|SUBJECT/OCCASION). Each subject was tested on
> multiple occasions.I want to evaluate the variance within-subjects
> and variance within-occasions.  I have data for 105
> subjects. Occasions per subject ranges from 1 to 4. Repeated
> measurements of the response Y per occasion range from 1 to 5.
> Originally, I thought to restrict the modelling to subjects tested
> on at least 2 occasions and with at least 2 Y data per
> occasion. Here are the numbers of "levels" in the reduced dataset:

> > model=lmer(Y~1+(1|SUBJECT/OCCASION), data=reduced) 

  {57 subjects, 138 occasions, 353 observations)

> And here's what I get with the full dataset: 
>  
> > model=lmer(Y~1+(1|SUBJECT/OCCASION), data=full) 

  {105 subjects, 196 observations, 471 observations}

> There are some potential issues in the full dataset affecting 48/105
  of the subjects:

> 1) No replication (i.e. subjects measured on 1 occasion and once).
> 2) No replication of occasions (i.e. subjects measured multiple
  times but on 1 occasion).
> 3) No replication of measurements on some occasions (i.e. subjects
> measured on multiple occasions but sometimes with only 1 measurement
> per occasion).

> I do not want to ignore potentially informative data and the
> precision for random effect results seems to improve with the full
> dataset.

  As far as I can see, all three of your issues are allowable in the 'modern'/
(RE)ML mixed model framework; the within-subject variance and the
within-occasion
variances should still be identifiable.  If you had a very extreme case
(e.g. most individuals measured only once, with a few measured more than once)
it might not be _practical_ to try to estimate both variances, even though
they would still be theoretically identifiable, but it sounds like
you're not in that situation.

  As always, someone else more informed may come along and correct
this answer ...  The best way to reassure yourself in this case is to
simulate some data with known variance structure, knock out a number
of observations to make it resemble your example, and see whether you
still recover approximately correct answers.

  Ben Bolker



More information about the R-sig-mixed-models mailing list