[R-sig-ME] advice on grouping structure - many levels but few individuals per level
Andrew Robinson
A.Robinson at ms.unimelb.edu.au
Thu Apr 10 00:14:44 CEST 2008
Doug,
On Wed, Apr 09, 2008 at 08:47:27AM -0500, Douglas Bates wrote:
> On Wed, Apr 9, 2008 at 5:29 AM, Martin Matejus <mmatejus at googlemail.com> wrote:
> > Dear lmer's
>
> > I was hoping to get a little advice about specifying a grouping structure
> > with many levels but few (sometimes one) individual per level. I have had a
> > look through the posting archives but could not find a similar question.
> > Many apologies in advance if I have missed any.
>
> > The context of the question is as follows:
>
> > I would like to model fitness of juvenile birds (a simple weight based
> > metric) with a number of explanatory variables including; when they were
> > layed (as a Julian day - egglayed), number of nestlings in nest (nestlings)
> > and whether they are male or female (sex). Each bird obviously originates
> > from a nest with some birds originating from the same nest (siblings). As
> > there is the potential for the fitness of siblings to be similar (either due
> > to genetic or environmental effects) I would like to include nest as a
> > random effect to reflect this potential grouping structure. For example
>
> > model <- lmer(fitness ~ egglayed + nestlings + sex +(1|nest))
>
> > I have many nests (175) but about half of them contain only 1 individual.
>
> > My question is: does it make sense to include nest as a random effect given
> > that many nests only contain one individual? I know this probably reflects a
> > rather deep misunderstanding regarding mixed effects models on my part but I
> > would have thought that it would be impossible to estimate a within nest
> > variance with only one individual and therefore make my between nest
> > variance estimates meaningless.
>
> That's not a problem as long as you recognize that you will get almost
> no new information from the groups that have only one observation. In
> other words you will get almost the same parameter estimates from the
> complete data set as you would get from the data after elimination
> those nests with only one individual. If you wrote out all of the
> error terms for each observation you would see that for those nests
> with only one observation you have two confounded error terms.
>
> I have seen this effect when fitting models to the 'star' data set in
> the mlmRev package. Because these are longitudinal data, groups are
> indexed by individuals (students, in this case) and the number of
> observations per group is the number of times the student takes a
> test. Many students have only one observation. For most models you
> can remove those students or keep them in without affecting the
> parameter estimates noticeably.
Do you mean all those unidatum students at once, or one at a time?
Presumably that also depends on the multivariate distribution of the
observations.
Andrew
> > Many, many thanks for your advice in advance.
> > Best wishes
> > Martin
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Andrew Robinson
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/
More information about the R-sig-mixed-models
mailing list