[R-sig-ME] A glmm prediction problem

Wed Jan 30 21:36:18 CET 2008

On Jan 30, 2008 2:02 PM, David Hinds <David_Hinds at perlegen.com> wrote:
> I'm using lmer with a generalized linear (binomial) mixed model with
> nested random effects, like:

>         y ~ (1 | a / b / c)

> There are no fixed effects.

I would check that.  The model will include a constant term in the
fixed effects.

I don't think it is possible in the current formulation to fit a model
without any fixed effects.  There is nothing in the theory or
computational methods that would preclude that but I be hard pressed
to think of a situation where it would be sensible to do so.  The
distribution of the random effects assumes a mean of zero for all the
random effects terms.  If you don't have any fixed effects terms then
you are assuming that the population mean probability of success is
exactly 0.5, which seems like a pretty strong assumption.

> After fitting the model, I would like to make
> predictions for a new set of y values: specifically, I want to predict
> BLUPs
> for the random effects, and I would like to compute likelihoods for sets
> of
> y values under the fitted model.

Are the new y values to be at some set of observed levels for the a, b
and c factors?  For example, suppose that factor a is the field,
factor b is the plant selected from the field, factor c is the seed
pod selected from the plant and the observational unit is the seed
within the seed pod.  Do you want to predict for another seed from
that particular seed pod or for a seed in another, as yet unobserved,
seed pod from that plant or for a seed from a seed pod from another,
as yet unobserved, plant from one of the fields you observed or ...

Essentially what will happen is that you will need to use the marginal
variance for units that are as yet unobserved and the conditional
variance for the units that have been observed when determining the
variance of the linear predictor.  Because you have a generalized
linear mixed model you need to convert the variance of the linear
predictor to an associated interval on the mean response through the
inverse link function.  Somewhere along the line what would be the
"residual variance" in a model for a Gaussian family would need to be
incorporated for the binomial family.  I'm not sure exactly how that
would be done.  Suffice it to say that formulating the prediction
interval would not be trivial.

> I don't see a completely straightforward way of doing this since it
> isn't
> the usual sort of prediction problem.  Is this even a sensible thing to
> do?
>
> -- David Hinds
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>