[R-sig-ME] [R] lmer - BLUP prediction intervals

Wed Feb 6 00:54:15 CET 2013

Dear all

I have not been able to follow the discussion. But I would like to
know if it makes sense to calculate prediction intervals like this:

var(fixed effect+random effect)= var(fixed effect) + var(random
effect) + 0 (i.e., the cov is zero)

and based on this create the prediction intervals. Does this make sense?

All the best,
Daniel

On Tue, Feb 5, 2013 at 8:54 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> On Tue, Feb 5, 2013 at 2:14 PM, Andrew Robinson <
> A.Robinson at ms.unimelb.edu.au> wrote:
>
>> I'd have thought that the joint correlation matrix would be of the
>> estimates of the fixed effects and the random effects, rather than the
>> things themselves.
>>
>
> Well, it may be because I have turned into a grumpy old man but I get picky
> about terminology and the random effects are not parameters - they are
> unobserved random variables.  They don't have "estimates" in the sense of
> parameter estimates.  The quantities returned by the ranef function are the
> conditional means (in the case of a linear mixed model, conditional modes
> in general) of the random effects given the observed data evaluated with
> the parameters at their estimated values. In the Bayesian point of view
> none of this is problematic because they're all random variables but
> otherwise I struggle with the interpretation of how these can be considered
> jointly.   If you want to consider the distribution of the random effects
> you need to have known values of the parameters.
>
>
>> The estimates are statistical quantities, with specified distributions,
>> under the model.  The model posits these different roles (parameter, random
>> variable) for the quantities that are the targets of the estimates, but the
>> estimates are just estimates, and as such, they have a correlation
>> structure under the model, and that correlation structure can be estimated.
>>
>> An imperfect analogy from least-squares regression is the correlation
>> structure of residual estimates, induced by the model.  We say that the
>> errors are independent, but the model creates a (modest) correlation
>> structure than can be measured, again, conditional on the model.
>>
>
> Well the residuals are random variables and we can show that at the least
> squares estimates of the parameters they will have a known Gaussian
> distribution which, it turns out, doesn't depend on the values of the
> coefficients.  But those are the easy cases.  In the linear mixed model we
> still have a Gaussian distribution and a linear predictor but that is for
> the conditional distribution of the response given the random effects.  For
> the complete model things get much messier.
>
> I'm not making these points just to be difficult.  I have spent a lot of
> time thinking about these models and trying to come up with a coherent way
> of describing them.  Along the way I have come to the conclusion that the
> way these models are often described is, well, wrong.  And those
> descriptions include some that I have written.  For example, you often see
> the model described as the linear predictor for an observation plus a
> "noise" term, epsilon, and the statement that the distribution of the
> random effects is independent of the distribution of the noise term.  I now
> view the linear predictor as a part of the conditional distribution of the
> response given the random effects so it wouldn't make sense to talk about
> these distributions being independent.  The biggest pitfall in transferring
> your thinking from a linear model to any other kind (GLM, LMM, GLMM) is the
> fact that we can make sense of a Gaussian distribution minus its mean so we
> write the linear model in the "signal plus noise" form as Y = X\beta +
> \epsilon where Y is an n-dimensional random variable, X is the n by p model
> matrix, \beta is the p-dimensional vector of coefficients and \epsilon is
> an n-dimensional Gaussian with mean zero.  That doesn't work in the other
> cases, despite the heroic attempts of many people to write things in that
> way.
>
> Here endeth the sermon.
>
>
>> On Wed, Feb 6, 2013 at 6:34 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
>>
>>> It is possible to create the correlation matrix of the fixed effects and
>>> the random effects jointly using the results from lmer but I have
>>> difficulty deciding what this would represent statistically.  If you adopt
>>> a Bayesian approach then everything works because the fixed effects and
>>> the
>>> random effects are both random variables.  In the classical statistical
>>> approach (i.e. sampling theory), however, I haven't been able to make
>>> sense
>>> of the relationship between a model parameter and an unobserved random
>>> variable.  They exist in different worlds.  You can talk about the
>>> probability model for the random effects and describe the conditional
>>> distribution of the random effects given the observed data and assumed
>>> values of the parameters.  Remember that in probability land the
>>> parameters
>>> are known.  You can also talk about the distribution of the estimators of
>>> the parameters given the observed data.  But I am at a loss to describe a
>>> combination of parameter estimators and a conditional distribution given
>>> the observed data with known parameters.
>>>
>>> I appreciate that people want to know these quantities but without a
>>> Bayesian model I can't think of what they are.
>>>
>>>
>>> On Tue, Feb 5, 2013 at 12:49 PM, Doran, Harold <HDoran at air.org> wrote:
>>>
>>> > Ben,
>>> >
>>> > I'm not sure the predictions of the conditional modes and the estimates
>>> of
>>> > the fixed effects are orthogonal. I don't think it's possible to spit
>>> out
>>> > that covariance from lmer(), though I could be wrong.
>>> >
>>> > My own software for fitting mixed models uses henderson's method. Under
>>> > this framework, I output the matrix giving the covariance between the
>>> fixed
>>> > effects and the BLUPs.
>>> > The easiest way to see thi is to go to this wiki page and look at the
>>> eqn
>>> > under estimation.
>>> >
>>> > The matrix in the upper right block (X'R^{-1}Z) is the one giving those
>>> > covariances.
>>> >
>>> > http://en.wikipedia.org/wiki/Mixed_model
>>> >
>>> > > -----Original Message-----
>>> > > From: r-help-bounces at r-project.org [mailto:
>>> r-help-bounces at r-project.org]
>>> > > On Behalf Of Ben Bolker
>>> > > Sent: Tuesday, February 05, 2013 1:31 PM
>>> > > To: r-help at stat.math.ethz.ch
>>> > > Subject: Re: [R] lmer - BLUP prediction intervals
>>> > >
>>> > > Daniel Caro <dcarov <at> gmail.com> writes:
>>> > >
>>> > > >
>>> > > > Dear all
>>> > > >
>>> > > > I have a model that looks like this:
>>> > > >
>>> > > > m1 <- lmer(Difference ~ 1+  (1|Examiner) + (1|Item),
>>> > > > data=englisho.data)
>>> > > >
>>> > > > I know it is not possible to estimate random effects but one can
>>> > > > obtain BLUPs of the conditional modes with
>>> > > >
>>> > > > re1 <- ranef(m1, postVar=T)
>>> > > >
>>> > > > And then dotplot(re1) for the examiner and item levels gives me a
>>> nice
>>> > > > prediction interval. But I would like to have the prediction
>>> interval
>>> > > > for the individual intercepts, not the conditional modes of the
>>> random
>>> > > > effects, that is, the fixed effect (overall estimated intercept) +
>>> the
>>> > > > conditional mode of the random effect (examiner or item level). Does
>>> > > > this make sense? And if so, how would I calculate this? I'd like to
>>> do
>>> > > > the same thing to obtain prediction intervals of individual growth
>>> > > > rates in longitudinal models (i.e., overall growth rate + random
>>> > > > effect).
>>> > >
>>> > >   I think this belongs on the r-sig-mixed-models at r-project.org list.
>>> > > Could you please re-post it there?  (I would redirect it myself but am
>>> > reading
>>> > > via gmane ...)  For a start, I would probably assume independence of
>>> the
>>> > > uncertainty in the conditional modes and in the overall slope
>>> parameter
>>> > and
>>> > > compute the overall variance by adding the variances ... ?  (Not sure
>>> > that's
>>> > > right.)
>>> > >
>>> > >   Ben Bolker
>>> > >
>>> > > ______________________________________________
>>> > > R-help at r-project.org mailing list
>>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > > PLEASE do read the posting guide http://www.R-project.org/posting-
>>> > > guide.html
>>> > > and provide commented, minimal, self-contained, reproducible code.
>>> >
>>> > _______________________________________________
>>> > R-sig-mixed-models at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> >
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>>
>>
>> --
>> Andrew Robinson
>> Director (A/g), ACERA
>> Senior Lecturer in Applied Statistics                      Tel:
>> +61-3-8344-6410
>> Department of Mathematics and Statistics            Fax: +61-3-8344 4599
>> University of Melbourne, VIC 3010 Australia
>> Email: a.robinson at ms.unimelb.edu.au    Website:
>> http://www.ms.unimelb.edu.au
>>
>> FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
>> SPuR: http://www.ms.unimelb.edu.au/spuRs/
>>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models