# [R-sig-ME] [R] lmer - BLUP prediction intervals

Daniel Caro dcarov at gmail.com
Fri Feb 8 16:32:17 CET 2013

Dear Jarrod

Thank you so much and sorry for the confusion. I recalculated Cinv
with the original specification

m1 <- lmer(Difference ~ 1+  (1|Examiner) + (1|Item), data=englisho.data)

and it is now symmetrical. I will assume the following ordering
[overall intercept, uexaminer1...uexaminer22, uitem1...uitem24] and
will use attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,] for
the variances of random effects and Cinv for the covariances with
fixed effects.

All the best,
Daniel

On Fri, Feb 8, 2013 at 3:18 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
> Dear Daniel,
>
> Cinv should be symmetrical (although it might differ slightly because of
> rounding errors when inverting). The row/columns should be  [overall
> intercept,
> ,uitem1...uitem24, uexaminer1...uexaminer22] for your m1 model in this
> posting. However, in your original posting you fitted item/examiner in a
> different order:
>
>
> m1 <- lmer(Difference ~ 1+  (1|Examiner) + (1|Item), data=englisho.data)
>
> and it was for this model for which my Cinv code would work.
>
> I thought
>
>
> attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,]
>
> would be equal to diag(Cinv[1+1:ne]) (from the original specification:
> Examiner before Item) but actually it seems closer to something like this:
>
>  C = t(W)%*%W/Vr+bdiag(Diagonal(nb)*0,Diagonal(ne)/Ve,Diagonal(ni)/Vi)
>
>  Cinv = solve(C[-c(1:nb),-c(1:nb)])
>
>  diag(Cinv)[1:ne]
>
> I don't think this is a good thing - it assumes the fixed effects, as well
> as the variance parameters, are known without error.
>
> Cheers,
>
> Jarrod
>
>
>
>
>
>
>
>
>
>
> Quoting Daniel Caro <dcarov at gmail.com> on Fri, 8 Feb 2013 14:32:53 +0000:
>
>> Dear Jarrod
>>
>> Thank you for the practical advice and thank you all for the
>>
>> I was able to calculate Cinv. In my model,
>>
>> m1 <- lmer(Difference ~ 1+ (1|Item) + (1|Examiner), data=englisho.data)
>>
>> this matrix is of size 47x47, which is equal to the number of Items
>> (ni=24) + number of Examiners (ne=22) + number of fixed effects
>> (nb=1). Or p (1)+q (46) in the lme4 manual. Cinv is not symmetric and
>> I am wondering how to interpret the position of covariances in the
>> matrix. I thought the row/columns would be: [overall intercept,
>> uitem1...uitem24, uexaminer1...uexaminer22]. But it seems not? And is
>> the diagonal for the random effects for the examiners, for example,
>> comparable with the results of
>>
>> attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,]
>>
>> ?
>>
>> Again, thank you for your help.
>>
>> All the best,
>> Daniel
>>
>> On Thu, Feb 7, 2013 at 11:01 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
>> wrote:
>>>
>>> Dear Daniel,
>>>
>>> To give a practical, rather than philosophical, solution to your problem
>>> you
>>> could perhaps try this....
>>>
>>> W<-cBind(X,Z)
>>>
>>> Cinv =
>>> solve(t(W)%*%W/Vr+bdiag(Diagonal(nb)*0,Diagonal(ne)/Ve,Diagonal(ni)/Vi))
>>>
>>>
>>> where Vr is the residual variance, nb is the number of fixed effects, ne
>>> is
>>> the number of examiners and Ve the examiner variance estimate and ni is
>>> the
>>> number of items and Vi the item variance estimate.
>>>
>>> Cinv is the (co)variance matrix of the fixed and random effects (stacked
>>> on
>>> top of each other).  If you take the sub-matrix of Cinv that pertains to
>>> the
>>> effects in your prediction, then summing the elements of that sub-matrix
>>> will give you something that is close to the prediction variance. In
>>> general
>>> the sub-matrix will not be diagonal (i.e. there are covariances between
>>> the
>>> random effects and the things we are *calling* fixed effects). Vr, Ve and
>>> Vn
>>> are assumed known. If you want to include their uncertainty in the
>>> prediction interval you will probably have to wear the B-badge more
>>> visibly
>>> on your sleeve I'm afraid.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Quoting "Doran, Harold" <HDoran at air.org> on Thu, 7 Feb 2013 10:02:49
>>> -0500:
>>>
>>>> Doug:
>>>>
>>>> Sorry if this is too simple a question. Call \hat{\beta}} the estimate
>>>> of
>>>> a fixed parameter and \hat{b_i}  the BLUP for unit i.
>>>>
>>>> If the following exists
>>>>
>>>> \hat{\theta} = \hat{\beta}} + \hat{b_i}
>>>>
>>>> Then, does var(\hat{\theta}) also exist?
>>>>
>>>>
>>>> From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
>>>> Bates
>>>> Sent: Thursday, February 07, 2013 9:57 AM
>>>> To: Doran, Harold
>>>> Cc: Andrew Robinson; dcarov at gmail.com; r-sig-mixed-models at r-project.org;
>>>> Ben Bolker
>>>> Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
>>>>
>>>> On Wed, Feb 6, 2013 at 9:36 AM, Doran, Harold
>>>> <HDoran at air.org<mailto:HDoran at air.org>> wrote:
>>>> Andrew
>>>>
>>>> Ignoring the important theoretical question for just a moment on whether
>>>> it is sensible to do this, there is a covariance term between the BLUPs
>>>> and
>>>> the fixed effects.
>>>>
>>>> But the picky mathematician in me can't understand in what distribution
>>>> this covariance occurs.  It makes sense in the Bayesian formulation but
>>>> not
>>>> in a classical (sampling theory) formulation.  The distribution of the
>>>> estimator of the fixed effects for known values of the parameters is a
>>>> multivariate normal that depends on \beta, \sigma^2 and \Sigma, the
>>>> model
>>>> matrices X and Z being known.  The random variable B doesn't enter into
>>>> it.
>>>> (One way of writing this variance-covariance of this multivariate normal
>>>> is
>>>> X'V^{-1}X where V is that matrix that involves Z and Sigma^{-1} - I have
>>>> forgotten the exact form).
>>>>
>>>> I know these considerations sound like needless theoretical niceties but
>>>> to me they're not. I have to be able to formulate the theoretical basis
>>>> before I can make sense of the computational results and, after 20 years
>>>> or
>>>> so, I'm still having trouble making sense of this.
>>>>
>>>>
>>>> If that term exists under the model, but it is ignored for purposes of
>>>> variance estimation, what would be the reason for doing so?
>>>>
>>>>> -----Original Message-----
>>>>> From:
>>>>>
>>>>> r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>
>>>>> [mailto:r-sig-mixed-<mailto:r-sig-mixed->
>>>>> models-bounces at r-project.org<mailto:models-bounces at r-project.org>] On
>>>>> Behalf Of Andrew Robinson
>>>>> Sent: Tuesday, February 05, 2013 9:15 PM
>>>>> To: dcarov at gmail.com<mailto:dcarov at gmail.com>
>>>>> Cc:
>>>>>
>>>>> r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>;
>>>>> Ben Bolker
>>>>> Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
>>>>>
>>>>> I think that it is a reasonable way to proceed just so long as you
>>>>> interpret the
>>>>> intervals guardedly and document your assumptions carefully.
>>>>>
>>>>> Cheers
>>>>>
>>>>> Andrew
>>>>>
>>>>> On Wednesday, February 6, 2013, Daniel Caro wrote:
>>>>>
>>>>> > Dear all
>>>>> >
>>>>> > I have not been able to follow the discussion. But I would like to
>>>>> > know if it makes sense to calculate prediction intervals like this:
>>>>> >
>>>>> > var(fixed effect+random effect)= var(fixed effect) + var(random
>>>>> > effect) + 0 (i.e., the cov is zero)
>>>>> >
>>>>> > and based on this create the prediction intervals. Does this make
>>>>> > sense?
>>>>> >
>>>>> > All the best,
>>>>> > Daniel
>>>>> >
>>>>> > On Tue, Feb 5, 2013 at 8:54 PM, Douglas Bates
>>>>> > <bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
>>>>>
>>>>> wrote:
>>>>> > > On Tue, Feb 5, 2013 at 2:14 PM, Andrew Robinson <
>>>>> > > A.Robinson at ms.unimelb.edu.au<mailto:A.Robinson at ms.unimelb.edu.au>>
>>>>> > > wrote:
>>>>> > >
>>>>> > >> I'd have thought that the joint correlation matrix would be of the
>>>>> > >> estimates of the fixed effects and the random effects, rather than
>>>>> > >> the things themselves.
>>>>> > >>
>>>>> > >
>>>>> > > Well, it may be because I have turned into a grumpy old man but I
>>>>> > > get
>>>>> > picky
>>>>> > > about terminology and the random effects are not parameters - they
>>>>> > > are unobserved random variables.  They don't have "estimates" in
>>>>> > > the
>>>>> > > sense of parameter estimates.  The quantities returned by the ranef
>>>>> > > function are
>>>>> > the
>>>>> > > conditional means (in the case of a linear mixed model, conditional
>>>>> > > modes in general) of the random effects given the observed data
>>>>> > > evaluated with the parameters at their estimated values. In the
>>>>> > > Bayesian point of view none of this is problematic because they're
>>>>> > > all random variables but otherwise I struggle with the
>>>>> > > interpretation of how these can be
>>>>> > considered
>>>>> > > jointly.   If you want to consider the distribution of the random
>>>>> > > effects
>>>>> > > you need to have known values of the parameters.
>>>>> > >
>>>>> > >
>>>>> > >> The estimates are statistical quantities, with specified
>>>>> > >> distributions, under the model.  The model posits these different
>>>>> > >> roles (parameter,
>>>>> > random
>>>>> > >> variable) for the quantities that are the targets of the
>>>>> > >> estimates,
>>>>> > >> but
>>>>> > the
>>>>> > >> estimates are just estimates, and as such, they have a correlation
>>>>> > >> structure under the model, and that correlation structure can be
>>>>> > estimated.
>>>>> > >>
>>>>> > >> An imperfect analogy from least-squares regression is the
>>>>> > >> correlation structure of residual estimates, induced by the model.
>>>>> > >> We say that the errors are independent, but the model creates a
>>>>> > >> (modest) correlation structure than can be measured, again,
>>>>> > >> conditional
>>>>> on the model.
>>>>> > >>
>>>>> > >
>>>>> > > Well the residuals are random variables and we can show that at the
>>>>> > > least squares estimates of the parameters they will have a known
>>>>> > > Gaussian distribution which, it turns out, doesn't depend on the
>>>>> > > values of the coefficients.  But those are the easy cases.  In the
>>>>> > > linear mixed model
>>>>> > we
>>>>> > > still have a Gaussian distribution and a linear predictor but that
>>>>> > > is for the conditional distribution of the response given the
>>>>> > > random
>>>>> effects.
>>>>> >  For
>>>>> > > the complete model things get much messier.
>>>>> > >
>>>>> > > I'm not making these points just to be difficult.  I have spent a
>>>>> > > lot of time thinking about these models and trying to come up with
>>>>> > > a
>>>>> > > coherent
>>>>> > way
>>>>> > > of describing them.  Along the way I have come to the conclusion
>>>>> > > that the way these models are often described is, well, wrong.  And
>>>>> > > those descriptions include some that I have written.  For example,
>>>>> > > you often
>>>>> > see
>>>>> > > the model described as the linear predictor for an observation plus
>>>>> > > a "noise" term, epsilon, and the statement that the distribution of
>>>>> > > the random effects is independent of the distribution of the noise
>>>>> > > term.  I
>>>>> > now
>>>>> > > view the linear predictor as a part of the conditional distribution
>>>>> > > of
>>>>> > the
>>>>> > > response given the random effects so it wouldn't make sense to talk
>>>>> > > about these distributions being independent.  The biggest pitfall
>>>>> > > in
>>>>> > transferring
>>>>> > > your thinking from a linear model to any other kind (GLM, LMM,
>>>>> > > GLMM)
>>>>> > > is
>>>>> > the
>>>>> > > fact that we can make sense of a Gaussian distribution minus its
>>>>> > > mean so
>>>>> > we
>>>>> > > write the linear model in the "signal plus noise" form as Y =
>>>>> > > X\beta
>>>>> > > + \epsilon where Y is an n-dimensional random variable, X is the n
>>>>> > > by p
>>>>> > model
>>>>> > > matrix, \beta is the p-dimensional vector of coefficients and
>>>>> > > \epsilon is an n-dimensional Gaussian with mean zero.  That doesn't
>>>>> > > work in the other cases, despite the heroic attempts of many people
>>>>> > > to write things in that way.
>>>>> > >
>>>>> > > Here endeth the sermon.
>>>>> > >
>>>>> > >
>>>>> > >> On Wed, Feb 6, 2013 at 6:34 AM, Douglas Bates
>>>>> > >> <bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
>>>>>
>>>>> > wrote:
>>>>> > >>
>>>>> > >>> It is possible to create the correlation matrix of the fixed
>>>>> > >>> effects
>>>>> > and
>>>>> > >>> the random effects jointly using the results from lmer but I have
>>>>> > >>> difficulty deciding what this would represent statistically.  If
>>>>> > >>> you
>>>>> > >>> a B
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Andrew Robinson
>>>>> Director (A/g), ACERA
>>>>> Senior Lecturer in Applied Statistics                      Tel:
>>>>> +61-3-8344-6410<tel:%2B61-3-8344-6410>
>>>>> Department of Mathematics and Statistics            Fax: +61-3-8344
>>>>> 4599<tel:%2B61-3-8344%204599>
>>>>>
>>>>> University of Melbourne, VIC 3010 Australia
>>>>> Email:
>>>>> a.robinson at ms.unimelb.edu.au<mailto:a.robinson at ms.unimelb.edu.au>
>>>>> Website:
>>>>>
>>>>> http://www.ms.unimelb.edu.au
>>>>>
>>>>> FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
>>>>> SPuR: http://www.ms.unimelb.edu.au/spuRs/
>>>>>
>>>>>       [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
>>>>> mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>>
>>>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
>>>> mailing list
>>>>
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>>
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>
>>
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>