[R-sig-ME] [R] lmer - BLUP prediction intervals

Jarrod Hadfield j.hadfield at ed.ac.uk
Fri Feb 8 16:18:30 CET 2013


Dear Daniel,

Cinv should be symmetrical (although it might differ slightly because  
of rounding errors when inverting). The row/columns should be   
[overall intercept,
,uitem1...uitem24, uexaminer1...uexaminer22] for your m1 model in this  
posting. However, in your original posting you fitted item/examiner in  
a different order:

m1 <- lmer(Difference ~ 1+  (1|Examiner) + (1|Item), data=englisho.data)

and it was for this model for which my Cinv code would work.

I thought

attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,]

would be equal to diag(Cinv[1+1:ne]) (from the original specification:  
Examiner before Item) but actually it seems closer to something like  
this:

  C = t(W)%*%W/Vr+bdiag(Diagonal(nb)*0,Diagonal(ne)/Ve,Diagonal(ni)/Vi)

  Cinv = solve(C[-c(1:nb),-c(1:nb)])

  diag(Cinv)[1:ne]

I don't think this is a good thing - it assumes the fixed effects, as  
well as the variance parameters, are known without error.

Cheers,

Jarrod









Quoting Daniel Caro <dcarov at gmail.com> on Fri, 8 Feb 2013 14:32:53 +0000:

> Dear Jarrod
>
> Thank you for the practical advice and thank you all for the
> interesting comments.
>
> I was able to calculate Cinv. In my model,
>
> m1 <- lmer(Difference ~ 1+ (1|Item) + (1|Examiner), data=englisho.data)
>
> this matrix is of size 47x47, which is equal to the number of Items
> (ni=24) + number of Examiners (ne=22) + number of fixed effects
> (nb=1). Or p (1)+q (46) in the lme4 manual. Cinv is not symmetric and
> I am wondering how to interpret the position of covariances in the
> matrix. I thought the row/columns would be: [overall intercept,
> uitem1...uitem24, uexaminer1...uexaminer22]. But it seems not? And is
> the diagonal for the random effects for the examiners, for example,
> comparable with the results of
>
> attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,]
>
> ?
>
> Again, thank you for your help.
>
> All the best,
> Daniel
>
> On Thu, Feb 7, 2013 at 11:01 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>> Dear Daniel,
>>
>> To give a practical, rather than philosophical, solution to your problem you
>> could perhaps try this....
>>
>> W<-cBind(X,Z)
>>
>> Cinv =
>> solve(t(W)%*%W/Vr+bdiag(Diagonal(nb)*0,Diagonal(ne)/Ve,Diagonal(ni)/Vi))
>>
>>
>> where Vr is the residual variance, nb is the number of fixed effects, ne is
>> the number of examiners and Ve the examiner variance estimate and ni is the
>> number of items and Vi the item variance estimate.
>>
>> Cinv is the (co)variance matrix of the fixed and random effects (stacked on
>> top of each other).  If you take the sub-matrix of Cinv that pertains to the
>> effects in your prediction, then summing the elements of that sub-matrix
>> will give you something that is close to the prediction variance. In general
>> the sub-matrix will not be diagonal (i.e. there are covariances between the
>> random effects and the things we are *calling* fixed effects). Vr, Ve and Vn
>> are assumed known. If you want to include their uncertainty in the
>> prediction interval you will probably have to wear the B-badge more visibly
>> on your sleeve I'm afraid.
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Quoting "Doran, Harold" <HDoran at air.org> on Thu, 7 Feb 2013 10:02:49 -0500:
>>
>>> Doug:
>>>
>>> Sorry if this is too simple a question. Call \hat{\beta}} the estimate of
>>> a fixed parameter and \hat{b_i}  the BLUP for unit i.
>>>
>>> If the following exists
>>>
>>> \hat{\theta} = \hat{\beta}} + \hat{b_i}
>>>
>>> Then, does var(\hat{\theta}) also exist?
>>>
>>>
>>> From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
>>> Bates
>>> Sent: Thursday, February 07, 2013 9:57 AM
>>> To: Doran, Harold
>>> Cc: Andrew Robinson; dcarov at gmail.com; r-sig-mixed-models at r-project.org;
>>> Ben Bolker
>>> Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
>>>
>>> On Wed, Feb 6, 2013 at 9:36 AM, Doran, Harold
>>> <HDoran at air.org<mailto:HDoran at air.org>> wrote:
>>> Andrew
>>>
>>> Ignoring the important theoretical question for just a moment on whether
>>> it is sensible to do this, there is a covariance term between the BLUPs and
>>> the fixed effects.
>>>
>>> But the picky mathematician in me can't understand in what distribution
>>> this covariance occurs.  It makes sense in the Bayesian formulation but not
>>> in a classical (sampling theory) formulation.  The distribution of the
>>> estimator of the fixed effects for known values of the parameters is a
>>> multivariate normal that depends on \beta, \sigma^2 and \Sigma, the model
>>> matrices X and Z being known.  The random variable B doesn't enter into it.
>>> (One way of writing this variance-covariance of this multivariate normal is
>>> X'V^{-1}X where V is that matrix that involves Z and Sigma^{-1} - I have
>>> forgotten the exact form).
>>>
>>> I know these considerations sound like needless theoretical niceties but
>>> to me they're not. I have to be able to formulate the theoretical basis
>>> before I can make sense of the computational results and, after 20 years or
>>> so, I'm still having trouble making sense of this.
>>>
>>>
>>> If that term exists under the model, but it is ignored for purposes of
>>> variance estimation, what would be the reason for doing so?
>>>
>>>> -----Original Message-----
>>>> From:
>>>> r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>
>>>> [mailto:r-sig-mixed-<mailto:r-sig-mixed->
>>>> models-bounces at r-project.org<mailto:models-bounces at r-project.org>] On
>>>> Behalf Of Andrew Robinson
>>>> Sent: Tuesday, February 05, 2013 9:15 PM
>>>> To: dcarov at gmail.com<mailto:dcarov at gmail.com>
>>>> Cc:
>>>> r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>;
>>>> Ben Bolker
>>>> Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
>>>>
>>>> I think that it is a reasonable way to proceed just so long as you
>>>> interpret the
>>>> intervals guardedly and document your assumptions carefully.
>>>>
>>>> Cheers
>>>>
>>>> Andrew
>>>>
>>>> On Wednesday, February 6, 2013, Daniel Caro wrote:
>>>>
>>>> > Dear all
>>>> >
>>>> > I have not been able to follow the discussion. But I would like to
>>>> > know if it makes sense to calculate prediction intervals like this:
>>>> >
>>>> > var(fixed effect+random effect)= var(fixed effect) + var(random
>>>> > effect) + 0 (i.e., the cov is zero)
>>>> >
>>>> > and based on this create the prediction intervals. Does this make
>>>> > sense?
>>>> >
>>>> > All the best,
>>>> > Daniel
>>>> >
>>>> > On Tue, Feb 5, 2013 at 8:54 PM, Douglas Bates
>>>> > <bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
>>>>
>>>> wrote:
>>>> > > On Tue, Feb 5, 2013 at 2:14 PM, Andrew Robinson <
>>>> > > A.Robinson at ms.unimelb.edu.au<mailto:A.Robinson at ms.unimelb.edu.au>>
>>>> > > wrote:
>>>> > >
>>>> > >> I'd have thought that the joint correlation matrix would be of the
>>>> > >> estimates of the fixed effects and the random effects, rather than
>>>> > >> the things themselves.
>>>> > >>
>>>> > >
>>>> > > Well, it may be because I have turned into a grumpy old man but I
>>>> > > get
>>>> > picky
>>>> > > about terminology and the random effects are not parameters - they
>>>> > > are unobserved random variables.  They don't have "estimates" in the
>>>> > > sense of parameter estimates.  The quantities returned by the ranef
>>>> > > function are
>>>> > the
>>>> > > conditional means (in the case of a linear mixed model, conditional
>>>> > > modes in general) of the random effects given the observed data
>>>> > > evaluated with the parameters at their estimated values. In the
>>>> > > Bayesian point of view none of this is problematic because they're
>>>> > > all random variables but otherwise I struggle with the
>>>> > > interpretation of how these can be
>>>> > considered
>>>> > > jointly.   If you want to consider the distribution of the random
>>>> > > effects
>>>> > > you need to have known values of the parameters.
>>>> > >
>>>> > >
>>>> > >> The estimates are statistical quantities, with specified
>>>> > >> distributions, under the model.  The model posits these different
>>>> > >> roles (parameter,
>>>> > random
>>>> > >> variable) for the quantities that are the targets of the estimates,
>>>> > >> but
>>>> > the
>>>> > >> estimates are just estimates, and as such, they have a correlation
>>>> > >> structure under the model, and that correlation structure can be
>>>> > estimated.
>>>> > >>
>>>> > >> An imperfect analogy from least-squares regression is the
>>>> > >> correlation structure of residual estimates, induced by the model.
>>>> > >> We say that the errors are independent, but the model creates a
>>>> > >> (modest) correlation structure than can be measured, again,
>>>> > >> conditional
>>>> on the model.
>>>> > >>
>>>> > >
>>>> > > Well the residuals are random variables and we can show that at the
>>>> > > least squares estimates of the parameters they will have a known
>>>> > > Gaussian distribution which, it turns out, doesn't depend on the
>>>> > > values of the coefficients.  But those are the easy cases.  In the
>>>> > > linear mixed model
>>>> > we
>>>> > > still have a Gaussian distribution and a linear predictor but that
>>>> > > is for the conditional distribution of the response given the random
>>>> effects.
>>>> >  For
>>>> > > the complete model things get much messier.
>>>> > >
>>>> > > I'm not making these points just to be difficult.  I have spent a
>>>> > > lot of time thinking about these models and trying to come up with a
>>>> > > coherent
>>>> > way
>>>> > > of describing them.  Along the way I have come to the conclusion
>>>> > > that the way these models are often described is, well, wrong.  And
>>>> > > those descriptions include some that I have written.  For example,
>>>> > > you often
>>>> > see
>>>> > > the model described as the linear predictor for an observation plus
>>>> > > a "noise" term, epsilon, and the statement that the distribution of
>>>> > > the random effects is independent of the distribution of the noise
>>>> > > term.  I
>>>> > now
>>>> > > view the linear predictor as a part of the conditional distribution
>>>> > > of
>>>> > the
>>>> > > response given the random effects so it wouldn't make sense to talk
>>>> > > about these distributions being independent.  The biggest pitfall in
>>>> > transferring
>>>> > > your thinking from a linear model to any other kind (GLM, LMM, GLMM)
>>>> > > is
>>>> > the
>>>> > > fact that we can make sense of a Gaussian distribution minus its
>>>> > > mean so
>>>> > we
>>>> > > write the linear model in the "signal plus noise" form as Y = X\beta
>>>> > > + \epsilon where Y is an n-dimensional random variable, X is the n
>>>> > > by p
>>>> > model
>>>> > > matrix, \beta is the p-dimensional vector of coefficients and
>>>> > > \epsilon is an n-dimensional Gaussian with mean zero.  That doesn't
>>>> > > work in the other cases, despite the heroic attempts of many people
>>>> > > to write things in that way.
>>>> > >
>>>> > > Here endeth the sermon.
>>>> > >
>>>> > >
>>>> > >> On Wed, Feb 6, 2013 at 6:34 AM, Douglas Bates
>>>> > >> <bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
>>>>
>>>> > wrote:
>>>> > >>
>>>> > >>> It is possible to create the correlation matrix of the fixed
>>>> > >>> effects
>>>> > and
>>>> > >>> the random effects jointly using the results from lmer but I have
>>>> > >>> difficulty deciding what this would represent statistically.  If
>>>> > >>> you
>>>> > adopt
>>>> > >>> a B
>>>>
>>>>
>>>>
>>>> --
>>>> Andrew Robinson
>>>> Director (A/g), ACERA
>>>> Senior Lecturer in Applied Statistics                      Tel:
>>>> +61-3-8344-6410<tel:%2B61-3-8344-6410>
>>>> Department of Mathematics and Statistics            Fax: +61-3-8344
>>>> 4599<tel:%2B61-3-8344%204599>
>>>>
>>>> University of Melbourne, VIC 3010 Australia
>>>> Email: a.robinson at ms.unimelb.edu.au<mailto:a.robinson at ms.unimelb.edu.au>
>>>> Website:
>>>>
>>>> http://www.ms.unimelb.edu.au
>>>>
>>>> FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
>>>> SPuR: http://www.ms.unimelb.edu.au/spuRs/
>>>>
>>>>       [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
>>>> mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
>>> mailing list
>>>
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list