[R-sig-ME] [R] lmer - BLUP prediction intervals

Jarrod Hadfield j.hadfield at ed.ac.uk
Fri Feb 8 00:01:57 CET 2013


Dear Daniel,

To give a practical, rather than philosophical, solution to your  
problem you could perhaps try this....

W<-cBind(X,Z)

Cinv =  
solve(t(W)%*%W/Vr+bdiag(Diagonal(nb)*0,Diagonal(ne)/Ve,Diagonal(ni)/Vi))


where Vr is the residual variance, nb is the number of fixed effects,  
ne is the number of examiners and Ve the examiner variance estimate  
and ni is the number of items and Vi the item variance estimate.

Cinv is the (co)variance matrix of the fixed and random effects  
(stacked on top of each other).  If you take the sub-matrix of Cinv  
that pertains to the effects in your prediction, then summing the  
elements of that sub-matrix will give you something that is close to  
the prediction variance. In general the sub-matrix will not be  
diagonal (i.e. there are covariances between the random effects and  
the things we are *calling* fixed effects). Vr, Ve and Vn are assumed  
known. If you want to include their uncertainty in the prediction  
interval you will probably have to wear the B-badge more visibly on  
your sleeve I'm afraid.

Cheers,

Jarrod

















Quoting "Doran, Harold" <HDoran at air.org> on Thu, 7 Feb 2013 10:02:49 -0500:

> Doug:
>
> Sorry if this is too simple a question. Call \hat{\beta}} the  
> estimate of a fixed parameter and \hat{b_i}  the BLUP for unit i.
>
> If the following exists
>
> \hat{\theta} = \hat{\beta}} + \hat{b_i}
>
> Then, does var(\hat{\theta}) also exist?
>
>
> From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
> Sent: Thursday, February 07, 2013 9:57 AM
> To: Doran, Harold
> Cc: Andrew Robinson; dcarov at gmail.com;  
> r-sig-mixed-models at r-project.org; Ben Bolker
> Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
>
> On Wed, Feb 6, 2013 at 9:36 AM, Doran, Harold  
> <HDoran at air.org<mailto:HDoran at air.org>> wrote:
> Andrew
>
> Ignoring the important theoretical question for just a moment on  
> whether it is sensible to do this, there is a covariance term  
> between the BLUPs and the fixed effects.
>
> But the picky mathematician in me can't understand in what  
> distribution this covariance occurs.  It makes sense in the Bayesian  
> formulation but not in a classical (sampling theory) formulation.   
> The distribution of the estimator of the fixed effects for known  
> values of the parameters is a multivariate normal that depends on  
> \beta, \sigma^2 and \Sigma, the model matrices X and Z being known.   
> The random variable B doesn't enter into it.  (One way of writing  
> this variance-covariance of this multivariate normal is X'V^{-1}X  
> where V is that matrix that involves Z and Sigma^{-1} - I have  
> forgotten the exact form).
>
> I know these considerations sound like needless theoretical niceties  
> but to me they're not. I have to be able to formulate the  
> theoretical basis before I can make sense of the computational  
> results and, after 20 years or so, I'm still having trouble making  
> sense of this.
>
>
> If that term exists under the model, but it is ignored for purposes  
> of variance estimation, what would be the reason for doing so?
>
>> -----Original Message-----
>> From:  
>> r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>  
>> [mailto:r-sig-mixed-<mailto:r-sig-mixed->
>> models-bounces at r-project.org<mailto:models-bounces at r-project.org>]  
>> On Behalf Of Andrew Robinson
>> Sent: Tuesday, February 05, 2013 9:15 PM
>> To: dcarov at gmail.com<mailto:dcarov at gmail.com>
>> Cc:  
>> r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>;  
>> Ben Bolker
>> Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
>>
>> I think that it is a reasonable way to proceed just so long as you  
>> interpret the
>> intervals guardedly and document your assumptions carefully.
>>
>> Cheers
>>
>> Andrew
>>
>> On Wednesday, February 6, 2013, Daniel Caro wrote:
>>
>> > Dear all
>> >
>> > I have not been able to follow the discussion. But I would like to
>> > know if it makes sense to calculate prediction intervals like this:
>> >
>> > var(fixed effect+random effect)= var(fixed effect) + var(random
>> > effect) + 0 (i.e., the cov is zero)
>> >
>> > and based on this create the prediction intervals. Does this make sense?
>> >
>> > All the best,
>> > Daniel
>> >
>> > On Tue, Feb 5, 2013 at 8:54 PM, Douglas Bates  
>> <bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
>> wrote:
>> > > On Tue, Feb 5, 2013 at 2:14 PM, Andrew Robinson <
>> > >  
>> A.Robinson at ms.unimelb.edu.au<mailto:A.Robinson at ms.unimelb.edu.au>>  
>> wrote:
>> > >
>> > >> I'd have thought that the joint correlation matrix would be of the
>> > >> estimates of the fixed effects and the random effects, rather than
>> > >> the things themselves.
>> > >>
>> > >
>> > > Well, it may be because I have turned into a grumpy old man but I
>> > > get
>> > picky
>> > > about terminology and the random effects are not parameters - they
>> > > are unobserved random variables.  They don't have "estimates" in the
>> > > sense of parameter estimates.  The quantities returned by the ranef
>> > > function are
>> > the
>> > > conditional means (in the case of a linear mixed model, conditional
>> > > modes in general) of the random effects given the observed data
>> > > evaluated with the parameters at their estimated values. In the
>> > > Bayesian point of view none of this is problematic because they're
>> > > all random variables but otherwise I struggle with the
>> > > interpretation of how these can be
>> > considered
>> > > jointly.   If you want to consider the distribution of the  
>> random effects
>> > > you need to have known values of the parameters.
>> > >
>> > >
>> > >> The estimates are statistical quantities, with specified
>> > >> distributions, under the model.  The model posits these different
>> > >> roles (parameter,
>> > random
>> > >> variable) for the quantities that are the targets of the estimates,
>> > >> but
>> > the
>> > >> estimates are just estimates, and as such, they have a correlation
>> > >> structure under the model, and that correlation structure can be
>> > estimated.
>> > >>
>> > >> An imperfect analogy from least-squares regression is the
>> > >> correlation structure of residual estimates, induced by the model.
>> > >> We say that the errors are independent, but the model creates a
>> > >> (modest) correlation structure than can be measured, again, conditional
>> on the model.
>> > >>
>> > >
>> > > Well the residuals are random variables and we can show that at the
>> > > least squares estimates of the parameters they will have a known
>> > > Gaussian distribution which, it turns out, doesn't depend on the
>> > > values of the coefficients.  But those are the easy cases.  In the
>> > > linear mixed model
>> > we
>> > > still have a Gaussian distribution and a linear predictor but that
>> > > is for the conditional distribution of the response given the random
>> effects.
>> >  For
>> > > the complete model things get much messier.
>> > >
>> > > I'm not making these points just to be difficult.  I have spent a
>> > > lot of time thinking about these models and trying to come up with a
>> > > coherent
>> > way
>> > > of describing them.  Along the way I have come to the conclusion
>> > > that the way these models are often described is, well, wrong.  And
>> > > those descriptions include some that I have written.  For example,
>> > > you often
>> > see
>> > > the model described as the linear predictor for an observation plus
>> > > a "noise" term, epsilon, and the statement that the distribution of
>> > > the random effects is independent of the distribution of the noise
>> > > term.  I
>> > now
>> > > view the linear predictor as a part of the conditional distribution
>> > > of
>> > the
>> > > response given the random effects so it wouldn't make sense to talk
>> > > about these distributions being independent.  The biggest pitfall in
>> > transferring
>> > > your thinking from a linear model to any other kind (GLM, LMM, GLMM)
>> > > is
>> > the
>> > > fact that we can make sense of a Gaussian distribution minus its
>> > > mean so
>> > we
>> > > write the linear model in the "signal plus noise" form as Y = X\beta
>> > > + \epsilon where Y is an n-dimensional random variable, X is the n
>> > > by p
>> > model
>> > > matrix, \beta is the p-dimensional vector of coefficients and
>> > > \epsilon is an n-dimensional Gaussian with mean zero.  That doesn't
>> > > work in the other cases, despite the heroic attempts of many people
>> > > to write things in that way.
>> > >
>> > > Here endeth the sermon.
>> > >
>> > >
>> > >> On Wed, Feb 6, 2013 at 6:34 AM, Douglas Bates  
>> <bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
>> > wrote:
>> > >>
>> > >>> It is possible to create the correlation matrix of the fixed
>> > >>> effects
>> > and
>> > >>> the random effects jointly using the results from lmer but I have
>> > >>> difficulty deciding what this would represent statistically.  If
>> > >>> you
>> > adopt
>> > >>> a B
>>
>>
>>
>> --
>> Andrew Robinson
>> Director (A/g), ACERA
>> Senior Lecturer in Applied Statistics                      Tel:
>> +61-3-8344-6410<tel:%2B61-3-8344-6410>
>> Department of Mathematics and Statistics            Fax: +61-3-8344  
>> 4599<tel:%2B61-3-8344%204599>
>> University of Melbourne, VIC 3010 Australia
>> Email:  
>> a.robinson at ms.unimelb.edu.au<mailto:a.robinson at ms.unimelb.edu.au>    
>>  Website:
>> http://www.ms.unimelb.edu.au
>>
>> FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
>> SPuR: http://www.ms.unimelb.edu.au/spuRs/
>>
>>       [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>  
>> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>  
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list