# [R-sig-ME] [R] coef se in lme

Douglas Bates bates at stat.wisc.edu
Thu Oct 18 17:49:47 CEST 2007

On 10/18/07, Doran, Harold <HDoran at air.org> wrote:
> I'm particularly interested in this, so I'll bite. Let me start with
> something very basic and build up later as I think more about this.

> Assume we have a model such as

> Y_ij = \mu + a_i + e_ij \quad a_i \sim N(0, s^2), e_ij \sim N(0, r^2)

> In this basic model, we also assume the variance components are
> orthogonal, denoted e_ij \bot a_i

> Now, from ranef() I get the estimate of a for all i (what Doug calls the
> conditional modes) and, also using the postVar argument in this
> extractor function I get variances of these conditional modes.

> Now, vcov() would return the variance for the fixed parameter \mu.

> Given that we can find the variances for each term in the model, and
> that they are assumed orthogonal, why is it that we can't simply take
> the variance of a linear combination?

Because the conditional distribution of the random effects (given the
data) depends on the fixed effects.  If you change \mu you will get
different conditonal modes for the a_i and you must take this into
account when considering the variation in the combination you show
below.

> For example, if I want to find:

> \hat{y}_ij = \mu + a_i
> Then the variance of this would be simply

> Var(\hat{y}_ij) = var(\mu) + var(a_i)

That would apply if \hat{\mu} and a_i were independent, but they aren't.

For this example (and thank you for suggesting a simple example for
discussion), the calculation I suggested would be conditional on the
ratio s/r or, equivalently, the ratio s^2/r^2.  I refer to this as the
"relative variance" of the random effects or, in more general cases,
the relative variance-covariance of the random effects.

If we condition on both variance components, s^2 and r^2, the joint
distribution of the random effects and the estimators of the fixed
effects is multivariate normal.  The mean of this distribution is the
solution to a penalized linear least squares problem.  Exactly the
same calculations as we would use for a linear least squares problem
will give us the joint distribution of the estimators of the fixed
effects and the random effects marginalized over r^2 but conditional
on the ratio s^2/r^2.

The "marginalized over r^2" of that last sentence is what leads us to
t distributions instead of z distributions and, after suitable
normalization, F distributions instead of chi-squared distributions.
What I haven't quite got a handle on yet is whether it is as important
to consider the effects of uncertainty in our estimate of s^2/r^2 as
it is to consider the effects of uncertainty in our estimate of s^2.
For fixed effects models we know that in some circumstances it is very
important to consider the uncertainty in our estimate of r^2.  This is
what Gosset did in formulating the T distribution and what Fisher
generalized into the analysis of variance.

Can we reasonably assume that variability in our estimate of r^2 is
important but variability in our estimate of s^2/r^2 is not?  If so
then we can apply the usual results from least squares to the
penalized least squares problem (conditional on the ratio r^2/s^2) and
we're done.  We need the model matrices for the fixed and the random
effects and the penalty matrix and we do need to be careful of how the
computation is done, especially in cases such as you consider where
the number of random effects can get into the millions, but the theory
is straightforward.  However, I don't think it is that cut and dried
in all cases.

It is likely that, in your world, conditioning on an estimate of
s^2/r^2 would not be a stretch.  In fact, you could condition on an
estimate of r^2 while you were at it because you have a huge number of
degrees of freedom in your estimate of the residual variance.  The
differences between a T distribution with 10 million degrees of
freedom and a standard normal distribution are not worth worrying

So I think that in your world it would be reasonable to use the
penalized least squares representation and calculate the results from
that.  You mentioned above the "posterior variances" (which I would
call conditional variances if I were to do it over again) of the
random effects.  These come from the penalized least squares
representation but conditional on the fixed effects.  The get the
joint distribution of the random effects and the estimators of the
fixed effects it is only necessary to extend that calculation a bit.
>From the joint distribution we can get the variance of the linear
combination you want to consider.

For smaller data sets it may be worthwhile going the Markov chain
Monte Carlo route because that takes into account the variation in all
the parameter estimates and in the random effects.  However, Steven
Novick recently sent me mail with a reproducible example about
difficulties with the Markov chain Monte Carlo samples drawn from some
models.  I'll send a slightly expanded version of that report to the
list for comment later today.

> And in this case there is no covariance given e_ij \bot a_i, right?
>
> Even if this scenario is too simplistic, I assume I can use a taylor
> series expansion to derive something that gives me an estimate of the
> variance of \hat{y}_ij in closed form that is pretty close, no?
>
> OK, I'm ready for my lashing.
>
> Harold
>
>
> > -----Original Message-----
> > From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf
> > Of Douglas Bates
> > Sent: Wednesday, October 17, 2007 4:05 PM
> > To: Doran, Harold
> > Cc: Irene Mantzouni; r-help at stat.math.ethz.ch; R-SIG-Mixed-Models
> > Subject: Re: [R] coef se in lme
> >
> > On 10/15/07, Doran, Harold <HDoran at air.org> wrote:
> > > ?vcov
> >
> > The vcov method returns the estimated variance-covariance
> > matrix of the fixed-effects only.  I think Irene's question
> > is about the combination of the fixed-effects parameters and
> > the BLUPs of the random effects that is returned by the coef
> > method applied to an lmer object.  (You may recall that you
> > were the person who requested such a method in lme4 like the
> > coef method in nlme :-)
> >
> > On the face of it this quantity should be easy to define and
> > evaluate but in fact it is not easy to do so because these
> > are combinations of model parameters (the fixed effects) and
> > unobserved random variables (the random effects).  It gets a
> > bit tricky trying to decide what the variance of this
> > combination would be.  I think there is a sensible
> > definition, or at least a computationally reasonable
> > definition, but there are still a few slippery points in the argument.
> >
> > Lately I have taken to referring to the "estimates" of the
> > random effects, what are sometimes called the BLUPs or Best
> > Linear Unbiased Predictors, as the "conditional modes" of the
> > random effects.  That is, they are the values that maximize
> > the density of the random effects given the observed data and
> > the values of the model parameters.  For a linear mixed model
> > the conditional distribution of the random effects is
> > multivariate normal so the conditional modes are also the
> > conditional means.  Also, we can evaluate the conditional
> > variance-covariance matrix of the random effects up to a scale factor.
> >
> > The next part is where things get a bit hazy for me but I
> > think it makes sense to consider the joint distribution of
> > the estimator of the fixed-effects parameters and the random
> > effects conditional on the data and, possibly, on the
> > variance components.  Conditional on the relative
> > variance-covariance of the random effects (i.e. the matrix
> > that occurs as the penalty term in the penalized least
> > squares representation of the model) the joint distribution
> > of the fixed-effects estimators and the random effects is
> > multivariate normal with mean and variance-covariance matrix
> > determined from the mixed-model equations.
> >
> > This big (p+q by p+q, where p is the dimension of the fixed
> > effects and q is the dimension of the random effects)
> > variance-covariance matrix could be evaluated and, from that,
> > the variance of any linear combination of components.
> > However, I have my doubts about whether it is the most
> > sensible answer to evaluate.  Conditioning on the relative
> > variance-covariance matrix of the random effects is cheating,
> > in a way.  It would be like saying we have a known variance,
> > $\sigma^2$ when, in fact, we are using an estimate.  The fact
> > that we don't know $\sigma^2$ is what gives rise to the t
> > distributions and F distributions in linear models and we are
> > all trained to pay careful attention to the number of degrees
> > of freedom in that estimate and how it affects our ideas of
> > the precision of the estimates of other model parameters.
> > For mixed models, though, many practioners are quite
> > comfortable conditioning on the value of some of the variance
> > components but not others.  It could turn out that
> > conditioning on the relative variance-covariance of the
> > random effects is not a big deal but I don't know.  I haven't
> > examined it in detail and I don't know of others who have.
> >
> > Another approach entirely is to use Markov chain Monte Carlo
> > to examine the joint distribution of the parameters (in the Bayesian
> > sense) and the random effects.  If you save the fixed effects
> > and the random effects from the MCMC chain then you can
> > evaluate the linear combination of interest throughout the
> > chain and get an empirical distribution of the quantities
> > returned by coef.
> >
> > This is probably an unsatisfactory answer for Irene who may
> > have wanted something quick and simple.  Unfortunately, I
> > don't think there is a quick, simple answer here.
> >
> > I suggest we move this discussion to the R-SIG-Mixed-Models
> > list which I am cc:ing on this reply.
> >
> > > -----Original Message-----
> > > From: r-help-bounces at r-project.org on behalf of Irene Mantzouni
> > > Sent: Mon 10/15/2007 3:20 PM
> > > To: r-help at stat.math.ethz.ch
> > > Subject: [R] coef se in lme
> > >
> > > Hi all!
> > >
> > > How is it possible to estimate  standard errors for coef
> > obtained from lme?
> > > Is there sth like se.coef() for lmer or what is the
> > anaytical solution?
> > >
> > > Thank you!
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>