[R-sig-ME] getVarCov for lmer()
Douglas Bates
bates at stat.wisc.edu
Sun Feb 3 00:47:35 CET 2008
Thanks for including the code to generate the example, David, and I
apologize for grumbling at you. Sometimes my reaction to a feature
request for the lme4 package is "Oh God, I am going to have to think
about how to do this in the general case".
Once I thought about it the answer was comparatively straightforward,
although I can imagine circumstances in which one would choke even a
powerful computer trying to construct it.
The problem with trying to construct this in the general case is that
you can't make sense of Z_i. However, you can construct tr(Z) %*%
Sigma %*% Z which, in this case will be a block diagonal matrix.
I need to explain a couple of relationships in the slots from a fitted
model. As I mentioned previously, the Zt slot is a sparse
representation of the transpose of Z. Let n be the number of
observations and q be the total number of random effects. There are
(virtual) q by q matrices S and T where S is diagonal with
non-negative diagonal elements and T is unit lower triangular that
define the q by q variance-covariance matrix Sigma of the
q-dimensional random effects vector b.
Sigma = sigma^2 * T %*% S %*% S %*% t(T)
The A slot is
A = t(T) %*% S %*% t(Z)
So the matrix you want is
sigma^2 * (crossprod(fm at A) + I)
The matrix crossprod(fm at A) will be stored as a sparse symmetric
matrix. There may be a slick way of adding an identity to it but I
can't recall offhand how to do so (Martin would know better than I but
I think he is about to go on vacation somewhere where he doesn't have
email access). Then you need to multiply by s^2 to get the estimated
marginal covariance matrix.
Before doing that you should check that it really is block diagonal.
Deepayan Sarkar wrote an image method for sparse matrices that helps
in visualization of the pattern. Try
library(lme4)
fm1.no.cov <- lmer(dv ~ time.num * drug + (0+Pind|Patient.new) +
(0+Dind|Patient.new), dat.new)
image(crossprod(fm1.no.cov at A))
image(drop0(crossprod(fm1.no.cov at A))
as(crossprod(fm1.no.cov at A)[1:8,1:8], "matrix")
Remember that you need to add 1 to the diagonal elements then multiply
the matrix by s^2.
This method works for this example. In general the matrix
crossprod(fm at A) has the potential of being a sparse matrix with a huge
number of nonzeros.
On Feb 1, 2008 1:55 PM, David Afshartous <dafshartous at med.miami.edu> wrote:
>
> RE what I'd like to calculate, I'd like to calculate the marginal covariance
> matrix, i.e., tr(Z_i) Psi Z_I + sigma^2 I, in the notation of your book.
>
> I've done this for a series of models estimated with lme() using
> getVarCov(model, type = "marginal"), but for the last model there is an
> error message and I wanted to see if this is also the case for lmer().
>
> Here is the problem on simulated data called dat.new (formed below;
> longitudinal data from a crossover design; 20 patients, 4 measurements each,
> both for drug and placebo, thus n=160 observations total):
>
> # basic random intercept model, marginal covariance matrix is compound
> # symmetric as expected:
>
> > fm <- lme( dv ~ time.num*drug, random = ~ 1 | Patient.new, data=dat.new )
> > getVarCov(fm, type = "marginal")
> Patient.new 1
> Marginal variance covariance matrix
> 1 2 3 4 5 6 7 8
> 1 18.8570 8.5199 8.5199 8.5199 8.5199 8.5199 8.5199 8.5199
> 2 8.5199 18.8570 8.5199 8.5199 8.5199 8.5199 8.5199 8.5199
> 3 8.5199 8.5199 18.8570 8.5199 8.5199 8.5199 8.5199 8.5199
> 4 8.5199 8.5199 8.5199 18.8570 8.5199 8.5199 8.5199 8.5199
> 5 8.5199 8.5199 8.5199 8.5199 18.8570 8.5199 8.5199 8.5199
> 6 8.5199 8.5199 8.5199 8.5199 8.5199 18.8570 8.5199 8.5199
> 7 8.5199 8.5199 8.5199 8.5199 8.5199 8.5199 18.8570 8.5199
> 8 8.5199 8.5199 8.5199 8.5199 8.5199 8.5199 8.5199 18.8570
> Standard Deviations: 4.3424 4.3424 4.3424 4.3424 4.3424 4.3424 4.3424
> 4.3424
>
>
> ## now I make random effect variance stratified by treatment.
> ## as expected, the 8x8 matrix has the upper left 4x4 block
> ## and the lower right 4x4 block represent the compound symmetric
> ## covariance matrix for the different treatment conditions (recall
> ## that the same patient will experience both since it's a crossover).
> ## whereas the upper right and lower left 4x4 blocks represent
> ## covariance of measurements across treatment condition, e.g.,
> ## the covariance of measurement i on treatment to measurement j on control.
> ## such a covariance exists since although we have stratified the intercept
> ## random effect by treatment, a covariance of these random variables is
> specified by default ( check via summary(fm1) ).
>
> > fm1 <- lme( dv ~ time.num*drug, random = ~ 0 + Pind + Dind | Patient.new,
> data=dat.new )
> > getVarCov(fm1, type = "marginal")
> Patient.new 1
> Marginal variance covariance matrix
> 1 2 3 4 5 6 7 8
> 1 31.3030 29.3380 29.3380 29.3380 2.0209 2.0209 2.0209 2.0209
> 2 29.3380 31.3030 29.3380 29.3380 2.0209 2.0209 2.0209 2.0209
> 3 29.3380 29.3380 31.3030 29.3380 2.0209 2.0209 2.0209 2.0209
> 4 29.3380 29.3380 29.3380 31.3030 2.0209 2.0209 2.0209 2.0209
> 5 2.0209 2.0209 2.0209 2.0209 6.8509 4.8858 4.8858 4.8858
> 6 2.0209 2.0209 2.0209 2.0209 4.8858 6.8509 4.8858 4.8858
> 7 2.0209 2.0209 2.0209 2.0209 4.8858 4.8858 6.8509 4.8858
> 8 2.0209 2.0209 2.0209 2.0209 4.8858 4.8858 4.8858 6.8509
> Standard Deviations: 5.5949 5.5949 5.5949 5.5949 2.6174 2.6174 2.6174
> 2.6174
>
> ## now, presumably the off-diagonal 4x4 blocks should go to zero if the
> ## covariance between the intercept random effects for treatment/control can
> ## be constrained to 0. This can be done via:
>
> > fm1.no.cov <- lme( dv ~ time.num*drug, random = list(~ 0 + Pind |
> Patient.new, ~ 0 + Dind | Patient.new), data=dat.new )
>
> # But the following error message arises when trying to get the marginal
> # covariance:
>
> > getVarCov(fm1.no.cov, type = "marginal")
> Error in getVarCov.lme(fm1.no.cov, type = "marginal") :
> Not implemented for multiple levels of nesting
>
> Does there exist a different model statement for fm1.no.cov such that this
> problem doesn't arise? In any event, this is why I sought getVarCov in
> lmer().
>
> ##################################
> ## code to generate simulated data above:
>
> set.seed(500)
> n.timepoints <- 4
> n.subj.per.tx <- 20
> sd.d <- 5;
> sd.p <- 2;
> sd.res <- 1.3
> drug <- factor(rep(c("D", "P"), each = n.timepoints, times =
> n.subj.per.tx))
> drug.baseline <- rep( c(0,5), each=n.timepoints, times=n.subj.per.tx )
> Patient.baseline <- rep( rnorm( n.subj.per.tx*2, sd=c(sd.d, sd.p) ),
> each=n.timepoints )
> time.baseline <- rep(1:n.timepoints,n.subj.per.tx*2)*as.numeric(drug=="D")
> dv <- rnorm( n.subj.per.tx*n.timepoints*2,
> mean=time.baseline+Patient.baseline+drug.baseline, sd=sd.res )
>
> dat.new <- data.frame(drug, dv)
> dat.new$Dind <- as.numeric(dat.new$drug == "D")
> dat.new$Pind <- as.numeric(dat.new$drug == "P")
> dat.new$time.num = rep(1:n.timepoints, n.subj.per.tx*2)
> dat.new$Patient.new = rep(1:20, each=8)
>
>
>
> On 2/1/08 1:29 PM, "Douglas Bates" <bates at stat.wisc.edu> wrote:
>
> > On Feb 1, 2008 11:15 AM, David Afshartous <dafshartous at med.miami.edu> wrote:
> >
> >> All,
> >
> >> I've fitted some models using lme() and now I want to check some things by
> >> fitting them in lmer() (latest version loaded from CRAN). Under the listed
> >> methods for lmer() I don't seem to find functionality similar to that of
> >> getVarCov() to obtain the variance-covariance matrix of the fitted model.
> >> Does such functionality exist?
> >
> > I would need to go back and check what getVarCov produces. Can you
> > tell me what you want to be able to calculate?
> >
> > Be aware that it is not always possible to reproduce all the
> > capabilities of the nlme package in lme4 because nlme does not handle
> > (except through egregious kludges) crossed or partially crossed random
> > effects specifications. Some of the quantities that can be calculated
> > for models with a single level of random effects or with multiple
> > nested levels of random effects cannot even be defined for models with
> > crossed random effects.
> >
> >> PS - running R v2.6.0.
> >
> >> PPS - if it must be computed manually, how does one access information
> >> stored in slots (e.g., Z', the transpose of the model matrix for the random
> >> effects)?
> >
> > If the fitted model is fm then Z' is available as
> >
> > fm at Zt
> >
> > It is stored as a compressed, column-oriented sparse matrix of class
> > "dgCMatrix", defined in the Matrix package. Linear algebra operations
> > with such objects should magically do the right thing. If you are
> > working with very large models, of course, you will find that you need
> > to be very careful about exactly what operations are done with Z. In
> > many books on mixed models there are formulas for many calculations
> > based on a generalized least squares representation of the model.
> > Don't even consider using such formulas except for trivial examples.
> >
> > I've seen far too many papers where the authors write down an n by n
> > matrix like V = sigma^2 I + Z Sigma Z' and then proceed to blithely
> > describe calculations involving the inverse of V. I usually stop
> > reading at that point because I know that the author has never tried
> > to perform the calculation on a non-trivial example. When n is in the
> > hundreds of thousands or in the millions you do not even construct n
> > by n matrices, let alone try to decompose them. And anyone with
> > experience in numerical linear algebra knows that you never, ever,
> > ever expect to evaluate a formula that involves the inverse of a large
> > matrix.
> >
> > This is all to say that you could write down an expression involving
> > solve() applied to some matrix of size n by n derived from Z but you
> > will be disappointed in the speed.
>
>
More information about the R-sig-mixed-models
mailing list