[R-sig-ME] Options for prediction from lmer

Douglas Bates bates at stat.wisc.edu
Thu Mar 11 23:52:34 CET 2010

On Thu, Mar 11, 2010 at 1:17 PM, David Hsu <dhsu2 at uw.edu> wrote:
> Dear list (and especially the keepers of nlme / lme4),

> To sharpen my previous post on this topic, I'd like to get predictions
> and estimated errors of the predictions out of a model with correlated
> random coefficients. In lmer() language, the model is written as:

> y ~ x1 + x2 + (1 + x3 + x4 | group)

> Q1:  Can I get predictions and associated errors with lme()?  If I use
> lme(), I can get the predictions using predict(), but how do I get
> errors on the predictions?

First, you define them.  You are combining parameters and random
variables in your predictions and I'm not sure what it would mean to
assess the variability of those predictions.  A Bayesian may be able
to make sense of it but I can't get my head around it.

> Q2:  Can I get predictions and associated errors with lmer()?  If I
> use lmer(), I thought I could get the predictions and errors by
> simulation (this is empirical Bayes, I think) by using mcmcsamp().  It
> works for a single random coefficient, but it doesn't seem to work for
> a model with multiple correlated random coefficients, giving me the
> following error:

> Error in .local(object, n, verbose, ...) :
>   Code for non-trivial theta_T not yet written

As it says, the code has not yet been written.  You may want to
consider Jarrod's MCMCglmm package instead.

> As far as I know, this seems to be because mcmcsamp() doesn't yet
> update the covariance parameter.
> Any corrections, suggestions or comments would be much appreciated.
> David
> On Wed, Mar 10, 2010 at 11:17 AM, David Hsu <dhsu2 at uw.edu> wrote:
>> Dear all,
>> I know that the subject of a predict() function for lme4 / lmer -- or lack thereof -- has been kicked around before (I googled "lmer predict").  After reading all of the previous e-mails, I'd like to get feedback on the various ways that I've thought up to actually do prediction from lmer().  Most of the ideas are briefly sketched out below; any feedback or comments would be appreciated.
>> David
>> Problem Statement:
>> After estimating my model from a given specification, I'd like to get both (a) predictions from the model and the original covariates, and (b) predictions from the same model applied to new covariates.  In pseudocode:
>> Step 1:  Original covariates "origdat", columns "x1" ... "x4", groups indicated by "group"
>> Step 2:  Estimate model, get object "M1".  My models are relatively simple, something like:  M1 <- lmer(y ~ x1 + x2 + (x3 + x4 | group), data=origdat)
>> Step 3:  New covariates "newdat", with same column names as "dat", but different data (changed covariates represent a counterfactual scenario).
>> After thinking about it, it seems like my options are:
>> [Option #1]  Write out the BLUPs and do simulations manually.  From the M1 object I could extract the fixed effects, random effects, variance-covariance matrix, and residual sds, and so could just simulate by hand.  In other words, write something like this:
>> orig.pred <- rnorm(fixef(M1) %*% origdat + mvnorm(ranef(M1) %*% origdat[group], VarCorr(M1)), residsd).
>> Then apply the same parameters to the new data:
>> new.pred <- rnorm(fixef(M1) %*% newdat + mvnorm(ranef(M1) %*% newdat[group], VarCorr(M1)), residsd).
>> Pros:  Probably fast.
>> Cons:  Really messy since I'm doing a number of different models, unless I write a really clean function, and perhaps beyond my programming skills.
>> [Option #2]  Think of some creative use of refit() and simulate() -- or other functions -- for lmer.  The documentation for refit in lme4.pdf is pretty tempting, since it writes that "it is common to use the same model specification and covariate data on many simulated responses.  That is, the only thing that changes between model fits is the response vector."  However, in this case I want to keep the model specification, but change both the covariate data and get the response vector.  I'm not sure if this is possible using lmer().
>> Cons: obviously, I don't know how quite to do this yet.
>> [Option #3]  Just use nlme().
>> Pros: nlme() will estimate this relatively simple model, and then I can simply use the predict() function to get the predictions using the fit to the original data, and then predict(M1, newdat) to get predictions with the new data.
>> Cons: nlme() is slower than lmer(), and my model already takes about 2 hours to run.
>> --
>> David Hsu
>> PhD Candidate, Urban Planning
>> University of Washington
>> (e-mail) dhsu2 at uw.edu
> --
> David Hsu
> PhD Candidate, Urban Planning
> University of Washington
> (e-mail) dhsu2 at uw.edu
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list