[R] fixed portion of lme4 model in newdata

Jacob Wegelin jacobwegelin at fastmail.fm
Thu Apr 15 19:34:39 CEST 2010


Suppose we have

 	•       An lmer{lme4} model, MyModel, computed on dataframe SomeDATA;

 	•       Dataframe NEWDATA, which contains the variables used in computing MyModel, but different values for these variables.

In NEWDATA I would like to compute (in an automated, quick, easy, non-error-prone manner) the fitted values of the fixed portion of the model. Also I want to compute the standard errors for these fits.

One use of this would be to display graphically the fitted portion of an lmer model so as to see the "effects" of selected variables. For instance, such a display would be useful for judging the *practical* significance of variables that have tested as *statistically* significant.

To this end, I would like to obtain (in an automated manner) the "design matrix" of the model for the NEWDATA, say NEWDATA_designMatrix.  Then the fits would be

NEWDATA_designMatrix %*% cbind(fixef(MyModel))

and the variance of the fits

NEWDATA_designMatrix %*% vcov(MyModel)  %*% t(NEWDATA_designMatrix)

Are not model.frame() and model.matrix() the way to obtain NEWDATA_designMatrix?  Below is an unsuccessful attempt to use these functions.

Can anyone explain how to obtain the desired result?

#	Simulate unbalanced longitudinal data which include a factor (categorical predictor).
set.seed(5)
obsPerID<-5
N_ID<-10
SomeDATA<-data.frame(
 	ID=rep( 1:N_ID, each=obsPerID)
 	, X=rnorm( obsPerID * N_ID )
 	, categ=factor(sample(LETTERS[1:3], size=obsPerID * N_ID, replace=T))
 	)
SomeDATA$Y <- SomeDATA$X + SomeDATA$ID + rnorm( obsPerID * N_ID ) / 2 + ifelse(SomeDATA$categ=="C", 5, 0)
SomeDATA<-SomeDATA[order(SomeDATA$ID, SomeDATA$X),]
print(summary(SomeDATA))
library(lattice)
print(
xyplot( Y ~ X | categ, groups=ID, data=SomeDATA, type="l"
 	, layout=c(3,1)
 	)
)
#	I would like to easily slap the fixed portion of the fits of a regression model on top of this plot. 
library(lme4)
MyModel<-lmer( Y ~ X + categ + (1|ID), data=SomeDATA)
print(MyModel)

#	Create an orderly NEWDATA. 
N_NEWDATA<-7
NEWDATA<-data.frame(
 	X=seq(-2, 2, length=N_NEWDATA)
 	, ID=2
 	, categ=factor(
 		sample(c("A", "C"), replace=T, size=N_NEWDATA)
 		, levels=levels(SomeDATA$categ))
 	, Y=0
)
NEWDATA$nuisance<-sample(1:3, replace=T, size=nrow(NEWDATA))
print(NEWDATA)
NEWDATA_frame<-model.frame( formula=formula(MyModel), data=NEWDATA)
print(NEWDATA_frame)
#	The following generates an error:
NEWDATA_matrix<-model.matrix( object=formula(MyModel), data=NEWDATA_frame)
#	The following does not conform to NEWDATA but instead reverts to SomeDATA:
NEWDATA_matrix<-model.matrix( object=MyModel, data=NEWDATA_frame)
print(NEWDATA_matrix)

Thanks for any insights

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032

APPENDIX

> sessionInfo()
R version 2.10.1 (2009-12-14) 
i386-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] lme4_0.999375-32   Matrix_0.999375-33 lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.1  tools_2.10.1

END APPENDIX


More information about the R-help mailing list