[R] fixed portion of lme4 model in newdata
Jacob Wegelin
jacobwegelin at fastmail.fm
Thu Apr 15 19:34:39 CEST 2010
Suppose we have
• An lmer{lme4} model, MyModel, computed on dataframe SomeDATA;
• Dataframe NEWDATA, which contains the variables used in computing MyModel, but different values for these variables.
In NEWDATA I would like to compute (in an automated, quick, easy, non-error-prone manner) the fitted values of the fixed portion of the model. Also I want to compute the standard errors for these fits.
One use of this would be to display graphically the fitted portion of an lmer model so as to see the "effects" of selected variables. For instance, such a display would be useful for judging the *practical* significance of variables that have tested as *statistically* significant.
To this end, I would like to obtain (in an automated manner) the "design matrix" of the model for the NEWDATA, say NEWDATA_designMatrix. Then the fits would be
NEWDATA_designMatrix %*% cbind(fixef(MyModel))
and the variance of the fits
NEWDATA_designMatrix %*% vcov(MyModel) %*% t(NEWDATA_designMatrix)
Are not model.frame() and model.matrix() the way to obtain NEWDATA_designMatrix? Below is an unsuccessful attempt to use these functions.
Can anyone explain how to obtain the desired result?
# Simulate unbalanced longitudinal data which include a factor (categorical predictor).
set.seed(5)
obsPerID<-5
N_ID<-10
SomeDATA<-data.frame(
ID=rep( 1:N_ID, each=obsPerID)
, X=rnorm( obsPerID * N_ID )
, categ=factor(sample(LETTERS[1:3], size=obsPerID * N_ID, replace=T))
)
SomeDATA$Y <- SomeDATA$X + SomeDATA$ID + rnorm( obsPerID * N_ID ) / 2 + ifelse(SomeDATA$categ=="C", 5, 0)
SomeDATA<-SomeDATA[order(SomeDATA$ID, SomeDATA$X),]
print(summary(SomeDATA))
library(lattice)
print(
xyplot( Y ~ X | categ, groups=ID, data=SomeDATA, type="l"
, layout=c(3,1)
)
)
# I would like to easily slap the fixed portion of the fits of a regression model on top of this plot.
library(lme4)
MyModel<-lmer( Y ~ X + categ + (1|ID), data=SomeDATA)
print(MyModel)
# Create an orderly NEWDATA.
N_NEWDATA<-7
NEWDATA<-data.frame(
X=seq(-2, 2, length=N_NEWDATA)
, ID=2
, categ=factor(
sample(c("A", "C"), replace=T, size=N_NEWDATA)
, levels=levels(SomeDATA$categ))
, Y=0
)
NEWDATA$nuisance<-sample(1:3, replace=T, size=nrow(NEWDATA))
print(NEWDATA)
NEWDATA_frame<-model.frame( formula=formula(MyModel), data=NEWDATA)
print(NEWDATA_frame)
# The following generates an error:
NEWDATA_matrix<-model.matrix( object=formula(MyModel), data=NEWDATA_frame)
# The following does not conform to NEWDATA but instead reverts to SomeDATA:
NEWDATA_matrix<-model.matrix( object=MyModel, data=NEWDATA_frame)
print(NEWDATA_matrix)
Thanks for any insights
Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
APPENDIX
> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lme4_0.999375-32 Matrix_0.999375-33 lattice_0.17-26
loaded via a namespace (and not attached):
[1] grid_2.10.1 tools_2.10.1
END APPENDIX
More information about the R-help
mailing list