[R] User error in calling predict/model.frame
Russell Pierce
rpier001 at ucr.edu
Sat Jan 29 02:59:26 CET 2011
Ista & r-help list,
I guess I left out the most important part of any question, my reason
for doing this. I am interested in the predicted y value at the mean,
1 SD above of the mean, and 1 SD below the mean for each predictor.
Since I conducted my analysis with my IVs in the scale of Z (i.e. my
formula for lm.obj was out ~ scale(xxA)*scale(xxB)), I expected to be
able to define my predictors in newdata in terms of Z. It seems like
predict should be the right function to achieve these aims.
I made several (major) errors in my initial example, though the nature
of the error that R produces is the same, this may have added
considerably to the confusion. I apologize. A more concise example of
code that (I think) should work, but doesn't is:
set.seed(10)
dat <- data.frame(xxA = rnorm(20,10), xxB = rnorm(20,20))
dat$out <- with(dat,xxA+xxB+xxA*xxB+rnorm(20,20))
lm.res.scale <- lm(out ~ scale(xxA)*scale(xxB),data=dat)
my.data <- lm.res.scale$model #load the data from the lm object
newdata <- expand.grid(X1=c(-1,0,1),X2=c(-1,0,1))
names(newdata) <- c("scale(xxA)","scale(xxB)")
newdata$Y <- predict(lm.res.scale,newdata)
Using your solution:
names(newdata) <- names(dat)[1:2]
As you say, your solution is doing something other than:
coef(lm.res.scale)[1]+coef(lm.res.scale)[2]*newdata[,1]+coef(lm.res.scale)[3]*newdata[,2]+coef(lm.res.scale)[4]*newdata[,1]*newdata[,2]
I really want a solution that, in one step will provide values like
the above code. I think that should be exactly what predict() should
do (now that I fixed newdata so that it doesn't have the sd() scaling
factors). I think my example code should be equivalent to:
dat <- data.frame(xxA = rnorm(20,10), xxB = rnorm(20,20))
dat$out <- with(dat,xxA+xxB+xxA*xxB+rnorm(20,20))
#rescaling outside of lm
X1 <- with(dat,as.vector(scale(xxA)))
X2 <- with(dat,as.vector(scale(xxB)))
y <- with(dat,out)
lm.res.correct <- lm(y~X1*X2)
my.data <- lm.res.correct$model #load the data from the lm object
newdata <- expand.grid(X1=c(-1,0,1),X2=c(-1,0,1))
#No need to rename newdata as it matches my lm object already
newdata$Y <- predict(lm.res.correct,newdata)
Notably, adjusting my formula to include as.vector() does not solve
the problem with my attempt to use predict() directly with newdata.
Thanks again,
Russell Pierce
More information about the R-help
mailing list