[R] User error in calling predict/model.frame

David Winsemius dwinsemius at comcast.net
Sat Jan 29 22:05:33 CET 2011


On Jan 29, 2011, at 3:49 PM, Joshua Wiley wrote:

> On Sat, Jan 29, 2011 at 12:31 PM, David Winsemius
> <dwinsemius at comcast.net> wrote:
>> Huh?. With the same model and data, they should be the same:
>
> I must be missing something, because that is what I would have
> expected too, but it is not what I get (at least when I run the code
> as shown below).
>
>>> lm.mod2 <- lm(out ~ scale(xxA)*scale(xxB), data=dat)
>>> newdata2 <- expand.grid(xxA=c(-1,0,1),xxB=c(-1,0,1))
>>> newdata2$preds <- predict(lm.mod, newdata2)
>                                                   ^^^^^ is that  
> lm.mod2?

UnnHHH . lm.mod2 was the model from Zahn's posting.

lm.mod <- lm(out ~ I(scale(xxA))*I(scale(xxB)), data=dat)

So I guess you need the "I" to get the correct results with the scaled  
numbers, but this "works" with the unscaled numbers, although the  
numbers (meanA, meanB) are somewhat different, which deserves  
investigation. The pair (meanA, meanB) gives the same result but the  
others don't seem to get to the same numbers. Which is correct?

 >   meanA<-mean(dat$xxA); meanB<-mean(dat$xxB); sdA<-sd(dat$xxA);  
sdB<-sd(dat$xxB)
 > newdata2 <- expand.grid(xxA=c(meanA-sdA,meanA,meanA 
+sdA),xxB=c(meanB-sdB,meanB,meanB+sdB))
 > newdata2$preds <- predict(lm.mod2, newdata2)
 > newdata2
        xxA      xxB    preds
1  9.304820 18.75866 222.1326
2  9.905043 18.75866 234.0992
3 10.505266 18.75866 246.0659
4  9.304820 19.73195 232.7828
5  9.905043 19.73195 244.8696
6 10.505266 19.73195 256.9565
7  9.304820 20.70524 243.4330
8  9.905043 20.70524 255.6400
9 10.505266 20.70524 267.8471


>>> newdata2
>> xxA xxB    preds
>> 1  -1  -1 218.6366
>> 2   0  -1 232.4330
>> 3   1  -1 246.2295
>> 4  -1   0 230.9129
>> 5   0   0 244.8696
>> 6   1   0 258.8263
>> 7  -1   1 243.1892
>> 8   0   1 257.3062
>> 9   1   1 271.4232
>>
>>> Predict(rms.res,xxA=c(-1,0,1),xxB=c(-1,0,1), conf.int=FALSE)
>> xxA xxB     yhat
>> 1  -1  -1 218.6366
>> 2   0  -1 232.4330
>> 3   1  -1 246.2295
>> 4  -1   0 230.9129
>> 5   0   0 244.8696
>> 6   1   0 258.8263
>> 7  -1   1 243.1892
>> 8   0   1 257.3062
>> 9   1   1 271.4232
>
> This is copied directly from my console in a clean session.  I get
> different values for the predicted values from predict(glm or lm) and
> "yhat" from Predict(Glm).
>
>
>> sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: i486-pc-linux-gnu (32-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>> #################################
>> require(rms)
> Loading required package: rms
> Loading required package: Hmisc
> Loading required package: survival
> Loading required package: splines
>
> Attaching package: 'Hmisc'
>
> The following object(s) are masked from 'package:survival':
>
>   untangle.specials
>
> The following object(s) are masked from 'package:base':
>
>   format.pval, round.POSIXt, trunc.POSIXt, units
>
>
> Attaching package: 'rms'
>
> The following object(s) are masked from 'package:survival':
>
>   Surv
>
>> set.seed(10)
>> dat <- data.frame(xxA = rnorm(20, 10), xxB = rnorm(20, 20))
>> dat$out <- with(dat, xxA+xxB+xxA*xxB+rnorm(20,20))
>>
>> rms.res <- Glm(out ~ scale(xxA) * scale(xxB), data=dat)
>> glm.mod <- glm(out ~ scale(xxA) * scale(xxB), data=dat)
>> lm.mod2 <- lm(out ~ scale(xxA)*scale(xxB), data=dat)
>>
>> newdata2 <- expand.grid(xxA = -1:1, xxB = -1:1)
>> newdata2$preds1 <- predict(glm.mod, newdata2)
>> newdata2$preds2 <- predict(lm.mod2, newdata2)
>> newdata2
> xxA xxB   preds1   preds2
> 1  -1  -1 143.3433 143.3433
> 2   0  -1 131.6110 131.6110
> 3   1  -1 119.8787 119.8787
> 4  -1   0 137.1149 137.1149
> 5   0   0 126.9708 126.9708
> 6   1   0 116.8267 116.8267
> 7  -1   1 130.8865 130.8865
> 8   0   1 122.3306 122.3306
> 9   1   1 113.7747 113.7747
>>
>> Predict(rms.res, xxA= -1:1, xxB= -1:1, conf.int=FALSE)
> xxA xxB     yhat
> 1  -1  -1 212.2324
> 2   0  -1 229.6472
> 3   1  -1 247.0620
> 4  -1   0 221.7795
> 5   0   0 240.6413
> 6   1   0 259.5030
> 7  -1   1 231.3266
> 8   0   1 251.6353
> 9   1   1 271.9441
>
> Response variable (y): X * Beta
>> ##################################

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list