[R] User error in calling predict/model.frame
Joshua Wiley
jwiley.psych at gmail.com
Sat Jan 29 22:23:38 CET 2011
On Sat, Jan 29, 2011 at 1:05 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Jan 29, 2011, at 3:49 PM, Joshua Wiley wrote:
>
>> On Sat, Jan 29, 2011 at 12:31 PM, David Winsemius
>> <dwinsemius at comcast.net> wrote:
>>>
>>> Huh?. With the same model and data, they should be the same:
>>
>> I must be missing something, because that is what I would have
>> expected too, but it is not what I get (at least when I run the code
>> as shown below).
>>
>>>> lm.mod2 <- lm(out ~ scale(xxA)*scale(xxB), data=dat)
>>>> newdata2 <- expand.grid(xxA=c(-1,0,1),xxB=c(-1,0,1))
>>>> newdata2$preds <- predict(lm.mod, newdata2)
>>
>> ^^^^^ is that lm.mod2?
>
> UnnHHH . lm.mod2 was the model from Zahn's posting.
>
> lm.mod <- lm(out ~ I(scale(xxA))*I(scale(xxB)), data=dat)
>
> So I guess you need the "I" to get the correct results with the scaled
> numbers, but this "works" with the unscaled numbers, although the numbers
> (meanA, meanB) are somewhat different, which deserves investigation. The
> pair (meanA, meanB) gives the same result but the others don't seem to get
> to the same numbers. Which is correct?
>
>> meanA<-mean(dat$xxA); meanB<-mean(dat$xxB); sdA<-sd(dat$xxA);
>> sdB<-sd(dat$xxB)
>> newdata2 <-
>> expand.grid(xxA=c(meanA-sdA,meanA,meanA+sdA),xxB=c(meanB-sdB,meanB,meanB+sdB))
>> newdata2$preds <- predict(lm.mod2, newdata2)
>> newdata2
> xxA xxB preds
> 1 9.304820 18.75866 222.1326
> 2 9.905043 18.75866 234.0992
> 3 10.505266 18.75866 246.0659
> 4 9.304820 19.73195 232.7828
> 5 9.905043 19.73195 244.8696
> 6 10.505266 19.73195 256.9565
> 7 9.304820 20.70524 243.4330
> 8 9.905043 20.70524 255.6400
> 9 10.505266 20.70524 267.8471
>
I may be completely barking up the wrong tree, but could it have
something to do with the interaction term being created from scaled
vs. unscaled terms?
> xxAB <- dat$xxA * dat$xxB
> meanAB <- meanA * meanB
> sdAB <- sqrt(sdA^2 + sdB^2 + meanA^2 * sdB^2 + meanB^2 * sdA^2)
> (xxAB - meanAB)/sdAB
[1] 0.17672311 -0.93203381 -1.39257245 -1.31620813 0.08964000 0.71663046
[7] -1.21984785 -0.39528381 -1.40027554 0.06304763 0.57713584 1.30481682
[13] 0.76070783 1.73067640 0.49481401 -0.22607119 -0.40356164 -0.70588375
[19] 1.34382743 0.65508728
> (sxxAB <- with(dat, scale(xxA) * scale(xxB))[1:20,])
[1] 0.01838999 0.25899229 -0.15337178 1.07536361 -0.26520706 0.25119036
[7] -0.11295999 0.05205466 -1.49676116 -0.14367368 -1.86921225 0.80860167
[13] -0.44816896 1.44122535 -0.73361472 -0.14930187 -1.46192081 0.19784309
[19] 0.62089372 0.0821291515
>
>>>> newdata2
>>>
>>> xxA xxB preds
>>> 1 -1 -1 218.6366
>>> 2 0 -1 232.4330
>>> 3 1 -1 246.2295
>>> 4 -1 0 230.9129
>>> 5 0 0 244.8696
>>> 6 1 0 258.8263
>>> 7 -1 1 243.1892
>>> 8 0 1 257.3062
>>> 9 1 1 271.4232
>>>
>>>> Predict(rms.res,xxA=c(-1,0,1),xxB=c(-1,0,1), conf.int=FALSE)
>>>
>>> xxA xxB yhat
>>> 1 -1 -1 218.6366
>>> 2 0 -1 232.4330
>>> 3 1 -1 246.2295
>>> 4 -1 0 230.9129
>>> 5 0 0 244.8696
>>> 6 1 0 258.8263
>>> 7 -1 1 243.1892
>>> 8 0 1 257.3062
>>> 9 1 1 271.4232
>>
>> This is copied directly from my console in a clean session. I get
>> different values for the predicted values from predict(glm or lm) and
>> "yhat" from Predict(Glm).
>>
>>
>>> sessionInfo()
>>
>> R version 2.12.1 (2010-12-16)
>> Platform: i486-pc-linux-gnu (32-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> #################################
>>> require(rms)
>>
>> Loading required package: rms
>> Loading required package: Hmisc
>> Loading required package: survival
>> Loading required package: splines
>>
>> Attaching package: 'Hmisc'
>>
>> The following object(s) are masked from 'package:survival':
>>
>> untangle.specials
>>
>> The following object(s) are masked from 'package:base':
>>
>> format.pval, round.POSIXt, trunc.POSIXt, units
>>
>>
>> Attaching package: 'rms'
>>
>> The following object(s) are masked from 'package:survival':
>>
>> Surv
>>
>>> set.seed(10)
>>> dat <- data.frame(xxA = rnorm(20, 10), xxB = rnorm(20, 20))
>>> dat$out <- with(dat, xxA+xxB+xxA*xxB+rnorm(20,20))
>>>
>>> rms.res <- Glm(out ~ scale(xxA) * scale(xxB), data=dat)
>>> glm.mod <- glm(out ~ scale(xxA) * scale(xxB), data=dat)
>>> lm.mod2 <- lm(out ~ scale(xxA)*scale(xxB), data=dat)
>>>
>>> newdata2 <- expand.grid(xxA = -1:1, xxB = -1:1)
>>> newdata2$preds1 <- predict(glm.mod, newdata2)
>>> newdata2$preds2 <- predict(lm.mod2, newdata2)
>>> newdata2
>>
>> xxA xxB preds1 preds2
>> 1 -1 -1 143.3433 143.3433
>> 2 0 -1 131.6110 131.6110
>> 3 1 -1 119.8787 119.8787
>> 4 -1 0 137.1149 137.1149
>> 5 0 0 126.9708 126.9708
>> 6 1 0 116.8267 116.8267
>> 7 -1 1 130.8865 130.8865
>> 8 0 1 122.3306 122.3306
>> 9 1 1 113.7747 113.7747
>>>
>>> Predict(rms.res, xxA= -1:1, xxB= -1:1, conf.int=FALSE)
>>
>> xxA xxB yhat
>> 1 -1 -1 212.2324
>> 2 0 -1 229.6472
>> 3 1 -1 247.0620
>> 4 -1 0 221.7795
>> 5 0 0 240.6413
>> 6 1 0 259.5030
>> 7 -1 1 231.3266
>> 8 0 1 251.6353
>> 9 1 1 271.9441
>>
>> Response variable (y): X * Beta
>>>
>>> ##################################
>
> David Winsemius, MD
> West Hartford, CT
>
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
More information about the R-help
mailing list