[R] predict.lm() question
Duncan Murdoch
murdoch at stats.uwo.ca
Tue Apr 8 00:08:37 CEST 2008
On 07/04/2008 5:57 PM, Chip Barnaby wrote:
> Dear R-people ...
>
> I'm a new user. I can't get predict.lm() to produce predictions for
> new independent data. There are some messages in archived help about
> this problem, but I still don't see my error after reviewing
> those. I understand that the new independent data must have the same
> name(s) as used when the model was made.
>
> In the example below, predict.lm produces the predictions for the
> original (model input) data plus a warning message. What I want is
> predictions for alternative data (in data frame DX in the example).
>
> Thanks,
> Chip Barnaby
>
> > D<-data.frame( X=seq(1:10))
> > D$Y<-D$X+rnorm( 10)
> > D
> X Y
> 1 1 0.3811634
> 2 2 1.8770049
> 3 3 3.5253376
> 4 4 3.1851957
> 5 5 3.8088813
> 6 6 5.7333074
> 7 7 7.4896623
> 8 8 7.9394056
> 9 9 8.6683570
> 10 10 10.7480675
> > lm<-lm( D$Y~D$X)
> > summary( lm)
>
> Call:
> lm(formula = D$Y ~ D$X)
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.98812 -0.36354 -0.09808 0.48154 0.88288
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) -0.58935 0.41680 -1.414 0.195
> D$X 1.07727 0.06717 16.037 2.29e-07 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 0.6101 on 8 degrees of freedom
> Multiple R-Squared: 0.9698, Adjusted R-squared: 0.9661
> F-statistic: 257.2 on 1 and 8 DF, p-value: 2.293e-07
>
> > DX<-data.frame( X=seq( 5.5, 11.5))
> > DX
> X
> 1 5.5
> 2 6.5
> 3 7.5
> 4 8.5
> 5 9.5
> 6 10.5
> 7 11.5
> > predict.lm( lm, DX)
> 1 2 3 4 5 6 7
> 0.4879174 1.5651887 2.6424600 3.7197313 4.7970026 5.8742739 6.9515453
> 8 9 10
> 8.0288166 9.1060879 10.1833592
> Warning message:
> 'newdata' had 7 rows but variable(s) found have 10 rows
Your formula refers to D explicitly, so predict.lm will never look at
DX. You need to do the fit as
fit <- lm( Y~X, data=D)
Duncan Murdoch
More information about the R-help
mailing list