[R] Newb Prediction Question using stepAIC and predict(), is R wrong?

BSanders adam at mycostech.com
Thu Feb 10 05:48:55 CET 2011


I'm using stepAIC to fit a model.  Then I'm trying to use that model to
predict future happenings.

My first few variables are labeled as their column. (Is this a problem?)
The dataframe that I use to build the model is the same as the data I'm
using to predict with.

Here is a portion of what is happening..


This is the value it is predicting  = > [1] 9.482975

Summary of the model
Call:
lm(formula = reservesub$paid ~ reservesub[, 3 + i] + reservesub$grads[, 
    i] + reservesub$Sun + reservesub$Fri + reservesub$Sat)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.447  -4.993  -1.090   3.910  27.454 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)            5.71370    1.46449   3.902 0.000149 ***
reservesub[, 3 + i]    1.00868    0.01643  61.391  < 2e-16 ***
reservesub$grads[, i]  0.44649    0.12131   3.681 0.000333 ***
reservesub$Sun         8.63606    1.95100   4.426 1.93e-05 ***
reservesub$Fri         3.76928    2.00079   1.884 0.061682 .  
reservesub$Sat         4.03103    2.12754   1.895 0.060225 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 7.842 on 138 degrees of freedom
  (131 observations deleted due to missingness)
Multiple R-squared: 0.9794,     Adjusted R-squared: 0.9787 
F-statistic:  1312 on 5 and 138 DF,  p-value: < 2.2e-16 


Here is the data that is being fed into predicted[p] =
predict.(stepsaicguess[[p]], newdata = reservesubpred[p,])
           V1  V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18
V19 paid Mon Tue Wed Thu
276 10/3/2010 155 84 76 68 64 63 53 42  42  42  42  38  38  38  35  31  31 
NA   84   0   0   0   0      
 Fri Sat Sun grads.1 grads.2 grads.3 grads.4 grads.5 grads.6 grads.7
0   0    1       8       4       1      10      11       0       0
    grads.8 grads.9 grads.10 grads.11 grads.12 grads.13 grads.14
     0       4        0        0        3        4        0


In this case, i = 1, so I calculate the predicted value should be 
5.7137+1.00868*84+.44649*8+1*8.636+0*3.769+0*4.03=102

But, R is giving me 9.482975 for a predicted value .. (Which, interestingly
is 5.7137+3.769*1) (Intercept+Sat)

Another question I have is, if I were to include interactions in this model,
would I have to make those variables in my prediction dataframe, or would R
'know' what to do?

Thanks in advance for your expert assistance.
-- 
View this message in context: http://r.789695.n4.nabble.com/Newb-Prediction-Question-using-stepAIC-and-predict-is-R-wrong-tp3298569p3298569.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list