[R] Predicting response from fitted linear model with incomplete new sample data
Chris Wilkinson
kinsham at verizon.net
Wed Dec 18 19:18:08 CET 2013
I would like to predict a new response from a fitted linear model where the
new data is a single case with a missing value. My reading of the help on
predict() is inconclusive on whether this is possible.
Leaving out the missing value or setting it to NA both fail but differently,
see example code below.
> y <- runif(50)
> x1 <- rnorm(50)
> x2 <- rnorm(50)
> dat <- data.frame(y, x1, x2)
> mod <- lm(y~.,data=dat)
> summary(mod)
Call:
lm(formula = y ~ ., data = dat)
Residuals:
Min 1Q Median 3Q Max
-0.50467 -0.28997 0.01457 0.27970 0.47791
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.50098 0.04577 10.945 1.6e-14 ***
x1 -0.01762 0.04172 -0.422 0.675
x2 -0.02753 0.04920 -0.560 0.578
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3177 on 47 degrees of freedom
Multiple R-squared: 0.009301, Adjusted R-squared: -0.03286
F-statistic: 0.2206 on 2 and 47 DF, p-value: 0.8028
> predict(mod, newdata=data.frame(x1=0.1, x2=0.3)) #OK as expected
1
0.4909624
> predict(mod, newdata=data.frame(x1=0.1)) # x2 missing
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels) :
variable lengths differ (found for 'x2')
In addition: Warning message:
'newdata' had 1 row but variables found have 50 rows
> predict(mod, newdata=data.frame(x1=0.1, x2=NA)) #x2=NA
Error: variable 'x2' was fitted with type "numeric" but type "logical" was
supplied
>
Thanks
Chris
More information about the R-help
mailing list