[R] predict: remove columns with new levels automatically
Andreas Wittmann
andreas_wittmann at gmx.de
Tue Nov 24 20:24:23 CET 2009
Dear R-users,
in the follwing thread
http://tolstoy.newcastle.edu.au/R/help/03b/3322.html
the problem how to remove rows for predict that contain levels which are
not in the model.
now i try to do this the other way round and want to remove columns
(variables) in the model which will be later problematic with new levels
for prediction.
## example:
set.seed(0)
x <- rnorm(9)
y <- x + rnorm(9)
training <- data.frame(x=x, y=y, z=c(rep("A", 3), rep("B", 3), rep("C", 3)))
test <- data.frame(x=t<-rnorm(1), y=t+rnorm(1), z="D")
lm1 <- lm(x ~ ., data=training)
## prediction does not work because the variable z has the new level "D"
predict(lm1, test)
## solution: the variable z is removed from the model
## the prediction happens without using the information of variable z
lm2 <- lm(x ~ y, data=training)
predict(lm2, test)
How can i autmatically recognice this and calculate according to this?
Thanks
Andreas
More information about the R-help
mailing list