[R] error in predict glm (new levels cause problems)

K. Steinmann Katharina.Steinmann at stud.unibas.ch
Mon Aug 15 15:39:54 CEST 2005


Dear R-helpers,

I try to perform glm's with negative binomial distributed data.
So I use the MASS library and the commands:
model_1 = glm.nb(response ~ y1 + y2 + ...+ yi, data = data.frame)
and
predict(model_1, newdata = data.frame)


So far, I think everything should be ok.

But when I want to perform a glm with a subset of the data,
I run into an error message as soon as I want to predict values, based on the
new model. The problem seems to be the reduced number of levels of one of the
factors yi ( a categorical factor) in the subset of the original data set.

On cran search I found some related hint, that the line "mf$drop.unused.levels
<- TRUE " in the glm (or glm.nb) function could cause the problem.

Therefore I changed the line to "mf$drop.unused.levels <- FALSE ".
Indeed the error message disappears and when I compare the prediction of model_1
with the prediction of the model, carried out with the full data set but with
the changed glm.nb function, I get the same predicted numbers.

However, the change of glm.nb function was more of an intuitive action, and
since I still consider myself as a beginner of R, I don't feel comfortable.

So my questions:
1. Is there an easier way to solve my problem?
2. Do I affect the glm.nb function seriously, by changing the line mentioned
above?


Thank you for your help,
Katharina

PS: I am working with R 2.0.0
PPS: Concrete error message:
"Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels) :
        factor I(kanton) has new level(s) GE"




--
K. Steinmann
Botanisches Institut
Universität Basel
CH-4056 Basel
Switzerland
Tel  0041 61 267 35 02
E-mail: Katharina.Steinmann at stud.unibas.ch




More information about the R-help mailing list