[R] Predictions on training set shorter than training set
wdunlap at tibco.com
Thu Apr 23 22:42:01 CEST 2015
Are there missing values in your data? If so, try adding
na.action = na.exclude
to your original call to glm or lm. It is like the default
na.omit except that it records which rows were omitted
(because they contained missing values) and fills in
the corresponding entries in the predictions, residuals, etc.
You can also set
options(na.action = "na.exclude")
to make it the default na.action in lm() and similar functions.
On Thu, Apr 23, 2015 at 10:23 AM, Mark Drummond <mark at markdrummond.ca>
> Hi all,
> Given a simple logistic regression on a training data set using glm,
> the number of predicted values is less than the number of observations
> in the training set:
> > fit.train.pred <- predict(fit, type = "response")
> > nrow(train)
>  62660
> > length(fit.train.pred)
>  58152
> As a relative newcomer, I've run lots of simple glm, CART etc. models
> but this is the first time I have seen this happen.
> Is this a common issue and is there a fix? An option to predict() perhaps?
> Cheers, Mark
> Mark Drummond
> mark at markdrummond.ca
> When I get sad, I stop being sad and be Awesome instead. TRUE STORY.
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help