[R] hurdle model - count and response predictions

John Wilson jhwilson.nb at gmail.com
Fri Feb 16 19:54:55 CET 2018


Hello,

I'm using pscl to run a hurdle model. Everything works great until I get to
the point of making predictions. All of my "count" predictions are lower
than my actual data, and lower than the "response" predictions, similar to
the issue described here (
https://stat.ethz.ch/pipermail/r-help/2012-August/320426.html) and here (
https://stackoverflow.com/questions/48794622/hurdle-model-prediction-count-vs-response
).

Since the issue is the same (and not resolved), I'll just use the example
from the second link:

library("pscl")
data("RecreationDemand", package = "AER")

## model
m <- hurdle(trips ~ quality | ski, data = RecreationDemand, dist = "negbin")
nd <- data.frame(quality = 0:5, ski = "no")
predict(m, newdata = nd, type = "count")
predict(m, newdata = nd, type = "response")

The presence/absence part of the model gives identical estimates to a
logistic model run on the data. However, I thought that the negbin part of
the hurdle should give identical estimates to a separate, glm.nb model of
the positive data. But I get completely different values...

library(MASS)
m.nb <- glm.nb(trips ~ quality, data =
RecreationDemand[RecreationDemand$trips > 0,])
predict(m, newdata = nd, type = "count") ## hurdle
predict(m.nb, newdata = nd, type = "response") ## positive counts only

Any help would be appreciated.

	[[alternative HTML version deleted]]



More information about the R-help mailing list