[R] Lambert (1992) simulation
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Sun May 6 16:31:51 CEST 2012
On Sat, 5 May 2012, Christopher Desjardins wrote:
> Hi,
> I am a little confused at the output from predict() for a zeroinfl object.
>
> Here's my confusion:
>
> ## From zeroinfl package
> fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin")
>
>
> ## The raw zero-inflated overdispersed data
> > table(bioChemists$art)
>
> 0 1 2 3 4 5 6 7 8 9 10 11 12 16 19
> 275 246 178 84 67 27 17 12 1 2 1 1 2 1 1
>
> ## The default output from predict. It looks like it is doing a horrible
> job. Does it really predict 7 zeros?
No, see also this R-help post on "Zero-inflated regression models:
predicting no 0s":
https://stat.ethz.ch/pipermail/r-help/2011-June/279765.html
The predicted _mean_ of a negative binomial distribution is not the most
likely outcome (i.e., the _mode_) of the distribution. The post above
presents some hands on examples.
> > table(round(predict(fm_zinb2)) )
>
> 0 1 2 3 4 5 6 10
> 7 354 487 45 12 6 3 1
>
> ## The output from predict using "count"
> > table(round(predict(fm_zinb2,type="count")))
>
> 1 2 3 4 5 6 10
> 312 536 45 12 6 3 1
>
> ## The output from predict using "zero", but here it predicts 24
> "structural" zeros?
> > table(round(predict(fm_zinb2,type="zero")))
>
> 0 1
> 891 24
>
>
> So my question is how do I interpret these different outputs from the
> zeroinf object? What are the differences? The help page just left me
> confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y)
> and would most accurately track table(bioChemists$art) but I am wrong. How
> can I find the E(Y) that would most closely track the raw data?
>
> Thanks,
> Chris
>
>
More information about the R-help
mailing list