[R] Problem with zero-inflated negative binomial model in sediment river dynamics
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Wed Aug 14 12:07:34 CEST 2013
On Tue, 13 Aug 2013, Cade, Brian wrote:
> Lauria: For historical reasons the logistic regression (binomial with
> logit link) model portion of a zero-inflated count model is usually
> structured to predict the probability of the 0 counts rather than the
> nonzero (>=1) counts so the coefficients will be the negative of what you
> expect based on the count model portion (as in your output). It is simple
> to interpret the probability of the logistic regression portion as the
> probability of the nonzero counts by just taking the negative of the
> coefficient estimates provided for the probability of the zero counts.
This is a common misinterpretation but not quite correct.
The zero-inflation model is a mixture model of two components: (1) a count
component (Poisson, NB, ...), and (2) a zero mass component (i.e., zero
with probability 1). Hence, the observed zeros in the data can come from
both sources: either they are "random" zeros from component (1) or
"excess" zeros from component (2).
The binomial zero-inflation part of the model predicts the probability
that a given observation belongs to component (1). Thus, the probability
of an "excess zero". But this is _not_ the probability of observing a zero
in the data (which is larger than the excess zero probability).
If you want a model that first models zero vs. non-zero and second the
non-zero counts, use the hurdle model. This has exactly the interpretation
you describe above.
Best,
Z
> Brian
>
> Brian S. Cade, PhD
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO 80526-8818
>
> email: cadeb at usgs.gov <brian_cade at usgs.gov>
> tel: 970 226-9326
>
>
>
> On Tue, Aug 13, 2013 at 9:06 AM, Lauria, Valentina <
> valentina.lauria at nuigalway.ie> wrote:
>
>> Dear All,
>>
>> I am running a negative binomial model in R using the package pscl in oder
>> to estimate bed sediment movements versus river discharge. Currently we
>> have deployed 4 different plates to test if a combination of more than one
>> plate would better describe the sediment movements when the river discharge
>> changes over time.
>>
>> My data are positively skewed and zero-inflated. I did run both
>> zero-inflated Poisson and zero-inflated negative binomial regression and
>> compared them using the VUONG test which showed that the negative binomial
>> works better than a simple zero-inflated Poisson.
>>
>> My models look like:
>>
>>
>> 1) plate1 ~ river discharge
>> 2) (plate 1 + plate 2) ~ river discharge
>> 3) (plate 1 + plate 2 +plate 3) ~ river discharge
>> 4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge
>>
>>
>> My main problem as I am new to these type of models is that I get a
>> different sign for the coefficent of discharge in the output of the
>> zero-inflated negative binomial model (please see below). What does this
>> mean? Also how could I compare the different models (1-4) i.e. what tells
>> me which is performing best? Thank you very much in advance for any
>> comments and suggestions!!
>>
>> Kind Regards,
>> Valentina
>>
>>
>> Call:
>> zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist =
>> "negbin", EM = TRUE)
>> Pearson residuals:
>> Min 1Q Median 3Q Max
>> -0.6770 -0.3564 -0.2101 -0.0814 12.3421
>>
>> Count model coefficients (negbin with log link):
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) 2.557066 0.036593 69.88 <2e-16 ***
>> discharge 0.064698 0.001983 32.63 <2e-16 ***
>> Log(theta) -0.775736 0.012451 -62.30 <2e-16 ***
>>
>> Zero-inflation model coefficients (binomial with logit link):
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) 13.01011 0.22602 57.56 <2e-16 ***
>> discharge -1.64293 0.03092 -53.14 <2e-16 ***
>> Theta = 0.4604
>> Number of iterations in BFGS optimization: 1
>> Log-likelihood: -6.933e+04 on 5 Df
>>
>>
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list