[R] Is there dpois equivalent for zero-inflated Poisson?
Martin Maechler
maechler at stat.math.ethz.ch
Wed Mar 23 16:27:33 CET 2016
>>>>> Thierry Onkelinx <thierry.onkelinx at inbo.be>
>>>>> on Tue, 22 Mar 2016 13:58:09 +0100 writes:
> dpois(0, lambda) == e^(-lambda)
> The wikipedia formula is
> ifelse(x == 0, zero + dpois(0, lambda) * (1-zero), dpois(x, lambda) *
> (1-zero))
> or
> ifelse(x == 0, zero + dpois(x, lambda) * (1-zero), dpois(x, lambda) *
> (1-zero))
> so we can move the dpois() out of the ifelse()
> ifelse(x == 0, zero, 0) + dpois(x, lambda) * (1-zero)
Nice! Thank you, Thierry.
Even nicer for symmetry reasons (and much faster) is *not* to use ifelse(),
so we'd get
dzipois <- function(x, lambda, zero)
(x == 0) * zero + dpois(x, lambda) * (1-zero)
However, numerically correctly adding the 'log = FALSE'
argument which all good "density" functions have in R --
((and which you *should* use as log = TRUE for the
log-likelihood if you want to become more professional))
is a bit tricky.
The x == 0 case at least needs care for large lambda and/or
small 'zero' (= pi).
All this is related on how to accurately compute f(L) = log(1 - exp(- L))
which I call log1mexp(L) in my still-not-submitted paper
available as one of the Rmpfr vignettes at
https://cloud.r-project.org/web/packages/Rmpfr/vignettes/log1mexp-note.pdf
BTW: There are at least 3-4 R packages which deal with zero-inflated
poisson in some ways, notably the recommended 'mgcv'
package which comes bundled with every R.
Martin Maechler,
ETH ZUrich
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> 2016-03-22 13:50 GMT+01:00 Matti Viljamaa <mviljamaa at kapsi.fi>:
>> And why is the first term of ifelse(x == 0, zero, 0) + dpois(x, lambda) /
>> (1 - zero)
>>
>> ifelse(x == 0, zero, 0)
>>
>> rather than something corresponding to
>>
>> zero+(1-zero)e^{-lambda}
>>
>> https://en.wikipedia.org/wiki/Zero-inflated_model#Zero-inflated_Poisson
>>
>> On 22 Mar 2016, at 14:25, Matti Viljamaa <mviljamaa at kapsi.fi> wrote:
>>
>> Could you clarify what are the parameters and why it’s formulated that way?
>>
>> -Matti
>>
>> On 22 Mar 2016, at 14:17, Thierry Onkelinx <thierry.onkelinx at inbo.be>
>> wrote:
>>
>> Dear Matti,
>>
>> What about this?
>>
>> dzeroinflpois <- function(x, lambda, zero){
>> ifelse(x == 0, zero, 0) + dpois(x, lambda) / (1 - zero)
>> }
>> plot(x, dzeroinflpois(x, lambda = 10, zero = 0.2), type = "l")
>>
>>
>>
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
>> Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> 2016-03-22 13:04 GMT+01:00 Matti Viljamaa <mviljamaa at kapsi.fi>:
>>
>>> I’m doing some optimisation that I first did with normal Poisson (only
>>> parameter theta was estimated), but now I’m doing the same with a
>>> zero-inflated Poisson model which
>>> gives me two estimated parameters theta and p (p is also pi in some
>>> notation).
>>>
>>> My question is, is there something equivalent to dpois that would use
>>> both of the parameters (or is the p parameter possibly unnecessary)?
>>>
>>> I’m calculating the “fit” of the Poisson model
>>>
>>> i.e. like
>>>
>>> x = c(0,1,2,3,4,5,6)
>>> y = c(3062,587,284,103,33,4,2)
>>> fit1 <- sum(y)*dpois(x, est_theta)
>>>
>>> and then comparing fit1 to the real observations.
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> <http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
> [[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list