[R] faster computation of cumulative multinomial distribution
Theo Borm
theo_rstt at borm.org
Fri Mar 30 16:10:17 CEST 2007
Dear list members,
I have a series of /unequal/ probabilities [p1,p2,...,pk], describing
mutually exclusive events, and a "remainder" class with a probability
p0=1-p1-p2-....-pk, and need to calculate, for a given number of trials
t>=k, the combined probability that each of the classes 1...k contains
at least 1 "event" (the remainder class may be empty).
To me this reaks of a sum of multinomial distributions, and indeed, I
can readily calculate the correct answer for small figures t,k using a
small R program.
However, in my typical experiment, k ranges from ~20-60 and t from
~40-100, and having to calculate these for about 6e9 experiments, a
quick calculation on the back of a napkin shows me that I will not be
able to complete these calculations before the expected end of the universe.
I already figured out that in this particular case I may use reciprocal
probabilities, and if I do this I get an equation with "only" 2^k terms,
which would shorten my computations to a few decades.
Isn't there a faster (numerical approximation?) way to do this?
R has dbinom /and/ pbinom functions, but unfortunately only a dmultinom
and no pmultinom function... perhaps because there is no (known) faster way?
with kind regards,
Theo Borm
More information about the R-help
mailing list