[R] faster computation of cumulative multinomial distribution

Theo Borm theo_rstt at borm.org
Fri Mar 30 16:10:17 CEST 2007


Dear list members,

I have a series of /unequal/ probabilities [p1,p2,...,pk], describing
mutually exclusive events, and a "remainder" class with a probability
p0=1-p1-p2-....-pk, and need to calculate, for a given number of trials
t>=k, the combined probability that each of the classes 1...k contains
at least 1 "event" (the remainder class may be empty).

To me this reaks of a sum of multinomial distributions, and indeed, I
can readily calculate the correct answer for small figures t,k using a
small R program.

However, in my typical experiment, k ranges from ~20-60 and t from
~40-100, and having to calculate these for about 6e9 experiments, a
quick calculation on the back of a napkin shows me that I will not be
able to complete these calculations before the expected end of the universe.

I already figured out that in this particular case I may use reciprocal
probabilities, and if I do this I get an equation with "only" 2^k terms,
which would shorten my computations to a few decades.

Isn't there a faster (numerical approximation?) way to do this?

R has dbinom /and/ pbinom functions, but unfortunately only a dmultinom
and no pmultinom function... perhaps because there is no (known) faster way?

with kind regards,

Theo Borm



More information about the R-help mailing list