[R] Sum of Bernoullis with varying probabilities

Deepayan Sarkar deepayan.sarkar at gmail.com
Fri Oct 6 23:44:13 CEST 2006


On 10/6/06, Ted Harding <Ted.Harding at nessie.mcc.ac.uk> wrote:
> Many thanks for your comments, Deepayan; and I liked your
> recursive solution! Fun indeed.
>
> Just a comment (below) on one of your comments (the rest
> snipped).
>
> On 06-Oct-06 Deepayan Sarkar wrote:
> > On 10/6/06, Ted Harding <Ted.Harding at nessie.mcc.ac.uk> wrote:
> >> Hi Folks,
> >>
> >> Given a series of n independent Bernoulli trials with
> >> outcomes Yi (i=1...n) and Prob[Yi = 1] = Pi, I want
> >>
> >>   P = Prob[sum(Yi) = r]  (r = 0,1,...,n)
> >>
> >> I can certainly find a way to do it:
> >>
> >> Let p be the vector c(P1,P2,...,Pn).
> >> The cases r=0 and r=n are trivial (and also are exceptions
> >> for the following routine).
> >>
> >> For a given value of r in (1:(n-1)),
> >>
> >>   library(combinat)
> >>   Set <- (1:n)
> >>   Subsets <- combn(Set,r)
> >>   P <- 0
> >>   for(i in (1:dim(Subsets)[2])){
> >>     ix <- numeric(n)
> >>     ix[Subsets[,i]] <- 1
> >>     P <- P + prod((p^ix) * ((1-p)^(1-ix)))
> >>   }
> >
> > Don't you want to multiply this with factorial(r) * factorial(n-r)? I
> > assume combn() gives all combinations and not all permutations, in
> > which case you are only counting
> >
> > prod((p^ix) * ((1-p)^(1-ix)))
> >
> > once instead of all the different ways in which ix could be permuted
> > without changing it.
>
> You had me worried for a moment -- the interplay between perms
> and combs is always a head-spinner -- but since I'd verified
> already that I get sum(P)=1 when I do this over all values of r,
> I had to conclude that your comment must be incorrect.
>
> In fact, if you consider the event "r out of the n gave Y=1",
> this can be decomposed into disjoint sub-events
>
>   subset {1,2,3,...,r} gave Y=1, the rest 0
>   subset {1,2,3,...,(r-1),(r=1)} gave Y=1, the rest 0.
>   ....
>
> and so on over all choose(n,r) subsets, and the probability of
> "r out of n" is the sum of the probabilities of the sub-events.
> For any one of these (say the first above), the probability is
>
>   P(Y1=1 & Y2=1 & ... & Yr=1 & Y[r+1]=0 & ... & Y[n]=0)
>
> which is simply the product of their probabilities. No further
> permutation is involved.

You are right. The key point I was missing is that the number of
distinct _permutations_ of r 1's and (n-r) 0's is choose(n, r) (the
places to put the 1's). Sorry for the red herring.

Deepayan



More information about the R-help mailing list