[R] Sum of Bernoullis with varying probabilities

Deepayan Sarkar deepayan.sarkar at gmail.com
Fri Oct 6 22:46:38 CEST 2006


On 10/6/06, Ted Harding <Ted.Harding at nessie.mcc.ac.uk> wrote:
> Hi Folks,
>
> Given a series of n independent Bernoulli trials with
> outcomes Yi (i=1...n) and Prob[Yi = 1] = Pi, I want
>
>   P = Prob[sum(Yi) = r]  (r = 0,1,...,n)
>
> I can certainly find a way to do it:
>
> Let p be the vector c(P1,P2,...,Pn).
> The cases r=0 and r=n are trivial (and also are exceptions
> for the following routine).
>
> For a given value of r in (1:(n-1)),
>
>   library(combinat)
>   Set <- (1:n)
>   Subsets <- combn(Set,r)
>   P <- 0
>   for(i in (1:dim(Subsets)[2])){
>     ix <- numeric(n)
>     ix[Subsets[,i]] <- 1
>     P <- P + prod((p^ix) * ((1-p)^(1-ix)))
>   }

Don't you want to multiply this with factorial(r) * factorial(n-r)? I
assume combn() gives all combinations and not all permutations, in
which case you are only counting

prod((p^ix) * ((1-p)^(1-ix)))

once instead of all the different ways in which ix could be permuted
without changing it.

> OK, some abbreviation of the above can be achieved with
> the 'apply' function (and I've spelt out the details for
> clarity). But I feel the basis of the approach (i.e. 'combn')
> is crude (also tending to cause problems if n is large);
> and I'm wondering if there is a nicer way.

I don't see how. Someone has to evaluate the n-choose-r distinct terms
to be added, and your code seems to be doing that as directly as
possible. Just for fun, here's a cute but very inefficient
implementation using recursion (even n=8 is very slow):

u <-
    function(n, r, p)
{
    if (r == 0) return(prod(1-p))
    if (r == n) return(prod(p))
    ans <- 0
    for (i in 1:n)
    {
        ans <-
            ans +
                p[i] * u(n-1, r-1, p[-i]) +
                    (1 - p[i]) * u(n-1, r, p[-i])
    }
    ans / n
}

-Deepayan

>
> And also wondering if someone has implemented it!
>
> Statutory Declaration: I have been on to R Site Search.
>
> Best wishes to all,
> Ted.



More information about the R-help mailing list