[R] Sum of Bernoullis with varying probabilities
Deepayan Sarkar
deepayan.sarkar at gmail.com
Fri Oct 6 22:46:38 CEST 2006
On 10/6/06, Ted Harding <Ted.Harding at nessie.mcc.ac.uk> wrote:
> Hi Folks,
>
> Given a series of n independent Bernoulli trials with
> outcomes Yi (i=1...n) and Prob[Yi = 1] = Pi, I want
>
> P = Prob[sum(Yi) = r] (r = 0,1,...,n)
>
> I can certainly find a way to do it:
>
> Let p be the vector c(P1,P2,...,Pn).
> The cases r=0 and r=n are trivial (and also are exceptions
> for the following routine).
>
> For a given value of r in (1:(n-1)),
>
> library(combinat)
> Set <- (1:n)
> Subsets <- combn(Set,r)
> P <- 0
> for(i in (1:dim(Subsets)[2])){
> ix <- numeric(n)
> ix[Subsets[,i]] <- 1
> P <- P + prod((p^ix) * ((1-p)^(1-ix)))
> }
Don't you want to multiply this with factorial(r) * factorial(n-r)? I
assume combn() gives all combinations and not all permutations, in
which case you are only counting
prod((p^ix) * ((1-p)^(1-ix)))
once instead of all the different ways in which ix could be permuted
without changing it.
> OK, some abbreviation of the above can be achieved with
> the 'apply' function (and I've spelt out the details for
> clarity). But I feel the basis of the approach (i.e. 'combn')
> is crude (also tending to cause problems if n is large);
> and I'm wondering if there is a nicer way.
I don't see how. Someone has to evaluate the n-choose-r distinct terms
to be added, and your code seems to be doing that as directly as
possible. Just for fun, here's a cute but very inefficient
implementation using recursion (even n=8 is very slow):
u <-
function(n, r, p)
{
if (r == 0) return(prod(1-p))
if (r == n) return(prod(p))
ans <- 0
for (i in 1:n)
{
ans <-
ans +
p[i] * u(n-1, r-1, p[-i]) +
(1 - p[i]) * u(n-1, r, p[-i])
}
ans / n
}
-Deepayan
>
> And also wondering if someone has implemented it!
>
> Statutory Declaration: I have been on to R Site Search.
>
> Best wishes to all,
> Ted.
More information about the R-help
mailing list