[R] Product of 1 - probabilities

Thu May 21 17:23:19 CEST 2009

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Mark Bilton
> Sent: Thursday, May 21, 2009 7:15 AM
> To: r-help at r-project.org
> Subject: [R] Product of 1 - probabilities
> 
> 
> I am having a slight problem with probabilities.
> 
> To calculate the final probability of an event p(F), we can 
> take the product of the chance that each independent event 
> that makes p(F) will NOT occur.
> So...
> p(F) = 1- ( (1-p(A)) * (1-p(B)) * (1-p(C))...(1-p(x)) )
> 
> If the chance of an event within the product occurring 
> remains the same, we can therefore raise this probability to 
> a power of the number of times that event occurs.
> e.g. rolling a dice p(A) = 1/6 of getting a 1...
> p(F) = 1 - (1- (1/6))^z 
> p(F) = 1 - (1-p(A))^z tells us the probabiltity of rolling a 
> 1 'at least once' in z number of rolls.
> 
> So then to R...
> 
> if p(A) = 0.01; z = 4; p(F) = 0.039
> 
> obviously p(F) > p(A)
> 
> however the problem arises when we use very small numbers 
> e.g. p(B) = 1 * 10^-30
> R understands this value
> However when you have 1-p(B) you get something very close to 
> 1 as you expect...but R seems to think it is 1.
> So when I calculate p(F) = 1 - (1-p(B))^z = 1 to the power 
> anything equals 1 so p(F) = 0 and not just close to zero BUT zero.
> It doesn't matter therefore if z = 1*10^1000, the answer is 
> still zero !!
> 
> Obviously therefore now p(F) < p(B)
> 
> Is there any solution to my problem, e.g.
> - is it a problem with the sum (-) ? ie could I change the 
> number of bits the number understands (however it seems 
> strange that it can hold it as a value close to 0 but not close to 1)
> -or should I maybe use a function to calculate the exact answer ?
> -or something else
> 
> Any help greatly appreciated
> Mark

R uses double precision arithmetic, which gives you 52 binary
digits of precision (about 17 decimal digits), so 1.0-10.0^-30==1.0
even though 10.0^-30 is greater than 0.0 (you need to get down
to 10^-323 to make it 0.0 because there are 11 binary digits for
the exponent used to scale the value).

To work around this you can work on a log scale.  The R functions
log1p(x) and expm1(x) compute the much better approximations
to the real functions log(1+x) and exp(x)-1, respectively, than do
the R expressions log(1+x) and exp(x)-1 when x is very close to 0.
E.g., 
    > pB <- 1e-30 # 1*10^-30
    > exp(100 * log1p(-pB))
    [1] 1
    > exp(100 * log1p(-pB))-1
    [1] 0
    > expm1(100 * log1p(-pB))
    [1] -1e-28

You have another problem: you want z=10^1000, which is infinite
in double precision arithmetic.  You can take logs again to deal
with this or do some more analysis.

Also, if you are computing your basic pB and pF from R's p<dist>
family of functions, they all have arguments to let you get the upper
tail probabilities and output on a log scale.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com