# [R] significance in difference of proportions

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Thu Nov 27 18:43:00 CET 2003

```On 27-Nov-03 Arne.Muller at aventis.com wrote:
> I've 2 samples A (111 items) and B (10 items) drawn from the same
> unknown population. Witihn A I find 9 "positives" and in B 0
> positives. I'd like to know if the 2 samples A and B are different,
> ie is there a way to find out whether the number of "positives" is
> significantly different in A and B?

Pretty obviously not, just from looking at the numbers:

9 out of 111 -> p = P(positive) approx = 1/10

P(0 out of 10 when p = 1/10) is not unlikely (in fact = 0.35).

However, a Fisher exact test will give you a respectable P-value:

> library(ctest)
> ?fisher.test
> fisher.test(matrix(c(102,9,10,0),nrow=2))
[...]
p-value = 1
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.000000 6.088391
> fisher.test(matrix(c(102,9,9,1),nrow=2))
p-value = 0.5926
> fisher.test(matrix(c(102,9,8,2),nrow=2))
p-value = 0.2257
> fisher.test(matrix(c(102,9,7,3),nrow=2))
p-value = 0.0605
> fisher.test(matrix(c(102,9,6,4),nrow=2))
p-value = 0.01202

So there's a 95% CI (0,6.1) for the odds ratio which, for
identical probabilities of "+", is 1.0 hence well within the CI.
And, keeping the numbers for the larger sample fixed for
simplicity, you have to go quite a way with the smaller one to get
a result significant at 5%:

(102,9):(7,3) -> P = 0.06
(102,9):(6,4) -> P = 0.01

and, to have 80% power (0.8 probability of this event), the
probability of "+" in the second sample would have to be as
high as 0.41.

to detect rather large differences between the true proportions
in the two cases!

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 27-Nov-03                                       Time: 17:43:00
------------------------------ XFMail ------------------------------

```