[R] stats question

Thu Jul 31 14:48:51 CEST 2008

At the risk of oversimplifying the study design, this sounds like a two 
sample comparison of proportions, in which case power.prop.test() would 
be the function of interest.  This could also be done via Monte Carlo 
simulation, which would not be difficult to implement.

Note that I am also presuming that we are not in fact talking about the 
*rate* of events in the two groups, as there was no mention of the time 
to the events being a consideration.

I am going to make the further assumption that the phrase "significant 
difference" as used here means a target chi-square test p value of 
<=0.05, where the null is that the two proportions are equal and the 
alternative is that they are not equal.

That being the case, then:

 > power.prop.test(p1 = .06, p2 = .10, power = 0.8)

      Two-sample comparison of proportions power calculation

               n = 720.9169
              p1 = 0.06
              p2 = 0.1
       sig.level = 0.05
           power = 0.8
     alternative = two.sided

  NOTE: n is number in *each* group

See ?power.prop.test for more information.

If my assumptions are not correct, please post back with further 
information about the study design.

HTH,

Marc Schwartz

on 07/31/2008 02:11 AM Moshe Olshansky wrote:
> Hello Jason,
> 
> You are not specific enough. What do you mean by "significant
> difference"? Let's assume that indeed the incidence in A is 6% and in
> B is 10% and
we are looking for Na and Nb such that with probability of at least 80%
the mean of Nb sample from B will be at least, say, 0.03 (=3%) above the
mean of Na sample from A.
> The solution is not unique. If Mb is the mean of the sample from B
> and Ma is the one from A, using
Normal approximation we get the Mb is approximately normal with mean
0.10 and variance 0.1*0.9/Nb and Ma is approximately normal with mean
0.06 and variance 0.06*0.94/Na, so Mb - Ma is approximately normal with
mean 0.04 and variance 0.09/Nb + 0.0564/Na. So let V be the maximal
variance for which the probability that a normal rv with mean 0.04 and
variance V is above 0.03 equals 0.80 (finding such V is
straightforward). Then you must choose Na and Nb which satisfy 0.09/Nb +
0.0564/Na <= V. One such choice is Nb = 2*0.09/V, Na = 2*0.0564/V.
> 
> As I said, this solution is only approximate and probably not
> optimal,
so see what other people say.
> 
> Regards,
> 
> Moshe.
> 
> 
> --- On Thu, 31/7/08, Iasonas Lamprianou <lamprianou at yahoo.com> wrote:
> 
>> From: Iasonas Lamprianou <lamprianou at yahoo.com>
>> Subject: [R] stats question
>> To: r-help at r-project.org
>> Received: Thursday, 31 July, 2008, 2:46 PM
>> Dear friends, 
>> I am not sure that this is the right place to ask,  but
>> please feel free to suggest an alternative discussion
>> group.
>> My question is that I want to do a comparative study in
>> order to compare the rate of incidence in two populations. I
>> know that a pilot study was conducted a few weeks ago and
>> found 8/140 (around 6%) incidence in population A.
>> Population B was not sampled. Assuming this is (about) the
>> right proportion in the Population A what is the sample
>> size I need for population A and B in the main study, in
>> order to have power of 80% to idenitfy
>> significant differences? I would expect the incidence in
>> population B to be around 10% compared to the 6% of the
>> Population A.
>> Any suggestions? How can I do this in R?
>>  
>> Jason
>>