[R] Power calculation for survival analysis

Marc Schwartz marc_schwartz at me.com
Wed Sep 21 21:14:01 CEST 2011

On Sep 21, 2011, at 12:37 PM, Duke wrote:

> Thanks for your response, Marc.  HG and LG are high-grade/low-grade tumors. 
> The data has not been collected yet, but will be soon.  It's all archived
> data that will be pulled from computer records.  The IRB wants some mention
> of power or sample size, but doing it for this scenario has been a bit of a
> head scratcher for me.
> If it's not really feasible to do a power analysis for this scenario, I can
> work to explain why to the IRB.
> D 

Hi Derek,

My guess is that the IRB wants to have some CYA in terms of the justification for the study. In a design such as this, safety is not the typical concern, since the patients have already been treated and nothing that you are going to do will affect that. More than likely, there may be privacy (e.g. HIPAA) and ethical issues, pertaining to your accessing the medical records of the patients and having a reasonable level of assurance that you will be able to offer some scientific value at the end of the day as a consequence of that access.

I don't know the particulars of your IRB, so it may be of value to approach others at Duke who have experience in dealing with them in the setting of a retrospective chart review. You may be able to get a sense for what they are open to in terms of justification and where they may or may not be amenable to a discussion of the pros/cons of this particular approach.

It is not uncommon, in my experience, to simply indicate that n = 500 is a "convenience sample", based upon some assessment of time/budget limitations and some attempt to assess the number of patients with some common set of characteristics that are likely to be available within a reasonable time frame. In that setting, power as a discrete quantity is not quoted and you don't have an explicit hypothesis to be tested. You "get what you get" and within the limitations of the study design, can offer some insight into the differences in the two groups. I have seen the same approach even with prospective, non-randomized designs.

All of that being said, you can use Frank's cpower() function in Hmisc, if they put a gun to your head. It would not be overly difficult to do that, you just need to be aware of your assumptions and how they can impact the resultant power calculation. Using the function itself is not overly complex.



More information about the R-help mailing list