[R] Stratified Bootstrap question

Tim Hesterberg timh at insightful.com
Sat Apr 2 01:17:05 CEST 2005


Qian wrote:
>I talked with my advisor yesterday about how to do bootstrapping for my
>scenario: random clinic + random subject within clinic. She suggested that
>only clinic are independent units, so I can only resample clinic. But I
>think that since subjects are also independent within clinic, shall I
>resample subjects within clinic, which means I have two-stage resampling?
>Which one do you think makes sense?

This is a tough issue; I don't have a complete answer.  I'd
appreciate input from other r-help readers.

If you randomly select clinics, then randomly select patients within
the clinics:
  (1) by bootstrapping just clinics, you capture both sources of
  variation -- the between-subject variation is incorporated in the
  results for each clinic.
  
  (2) by bootstrapping clinics, then subjects within clinics, you
  end up double-counting the between-subject variation
That argues for resampling just clinics.

By analogy, if you have multiple subjects, and multiple measurements
per subject, you should just resample subjects.

However, I'm not comfortable with this if you have a small number of
clinics, and relatively large numbers of patients in each clinic, and
think that the between-clinic variation should be small.  Then it
seems better to resample both clinics and patients.

I'm leery about resampling just clinics if there are a small number
of clinics.  Bootstrapping isn't particularly effective for small
samples -- it is subject to skewness in small samples, and it 
underestimates variances (it's advantages over classical methods
really show up with medium size samples).
There are remedies for the small variance, see
	Hesterberg, Tim C. (2004), "Unbiasing the Bootstrap-Bootknife Sampling
	vs. Smoothing", Proceedings of the Section on Statistics and the
	Environment, American Statistical Association, 2924-2930
	www.insightful.com/Hesterberg/articles/JSM04-bootknife.pdf

Tim Hesterberg

========================================================
| Tim Hesterberg       Research Scientist              |
| timh at insightful.com  Insightful Corp.                |
| (206)802-2319        1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
Download the S+Resample library from www.insightful.com/downloads/libraries




More information about the R-help mailing list