[R] Stratified Bootstrap question
Tim Hesterberg
timh at insightful.com
Sat Apr 2 01:17:05 CEST 2005
Qian wrote:
>I talked with my advisor yesterday about how to do bootstrapping for my
>scenario: random clinic + random subject within clinic. She suggested that
>only clinic are independent units, so I can only resample clinic. But I
>think that since subjects are also independent within clinic, shall I
>resample subjects within clinic, which means I have two-stage resampling?
>Which one do you think makes sense?
This is a tough issue; I don't have a complete answer. I'd
appreciate input from other r-help readers.
If you randomly select clinics, then randomly select patients within
the clinics:
(1) by bootstrapping just clinics, you capture both sources of
variation -- the between-subject variation is incorporated in the
results for each clinic.
(2) by bootstrapping clinics, then subjects within clinics, you
end up double-counting the between-subject variation
That argues for resampling just clinics.
By analogy, if you have multiple subjects, and multiple measurements
per subject, you should just resample subjects.
However, I'm not comfortable with this if you have a small number of
clinics, and relatively large numbers of patients in each clinic, and
think that the between-clinic variation should be small. Then it
seems better to resample both clinics and patients.
I'm leery about resampling just clinics if there are a small number
of clinics. Bootstrapping isn't particularly effective for small
samples -- it is subject to skewness in small samples, and it
underestimates variances (it's advantages over classical methods
really show up with medium size samples).
There are remedies for the small variance, see
Hesterberg, Tim C. (2004), "Unbiasing the Bootstrap-Bootknife Sampling
vs. Smoothing", Proceedings of the Section on Statistics and the
Environment, American Statistical Association, 2924-2930
www.insightful.com/Hesterberg/articles/JSM04-bootknife.pdf
Tim Hesterberg
========================================================
| Tim Hesterberg Research Scientist |
| timh at insightful.com Insightful Corp. |
| (206)802-2319 1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax) Seattle, WA 98109-3044, U.S.A. |
| www.insightful.com/Hesterberg |
========================================================
Download the S+Resample library from www.insightful.com/downloads/libraries
More information about the R-help
mailing list