[R] Bootstrap inference for the sample median?

Emmanuel Charpentier charpent at bacbuc.dyndns.org
Sun Aug 30 18:24:08 CEST 2009


Le dimanche 30 août 2009 à 18:43 +0530, Ajay Shah a écrit :
> Folks,
> 
> I have this code fragment:
> 
>   set.seed(1001)
>   x <- c(0.79211363702017, 0.940536712079832, 0.859757602692931, 0.82529998629531, 
>          0.973451006822, 0.92378802164835, 0.996679563355802,
>          0.943347739494445, 0.992873542980045, 0.870624707845108, 0.935917364493788)
>   range(x)
>   # from 0.79 to 0.996
> 
>   e <- function(x,d) {
>     median(x[d])
>   }
> 
>   b <- boot(x, e, R=1000)
>   boot.ci(b)
> 
> The 95% confidence interval, as seen with `Normal' and `Basic'
> calculations, has an upper bound of 1.0028 and 1.0121.
> 
> How is this possible? If I sample with replacement from values which
> are all lower than 1, then any sample median of these bootstrap
> samples should be lower than 1. The upper cutoff of the 95% confidence
> interval should also be below 1.

Nope. "Normal" and "Basic" try to adjust (some form of) normal
distribution to the sample of your statistic. But the median of such a
small sample is quite far from a normal (hint : it is a discrete
distribution with only very few possible values, at most as many value
as the sample. Exercise : derive the distribution of median(x)...).

To convince yourself, look at the histogram of the bootstrap
distribution of median(x). Contrast with the bootstrap distribution of
mean(x). Meditate. Conclude...

HTH,

					Emmanuel Charpentier




More information about the R-help mailing list