[R] Interpreting Results of Bootstrapping

Liaw, Andy andy_liaw at merck.com
Sun Jul 11 03:36:14 CEST 2004


Have you actually look at plot(x1, x2)?  That ought to be quite
enlightening.

You have one data point:

    x1     x2 
25.240  6.744 

that's way out in the upper right.  Every bootstrap sample that include that
point will give an correlation that's high, and every bootstrap sample that
does not include that point will give low (near zero) correlation.  Now, the
probability that one point is included in a bootstrap sample is roughly
63.8%.  You can easily see that:

> mean(b$t>.5)
[1] 0.6385


Andy

> From: Y C Tao
> 
> I tried to bootstrap the correlation between two
> variables x1 and x2. The resulting distribution has
> two distinct peaks, how should I interprete it?
> 
> The original code is attached.
> 
> Y. C. Tao
> 
> ----------------
> 
> library(boot);
>  
> my.correl<-function(d, i) cor(d[i,1], d[i,2])
>  
> x1<-c(-2.612,-0.7859,-0.5229,-1.246,1.647,1.647,0.1811,-0.0709
7,0.8711,0.4323,0.1721,2.143,
> 4.33,0.5002,0.4015,-0.5225,2.538,0.07959,-0.6645,4.521,-1.371,
> 0.3327,25.24,-0.5417,2.094,0.6064,-0.4476,-0.5891,-0.08879,-0.
> 9487,-2.459e-05,-0.03887,0.2116,-0.0625,1.555,0.2069,-0.2142,-
> 0.807,-0.6499,2.384,-0.02063,1.179,-0.0003586,-1.408,0.6928,0.
689,0.1854,0.4351,0.5663,0.07171,-0.07004);
>  
> x2<-c(0.08742,0.2555,-0.00337,0.03995,-1.208,-1.208,-0.001374,
> -1.282,1.341,-0.9069,-0.2011,1.557,0.4517,-0.4376,0.4747,0.049
> 65,-0.1668,-0.6811,-0.7011,-1.457,0.04652,-1.117,6.744,-1.332,
> 0.1327,-0.1479,-2.303,0.1235,0.5916,0.05018,-0.7811,0.5869,-0.
> 02608,0.9594,-0.1392,0.4089,0.1468,-1.507,-0.6882,-0.1781,0.54
> 34,-0.4957,0.02557,-1.406,-0.5053,-0.7345,-1.314,0.3178,-0.210
> 8,0.4186,-0.03347);
>  
> b<-boot(cbind(x1, x2), my.correl, 2000)
> hist(b$t, breaks=50)
>




More information about the R-help mailing list