[R] a Bootstrap understanding problem

Mon Jan 21 14:23:43 CET 2002

In article <ifado.list.r.help/Pine.LNX.4.31.0201211014160.15847-100000 at gannet.stats>,
Prof Brian Ripley  <ripley at stats.ox.ac.uk> wrote:
>On Mon, 21 Jan 2002, Wilhelm B. Kloke wrote:
>
>It certainly does not conform.  The `bootstrap' package (its original S
>name was bootstrap.funs) is old and I suggest should not now be used, but
>it does have a function for BCa which you could find by looking in its
>INDEX.  The example is even

Which, BTW, yielded results resembling those I hoped to look for.

>
>     # For example, find bca limits for
>     # the correlation coefficient from a set of 15 data pairs:
>
>but the bootstrap set is tiny (see below).

As the data set is really tiny, I can give it here:
      V4    V5
1  -0.02 -0.07
2   0.04  0.02
3  -0.02  0.04
4   0.08 -0.02
5  -0.01  0.04
6   0.08  0.07
7   0.03  0.04
8   0.08  0.01
9   0.03  0.03
10 -0.12 -0.03
11  0.06  0.04
12 -0.21 -0.08
13  0.00 -0.01

My boot application gives:
: > mehnert.boot
: 
: ORDINARY NONPARAMETRIC BOOTSTRAP
: 
: 
: Call:
: boot(data = mehnert, statistic = function(x, i) {
:     cor(x[i, 1], x[i, 2])
: }, R = 1000)
: 
: 
: Bootstrap Statistics :
:      original      bias    std. error
: t1* 0.6623205 -0.03803166   0.2197617
: > 
and
: > boot.ci(mehnert.boot)
: BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
: Based on 1000 bootstrap replicates
: 
: CALL : 
: boot.ci(boot.out = mehnert.boot)
: 
: Intervals : 
: Level      Normal              Basic         
: 95%   ( 0.2696,  1.1311 )   ( 0.4220,  1.2838 )  
: 
: Level     Percentile            BCa          
: 95%   ( 0.0408,  0.9027 )   ( 0.0322,  0.8962 )  
: Calculations and Intervals on Original Scale
: Warning message: 
: Bootstrap variances needed for studentized intervals in: boot.ci(mehnert.boot) 
My question was raised by the fact that in Mehnert's writeup I found
BCa ci from 0.16 to 0.93 for 5%level, which may indicate some more
confidence for assuming the correlation to be positive.

>No, and in e.g. the MASS examples they give similar results.

Indeed. I saw that.

>BCa needs large, often very large (tens of thousands), bootstrap sets.
>Are you sure your colleague used a large enough set?  A quick bit of

We cannot make more observations without difficulty. We have these
data from 13 probands. For the bootstrap simulation we used 1000 both
in the original study and in my replication trial.

>replication suggests that the BCa limits are very variable for your
>problem. I find BCa pretty unreliable, and for correlations using Fisher's
>tanh transformation is normally enough to make all sensible confidence
>interval procedures agree for all practical purposes.
>
>Finally, what useful conclusions can be drawn from a confidence interval
>for the correlation of 13 data pairs?

Of course, this is not a bad question. But aren't bootstrap methods
designed for application to problematic datasets?
-- 
Dipl.-Math. Wilhelm Bernhard Kloke
Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._