[R] boot() versus loop, and statistics option

Sun Feb 6 12:43:38 CET 2011

Hello R users

I am quite new to bootstrapping. Now, having some data x,
----
R: set.seed(1234)
R: x <- runif(300)
----
I want to bootstrap simple statistics, mean and quantiles (.025, 
.975). Currently, I run a loop
----
R: res <- as.data.frame(matrix(ncol = 3, dimnames = list(NULL,
...    c("M", "Lo", "Hi"))))
R: for (i in 1:100) {
...    y <- x[sample(1:length(x), length(x), repl = T)]
...    res[i, ] <- c(mean(y), quantile(y, c(0.025, 0.975)))
...}
----
and then apply mean()
----
R: apply(res, 2, mean)
          M         Lo         Hi
0.49377715 0.03089873 0.98120235
----
to get the indices of interest.

I found the package 'boot' with the function of the same name. I 
tried to replicate my tiny simulation using this code:
----
R: library(boot)
R: myfun <- function(x) {
...    return(c(mean(x), quantile(x, c(0.025, 0.975))))
...}
R: boot(x, myfun, 100, sim = "parametric")
PARAMETRIC BOOTSTRAP

Call:
boot(data = x, statistic = myfun, R = 100, sim = "parametric")

Bootstrap Statistics :
       original  bias    std. error
t1* 0.48925194       0           0
t2* 0.02806586       0           0
t3* 0.98335435       0           0
----
The outcome looks "quite" similar to what my loop returned, so 
that would be fine. Yet, there is three things I don't understand:

(1) I have to use the option 'sim="parametric"'. If I don't use 
this option the function (provided via the statistic option) 
requires a second argument, which -- according to '?boot' "will be 
a vector of indices, frequencies or weights which define the 
bootstrap sample." What is that? Or is my simulation simply 
parametric? Why?

(2) What are the advantages and/or disadvantages of 'boot()' over 
my loop?

(3) Can I in principle use 'boot()' to return all of the 100 
different data vectors used in the loop, or does 'boot()' by 
default return already-calculated statistics?

Thanks for hints and help, *S*

-- 
Sascha Vieweg, saschaview at gmail.com