[R] boot() versus loop, and statistics option
Sascha Vieweg
saschaview at gmail.com
Sun Feb 6 12:43:38 CET 2011
Hello R users
I am quite new to bootstrapping. Now, having some data x,
----
R: set.seed(1234)
R: x <- runif(300)
----
I want to bootstrap simple statistics, mean and quantiles (.025,
.975). Currently, I run a loop
----
R: res <- as.data.frame(matrix(ncol = 3, dimnames = list(NULL,
... c("M", "Lo", "Hi"))))
R: for (i in 1:100) {
... y <- x[sample(1:length(x), length(x), repl = T)]
... res[i, ] <- c(mean(y), quantile(y, c(0.025, 0.975)))
...}
----
and then apply mean()
----
R: apply(res, 2, mean)
M Lo Hi
0.49377715 0.03089873 0.98120235
----
to get the indices of interest.
I found the package 'boot' with the function of the same name. I
tried to replicate my tiny simulation using this code:
----
R: library(boot)
R: myfun <- function(x) {
... return(c(mean(x), quantile(x, c(0.025, 0.975))))
...}
R: boot(x, myfun, 100, sim = "parametric")
PARAMETRIC BOOTSTRAP
Call:
boot(data = x, statistic = myfun, R = 100, sim = "parametric")
Bootstrap Statistics :
original bias std. error
t1* 0.48925194 0 0
t2* 0.02806586 0 0
t3* 0.98335435 0 0
----
The outcome looks "quite" similar to what my loop returned, so
that would be fine. Yet, there is three things I don't understand:
(1) I have to use the option 'sim="parametric"'. If I don't use
this option the function (provided via the statistic option)
requires a second argument, which -- according to '?boot' "will be
a vector of indices, frequencies or weights which define the
bootstrap sample." What is that? Or is my simulation simply
parametric? Why?
(2) What are the advantages and/or disadvantages of 'boot()' over
my loop?
(3) Can I in principle use 'boot()' to return all of the 100
different data vectors used in the loop, or does 'boot()' by
default return already-calculated statistics?
Thanks for hints and help, *S*
--
Sascha Vieweg, saschaview at gmail.com
More information about the R-help
mailing list