[R] boot() versus loop, and statistics option
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sun Feb 6 14:09:40 CET 2011
Package boot is support software for a book: have you consulted it?
It answers all your questions, and has copious examples.
On Sun, 6 Feb 2011, Sascha Vieweg wrote:
> Hello R users
>
> I am quite new to bootstrapping. Now, having some data x,
> ----
> R: set.seed(1234)
> R: x <- runif(300)
> ----
> I want to bootstrap simple statistics, mean and quantiles (.025, .975).
> Currently, I run a loop
> ----
> R: res <- as.data.frame(matrix(ncol = 3, dimnames = list(NULL,
> ... c("M", "Lo", "Hi"))))
> R: for (i in 1:100) {
> ... y <- x[sample(1:length(x), length(x), repl = T)]
> ... res[i, ] <- c(mean(y), quantile(y, c(0.025, 0.975)))
> ...}
> ----
> and then apply mean()
> ----
> R: apply(res, 2, mean)
> M Lo Hi
> 0.49377715 0.03089873 0.98120235
> ----
> to get the indices of interest.
>
> I found the package 'boot' with the function of the same name. I tried to
> replicate my tiny simulation using this code:
> ----
> R: library(boot)
> R: myfun <- function(x) {
> ... return(c(mean(x), quantile(x, c(0.025, 0.975))))
> ...}
> R: boot(x, myfun, 100, sim = "parametric")
> PARAMETRIC BOOTSTRAP
>
> Call:
> boot(data = x, statistic = myfun, R = 100, sim = "parametric")
>
> Bootstrap Statistics :
> original bias std. error
> t1* 0.48925194 0 0
> t2* 0.02806586 0 0
> t3* 0.98335435 0 0
> ----
> The outcome looks "quite" similar to what my loop returned, so that would be
> fine. Yet, there is three things I don't understand:
>
> (1) I have to use the option 'sim="parametric"'. If I don't use this option
> the function (provided via the statistic option) requires a second argument,
> which -- according to '?boot' "will be a vector of indices, frequencies or
> weights which define the bootstrap sample." What is that? Or is my simulation
> simply parametric? Why?
>
> (2) What are the advantages and/or disadvantages of 'boot()' over my loop?
>
> (3) Can I in principle use 'boot()' to return all of the 100 different data
> vectors used in the loop, or does 'boot()' by default return
> already-calculated statistics?
>
> Thanks for hints and help, *S*
>
>
> --
> Sascha Vieweg, saschaview at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list