[R] Bootstrapping issues
PIKAL Petr
petr.pikal at precheza.cz
Mon Nov 12 10:21:41 CET 2012
Hi
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Clive Nicholas
> Sent: Monday, November 12, 2012 8:06 AM
> To: r-help at r-project.org
> Subject: [R] Bootstrapping issues
>
> sessionInfo()R version 2.15.2 (2012-10-26)
> Platform: i686-pc-linux-gnu (32-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
> LC_TIME=en_GB.UTF-8
> [4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8
> LC_MESSAGES=en_GB.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] boot_1.3-7
>
> loaded via a namespace (and not attached):
> [1] tools_2.15.2
>
>
> Hello. I have a very straightforward question. Here's some simulated
> data
> (N=500)
>
> test<-data.frame(A=rnorm(500,mean=2.72,sd=5.36),
> B=sample(c(12,20,24,28,32),size=500,prob=c(0.333,0.026,0.026,0.436,0.17
> 9),replace=TRUE),C=sample(c(0,1),size=500,replace=TRUE),D=sample(c(0,1)
> ,size=500,replace=TRUE))
>
>
> head(test) A B C D
> 1 1.181804 28 1 0
> 2 -5.602307 12 1 1
> 3 2.925090 24 1 1
> 4 3.437408 28 1 0
> 5 -6.503531 32 0 0
> 6 11.013888 12 1 1
>
>
> which I then bootstrap using
>
> library(boot)
>
> bs <- function(formula, data, indices) { test <- data[indices,]
> fit <- lm(formula, data=test) return(coef(fit))
>
> }
>
>
> The following works
>
> results <- boot(data=test, statistic=bs, R=1000, A~B+C+D+C*D)
>
Actually it does not work either
> results <- boot(data=test, statistic=bs, R=1000, A~B+C+D+C*D)
Error in data[indices, ] : incorrect number of dimensions
>
I am not sure but I suspect your bs function expects some indices vector and it is somehow not in accordance with your data.
Regards
Petr
>
> results
>
>
> But when I then amend the dataset by changing the D variable to
> simulate fixed proportions
>
> D=sample(c(0,1),size=500,prob=c(0.564,0.436),replace=TRUE
>
>
> head(test) A B C D
> 1 5.73771963 28 0 1
> 2 -0.19040750 12 1 0
> 3 2.22515982 12 0 1
> 4 -0.02905223 32 1 0
> 5 4.68314112 28 0 1
> 6 5.10711732 12 1 0
>
>
> the same bootstrapping routine chokes with an error
>
> results <- boot(data=test, statistic=bs, R=1000, A~B+C+C*D)Error in
> data[indices, ] : incorrect number of dimensions
>
>
> despite the fact that the B variable also has simulated fixed
> proportions and yet the original code ran without any errors. I have
> two general observations to make about this:
>
> (1) this does not make sense; and
> (2) I don't understand this.
>
> How best to make these two observations go away and run the code to my
> satisfaction?
>
> Many thanks.
>
> --
> Clive Nicholas (clivenicholas.posterous.com)
>
> [Please DO NOT mail me personally here, but at
> <clivenicholas at hotmail.com>.
> Please respond to contributions I make in a list thread here. Thanks!]
>
> "My colleagues in the social sciences talk a great deal about
> methodology.
> I prefer to call it style." -- Freeman J. Dyson
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list