[R-sig-eco] Bootstrapping with pseudo-replicates
Johannes Radinger
JRadinger at gmx.at
Thu Nov 17 16:02:36 CET 2011
Hello Dixon,
As there is no real predefined function for doing that resampling and regression what I want I tried to work on my own code. So far I get a code which is working in a for-loop. There is still a problem because I don't know how to collect the results-vector for each loop step into a data frame or list etc.
Maybe someone can help me. So far I got following:
library(plyr)
y <- c(1,5,6,2,5,10) # response
x <- c(2,12,8,1,16,17) # predictor
group <- factor(c(1,2,2,3,4,4)) # group
df <- data.frame(y,x,group)
R = 50 # the number of replicates
out = numeric(R) # storage for the results
for (i in 1:R) {
subsample <- ddply(df, .(group), function(x){
x[sample(nrow(x), 1), ]})
model <- lm(y~x,data=subsample)
out[i] <- c(coef(model), #vector of coefficients
summary(model)$coefficients[-1,4], #p-values for all except Intercept
pf(summary(model)$fstatistic[1], summary(model)$fstatistic[2],
summary(model)$fstatistic[3], lower.tail = FALSE), #overall p-value
summary(model)$r.squared)
}
The problem is the object out. This must be a dataframe or a list where all the resulting out[i] vectors are collected. I want it in a way so that I can easily calculate the mean/variance of the single regression parameters etc.
Thank you very much!
Johannes
-------- Original-Nachricht --------
> Datum: Wed, 16 Nov 2011 08:25:11 -0600
> Von: "Dixon, Philip M [STAT]" <pdixon at iastate.edu>
> An: "r-sig-ecology at r-project.org" <r-sig-ecology at r-project.org>
> Betreff: [R-sig-eco] Bootstrapping with pseudo-replicates
> Johannes,
>
> A very good question to ask, but you can't use a bootstrap, or boot(), to
> investigate it.
>
> You can define strata and then bootstrap observations within strata, but
> all bootstrap data sets will have the same structure as the original data.
> That's the point of the bootstrap. In your example, you have observations
> from 4 sites, 1 obs from site 1, 2 from site 2, 1 from site 3, and 2 from
> site 4. Every stratified bootstrap sample will have 1 from site 1, 2 from
> site 2, 1 from site 3 and 2 from site 4.
>
> I believe you have to construct your own code, probably along the lines of
> defining a vector for one obs per site, then for each site: extracting the
> set of pseudoreplicates for one site, using sample() to grab one value
> from that set, then storing in the vector.
>
> Best wishes,
> Philip Dixon
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
--
More information about the R-sig-ecology
mailing list