[R-sig-eco] Bootstrapping with pseudo-replicates

Johannes Radinger JRadinger at gmx.at
Thu Nov 17 16:02:36 CET 2011


Hello Dixon,

As there is no real predefined function for doing that resampling and regression what I want I tried to work on my own code. So far I get a code which is working in a for-loop. There is still a problem because I don't know how to collect the results-vector for each loop step into a data frame or list etc.

Maybe someone can help me. So far I got following:

library(plyr)

y <- c(1,5,6,2,5,10) # response
x <- c(2,12,8,1,16,17) # predictor
group <- factor(c(1,2,2,3,4,4)) # group
df <- data.frame(y,x,group)



R = 50                                      # the number of replicates
out = numeric(R)                          	# storage for the results
for (i in 1:R) {
	subsample <- ddply(df, .(group), function(x){
	x[sample(nrow(x), 1), ]})
	model <- lm(y~x,data=subsample)
	out[i] <- c(coef(model),				#vector of coefficients
	summary(model)$coefficients[-1,4],		#p-values for all except Intercept
	pf(summary(model)$fstatistic[1], summary(model)$fstatistic[2],
	summary(model)$fstatistic[3], lower.tail = FALSE),		#overall p-value
	summary(model)$r.squared)
	}


The problem is the object out. This must be a dataframe or a list where all the resulting out[i] vectors are collected. I want it in a way so that I can easily calculate the mean/variance of the single regression parameters etc.


Thank you very much!
Johannes

-------- Original-Nachricht --------
> Datum: Wed, 16 Nov 2011 08:25:11 -0600
> Von: "Dixon, Philip M [STAT]" <pdixon at iastate.edu>
> An: "r-sig-ecology at r-project.org" <r-sig-ecology at r-project.org>
> Betreff: [R-sig-eco] Bootstrapping with pseudo-replicates

> Johannes,
> 
> A very good question to ask, but you can't use a bootstrap, or boot(), to
> investigate it.  
> 
> You can define strata and then bootstrap observations within strata, but
> all bootstrap data sets will have the same structure as the original data. 
> That's the point of the bootstrap.  In your example, you have observations
> from 4 sites, 1 obs from site 1, 2 from site 2, 1 from site 3, and 2 from
> site 4.  Every stratified bootstrap sample will have 1 from site 1, 2 from
> site 2, 1 from site 3 and 2 from site 4.
> 
> I believe you have to construct your own code, probably along the lines of
> defining a vector for one obs per site, then for each site: extracting the
> set of pseudoreplicates for one site, using sample() to grab one value
> from that set, then storing in the  vector.
> 
> Best wishes,
> Philip Dixon
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

--



More information about the R-sig-ecology mailing list