[R-sig-eco] Bootstrapping with pseudo-replicates

Johannes Radinger JRadinger at gmx.at
Mon Nov 21 11:19:38 CET 2011


Hi,

thank you Lars, your method with the matrix is the one I'll use now.
To get the % variation of e.g. a regression parameter estimate, I calculated:

coef.mean <- apply(out,2,mean,na.rm=TRUE) #Calc mean of matrix columns/reg. param.
coef.max <- apply(out,2,max,na.rm=TRUE) #Calc max of matrix columns/reg. param.
coef.min <- apply(out,2,min,na.rm=TRUE) #Calc min of matrix columns/reg. param.

var.coef <- (coef.max-coef.min)/(coef.mean/100)


/Johannes

-------- Original-Nachricht --------
> Datum: Fri, 18 Nov 2011 09:08:52 +0100
> Von: Lars Westerberg <lawes at ifm.liu.se>
> An: r-sig-ecology at r-project.org
> Betreff: Re: [R-sig-eco] Bootstrapping with pseudo-replicates

> I mainly use two ways to collect results in a for-loop. Either by 
> defining an empty result variable:
> out <- NA  #is ugly but makes things really easy:
> out[1] <- 1 #even out[3] <- 1 works
> 
> It is better, and faster, to allocate memory for a result variable using 
> e.g. 'array' or 'matrix'. In the example below, you have to match the 
> number of columns with length of the result vector:
> out <- matrix(NA, nrow=R, ncol=4) #define out before for-loop
> #for(i in 1:R){
> #...
> out[i,] <- c(coef(model),... #store results
> #...
> #}
> apply(out,2,mean,na.rm=TRUE) #Calc mean of matrix columns/reg. param.
> apply(out,2,sd,na.rm=TRUE) #Calc sd of matrix columns/reg. param.
> 
> HTH
> /Lars
> 
> On 2011-11-17 16:02, Johannes Radinger wrote:
> > Hello Dixon,
> >
> > As there is no real predefined function for doing that resampling and
> regression what I want I tried to work on my own code. So far I get a code
> which is working in a for-loop. There is still a problem because I don't know
> how to collect the results-vector for each loop step into a data frame or
> list etc.
> >
> > Maybe someone can help me. So far I got following:
> >
> > library(plyr)
> >
> > y<- c(1,5,6,2,5,10) # response
> > x<- c(2,12,8,1,16,17) # predictor
> > group<- factor(c(1,2,2,3,4,4)) # group
> > df<- data.frame(y,x,group)
> >
> >
> >
> > R = 50                                      # the number of replicates
> > out = numeric(R)                          	# storage for the results
> > for (i in 1:R) {
> > 	subsample<- ddply(df, .(group), function(x){
> > 	x[sample(nrow(x), 1), ]})
> > 	model<- lm(y~x,data=subsample)
> > 	out[i]<- c(coef(model),				#vector of coefficients
> > 	summary(model)$coefficients[-1,4],		#p-values for all except Intercept
> > 	pf(summary(model)$fstatistic[1], summary(model)$fstatistic[2],
> > 	summary(model)$fstatistic[3], lower.tail = FALSE),		#overall p-value
> > 	summary(model)$r.squared)
> > 	}
> >
> >
> > The problem is the object out. This must be a dataframe or a list where
> all the resulting out[i] vectors are collected. I want it in a way so that
> I can easily calculate the mean/variance of the single regression
> parameters etc.
> >
> >
> > Thank you very much!
> > Johannes
> >
> > -------- Original-Nachricht --------
> >> Datum: Wed, 16 Nov 2011 08:25:11 -0600
> >> Von: "Dixon, Philip M [STAT]"<pdixon at iastate.edu>
> >> An: "r-sig-ecology at r-project.org"<r-sig-ecology at r-project.org>
> >> Betreff: [R-sig-eco] Bootstrapping with pseudo-replicates
> >
> >> Johannes,
> >>
> >> A very good question to ask, but you can't use a bootstrap, or boot(),
> to
> >> investigate it.
> >>
> >> You can define strata and then bootstrap observations within strata,
> but
> >> all bootstrap data sets will have the same structure as the original
> data.
> >> That's the point of the bootstrap.  In your example, you have
> observations
> >> from 4 sites, 1 obs from site 1, 2 from site 2, 1 from site 3, and 2
> from
> >> site 4.  Every stratified bootstrap sample will have 1 from site 1, 2
> from
> >> site 2, 1 from site 3 and 2 from site 4.
> >>
> >> I believe you have to construct your own code, probably along the lines
> of
> >> defining a vector for one obs per site, then for each site: extracting
> the
> >> set of pseudoreplicates for one site, using sample() to grab one value
> >> from that set, then storing in the  vector.
> >>
> >> Best wishes,
> >> Philip Dixon
> >>
> >> _______________________________________________
> >> R-sig-ecology mailing list
> >> R-sig-ecology at r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> >
> > --
> >
> > _______________________________________________
> > R-sig-ecology mailing list
> > R-sig-ecology at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

--



More information about the R-sig-ecology mailing list