[R-sig-eco] "average" regression - bootstrap?

Pedro Lima Pequeno pacolipe at gmail.com
Mon Aug 29 16:56:38 CEST 2011


Hi Johannes,

I think your approach is reasonable. As you pointed out, however,
generating such parameter distributions as you did is not strictly the
same as bootstraping. The bootstrap simulates repeated sampling from
one original, target population, by assuming the available sample is
representative of that population and thus resampling it. Hence, it is
a way of indirectly gathering information from the original
population. This is only because in practice, resampling the
population of interest is often impossible.
In your case, you are directly simulating the target population (from
a uniform distribution with known limits), an thus bootstraping is not
needed. However, by directly simulating the target population and its
samples, your results will mainly reflect properties of this abstract,
infinite population. If you are also interested in a more "real world"
setting, you could first simulate a large, but finite population and
then sample it. At the same time, you could focus on a single, random
sample from this finite population and then apply the bootstrap, as
people would usually be able todo with their own data. Then, you could
compare the results.
It is also useful to check the shape of the resulting distributions
before choosing the adequate measures to summarize it. For instance,
the R2 sampling distribution is likely to be skewed, so using the mean
will emphasize the tail values; the median could be more
representative of the central tendency of the distribution in this
case.

Regards

2011/8/29, Johannes Radinger <JRadinger at gmx.at>:
> Hello,
>
> I've kind of a tricky statistical problem. First of all: I want to do a
> standard linear regression. Therefore my model is:
>
> X <- function()runif(length(Xa), Xa, Xb)
> model <- lm(Y~X())
>
> so X is a function drawing a random number between Xa and Xb (that is
> necessary in my case). What I did so far is:
>
> example1 <- list()
> n=1000
> for(i in 1:n) {
> 	model <- lm(Y~X())
> 	example1[[paste("run",i,sep="")]] <- model
> 	}
>
> So I ran the regression 1000 times and created a list with the regression
> parameters for each run.
>
> How can I analyse these results now? I can get nice mean values for p,
> R-squared etc. but is that the right way?
>
> So I thought, maybe a bootstrap approach can help in this case. Instead of
> doing the "manual" repeaded regression I can use bootstrap. But does the
> boot-function allow to use the "runif"-function for the X variable, so that
> each bootstrap run a new number is drawn? If it is the case it'd be nice
> because then I can get summarized results, a thing that I want. On the other
> hand, I don't necessarily need the subsampling of bootstrap. So in my case
> the subsample=all cases. Does that make sense?
>
> Hopefully you can give me some inputs
>
> best regards
> Johannes
>
>
> --
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>


-- 
Pedro A. C. Lima Pequeno
Programa de Pós-graduação em Ecologia
Instituto Nacional de Pesquisas da Amazônia
Manaus, AM, Brasil



More information about the R-sig-ecology mailing list