[R] Resampling to find Confidence intervals
Ben Ward
benjamin.ward at bathspa.org
Tue Jan 4 21:44:50 CET 2011
You mentioned the boot package, I've just stumbled across a package
called simpleboot, with a function lm.boot. Would this be suitable - it
says I can sample cases from the origional dataset, as well as from the
residuals of a model. Not all the options I understand but I assume the
defaults might be suitable for what I'm doing?
On 04/01/2011 17:56, Ben Ward wrote:
> Ok I'll check I understand:
> So it's using sample, to resample d once, 10 values, because the rnorm
> has 10 values, with replacement (I assume thats the TRUE part).
> Then a for loop has this to resample the data - in the loop's case its
> 1000 times. Then it does a lm to get the coefficients and add them to
> d1.coef. I'm guessing that the allboot bit with rbind, which is null
> at the start of the loop, is the collection of d1.coef values, as I
> think that without it, every cycle of the loop the d1.coef from the
> previous cycle round the loop would be gone?
>
> On 04/01/2011 16:24, Dieter Menne wrote:
>
> Axolotl9250 wrote:
>
>>> ...
>>> resampled_ecoli = sample(ecoli, 500, replace=T)
>>> coefs = (coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate,
>>> data=resampled_ecoli)))
>>> sd(coefs)
>>>
>>> ...
>>>
>> Below a simplified and self-consistent version of your code, and some
>> changes
>>
>> Dieter
>>
>> # resample
>> d = data.frame(x=rnorm(10))
>> d$y = d$x*3+rnorm(10,0.01)
>>
>> # if you do this, you only get ONE bootstrap sample
>> d1 = d[sample(1:nrow(d),10,TRUE),]
>> d1.coef = coef(lm(y~x,data=d1))
>> d1.coef
>> # No error below, because you compute the sd of (Intercept) and slope
>> # but result is wrong!
>> sd(d1.coef)
>>
>> # We have to do this over and over
>> # Check ?replicate for a more R-ish approach....
>> nsamples = 1000
>> allboot = NULL
>> for (i in 1:1000) {
>> d1 = d[sample(1:nrow(d),10,TRUE),]
>> d1.coef = coef(lm(y~x,data=d1))
>> allboot = rbind(allboot,d1.coef) # Not very efficient, preallocate!
>> }
>> head(allboot) # display first of nsamples lines
>> apply(allboot,2,mean) # Compute mean
>> apply(allboot,2,sd) # compute sd
>> # After you are sure you understood the above, you might try package
>> boot.
>>
>>
>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list