[R] bootstrapping in regression
Tom Backer Johnsen
backer at psych.uib.no
Thu Jan 29 22:56:52 CET 2009
Tom Backer Johnsen wrote:
> Thomas Mang wrote:
>> Hi,
>>
>> Please apologize if my questions sounds somewhat 'stupid' to the
>> trained and experienced statisticians of you. Also I am not sure if I
>> used all terms correctly, if not then corrections are welcome.
>>
>> I have asked myself the following question regarding bootstrapping in
>> regression:
>> Say for whatever reason one does not want to take the p-values for
>> regression coefficients from the established test statistics
>> distributions (t-distr for individual coefficients, F-values for
>> whole-model-comparisons), but instead apply a more robust approach by
>> bootstrapping.
>>
>> In the simple linear regression case, one possibility is to randomly
>> rearrange the X/Y data pairs, estimate the model and take the
>> beta1-coefficient. Do this many many times, and so derive the null
>> distribution for beta1. Finally compare beta1 for the observed data
>> against this null-distribution.
>
> There is a very basic difference between bootstrapping and random
> permutations. What you are suggesting is to shuffle values between
> cases or rows in the frame. That amounts to a variant of a permutation
> test, not a bootstrap.
>
> What you do in a bootstrap test is different, you regard your sample as
> a population and then sample from that population (with replacement),
> normally by extracting a large number of random samples of the same size
> as the original sample and do the computations for whatever you are
> interested in for each sample.
>
> In other words, with bootstrapping, the pattern of values within each
> case or row is unchanged, and you sample complete cases or rows. With a
> permutation test you keep the original sample of cases or rows, but
> shuffle the observations on the same variable between cases or rows.
>
> Have a look at the 'boot' package.
>
> Tom
>>
>> What I now wonder is how the situation looks like in the multiple
>> regression case. Assume there are two predictors, X1 and X2. Is it
>> then possible to do the same, but just only rearranging the values of
>> one predictor (the one of interest) at a time? Say I want again to
>> test beta1. Is it then valid to many times randomly rearrange the X1
>> data (and keeping Y and X2 as observed), fit the model, take the beta1
>> coefficient, and finally compare the beta1 of the observed data
>> against the distributions of these beta1s ?
>> For X2, do the same, randomly rearrange X2 all the time while keeping
>> Y and X1 as observed etc.
>> Is this valid ?
>>
>> Second, if this is valid for the 'normal', fixed-effects only
>> regression, is it also valid to derive null distributions for the
>> regression coefficients of the fixed effects in a mixed model this
>> way? Or does the quite different parameters estimation calculation
>> forbid this approach (Forbid in the sense of bogus outcome) ?
>>
>> Thanks, Thomas
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
--
+----------------------------------------------------------------+
| Tom Backer Johnsen, Psychometrics Unit, Faculty of Psychology |
| University of Bergen, Christies gt. 12, N-5015 Bergen, NORWAY |
| Tel : +47-5558-9185 Fax : +47-5558-9879 |
| Email : backer at psych.uib.no URL : http://www.galton.uib.no/ |
+----------------------------------------------------------------+
More information about the R-help
mailing list