[R-sig-ME] bootstrapping coefficient p-values under null hypothesis in case-resampling multiple linear mixed-effects regression
Robert LaBudde
alethephant at verizon.net
Sun Jan 28 04:20:26 CET 2018
If your null hypothesis is that variable X has a coefficient of zero in
the model, would not sampling under the null hypothesis be done by case
resampling of every variable except X, and then resample X from its set
of values?
It would appear wise to just do case resampling and construct a
confidence interval for the coefficient from the bootstrap. I avoid
statistical testing as completely as possible.
On 1/27/2018 5:19 PM, Aleksander Główka wrote:
> Dear mixed-effects community,
>
> I am fitting a multiple linear mixed-effects regression model in lme4.
> The residual fit is near-linear, enough to warrant not assuming
> residual homoscedasticity. One way to model regression without
> explicitly making this assumption is to use case-resampling regression
> (Davison & Hinkley 1997), an application of the bootstrap (Efron &
> Tibshirani 1993).
>
> In case-resampling regression, rather than assuming a normal
> distribution for the T-statistic, we estimate the distribution of T
> empirically. We mimic sampling from the original population by
> treating the original sample as if it were the population: for each
> bootstrap sample of size n we randomly select n values with
> replacement from the original sample and then fit regression giving
> estimates, repeating this procedure R times.
>
> Having applied this procedure, I am trying to calculate empirical
> p-values for my regression coefficients. As in parametric regression,
> I want to conduct the two-tailed hypothesis test of significance for
> slope with test statistic T under the null hypothesis H0:β^1=0. Since
> we are treating the original sample as the population, our T=t is the
> observed value from the original sample. For β^{0,1,…,p} We calculate
> the p-value as follows:
>
> (1) min(p=(1{T≥t}/R),p=(1{T≤t})/R)
>
> Davison and Hinkley take t=β^1
>
> so that, in practice
>
> (2) min(p=(1{β∗^1≥β^1}+1)/(R+1),p=(1{β∗^1≤β^1}+1)/(R+1))
>
> The major problem here is that the bootstrap samples were not sampled
> under the null hypothesis, so in (1) and (2) we are evaluating the
> alternative hypothesis rather than the null. Efron & Tibshirani (1993)
> indeed caution that all hypothesis testing must be performed by
> sampling under the null. This is relatively simple for, say, testing
> the difference between two means, where the null H0:σ1=σ2, and which
> requires a simple transformation of the data prior to sampling.
>
> So my question here is: how do I perform significance testing under
> the null hypothesis in case-resampling regression? As far as I could
> see, neither Davison & Hinkley (1997) nor Efron & Tibshirani (1993)
> seem to mention how to sample under the null. Is there some adjustment
> that I can introduce before (to the data) or after case-resampling (to
> the least-squares formula) in a way that is easily implementable in R
> and lme4? Any ideas and or algorithms would be greatly appreciated.
>
> N.B. With all due respect, please don’t advise me to fit a GLM instead
> or to talk directly with Rob Tibshirani.
>
> Thank you,
>
> Aleksander Glowka
> PhD Candidate in Linguistics
> Stanford University
>
> Works cited:
>
> Davison, A. C. and D. V. Hinkley (1997). Bootstrap Methods and their
> Applications. Cambridge, England: Cambridge University Press.
>
> Efron, B. and Tibshirani, R.J. (1993). An Introduction to the
> Bootstrap. New York: Champman & Hall.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Robert A LaBudde, BS, MS, PhD, ChDipl ACAFS President
Least Cost Formulations, Ltd URL: lcfltd.com
824 Timberlake Dr Tel: 757-467-0954
Virginia Beach, VA 23464 Fax: 757-467-2947
More information about the R-sig-mixed-models
mailing list