[R-sig-ME] bootstrapping coefficient p-values under null hypothesis in case-resampling multiple linear mixed-effects regression

Ben Bolker bbolker at gmail.com
Sun Jan 28 00:02:07 CET 2018


  I don't know. But I have the following suggestions/questions:

- the closest/simplest analogue for what you want to do would seem to be
a *permutation test*: that is, rather than sampling your data with
replacement, sample the predictor variables _without_ replacement, to
simulate the null hypothesis (independence between predictors and responses)
- alternatively you could try robust methods, e.g. the robustlmm package

On 18-01-27 05:19 PM, Aleksander Główka wrote:
> Dear mixed-effects community,
> 
> I am fitting a multiple linear mixed-effects regression model in lme4.
> The residual fit is near-linear, enough to warrant not assuming residual
> homoscedasticity. One way to model regression without explicitly making
> this assumption is to use case-resampling regression (Davison & Hinkley
> 1997), an application of the bootstrap (Efron & Tibshirani 1993).
> 
> In case-resampling regression, rather than assuming a normal
> distribution for the T-statistic, we estimate the distribution of T
> empirically. We mimic sampling from the original population by treating
> the original sample as if it were the population: for each bootstrap
> sample of size n we randomly select n values with replacement from the
> original sample and then fit regression giving estimates, repeating this
> procedure R times.
> 
> Having applied this procedure, I am trying to calculate empirical
> p-values for my regression coefficients. As in parametric regression, I
> want to conduct the two-tailed hypothesis test of significance for slope
> with test statistic T under the null hypothesis H0:β^1=0. Since we are
> treating the original sample as the population, our T=t is the observed
> value from the original sample. For β^{0,1,…,p} We calculate the p-value
> as follows:
> 
> (1) min(p=(1{T≥t}/R),p=(1{T≤t})/R)
> 
> Davison and Hinkley take t=β^1
> 
> so that, in practice
> 
> (2) min(p=(1{β∗^1≥β^1}+1)/(R+1),p=(1{β∗^1≤β^1}+1)/(R+1))
> 
> The major problem here is that the bootstrap samples were not sampled
> under the null hypothesis, so in (1) and (2) we are evaluating the
> alternative hypothesis rather than the null. Efron & Tibshirani (1993)
> indeed caution that all hypothesis testing must be performed by sampling
> under the null. This is relatively simple for, say, testing the
> difference between two means, where the null H0:σ1=σ2, and which
> requires a simple transformation of the data prior to sampling.
> 
> So my question here is: how do I perform significance testing under the
> null hypothesis in case-resampling regression? As far as I could see,
> neither Davison & Hinkley (1997) nor Efron & Tibshirani (1993) seem to
> mention how to sample under the null. Is there some adjustment that I
> can introduce before (to the data) or after case-resampling (to the
> least-squares formula) in a way that is easily implementable in R and
> lme4? Any ideas and or algorithms would be greatly appreciated.
> 
> N.B. With all due respect, please don’t advise me to fit a GLM instead
> or to talk directly with Rob Tibshirani.
> 
> Thank you,
> 
> Aleksander Glowka
> PhD Candidate in Linguistics
> Stanford University
> 
> Works cited:
> 
> Davison, A. C. and D. V. Hinkley (1997). Bootstrap Methods and their
> Applications. Cambridge, England: Cambridge University Press.
> 
> Efron, B. and Tibshirani, R.J. (1993). An Introduction to the Bootstrap.
> New York: Champman & Hall.
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list