[R-sig-ME] binomial fixed-effect p-values by simulation

Sat Aug 23 02:17:19 CEST 2008

  I don't quite understand your question about the t-distribution -- do you
mean naively using the t-statistic to test the null hypothesis?  A few 
issues
here -- (1) this is a Wald test, hence not generally as accurate as model
comparison; (2) assuming the scale parameter is fixed at 1, I *think* 
technically
one should just do a Z-test rather than a t-test here -- but I'm not 
sure. I have
a vague memory of a reference saying that Z- is actually better than t- in
this case (besides just being theoretically justified).  (3) if you do 
quasilikelihood,
then you should do the t-test, but then you have to decide on appropriate
degrees of freedom (oh joy).  Arguably the null-model simulations could give
quite a bit of insight into the _real_ shape of the distribution of 
(est)/(s.e.)
under the null hypothesis -- i.e., is it t?  With how many df?

  Ben

Daniel Ezra Johnson wrote:
> Yes, that would be very helpful if you'd sent your simulation function.
>
> I'm encouraged to know that I'm on the right track with this, at least
> in theory.
> For empirical p-values I can't see most people being satisfied with,
> say, 100 replications and the level of error that would bring. So the
> slowness is a problem.
>
> When would someone use the t-statistics distribution, versus the way I
> did it by comparing fixed effect sizes between the simulation and the
> data? Or are those two essentially the same thing, statistically?
>
> Thanks,
> D
>
> On Sat, Aug 23, 2008 at 12:40 AM, Ben Bolker <bolker at ufl.edu> wrote:
>   
>>  I just wrote a "simulate" method for glmer objects, to extend the
>> version that's already in lme4 for lmer objects.  Given a nested pair
>> of models m0, m1, the simulation distribution of t-statistics would
>> be
>>
>> replicate(1000,get.t.statistic(refit(m1,simulate(m0))))
>>
>> See simulate.lme in nlme, which also has a plot method
>> that displays the nominal and empirical p-values for REML
>> and ML estimates.
>>  The bad news is that it's essentially doing the same thing you
>> discuss below -- and therefore it's not any faster (I was able
>> to do about 10 replicates/minute ...)
>>
>>  If you're interested let me know and I can send my
>> "glmersim.R"
>>
>>  Ben Bolker
>>
>> Daniel Ezra Johnson wrote:
>>     
>>> Consider the common situation of modeling multiple responses from
>>> multiple subjects, where we want to know whether a fixed-effect
>>> between-subjects variable -- for example, gender -- has a significant
>>> effect.
>>>
>>> In my field (quantitative sociolinguistics) the response is usually
>>> binomial, and for some 30 years the usual procedure has been to carry
>>> out logistic regression, setting the subject variable aside, and
>>> assessing fixed effect significance with a likelihood-ratio test. When
>>> between-subject variation is high, this produces exceptionally
>>> anti-conservative p-values (see old.p.value below).
>>>
>>> I am writing an article and some wrapper functions designed to
>>> introduce the use of mixed models to this audience. With subject
>>> included as a random effect, the p-values for fixed between-subject
>>> effects are much more conservative, even using the likelihood-ratio
>>> test (see lrt.p.value below). In other words, this is an improvement.
>>>
>>> But as has been discussed many times on this list, likelihood-ratio
>>> tests are still anti-conservative for fixed effect significance. I
>>> believe the extent of anti-conservatism depends on the number of
>>> observations relative to the number of groups, the balance of the
>>> data, and probably other factors too.
>>>
>>> I have seen the MCMC method discussed, but I think this is not yet
>>> available for binomial data with lmer(), is that correct? In addition,
>>> that method, which operates in the parameter space, seems to be
>>> testing a slightly different null hypothesis than the usual
>>> frequentist one, though I may be misunderstanding this.
>>>
>>> Would it be appropriate for me to carry out a simulation of the type
>>> given below, to assess fixed effect significance?
>>>
>>> The code starts by generating a dataset which happens to have a
>>> potentially significant gender effect, according to the LRT. (In fact
>>> the data is randomly generated and has no "real" gender effect, but
>>> assume it was real data and we did not know that.)
>>>
>>> A null model (no fixed effect) and an alternative model (fixed gender
>>> effect) are fit to the data by lmer(), using a random effect for
>>> subject in both cases.
>>>
>>> It then takes the two parameters from the null model -- intercept and
>>> subject s.d. -- and randomly generates new data. The simulated data is
>>> fit with the alternative-style model and the proportion of times the
>>> fixed effect magnitude exceeds that from the 'observed' data is the
>>> p-value.
>>>
>>> Does this procedure make sense? It would generalize to linear models
>>> easily. Making it work for unbalanced data and when there are other
>>> fixed effects in the model would not be that hard either, I think.
>>>
>>> To my mind, this is a direct implementation of the frequentist p-value
>>> concept via simulation. Am I missing something important?
>>>
>>> Of course, especially with binomial data this simulation is quite slow
>>> (i.e. compared to the unimplemented MCMC method). The code below, with
>>> 1000 simulations of a 2000-observation dataset, took 15 minutes to run
>>> on a MacBook Pro. (The result was a p-value of 0.027, compared to
>>> 0.015 from the LRT, 0.009 from the Wald statistic... and 1.1e-56 from
>>> the fixed-effects model!)
>>>
>>> It's important to me to be able to give p-values that are
>>> accurate/conservative and that I understand. So if anyone can give me
>>> guidance on whether this simulation procedure makes sense, it would be
>>> much appreciated.
>>>
>>> And if there's a faster way to do something like this for binomial
>>> data, please let me know.
>>>
>>> Thanks,
>>> Daniel
>>>
>>> library(lme4)
>>> library(boot)
>>> set.seed(10)
>>>
>>> observed <- data.frame(subject = rep(letters[1:20],each=100),
>>>        gender = rep(c("M","F"),each=1000),
>>>        response = rbinom(2000,1,rep(plogis(rnorm(20,0,3)),each=100)))
>>>
>>> old <- glm(response~gender,binomial,observed)
>>> old.p.value <- pchisq(old$null.deviance-old$deviance,df=1,lower.tail=F)
>>> # 1.083e-56
>>>
>>> null <- lmer(response ~ (1|subject),observed,binomial)
>>> fixed <- lmer(response ~ gender + (1|subject),observed,binomial)
>>>
>>> wald.p.value <-  summary(fixed)@coefs["genderM","Pr(>|z|)"] # 0.00894
>>> lrt.p.value <- anova(null,fixed)[,"Pr(>Chisq)"][2] # 0.0152
>>>
>>> null.intercept <- fixef(null) # -0.0914
>>> null.subject.sd <- attr(VarCorr(null)$subject,"stddev") # 2.276
>>> gender.observed <- fixef(fixed)[2] # -2.319
>>> gender.simulated <- vector()
>>>
>>> for (i in 1:1000){
>>> simulated <- data.frame(subject = rep(letters[1:20],each=100),
>>>        gender = rep(c("M","F"),each=1000),
>>>        response =
>>> rbinom(2000,1,rep(plogis(rnorm(20,0,null.subject.sd)),each=100)))
>>> fixed.sim <- lmer(response~gender+(1|subject),simulated,binomial)
>>> gender.simulated[i] <- fixef(fixed.sim)[2]}
>>>
>>> # minutes later...
>>>
>>> simulation.p.value <- mean(abs(gender.simulated)>=abs(gender.observed)) #
>>> 0.027
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>       
>>     

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glmersim.R
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080822/925dc8fe/attachment.pl>