[R-sig-ME] What to do when a subset of binomial data has only positive outcomes

Fri Jul 30 22:26:49 CEST 2010

  The bleeding-edge version of lme4, lme4a, has some profiling
capabilities that might (?) be useful. A Bayesian approach (which you
would probably have to roll yourself; glmmBUGS exists but did not seem
flexible enough to deal with crossed random effects the last time I
looked) would also be useful for stabilizing this kind of estimation
problem (i.e., assigning a prior would allow the posterior probability
density to be not quite completely concentrated at prob=0 or prob=1
...)  For inspiration you might also try looking in the GLM literature
under the keyword 'complete separation'.

On Fri, Jul 30, 2010 at 10:28 AM,  <Timothy_Handley at nps.gov> wrote:
>
> I had this problem with glm. My solution was to create a likelihood
> function, and use the optim or optimize function to find an appropriate
> confidence interval. More specifically, I found the likelihood of the data
> for the best-estimate parameters, then found the range of parameters for
> which a likelihood-ratio-test (best-estimate params vs. current params) had
> a p-value>.025. The range is then the 95% confidence interval for those
> parameters. This is essentially replicating the functionality of glm by
> hand, as it seemed unable to deal with data which is all positive or all
> negative.
>
> While I understand how to do this with glm, I don't understand enough about
> fixed effects models to do the same for lmer. If you were willing to treat
> everything as a fixed effect and switch to glm, then you could e-mail me,
> and I could offer some more specific advice on how to do this.
>
> If anyone else has a better solution, I'd be very interested to hear it.
>
> Tim Handley
> Fire Effects Monitor
> Santa Monica Mountains National Recreation Area
> 401 W. Hillcrest Dr.
> Thousand Oaks, CA 91360
> 805-370-2300 x2412
>
> Date: Thu, 29 Jul 2010 17:57:35 -0700
> From: Xiao He <praguewatermelon at gmail.com>
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] What to do when a subset of binomial data has only
>             positive outcomes
> Message-ID:
>             <AANLkTi=zskcerd9fuELVQZ833mYPRkCddYpRUdVvQq7- at mail.gmail.com>
> Content-Type: text/plain
>
> Dear R users and experts,
>
> I tried to fit a model as shown below:
>
> data.lmer<-lmer(word1~compoundType*nativelanguage + (1|subject) + (1|word),
> data=data, family="binomial").
>
> the factor 'compoundType' has three levels: blue, green, red.
> the factor nativelanguage has two levels: english, other
>
> I obtained the following results:
> ************************************************
> Fixed effects:
>                              Estimate Std. Error z value Pr(>|z|)
> (Intercept)                    -1.0190     0.5777  -1.764   0.0778 .
> nativelanguageother            -0.6862     0.4075  -1.684   0.0922 .
> typegreen                      19.7882   728.0627   0.027   0.9783
> typered                         3.6985     0.8359   4.425 9.65e-06 ***
> nativelanguageother:typegreen -16.1915   728.0624  -0.022   0.9823
> nativelanguageother:typered    -2.8106     0.5639  -4.984 6.23e-07 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> ************************************************
>
> As you can see, the standard errors and estimates for "typegreen" as well
> as
> "nativelanguageother:typegreen" are enormous. I examined my data and
> realized that the responses obtained for the level "green" of compoundType
> were all positive. But the responses of "green" were clearly significantly
> different from the base level "blue" as green was 100% positive responses,
> whereas blue had only 30% of positive responses.
>
> I wonder if there is any way to conduct statistical analyses to show that
> green is significantly different from blue (and potentially red) because
> obviously I can't just say they are different :)...
>
> Thank you in advance for your help!
>
>
>
> Best,
>
> Xiao He
> Graduate student
> Department of Linguistics
> University of Southern California
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>