[R-sig-ME] Zero cells in contrast matrix problem

Ben Bolker bbolker at gmail.com
Thu May 28 02:28:26 CEST 2015

And for what it's worth, you can do this in conjunction with lme4 by
using the blme package instead (a thin Bayesian wrapper around lme4),
or via the MCMCglmm package; see
http://ms.mcmaster.ca/~bolker/R/misc/foxchapter/bolker_chap.html for
an example (search for "complete separation").

On Wed, May 27, 2015 at 5:21 PM, Viechtbauer Wolfgang (STAT)
<wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> You may need to consider using an 'exact', Bayesian, or penalized likelihood approach (along the lines proposed by Firth).
> Maybe a place to start: http://sas-and-r.blogspot.nl/2010/11/example-815-firth-logistic-regression.html
> Best,
> Wolfgang
>> -----Original Message-----
>> From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
>> project.org] On Behalf Of Francesco Romano
>> Sent: Wednesday, May 27, 2015 23:00
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] Zero cells in contrast matrix problem
>> After giving up on a glmer for my data, I remembered a post by Roger Levy
>> suggesting to try the use non mixed effects glm when one of the cells in
>> a
>> matrix is zero.
>> To put this into perspective:
>> > trial<-glmer(Correct ~ Syntax.Semantics + (1 | Part.name), data =
>> trialglm, family = binomial)
>> Warning messages:
>> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
>> :
>>   Model failed to converge with max|grad| = 0.053657 (tol = 0.001,
>> component 4)
>> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
>> :
>>   Model is nearly unidentifiable: large eigenvalue ratio
>>  - Rescale variables?
>> My data has a binary outcome, correct or incorrect, a fixed effect
>> predictor factor with 8 levels, and a random effect for participants. I
>> believe the problem R is encountering is with one level of the factor
>> (let
>> us call it level B) which has no counts (no I won' t try to post the
>> table
>> from the paper with the counts because I know it will get garbled up!).
>> I attempt a glm with the same data:
>> > trial<-glm(Correct ~ Syntax.Semantics, data = trialglm, family =
>> binomial)
>> > anova(trial)
>> Analysis of Deviance Table
>> Model: binomial, link: logit
>> Response: Correct
>> Terms added sequentially (first to last)
>>                  Df Deviance Resid. Df Resid. Dev
>> NULL                               384     289.63
>> Syntax.Semantics  7   34.651       377     254.97
>> > summary(trial)
>> Call:
>> glm(formula = Correct ~ Syntax.Semantics, family = binomial,
>>     data = trialglm)
>> Deviance Residuals:
>>      Min        1Q    Median        3Q       Max
>> -0.79480  -0.62569  -0.34474  -0.00013   2.52113
>> Coefficients:
>>                            Estimate Std. Error z value Pr(>|z|)
>> (Intercept)                 -1.6917     0.4113  -4.113 3.91e-05 ***
>> Syntax.Semantics A   0.7013     0.5241   1.338   0.1809
>> Syntax.Semantics B -16.8744   904.5273  -0.019   0.9851
>> Syntax.Semantics C  -1.1015     0.7231  -1.523   0.1277
>> Syntax.Semantics D   0.1602     0.5667   0.283   0.7774
>> Syntax.Semantics E  -0.8733     0.7267  -1.202   0.2295
>> Syntax.Semantics F  -1.4438     0.8312  -1.737   0.0824 .
>> Syntax.Semantics G   0.4630     0.5262   0.880   0.3789
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> (Dispersion parameter for binomial family taken to be 1)
>>     Null deviance: 289.63  on 384  degrees of freedom
>> Residual deviance: 254.98  on 377  degrees of freedom
>> AIC: 270.98
>> Number of Fisher Scoring iterations: 17
>>  The comparison I'm interested in is between level B and the reference
>> level but it cannot be estimated as shown by the ridiculously high
>> estimate
>> and SE value.
>> Any suggestions on how to get a decent beta, SE, z, and p? It's the only
>> comparison missing in the table for the levels I need so I think it would
>> be a bit unacademic of me to close this deal saying 'the difference could
>> not be estimated due to zero count'.
>> And by the way I have seen this comparison being generated using other
>> stats.
>> Thanks in advance,
>> Frank
[[alternative HTML version deleted]]
