[R-sig-ME] Zero cells in contrast matrix problem

Wed May 27 23:00:11 CEST 2015

After giving up on a glmer for my data, I remembered a post by Roger Levy
suggesting to try the use non mixed effects glm when one of the cells in a
matrix is zero.

To put this into perspective:

> trial<-glmer(Correct ~ Syntax.Semantics + (1 | Part.name), data =
trialglm, family = binomial)

Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.053657 (tol = 0.001,
component 4)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?

My data has a binary outcome, correct or incorrect, a fixed effect
predictor factor with 8 levels, and a random effect for participants. I
believe the problem R is encountering is with one level of the factor (let
us call it level B) which has no counts (no I won' t try to post the table
from the paper with the counts because I know it will get garbled up!).

I attempt a glm with the same data:

> trial<-glm(Correct ~ Syntax.Semantics, data = trialglm, family = binomial)
> anova(trial)
Analysis of Deviance Table

Model: binomial, link: logit

Response: Correct

Terms added sequentially (first to last)

                 Df Deviance Resid. Df Resid. Dev
NULL                               384     289.63
Syntax.Semantics  7   34.651       377     254.97
> summary(trial)

Call:
glm(formula = Correct ~ Syntax.Semantics, family = binomial,
    data = trialglm)

Deviance Residuals:
     Min        1Q    Median        3Q       Max
-0.79480  -0.62569  -0.34474  -0.00013   2.52113

Coefficients:
                           Estimate Std. Error z value Pr(>|z|)
(Intercept)                 -1.6917     0.4113  -4.113 3.91e-05 ***
Syntax.Semantics A   0.7013     0.5241   1.338   0.1809
Syntax.Semantics B -16.8744   904.5273  -0.019   0.9851
Syntax.Semantics C  -1.1015     0.7231  -1.523   0.1277
Syntax.Semantics D   0.1602     0.5667   0.283   0.7774
Syntax.Semantics E  -0.8733     0.7267  -1.202   0.2295
Syntax.Semantics F  -1.4438     0.8312  -1.737   0.0824 .
Syntax.Semantics G   0.4630     0.5262   0.880   0.3789
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 289.63  on 384  degrees of freedom
Residual deviance: 254.98  on 377  degrees of freedom
AIC: 270.98

Number of Fisher Scoring iterations: 17

 The comparison I'm interested in is between level B and the reference
level but it cannot be estimated as shown by the ridiculously high estimate
and SE value.

Any suggestions on how to get a decent beta, SE, z, and p? It's the only
comparison missing in the table for the levels I need so I think it would
be a bit unacademic of me to close this deal saying 'the difference could
not be estimated due to zero count'.

And by the way I have seen this comparison being generated using other
stats.

Thanks in advance,

Frank

	[[alternative HTML version deleted]]