[R-sig-ME] Zero cells in contrast matrix problem

Ben Bolker bbolker at gmail.com
Wed Oct 28 21:44:55 CET 2015


  Please keep r-sig-mixed-models in the cc: list ... thanks.

  I think you might have misrepresented the numbers of levels per
category slightly -- you said "Syntax, a factor with two levels, and
Animacy, a factor with four levels. In addition a new between-subjects
factor called Group with two levels (native vs non-native speaker) has
to be added".  If this is really the case, I don't see how you could get
22 fixed-effect variables (you should get a max of 2*4*2).  Maybe one of
these factors really has three levels, which would lead to 24-2
collinear = 22 ?

  If the collinearity is too high for your comfort, you could try
strengthening the prior a bit (i.e. make the variance of the prior
smaller).

On 15-10-28 01:56 PM, Francesco Romano wrote:
> Well, this used to be a single variable but the reviewers asked for a split.
> Ben, it turns out the magic number was...22
> Now, I've managed to obtain a model with random intercepts
> despite a few warnings being flagged up by R
> 
> bglmer(Correct~Syntax*Animacy*Prof.group.2+(1|Part.name)+(1|Item), data =
> revana, family = binomial, fixef.prior = normal(cov = diag(9,22)),
> control=glmerControl(optCtrl=list(maxfun=100000))
> 
> but if you call up the full correlation table -you'll need vcov(name of
> model)- you'll see a good number of high intercorrelated fixed effects.
> Would you suggest I drop this structure for something simpler?
> Anything more complex takes ages to stop, in fact I've quit all of them.
> 
> I attach the data in cvs. Thanks again for any help you might have to offer.
> 
> 
> On Tue, Oct 27, 2015 at 5:06 PM, Ben Bolker <bbolker at gmail.com> wrote:
> 
>> I think this isn't as much of a problem as you think.  lme4
>> automatically drops non-estimable columns in the model matrix.  The
>> problem is just getting it to work right with blme.  (But yes,
>> collapsing this into a one-way design is one way of proceeding ...)
>>
>> On Tue, Oct 27, 2015 at 12:01 PM, Thompson,Paul
>> <Paul.Thompson at sanfordhealth.org> wrote:
>>> This is a problem in SAS as well. When you specify a contrast in a
>> 2-factor design, things get screwed up with missing cells. In fact, the
>> contrasts require a matrix multiplication, and if a cell isn't there, the
>> matrix multiplication does not work correctly. They are not estimable, in
>> the SAS terminology (which is standard statistical terminology as well).
>>>
>>> One approach that can be used is to convert the 2-factor design to a
>> one-factor design, and do everything with contrasts, main effects, in
>> teractions, etc. A little more work, but since you are specifying the
>> contrasts (and presuming that you do so correctly), you will get estimable
>> contrasts.
>>>
>>> Marginal effects must take the missing cells into account correctly.
>>>
>>> -----Original Message-----
>>> From: R-sig-mixed-models [mailto:
>> r-sig-mixed-models-bounces at r-project.org] On Behalf Of Francesco Romano
>>> Sent: Monday, October 26, 2015 6:18 AM
>>> To: Ben Bolker; r-sig-mixed-models at r-project.org
>>> Subject: Re: [R-sig-ME] Zero cells in contrast matrix problem
>>>
>>> For some reason the silly bugger didn't past the full command:
>>>
>>>> revanaA<-
>>> bglmer(Correct~Syntax*Animacy*Prof.group.2+(1|Part.name)+(1|Item), data
>> = revana, family = binomial, fixef.prior = normal(cov = diag(9,16)))
>> fixed-effect model matrix is rank deficient so dropping 2 columns /
>> coefficients Error in normal(cov = cov, common.scale = FALSE) :
>>>   normal prior covariance of improper length
>>>
>>> To give more info on this, it is the Animacy factor that is causing
>> separation because two levels of it have zero counts in some cases.
>>>
>>> On Mon, Oct 26, 2015 at 12:13 PM, Ben Bolker <bbolker at gmail.com> wrote:
>>>
>>>> Well, that's a separate problem (and not necessarily a "problem").   R
>>>> is telling you that you have 16 separate combinations of the factors,
>>>> but only 14 unique combinations represented in your data set, so it
>>>> can only estimate 14 parameters.  Unless there is a weird interaction
>>>> with blme I don't know about, this should still give you reasonable
>>>> answers.
>>>>
>>>> On Mon, Oct 26, 2015 at 7:10 AM, Francesco Romano
>>>> <francescobryanromano at gmail.com> wrote:
>>>>> Many thanks Ben,
>>>>>
>>>>> but I tried that already:
>>>>>
>>>>>> revanaA<-
>>>>>> bglmer(Correct~Syntax*Animacy*Prof.group.2+(1|Part.name)+(1|Item),
>>>>>> data
>>>> =
>>>>>> revana, family = binomial, fixef.prior = normal(cov = diag(9,16)))
>>>>> fixed-effect model matrix is rank deficient so dropping 2 columns /
>>>>> coefficients Error in normal(cov = cov, common.scale = FALSE) :
>>>>>   normal prior covariance of improper length
>>>>>
>>>>> On Mon, Oct 26, 2015 at 12:06 PM, Ben Bolker <bbolker at gmail.com>
>> wrote:
>>>>>>
>>>>>> On 15-10-26 06:56 AM, Francesco Romano wrote:
>>>>>>> I wonder if anyone can help with the separation problem
>>>>>>> originally solved by Ben Bolker (see thread).
>>>>>>> The model and fitting I used previously was
>>>>>>>
>>>>>>> trial<-bglmer(Correct ~ Syntax.Semantics, data = trialglm, family
>>>>>>> = binomial, fixef.prior = normal(cov = diag(9,4))
>>>>>>>
>>>>>>> which now has to change because the Syntax.Semantcs factor needs
>>>>>>> to be split into separate within-subjects variables, Syntax, a
>>>>>>> factor with two levels, and Animacy, a factor with four levels.
>>>>>>> In addition a new between-subjects factor called Group with two
>>>>>>> levels (native vs non-native speaker) has to be added which
>>>>>>> determines the following model, fit by bglmer:
>>>>>>>
>>>>>>> trial<-bglmer(Correct ~ Syntax*Animacy*Group+
>>>>>>> (1|Part.name)+(1|Item), data = trialglm, family = binomial,
>>>>>>> fixef.prior = normal(cov = diag???)
>>>>>>>
>>>>>>> What values should I use for the cov=diag portion in order to
>>>>>>> continue attempting convergence of a model that includes the
>>>>>>> random effects?
>>>>>>
>>>>>>     In general a reasonable form is normal(cov = diag(v,np)) where
>>>>>> v is the prior variance (generally something reasonably
>>>>>> large/non-informative; 9 (=std dev of 3) is probably an OK default)
>>>>>> and np is the number of fixed-effect parameters.  You can figure
>>>>>> this out
>>>> via
>>>>>>
>>>>>> ncol(model.matrix(~Syntax*Animacy*Group,data=trialglm)
>>>>>>
>>>>>> or multiply 2*4*2 to get 16 ...
>>>>>>
>>>>>>>
>>>>>>> R returns the following error because I don't know how to
>>>>>>> establish
>>>> the
>>>>>>> parameters when more than one
>>>>>>> fixed effect is involved:
>>>>>>>
>>>>>>> Error in normal(cov = cov, common.scale = FALSE) :
>>>>>>>   normal prior covariance of improper length
>>>>>>>
>>>>>>> Many thanks in advance for any help!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, May 28, 2015 at 10:46 PM, Ben Bolker <bbolker at gmail.com>
>>>> wrote:
>>>>>>>
>>>>>>>>   I don't see your data -- I see a little tiny subset, but
>>>>>>>> that's not really enough for a reproducible example.
>>>>>>>>
>>>>>>>> This is the example given in the URL I sent:
>>>>>>>>
>>>>>>>> cmod_blme_L2 <- bglmer(predation~ttt+(1|block),data=newdat,
>>>>>>>>                        family=binomial,
>>>>>>>>                        fixef.prior = normal(cov = diag(9,4)))
>>>>>>>>
>>>>>>>> trial<-bglmer(Correct ~ Syntax.Semantics+(1|Part.name),
>>>>>>>>           data =trialglm,
>>>>>>>>          family = binomial,
>>>>>>>>         fixef.prior = normal(cov=diag(9,8)))
>>>>>>>>
>>>>>>>> The last line specifies an 8x8 matrix (because you have 8 fixed
>>>> effect
>>>>>>>> parameters) with a value of 9 on the diagonal, meaning the
>>>>>>>> priors for the fixed effects are independent and each is Normal
>>>>>>>> with a sd of sqrt(9)=3.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, May 28, 2015 at 3:25 PM, Francesco Romano
>>>>>>>> <francescobryanromano at gmail.com> wrote:
>>>>>>>>> Yes but this seems a bit above my head without your help. The
>>>>>>>>> data
>>>> are
>>>>>>>>> in
>>>>>>>>> the three variables at the bottom of my email but I forgot to
>>>> mention
>>>>>>>>> the
>>>>>>>>> random participant effect (n = 17). Thanks!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Il giovedì 28 maggio 2015, Ben Bolker <bbolker at gmail.com> ha
>>>> scritto:
>>>>>>>>>>
>>>>>>> On 15-05-28 06:55 AM, Francesco Romano wrote:
>>>>>>>>>>>> Many thanks to both.
>>>>>>>>>>>>
>>>>>>>>>>>> The approaches you suggest (and others online) help one deal
>>>>>>>>>>>> with the separation problem but don't offer any specific
>>>>>>>>>>>> advice as to how getting a valid p coefficient when
>>>>>>>>>>>> comparing two levels of
>>>> the
>>>>>>>>>>>> model vexed by separation.
>>>>>>>>>>>>
>>>>>>>>>>>> Ben, here's the output of the bglmer which by the way would
>>>>>>>>>>>> be ideal since it allows me to retain the random effect so
>>>>>>>>>>>> that all
>>>> my
>>>>>>>>>>>> pairwise comparisons are conducted using mixed effects.
>>>>>>>>>>>>
>>>>>>>>>>>>> trial<-bglmer(Correct ~ Syntax.Semantics+(1|Part.name),
>>>>>>>>>>>>> data = trialglm,
>>>>>>>>>>>> family = binomial) Warning message: package ‘blme’ was built
>>>> under
>>>>>>>>>>>> R version 3.1.2
>>>>>>>>>>>>> summary(trial)
>>>>>>>>>>>> Cov prior  : Part.name ~ wishart(df = 3.5, scale = Inf,
>>>>>>>>>>>> posterior.scale = cov, common.scale = TRUE) Prior dev  :
>>>>>>>>>>>> 1.4371
>>>>>>>>>>>>
>>>>>>>>>>>> Generalized linear mixed model fit by maximum likelihood
>>>>>>>>>>>> (Laplace
>>>>>>>>>>>> Approximation) ['bglmerMod'] Family: binomial  ( logit )
>> Formula:
>>>>>>>>>>>> Correct ~ Syntax.Semantics + (1 | Part.name) Data: trialglm
>>>>>>>>>>>>
>>>>>>>>>>>> AIC      BIC   logLik deviance df.resid 269.9    305.5
>>  -126.0
>>>>>>>>>>>> 251.9      376
>>>>>>>>>>>>
>>>>>>>>>>>> Scaled residuals: Min      1Q  Median      3Q     Max -0.9828
>>>>>>>>>>>> -0.4281 -0.2445 -0.0002  5.7872
>>>>>>>>>>>>
>>>>>>>>>>>> Random effects: Groups    Name        Variance Std.Dev.
>> Part.name
>>>>>>>>>>>> (Intercept) 0.3836   0.6194 Number of obs: 385, groups:
>>>> Part.name,
>>>>>>>>>>>> 16
>>>>>>>>>>>>
>>>>>>>>>>>> Fixed effects: Estimate Std. Error z value Pr(>|z|)
>> (Intercept)
>>>>>>>>>>>> -1.8671     0.4538  -4.114 3.89e-05 *** Syntax.Semantics A
>>>>>>>>>>>> 0.8121     0.5397   1.505   0.1324 Syntax.Semantics B
>> -16.4391
>>>>>>>>>>>> 1195.5031  -0.014   0.9890 Syntax.Semantics C   -1.1323
>>>>  0.7462
>>>>>>>>>>>> -1.517   0.1292 Syntax.Semantics D    0.1789     0.5853
>>  0.306
>>>>>>>>>>>> 0.7598 Syntax.Semantics E   -0.8071     0.7500  -1.076
>>  0.2819
>>>>>>>>>>>> Syntax.Semantics F   -1.5051     0.8575  -1.755   0.0792 .
>>>>>>>>>>>> Syntax.Semantics G    0.4395     0.5417   0.811   0.4171 ---
>>>>>>>>>>>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
>>>>>>>>>>>> 1
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately the separation problem is still there. Should
>>>>>>>>>>>> I be constraining the parameter somehow? How would I do
>>>>>>>>>>>> that? The data is below.
>>>>>>>
>>>>>>>    Did you read the section in the URL I suggested?  Just using
>>>>>>> bglmer isn't enough; you also have to set a prior on the fixed
>> effects.
>>>>>>>
>>>>>>>   Your data don't seem to be attached (note that the mailing list
>>>>>>> strips most non-ASCII file types).
>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In passing I also tried brglm which solves the separation
>>>>>>>>>>>> problem but tells me comparison is significant which I don't
>>>>>>>>>>>> believe one bit (see the data below). I am pretty sure about
>>>>>>>>>>>> this because
>>>> when
>>>>>>>>>>>> I reveled and look at the comparisons I was able to compute
>>>>>>>>>>>> using glmer, these turn out to be non-significant, when
>>>>>>>>>>>> glmer told me they were:
>>>>>>>>>>>>
>>>>>>>>>>>>> trial<-brglm(Correct ~ Syntax.Semantics, data = trialglm,
>>>>>>>>>>>>> family =
>>>>>>>>>>>> binomial) Warning messages: 1: package ‘elrm’ was built
>>>>>>>>>>>> under R version 3.1.2 2: package ‘coda’ was built under R
>>>>>>>>>>>> version 3.1.3
>>>>>>>>>>>>> summary(trial)
>>>>>>>>>>>>
>>>>>>>>>>>> Call: brglm(formula = Correct ~ Syntax.Semantics, family =
>>>>>>>>>>>> binomial, data = trialglm)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept)
>>>>>>>>>>>> -1.6358     0.4035  -4.053 5.05e-05 *** Syntax.Semantics A
>>>>  0.6689
>>>>>>>>>>>> 0.5169   1.294   0.1957 Syntax.Semantics B  -3.0182     1.4902
>>>>>>>>>>>> -2.025   0.0428 * Syntax.Semantics C  -1.0135     0.6889
>> -1.471
>>>>>>>>>>>> 0.1413 Syntax.Semantics D   0.1515     0.5571   0.272   0.7857
>>>>>>>>>>>> Syntax.Semantics E  -0.7878     0.6937  -1.136   0.2561
>>>>>>>>>>>> Syntax.Semantics F  -1.2874     0.7702  -1.672   0.0946 .
>>>>>>>>>>>> Syntax.Semantics G   0.4358     0.5186   0.840   0.4007 ---
>>>> Signif.
>>>>>>>>>>>> codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>>>>>>>>>>
>>>>>>>>>>>> (Dispersion parameter for binomial family taken to be 1)
>>>>>>>>>>>>
>>>>>>>>>>>> Null deviance: 262.51  on 384  degrees of freedom Residual
>>>>>>>>>>>> deviance: 256.22  on 377  degrees of freedom Penalized
>> deviance:
>>>>>>>>>>>> 245.5554 AIC:  272.22
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> MCMCglmm is too complex for me.
>>>>>>>>>>>>
>>>>>>>>>>>> Wolfgang, I tried the penalized likelihood method (logistf
>>>>>>>>>>>> function) but output is hard to read:
>>>>>>>>>>>>
>>>>>>>>>>>>> trial<-logistf(Correct ~ Syntax.Semantics, data = trialglm,
>>>>>>>>>>>>> family =
>>>>>>>>>>>> binomial) Warning messages: 1: package ‘logistf’ was built
>>>>>>>>>>>> under
>>>> R
>>>>>>>>>>>> version 3.1.2 2: package ‘mice’ was built under R version
>>>>>>>>>>>> 3.1.2
>>>>>>>>>>>>> summary(trial)
>>>>>>>>>>>> logistf(formula = Correct ~ Syntax.Semantics, data =
>>>>>>>>>>>> trialglm, family = binomial)
>>>>>>>>>>>>
>>>>>>>>>>>> Model fitted by Penalized ML Confidence intervals and
>>>>>>>>>>>> p-values by Profile Likelihood Profile Likelihood Profile
>>>>>>>>>>>> Likelihood Profile Likelihood Profile Likelihood Profile
>>>>>>>>>>>> Likelihood Profile
>>>> Likelihood
>>>>>>>>>>>> Profile Likelihood
>>>>>>>>>>>>
>>>>>>>>>>>> coef  se(coef) lower 0.95 upper 0.95 Chisq            p
>>>> (Intercept)
>>>>>>>>>>>> 3.2094017 0.7724482  2.9648747  3.5127830 0.000000
>>>>>>>>>>>> 1.000000e+00 Syntax.Semantics A  4.1767737 6.3254344
>>>>>>>>>>>> 0.4224696 12.0673987
>>>>>>>>>>>> 64.224452 1.110223e-15 Syntax.Semantics B -1.0583602
>>>>>>>>>>>> 0.8959376
>>>>>>>>>>>> -1.3963977 -0.7625216  0.000000 1.000000e+00
>>>>>>>>>>>> Syntax.Semantics C
>>>>>>>>>>>> -0.7299070 0.9308193 -1.0765598 -0.4180076  0.000000
>>>>>>>>>>>> 1.000000e+00 Syntax.Semantics D  0.2314740 1.1563731
>>>>>>>>>>>> -0.1704535  0.6479908
>>>>>>>>>>>> 1.156512 2.821901e-01 Syntax.Semantics E -0.6476907
>>>>>>>>>>>> 0.9771824
>>>>>>>>>>>> -1.0076740 -0.3164066  0.000000 1.000000e+00
>>>>>>>>>>>> Syntax.Semantics F
>>>>>>>>>>>> -0.8271499 0.9305931 -1.1743834 -0.5160799  0.000000
>>>>>>>>>>>> 1.000000e+00 Syntax.Semantics G  0.9909046 1.3787175
>>>>>>>>>>>> 0.5457741  1.5353981
>>>>>>>>>>>> 0.000000 1.000000e+00
>>>>>>>>>>>>
>>>>>>>>>>>> Likelihood ratio test=121.9841 on 7 df, p=0, n=385 Wald test
>>>>>>>>>>>> =
>>>>>>>>>>>> 5.334321 on 7 df, p = 0.6192356
>>>>>>>>>>>>
>>>>>>>>>>>> In particular, what is this model telling me? That Z (my ref
>>>> level)
>>>>>>>>>>>> and B are significantly different?
>>>>>>>>>>>>
>>>>>>>>>>>> I'm happy to try the elrm function with exact logistic
>>>>>>>>>>>> regression but I am not capable of programming it. Besides,
>>>>>>>>>>>> would it give me valid estimates for the comparison between
>> the Z and B levels?
>>>> The
>>>>>>>>>>>> data frame should look like this:
>>>>>>>>>>>>
>>>>>>>>>>>> Outcome variable (Correct, incorrect) Predictor variable (A,
>>>>>>>>>>>> B,
>>>> C,
>>>>>>>>>>>> D, E, F, G, Z) Counts (E: 38,3; B: 51,0; Z: 37,7; G: 40,12; D:
>>>>>>>>>>>> 36,8; C:45,3; A: 34,13; F:65,22).
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you! F.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, May 28, 2015 at 2:28 AM, Ben Bolker
>>>>>>>>>>>> <bbolker at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> And for what it's worth, you can do this in conjunction
>>>>>>>>>>>>> with
>>>> lme4
>>>>>>>>>>>>> by using the blme package instead (a thin Bayesian wrapper
>>>> around
>>>>>>>>>>>>> lme4), or via the MCMCglmm package; see
>>>>>>>>>>>>>
>>>> http://ms.mcmaster.ca/~bolker/R/misc/foxchapter/bolker_chap.html
>>>>>>>>>>>>> for an example (search for "complete separation").
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, May 27, 2015 at 5:21 PM, Viechtbauer Wolfgang
>>>>>>>>>>>>> (STAT) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>>>>>>>>>>>>>> You may need to consider using an 'exact', Bayesian, or
>>>>>>>>>>>>>> penalized
>>>>>>>>>>>>> likelihood approach (along the lines proposed by Firth).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maybe a place to start:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>> http://sas-and-r.blogspot.nl/2010/11/example-815-firth-logistic-regres
>>>> sion.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>> Best,
>>>>>>>>>>>>>> Wolfgang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -----Original Message----- From: R-sig-mixed-models
>>>>>>>>>>>>>>> [mailto:r-sig-mixed-models-bounces at r- project.org] On
>>>>>>>>>>>>>>> Behalf Of Francesco Romano Sent: Wednesday, May 27, 2015
>> 23:00 To:
>>>>>>>>>>>>>>> r-sig-mixed-models at r-project.org Subject: [R-sig-ME] Zero
>>>>>>>>>>>>>>> cells in contrast matrix problem
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> After giving up on a glmer for my data, I remembered a
>>>>>>>>>>>>>>> post by Roger
>>>>>>>>>>>>> Levy
>>>>>>>>>>>>>>> suggesting to try the use non mixed effects glm when one
>>>>>>>>>>>>>>> of the cells in a matrix is zero.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To put this into perspective:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> trial<-glmer(Correct ~ Syntax.Semantics + (1 |
>>>>>>>>>>>>>>>> Part.name), data =
>>>>>>>>>>>>>>> trialglm, family = binomial)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Warning messages: 1: In checkConv(attr(opt, "derivs"),
>>>>>>>>>>>>>>> opt$par, ctrl = control$checkConv, : Model failed to
>>>>>>>>>>>>>>> converge with max|grad| = 0.053657 (tol = 0.001,
>>>>>>>>>>>>>>> component 4) 2: In checkConv(attr(opt, "derivs"),
>>>>>>>>>>>>>>> opt$par, ctrl = control$checkConv, : Model is nearly
>>>>>>>>>>>>>>> unidentifiable: large eigenvalue ratio - Rescale variables?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My data has a binary outcome, correct or incorrect, a
>>>>>>>>>>>>>>> fixed effect predictor factor with 8 levels, and a random
>>>>>>>>>>>>>>> effect for participants. I believe the problem R is
>>>>>>>>>>>>>>> encountering is with one level of the factor (let us call
>>>>>>>>>>>>>>> it level B) which has no counts (no I won' t try to post
>>>>>>>>>>>>>>> the table from the paper with the counts because I know
>>>>>>>>>>>>>>> it will get garbled up!).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I attempt a glm with the same data:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> trial<-glm(Correct ~ Syntax.Semantics, data = trialglm,
>>>>>>>>>>>>>>>> family =
>>>>>>>>>>>>>>> binomial)
>>>>>>>>>>>>>>>> anova(trial)
>>>>>>>>>>>>>>> Analysis of Deviance Table
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Model: binomial, link: logit
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Response: Correct
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Terms added sequentially (first to last)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Df Deviance Resid. Df Resid. Dev NULL
>>>>>>>>>>>>>>> 384     289.63 Syntax.Semantics  7   34.651       377
>>>>>>>>>>>>>>> 254.97
>>>>>>>>>>>>>>>> summary(trial)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Call: glm(formula = Correct ~ Syntax.Semantics, family =
>>>>>>>>>>>>>>> binomial, data = trialglm)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Deviance Residuals: Min        1Q    Median        3Q
>>>>>>>>>>>>>>> Max -0.79480  -0.62569  -0.34474  -0.00013   2.52113
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Coefficients: Estimate Std. Error z value Pr(>|z|)
>>>>>>>>>>>>>>> (Intercept)                 -1.6917     0.4113  -4.113
>>>>>>>>>>>>>>> 3.91e-05 *** Syntax.Semantics A   0.7013     0.5241   1.338
>>>>>>>>>>>>>>> 0.1809 Syntax.Semantics B -16.8744   904.5273  -0.019
>>>>>>>>>>>>>>> 0.9851 Syntax.Semantics C  -1.1015     0.7231  -1.523
>>>>>>>>>>>>>>> 0.1277 Syntax.Semantics D   0.1602     0.5667   0.283
>>>>>>>>>>>>>>> 0.7774 Syntax.Semantics E  -0.8733     0.7267  -1.202
>>>>>>>>>>>>>>> 0.2295 Syntax.Semantics F  -1.4438     0.8312  -1.737
>>>>>>>>>>>>>>> 0.0824 . Syntax.Semantics G   0.4630     0.5262   0.880
>>>>>>>>>>>>>>> 0.3789 --- Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
>>>>>>>>>>>>>>> 0.05 ‘.’ 0.1 ‘ ’ 1
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (Dispersion parameter for binomial family taken to be 1)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Null deviance: 289.63  on 384  degrees of freedom
>>>>>>>>>>>>>>> Residual
>>>>>>>>>>>>>>> deviance: 254.98  on 377  degrees of freedom AIC: 270.98
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Number of Fisher Scoring iterations: 17
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The comparison I'm interested in is between level B and
>>>>>>>>>>>>>>> the reference level but it cannot be estimated as shown
>>>>>>>>>>>>>>> by the ridiculously high estimate and SE value.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any suggestions on how to get a decent beta, SE, z, and p?
>>>>>>>>>>>>>>> It's the only comparison missing in the table for the
>>>>>>>>>>>>>>> levels I need so I think it
>>>>>>>>>>>>> would
>>>>>>>>>>>>>>> be a bit unacademic of me to close this deal saying 'the
>>>>>>>>>>>>>>> difference
>>>>>>>>>>>>> could
>>>>>>>>>>>>>>> not be estimated due to zero count'.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And by the way I have seen this comparison being
>>>>>>>>>>>>>>> generated using other stats.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Frank
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [[alternative HTML version deleted]]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Sent from Gmail for IPhone
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Frank Romano Ph.D.
>>>>>
>>>>> Tel. +39 3911639149
>>>>>
>>>>> LinkedIn
>>>>> https://it.linkedin.com/pub/francesco-bryan-romano/33/1/162
>>>>>
>>>>> Academia.edu
>>>>> https://sheffield.academia.edu/FrancescoRomano
>>>>
>>>
>>>
>>>
>>> --
>>> Frank Romano Ph.D.
>>>
>>> Tel. +39 3911639149
>>>
>>> *LinkedIn*
>>> https://it.linkedin.com/pub/francesco-bryan-romano/33/1/162
>>>
>>> *Academia.edu*
>>> https://sheffield.academia.edu/FrancescoRomano
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> -----------------------------------------------------------------------
>>> Confidentiality Notice: This e-mail message, including any attachments,
>>> is for the sole use of the intended recipient(s) and may contain
>>> privileged and confidential information.  Any unauthorized review, use,
>>> disclosure or distribution is prohibited.  If you are not the intended
>>> recipient, please contact the sender by reply e-mail and destroy
>>> all copies of the original message.
>>
> 
> 
>



More information about the R-sig-mixed-models mailing list