[R-sig-ME] Replicating type III anova tests for glmer/GLMM

Tue Feb 23 21:10:08 CET 2016

Dear Francesco,

For a 1-df test, the Wald chi-square is just Z^2, but the chi-square is more general. When a term in the model has more than 1 df, there is more than one beta (hat) and one SE (and covariances) for the coefficients in the term. If you want to see the individual coefficient estimates, then summary(mod) will show you each coefficient estimate, the SE for each estimate, Z, and p. Why one would want to look at the individual effect-coded coefficients and tests in this context escapes me. 

Best,
 John

> -----Original Message-----
> From: Francesco Romano [mailto:francescobryanromano at gmail.com]
> Sent: February 23, 2016 2:34 PM
> To: Fox, John <jfox at mcmaster.ca>
> Cc: Emmanuel Curis <emmanuel.curis at parisdescartes.fr>; r-sig-mixed-
> models at r-project.org
> Subject: Re: [R-sig-ME] Replicating type III anova tests for glmer/GLMM
> 
> John,
> 
> I tried the Anova() function in the car package implemented with contr.sum()
> but it doesn't produce beta, SE, z, and p.
> To be more precise, R requires that either the F or Chi sq statistic be used.
> The model I used was termed "mod", here is the error:
> 
> 
> > Anova(mod, type=c("III"),
> +     test.statistic=c("LR"))
> Error in match.arg(test.statistic) : 'arg' should be one of “Chisq”, “F”
> 
> Chi square produces the following output:
> 
> > Anova(mod, type=c("III"),
> +     test.statistic=c("Chisq"))
> Analysis of Deviance Table (Type III Wald chisquare tests)
> 
> Response: Correct
>                               Chisq Df Pr(>Chisq)
> (Intercept)                 67.7409  1  < 2.2e-16 ***
> Syntax                       0.2856  1   0.593083
> Animacy                      6.2575  1   0.012367 *
> Prof.group.2                 2.9888  2   0.224379
> Syntax:Animacy               0.0970  1   0.755521
> Syntax:Prof.group.2          9.3054  2   0.009536 **
> Animacy:Prof.group.2         4.7633  2   0.092399 .
> Syntax:Animacy:Prof.group.2  1.3704  2   0.503997
> 
> So I still don't know how Raffrey et al. reported beta, SE, z, and p for a main
> effect of factor with 4 levels.
> If reviewers ask me to do this, I will argue that reporting chi square tests with
> corresponding p-values is a more accurate way of reporting main effects and
> interactions.
> 
> 
> If I haven't abused enough of your time, it would be beneficial to understand
> which of the two
> methods suggested by Henrik I should adopt. I attach my data.
> 
> The predictors of interest are Syntax (2 levels), Animacy (2 levels),
> Prof.group.2 (3 levels),
> and the outcome 'correct', while the random effects are 'Part.name' and
> 'Item'. The best model fit is a
> bglmer with glmerControl(optimizer = "bobyqa") and nAGQ=1
> 
> > summary(recallmodel4bisB3)
> Cov prior  : Part.name ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov,
> common.scale = TRUE)
>            : Item ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov,
> common.scale = TRUE)
> Prior dev  : 1.3565
> 
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['bglmerMod']
>  Family: binomial  ( logit )
> Formula: Correct ~ Syntax * Animacy * Prof.group.2 + (1 | Part.name) +      (1
> | Item)
>    Data: recall
> Control: glmerControl(optimizer = "bobyqa")
> 
>      AIC      BIC   logLik deviance df.resid
>    313.3    372.9   -142.6    285.3      509
> 
> Scaled residuals:
>     Min      1Q  Median      3Q     Max
> -1.3517 -0.2926 -0.1802 -0.1137  9.3666
> 
> Random effects:
>  Groups    Name        Variance Std.Dev.
>  Part.name (Intercept) 0.8046   0.8970
>  Item      (Intercept) 0.5031   0.7093
> Number of obs: 523, groups:  Part.name, 42; Item, 16
> 
> Fixed effects:
>                                        Estimate Std. Error z value Pr(>|z|)
> (Intercept)                             -0.8960     0.6317  -1.418 0.156071
> Syntaxs                                 -2.0713     0.9447  -2.193 0.028342 *
> Animacy+AN -AN                          -3.0539     1.2548  -2.434 0.014941 *
> Prof.group.2int                         -2.5594     0.9473  -2.702 0.006898 **
> Prof.group.2ns                          -1.8673     0.7634  -2.446 0.014442 *
> Syntaxs:Animacy+AN -AN                   1.8642     1.8202   1.024 0.305750
> Syntaxs:Prof.group.2int                  4.1704     1.1676   3.572 0.000355 ***
> Syntaxs:Prof.group.2ns                   2.4244     1.0483   2.313 0.020736 *
> Animacy+AN -AN:Prof.group.2int           3.0067     1.5528   1.936 0.052824 .
> Animacy+AN -AN:Prof.group.2ns            1.3245     1.6071   0.824 0.409848
> Syntaxs:Animacy+AN -AN:Prof.group.2int  -2.2056     2.0550  -1.073 0.283162
> Syntaxs:Animacy+AN -AN:Prof.group.2ns   -2.3249     2.3108  -1.006 0.314360
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> 
> Henrik's first method via afex:mixed leads to:
> 
> > m4<-mixed(recallmodel4bisB3, data = recall, family = binomial, method =
> "LRT")
> Formula (the first argument) converted to formula.
> Fitting 8 (g)lmer() models:
> (8 warnings omitted)
> 
> > anova(m4)
> Mixed Model Anova Table (Type 3 tests)
> 
> Model: Correct ~ Syntax * Animacy * Prof.group.2 + (1 | Part.name) +
> Model:     (1 | Item)
> Data: recall
> Df full model: 14
>                             Df   Chisq Chi Df Pr(>Chisq)
> Syntax                      13  5.5659      1   0.018313 *
> Animacy                     13  8.4710      1   0.003609 **
> Prof.group.2                12 10.5099      2   0.005222 **
> Syntax:Animacy              13  0.9832      1   0.321400
> Syntax:Prof.group.2         12 15.8094      2   0.000369 ***
> Animacy:Prof.group.2        12  3.9188      2   0.140945
> Syntax:Animacy:Prof.group.2 12  1.2240      2   0.542272
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> The result is a main effect of Syntax, Animacy, Prof.group.2, and interaction
> between Syntax and Prof.Group.2. The summary(m4) is perfectly
> interpretable.
> 
> Henrik's second method yields:
> 
> *set contrasts*
> >Syntax01 <- as.factor(1*(recall$Syntax=="of") + 2*(recall$Syntax=="''s"))
> 
> >Animacy01 <- as.factor(1*(recall$Animacy=="-AN +AN") +
> 2*(recall$Animacy=="+AN -AN"))
> 
> >Group012 <- as.factor(1*(recall$Prof.group.2=="adv") +
> 2*(recall$Prof.group.2=="int") + 3*(recall$Prof.group.2=="ns"))
> >contrasts(Syntax01) <- contr.sum
> 
> >contrasts(Animacy01) <- contr.sum
> 
> >contrasts(Group012) <- contr.sum
> 
> 
> *try second method*
> > m5 <- bglmer (Correct~Syntax01 * Animacy01 * Group012 + (1 | Part.name)
> +      (1 | Item), data = recall, control = glmerControl(optimizer = "bobyqa"),
> nAGQ=1, family=binomial, expand_re= T)
> Warning message:
> extra argument(s) ‘expand_re’ disregarded
> > car::Anova(m5, type = 3)
> Analysis of Deviance Table (Type III Wald chisquare tests)
> 
> Response: Correct
>                               Chisq Df Pr(>Chisq)
> (Intercept)                 67.7409  1  < 2.2e-16 ***
> Syntax01                     0.2856  1   0.593083
> Animacy01                    6.2575  1   0.012367 *
> Group012                     2.9888  2   0.224379
> Syntax01:Animacy01           0.0970  1   0.755521
> Syntax01:Group012            9.3054  2   0.009536 **
> Animacy01:Group012           4.7633  2   0.092399 .
> Syntax01:Animacy01:Group012  1.3704  2   0.503997
> 
> The result this time is a main effect of what was Animacy and interaction
> between what was Syntax and Prof.Group.2 ?!
> 
> The summary(m5) is perfectly interpretable.
> 
> 
> 
> On Tue, Feb 23, 2016 at 6:17 PM, Fox, John <jfox at mcmaster.ca
> <mailto:jfox at mcmaster.ca> > wrote:
> 
> 
> 	Dear Emmanuel,
> 
> 	First, the relevant linear hypothesis is for several coefficients
> simultaneously -- for example, all 3 coefficients for the contrasts
> representing a 4-level factor -- not for a single contrast. Although it's true
> that any linear combination of parameters that are 0 is 0, the converse isn't
> true. Second, for a GLMM, we really should be talking about type-III tests not
> type-III sums of squares.
> 
> 	Type-III tests are dependent on coding in the full-rank
> parametrization of linear (and similar) models used in R, to make the tests
> correspond to reasonable hypotheses. The invariance of type-II tests with
> respect to coding is attractive, but shouldn't distract from the fundamental
> issues, which are the hypotheses that are tested and the power of the tests.
> 
> 	Best,
> 	 John
> 
> 	> -----Original Message-----
> 	> From: Emmanuel Curis [mailto:emmanuel.curis at parisdescartes.fr
> <mailto:emmanuel.curis at parisdescartes.fr> ]
> 	> Sent: February 23, 2016 11:50 AM
> 	> To: Fox, John <jfox at mcmaster.ca <mailto:jfox at mcmaster.ca> >
> 	> Cc: Francesco Romano <francescobryanromano at gmail.com
> <mailto:francescobryanromano at gmail.com> >; r-sig-mixed-
> 	> models at r-project.org <mailto:models at r-project.org>
> 	> Subject: Re: [R-sig-ME] Replicating type III anova tests for
> glmer/GLMM
> 	>
> 
> 	> Dear Pr Fox,
> 	>
> 	> Thanks for your precision. But to summarize this test of, let's say 3
> 	> parameters to 0 for a 4-levels factor, by a single value with its SE, as
> 	> mentionned in Francesco's mail, the linear combination of these
> parameters
> 	> that is practically tested by this sum of square is needed, isn't it ?
> 	>
> 	> I mean, if really the parameters are all 0, whatever linear
> combination could
> 	> do the job, but type III sum of square just tests one of all possible
> linear
> 	> combinations, right?
> 	>
> 	> By the way, I was always very annoyed by the fact that Type III sum
> of
> 	> squares are so dependent on coding, but that's another debate...
> 	>
> 	> Best regards,
> 	>
> 	> On Tue, Feb 23, 2016 at 04:15:02PM +0000, Fox, John wrote:
> 	> < Dear Emmanuel,
> 	> <
> 	> < With proper contrast coding (i.e., a coding that's orthogonal in the
> *basis*
> 	> of the design, such as provided by contr.sum() ), a "type-III" test is
> just a test
> 	> that the corresponding parameters are 0. The models in question
> are
> 	> generalized linear (mixed) models and so sums of squares aren't
> really
> 	> involved, but one could do the corresponding Wald (like
> car::Anova) or LR
> 	> test. The Wald test is what you'd get with multcomp:glht or
> 	> car:linearHypothesis. BTW, I don't think that it would be hard for
> car::Anova
> 	> to be extended to provide LR tests in this case.
> 	> <
> 	> < Best,
> 	> <  John
> 	>
> 	> --
> 	>                                 Emmanuel CURIS
> 	>                                 emmanuel.curis at parisdescartes.fr
> <mailto:emmanuel.curis at parisdescartes.fr>
> 	>
> 	> Page WWW: http://emmanuel.curis.online.fr/index.html
> 
> 
> 
> 
> 
> --
> 
> Frank Romano Ph.D.
> 
> Tel. +39 3911639149
> 
> 
> LinkedIn
> https://it.linkedin.com/pub/francesco-bryan-romano/33/1/162
> 
> 
> Academia.edu
> https://sheffield.academia.edu/FrancescoRomano