[R-sig-ME] Anova()'s Wald chisquare test for predictors with df > 1 in a logistic GLMM

Thu Apr 18 18:18:45 CEST 2019

> I'm trying to figure out how the Wald chi-square test is performed for a
> nominal predictor with more than two levels in the context of a logistic
> GLMM.

Keep in mind that Wald tests and LRTs for GLMMs come with health warnings, e.g. 
https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#tests-of-effects-i.e.testing-that-several-parameters-are-simultaneously-zero

> But how are they calculated for a multi-level nominal predictor?
> 
> My (un)educated guess is that for each level of the nominal variable, the
> z-score is squared and the sum of these squares compared to the right-tail
> probability of the chi-squared distribution with DF equal to the number of
> levels of the predictor minus one. And indeed this square-the-z approach
> seems to correctly reproduce the results of Anova() for predictors with a
> single degree of freedom. But I can't make it reproduce the results of
> Anova() for predictors with more than one level. Hence my question: how is
> the test statistic calculated?

When the null hypothesis is that more than one fixed effect is zero (as with a factor with >2 levels), the covariances of the fixed effects have to be be taken into account. The Wald statistic is

    t(R %*% b) %*% solve(R %*% vc %*% t(R)) %*% (R %*% b)

where R is a restriction matrix that picks out the contrasts you're interested in, b is the fixed effect estimates vector, and vc is the estimated covariance matrix of the fixed effects (and we’re assuming that as is usual the null hypothesis values of b are all zero). In the case where you’re testing a multilevel categorical variable, R %*% b is simply the vector of estimates for the k-1 non-reference levels and R %*% vc %*% t(R) is the k-1 X k-1 block of the covariance matrix for those k-1 fixed effects. If the covariances were all zero (e.g. by design — but I don’t think they can be for multilevel factors), the Wald statistic formula above would reduce to your sum of squares of z method.

Below is an binomial glmer example with a Wald test function I found on the web years ago. It gives the same result as car::Anova but it’s easier to see the inner workings.

Best wishes,
Paul

# wald z-test function 
# (adapted from Stijn Ruiter's function http://stijnruiter.nl/blog/?p=309 - broken link)

Wald<-
  function(object, # lme4 or glmmTMB fit
           R, # restrictions matrix. null hypothesis: R %*% fixef(object) = q
           q = NULL, # null values. by default q is a vector of zeroes
           gmmTMB.model = "cond")
  {
    require(lme4)
    if (!is.matrix(R)) stop("Restrictions must be a matrix")
    if(is.null(q)) q <- rep(0, nrow(R))
    b <- fixef(object)
    if(class(b) == "fixef.glmmTMB") b <- b[[gmmTMB.model]]
    vc <- vcov(object)
    if("vcov.glmmTMB" %in% class(vc)) vc <- vc[[gmmTMB.model]]
    w <- t(R %*% b - q) %*% solve(R %*% vc %*% t(R)) %*% (R %*% b - q)
    pw <- pchisq(w[1], length(q), lower.tail = FALSE)
    cat("*************\n* Wald Test *\n*************\n")
    cat("lme4 fixed effects:\n")
    print(fixef(object))
    cat("\nRestrictions:\n")
    print(R)
    cat("\nq = ",q)
    cat("\nChi-square:", round(w[1],3), " df = ", length(q))
    cat("\nProb x>chisq:", round(pw, 5), "\n")
    return(pw)
  }

# first example from ?glmer
library(lme4)
library(car)
(gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
              data = cbpp, family = binomial))

R <- cbind(0, diag(3))
# ...but ideally you want to generate the restriction matrix automatically

Wald(gm1, R = R)
anova(gm1, update(gm1, ~ . - period))
Anova(gm1)