[R-sig-ME] different overdispersion parameter for binomial GLMM in lme4, glmmADMB and glmmTMB

Muldoon, Ariel Ariel.Muldoon at oregonstate.edu
Fri Mar 16 17:31:59 CET 2018

Hi, Thomas-

After doing a little poking around, my impression is that the glmmTMB model may not be returning the correct values for the residuals for a binomial model.

I fit the model from the "Examples" in both glmmTMB (development version) and lme4, using the two different coding approaches that are often used for fitting binomial models (one with a matrix response and one with a proportion plus number of trials as weights).

The two lme4 models return identical results, so there wasn't much reason to pursue them both.  The second model (tmbm2) is like the model you are fitting.


tmbm1 = glmmTMB(cbind(incidence, size - incidence) ~ period + (1 | herd),
                  data = cbpp, family = binomial)

tmbm2 = glmmTMB(incidence/size ~ period + (1 | herd),
                  data = cbpp, weights = size, family = binomial)

tmbm3 = glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                  data = cbpp, family = binomial)

tmbm4 = glmer(incidence/size ~ period + (1 | herd),
                data = cbpp, family = binomial, weights = size)

Drilling down for reasons for residuals to be different, I found that the second model returns different fitted values than the others.  


I can't tell you why this is and I can't be certain the values from the other models are correct, but if I manually calculate the fitted values for the second model the results match the values from the other three models.  (You'll see slight differences between the lme4 and glmmTMB models due to slight differences in estimates of the conditional modes of the random effects.)

# Manual calculation of fitted values
plogis( (getME(tmbm2, "X") %*% fixef(tmbm2)$cond) + 
                 times = c(4L, 3L, 4L, 4L, 4L, 4L, 4L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L)) )

A good approach then might be to use the matrix-style response variable in glmmTMB.  However, the residuals.glmmTMB function then uses the matrix response when calculating residuals instead of a proportion response.

# Calculate response residuals for matrix response
model.response(tmbm1$frame) - fitted(tmbm1)
# Here is what these should be, which matches lme4
model.response(tmbm1$frame)[,1]/cbpp$size - fitted(tmbm1)
getME(tmbm4, "y") - fitted(tmbm4)

The last difference I found is in the calculation of the pearson residuals.  I believe glmmTMB uses mu*(1-mu) to calculate the variance that is used in the divisor (I base this on family(tmbm1)$variance).  However, I think lme4 uses the variance of a proportion, mu*(1-mu)/m (where m is the binomial sample size/number of trials). 

residuals(tmbm4, type = "pearson")
( getME(tmbm4, "y") - fitted(tmbm4) )/sqrt( fitted(tmbm3)*(1-fitted(tmbm3))/cbpp$size )

I'm not sure any of this helps you decide how to move forward, but I definitely wouldn't dismiss the overdispersion you are seeing in your lme4 model yet.


-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Thomas Merkling
Sent: Thursday, March 15, 2018 6:51 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] different overdispersion parameter for binomial GLMM in lme4, glmmADMB and glmmTMB

Hi all,

I'm trying to model proportion data using a binomial GLMM. I first started with lme4 and saw that my data were overdispersed.

LME4Inter <- glmer(PropInter ~ Manip + Observer  +(1|Date), family=binomial, weights = NbAg, data = senior, control=glmerControl(optimizer="bobyqa",optCtrl = list(maxfun=100000)))

Using the dispersion_glmer (provided by Ben Bolker, see below for code), I got a dispersion parameter of 1.99 (as expected the outcome is the same with glmmADMB).

I then tried to add an observation level random effect, and the dispersion parameter dropped to 0.24. With such a small dispersion parameter, my model becomes overly conservative and I'll probably not be able to detect any (true) effect (if I understand correctly). Would you stick to this model anyways?

I started investigating beta-binomial models to see if they could better model overdispersion (and not be too conservative) and wanted to use glmmTMB to do that.

I first tried to refit the binomial GLMM with glmmTMB and calculate the dispersion parameter using the dispfun function (also provided by Ben Bolker, see below for code), and got a very different response: the dispersion parameter was 0.514. If this is correct, I wouldn't have to worry about overdispersion.

TMBInter <- glmmTMB(PropInter ~ Manip + Observer  +(1|Date), family=binomial, weights = NbAg, data = senior)

The estimates and SEs were identical between the 3 packages, so I am fitting correctly the exact same models. However, the residuals between glmer/glmmADMB and glmmTMB models were quite different. Is that expected?
I tried to run Poisson GLMMs with the 2 packages and got identical dispersion parameters, so the dispfun function works well for glmmTMB too.

Which dispersion parameter value should I trust for the binomial GLMM? 
Is there something specific about binomial GLMMs in glmmTMB?

Thanks in advance for your help!

Thomas Merkling

dispersion_glmer <- function(model) {
   ## number of variance parameters in
   ##   an n-by-n variance-covariance matrix
   vpars <- function(m) {
   model.df <- sum(sapply(VarCorr(model),vpars))+length(fixef(model))
   rdf <- nrow(model.frame(model))-model.df
   rp <- residuals(model,type="pearson")
   Pearson.chisq <- sum(rp^2)
   prat <- Pearson.chisq/rdf
   pval <- pchisq(Pearson.chisq, df=rdf, lower.tail=FALSE)

dispfun <- function(m) {
   r <- residuals(m,type="pearson")
   n <- df.residual(m)
   dsq <- sum(r^2)

R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list