[R-sig-ME] Binomial model R^2 and ICC calculation with lme4
Maurits Van Zinnicq Bergmann
mauritsvzb at gmail.com
Thu Oct 23 12:41:39 CEST 2014
Dear list,
Im trying to calculate the marginal and conditional R^2 for a binomial GLMM using the lme4 package. Some googling a while ago brought me to the site of Jonathan Lefcheck (http://jonlefcheck.net/2013/03/13/r2-for-linear-mixed-effects-models/) which offers exactly this. However, more recently I became aware of a recent paper by Dr. Paul Johnson (http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12225/abstract) that explains the calculation of R^2 for random slope models and points to the r.squaredGLMM function in the MuMIn package.
Unfortunately I experience some inconsistencies and problems when comparing R^2 values of three different GLMMs 1) random intercept mixed model, 2) mixed model with random covariance between intercept and slope, and 3) a mixed model without random covariance between intercept and slope, using the 2 methods:.
1) a comparison of marginal and conditional R^2 estimates between Lefchecks method and the MuMIn package showed consistent marginal R^2 estimates for the first and the second GLMM, whereas the marginal R^2 values were lower using Lefchecks method. The third mixed model showed both different conditional and marginal R^2. Could anyone explain why I obtain different R^2 estimates when comparing the 2 methods,
2) When I calculate R^2 values using the code from the supporting information with Dr. Johnsons paper (slightly modified to reflect a binomial GLMM instead of poison LMM) I obtain different values when compared to the values obtained from the MuMIn package. If I calculated the variance components correctly, they should return identical R^2 values. While I dont have to bother with the mistake(s) I made in the code to obtain R^2 values (I can just use the package instead), I would like to know how to correctly calculate the variance components needed to calculate the ICC over the whole sample of all three models.
Thank you for any help.
Maurits van Zinnicq
# random intercept model
m1 <- glmer(cbind(df$Valid.detections, df$Missed.detections) ~ distance +
Habitat + Replicate + transmitter.depth + receiver.depth +
wind.speed + wtc + Transmitter + (1 | Unit) +
(1 | SUR.ID) + distance:Transmitter +
distance:Habitat + distance:transmitter.depth + distance:receiver.depth +
distance:wind.speed, data = df, family = binomial(link=logit),control=glmerControl(calc.derivs=F))
# random slope model
m2 <- glmer(cbind(df$Valid.detections, df$Missed.detections) ~ distance +
Habitat + Replicate + transmitter.depth + receiver.depth +
wind.speed + wtc + Transmitter + (1 | Unit) +
(distance | SUR.ID) + distance:Transmitter +
distance:Habitat + distance:transmitter.depth + distance:receiver.depth +
distance:wind.speed, data = df, family = binomial(link=logit),control=glmerControl(calc.derivs=F))
# random slope model without random covariance between slope/intercept
m3 <- glmer(cbind(df$Valid.detections, df$Missed.detections) ~ distance +
Habitat + Replicate + transmitter.depth + receiver.depth +
wind.speed + wtc + Transmitter + (1 | Unit) +
(1 | SUR.ID) + (0 + distance | SUR.ID) + distance:Transmitter +
distance:Habitat + distance:transmitter.depth + distance:receiver.depth +
distance:wind.speed, data = df, family = binomial(link=logit),control=glmerControl(calc.derivs=F))
# adjusted code from Paul Johnson to reflect binomial GLMM
# Extract the variance components require for the R-squared statistics,
# starting with the better-fitting random slopes model.
# First we need the design matrix (X), the number of observations (n)
# and the fixed effect estimates (Beta)
X <- model.matrix(m1)
n <- nrow(X)
Beta <- fixef(m1)
# First the fixed effects variance (eqn 27 of Nakagawa & Schielzeth):
Sf <- var(X %*% Beta)
# Second, the list of covariance matrices of the random effects.
# Here the list has only a single element because there is only
# one level of random effects.
(Sigma.list <- VarCorr(m1))
# Use equation 11 in the paper to estimate the random effects variance
# component.
# Using sapply ensures that the same code will work when there are multiple
# random effects (i.e. where length(Sigma.list) > 1)
Sl <-
sum(
sapply(Sigma.list,
function(Sigma)
{
Z <-X[,rownames(Sigma)]
sum(diag(Z %*% Sigma %*% t(Z)))/n
}))
# As this model is an binomial GLMM, the additive dispersion variance, Se, is
# usually fixed to 1 for computational reasons, as additive disperson is not identifiable (Nakagawa & Schielzeth 2013)
Se <- 1
# Finally, the distribution-specific variance, Sd, is pi^2/3 for binomial GLMMs:
Sd <- pi^2/3
# Use eqns 29 and 30 from Nakagawa & Schielzeth to estimate marginal and
# conditional R-squared:
total.var <- Sf + Sl + Se + Sd
(Rsq.m <- Sf / total.var)
(Rsq.c <- (Sf + Sl) / total.var)
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list