[R-sig-ME] GLMER vs. MCMCglmm - difference in model results - Poisson model with offset term

Thu May 23 16:50:23 CEST 2019

Dear everyone,

I have a question regarding the comparison of Generalized Linear Mixed 
Models (GLMM - using the lme4 - package) and Markov-Chain Monte-Carlo 
Generalized Mixed Models (MCMCglmm - using the MCMCglmm-package).

I am running mixed models using both packages. While the estimates 
provided by each approach are relatively comparable, the significance 
levels differ strongly. My main questions are the following:

1. How can the divergence in results be explained and why doe both 
methods provide different results even though the model structure is the 
same? (Note: diagnostic plots for both models look fine)

2. I am aware that there is strong debate about the validity of p-values 
within GLMMs. Is there similar concern regarding MCMCglmm?

3. What would be a sensible interpretation of both model outputs 
especially with respect to PRED2 and the interaction term.

Below I provide a short description and example output:

I want to test the effect of 3 continuous predictors (PRED2, PRED3, 
PRED4) on a proportional response (RESP). PRED2 and PRED3 enter the 
model as an interaction effect (PRED2*PRED3). The response is modeled 
using a Poisson error structure with a log-link function and the 
nominator of the ratio is therefore included as an offset predictor 
(offset(log(PRED1))).

MODEL1 -  GLMM using the lme4-package:

MOD1 <- glmer(RESP ~ offset(log(PRED1)) + PRED2 * PRED3 + PRED4 + 
(1|RANEF), family = "poisson", data = data, 
control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000)))

summary(MOD1)

Generalized linear mixed model fit by maximum likelihood (Laplace 
Approximation) ['glmerMod'] Family: poisson ( log ) Formula: RESP ~ 
offset(log(PRED1)) + PRED2 * PRED3 + PRED4 + (1 | RANEF) Data: data 
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 
1e+05)) AIC BIC logLik deviance df.resid 2451.3 2470.9 -1219.6 2439.3 
190 Scaled residuals: Min 1Q Median 3Q Max -10.8175 -1.2477 -0.1913 
1.1076 11.0389 Random effects: Groups Name Variance Std.Dev. RANEF 
(Intercept) 0.1519 0.3898 Number of obs: 196, groups: Order, 56 Fixed 
effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.29896 
0.06250 -52.784 < 2e-16 *** PRED2 0.09089 0.01850 4.912 9.00e-07 *** 
PRED3 0.57262 0.02375 24.114 < 2e-16 *** PRED4 0.54726 0.01808 30.275 < 
2e-16 *** PRED2:PRED3 0.12190 0.01821 6.693 2.19e-11 *** --- Signif. 
codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of 
Fixed Effects: (Intr) PRED2 PRED3 PRED4 PRED2 -0.142 PRED3 -0.056 0.102 
PRED4 -0.042 0.135 -0.155 PRED2:PRED3 0.001 0.017 -0.678 0.273

MODEL2 -  MCMCglmm using the lme4-package:

Now I build the same model using a MCMCglmm using weak informative 
priors. The offset term was implemented following the suggestion by 
Jarrod Hadfield in this post 
(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2017q3/026016.html). 
See the model formula and summary below:

prior <- list(B=list(V=diag(6)*10, mu=c(0,1,0,0,0,0)),
                          G =list(G1=list(V=1,nu=0.002)),
                          R=list(V=1,nu=0.002))
prior$B$V[2,2] <- 1e-9 # set fixed effect prior for the offset variable to 1
prior

PGLMM.MOD1 <- MCMCglmm(RESP ~ log(PRED1) + PRED2 * PRED3 + PRED4,
                                         random = ~ RANEF,
                                         family="poisson",
                                         prior = prior,
                                         scale = T,
                                         pr=TRUE,
                                         data = data,
                                         nitt = 500000, burnin = 20000, 
thin = 200)

summary(PGLMM.MOD1)

Iterations = 20001:499801 Thinning interval = 200 Sample size = 2400 
DIC: 1164.591 G-structure: ~Order post.mean l-95% CI u-95% CI eff.samp 
RANEF 0.02264 0.0002387 0.0762 2400 R-structure: ~units post.mean l-95% 
CI u-95% CI eff.samp units 0.2788 0.1946 0.3705 2400 Location effects: 
RESP ~ log(PRED1) + PRED2 * PRED3 + PRED4 post.mean l-95% CI u-95% CI 
eff.samp pMCMC (Intercept) -3.40280 -3.50958 -3.28826 2400 <4e-04 *** 
log(PRED1) 1.00000 0.99994 1.00006 2257 <4e-04 *** PRED2 0.08369 
-0.01995 0.18120 2400 0.113 PRED3 0.63058 0.52164 0.73889 2400 <4e-04 
*** PRED4 0.60488 0.50565 0.69963 2400 <4e-04 *** PRED2:PRED3 0.06882 
-0.02363 0.17414 2258 0.167 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 
‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

As you can see from the different summaries, the estimates are fairly 
similar with some divergence especially regarding PRED2 and the 
interaction term. Additionally the provided p-values differ 
dramatically, with the GLMM (lme4) being highly significant for PRED2 
and the interaction wheras non-significance for the MCMCglmm.

My questions now are:

1. How can the divergence in results be explained and why doe both 
methods provide different results even though the model structure is the 
same? (Note: diagnostic plots for both models look fine)

2. I am aware that there is strong debate about the validity of p-values 
within GLMMs. Is there similar concern regarding MCMCglmm?

3. What would be a sensible interpretation of both model outputs 
especially with respect to PRED2 and the interaction term.

Any insight and help on that matter would be highly appreciated.

Best regards,

Bernd

-- 
Bernd Lenzner PhD

Researcher
University of Vienna
Department of Botany and Biodiversity Research
Division of Conservation Biology, Vegetation and Landscape Ecology
Rennweg 14
A-1030 Vienna

	[[alternative HTML version deleted]]