[R-sig-ME] GLMM post-hoc analysis with bootstrapping

Mon Oct 28 09:33:27 CET 2024

Hello all
I hope you are all doing well!
I was also hoping someone could perhaps help me out with the following:
For an ecological study, we�re interested in comparing the diversity of different organism groups between one land use type (let�s call it �type 1�) and three other land use types.
Our model is: Diversity ~ landuse + (1|location)
Currently, I am using the emmeans-package to conduct the post-hoc analysis and thereby obtain

  1.  Estimated marginal means of the diversity per land use type and their 95% confidence intervals, transformed to the response scale (with a bias adjustment for the random effects)
  2.  Contrasts between the emmeans (more specifically ratios, in the response scale) and the 95%CI thereof, with a Dunnett-like adjustment for multiple comparisons
  3.  P-values for the pairwise comparisons between the estimated mean of land use type 1 and the three other land use types, tests performed on the log-scale, with a Dunnet-like adjustment for multiple comparisons

However, as in the emmeans package Wald-intervals and -tests are used, I was hoping I could shift to an approach that uses bootstrap intervals which would be a bit more reliable for my small sample sizes.
Since I don�t manage to figure out how to code this to obtain the same output as above (a-c), I was hoping someone could help me with this.
Below is a minimal reproducible example for how I currently approach the post-hoc analysis.
It is not the best example given the singularity and really low random effect variance, but I hope this doesn�t bother for this question. Otherwise I could provide another dataset.
Thank you very much in advance for any help!
Lieke

# example data (D1 stands for the Shannon diversity)
example_df <- data.frame(
  code = c(1:88),
  landuse = c(rep("type1", 28), rep("type2", 24),
              rep("type3", 24), rep("type4", 12)),
  location = c(rep(letters[c(1,2,2:13)], each = 2),
               rep(letters[c(1,3:13)], each = 2),
               rep(letters[c(1:6,8:13)], each = 2),
               rep(letters[c(5,6,7,9,10,12)], each = 2)),
  D1 = c(2.59, 2.23, 1, 1, 1, 2.46, 2.94, 1, 2.94, 1.71,
          3.92, 2.94, 0, 3.78, 1, 2, 1.3, 1, 0, 0, 0, 3.75,
          0, 1, 0, 2, 2.38, 2, 1.51, 2.38, 1.89, 0, 2.97,
          1.18, 3.26, 5.86, 3.59, 1, 4.63, 2.94, 0, 0, 2.7,
          3.26, 2, 3.79, 0, 1, 4, 0, 2.8, 2, 0, 0, 1.7, 1,
          2, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 0.98, 0, 0, 0, 1, 0,
          0, 1.38, 2, 0, 0, 0, 1.05, 0, 0, 1, 1, 1.16, 0, 2, 1)
)

# tweedie model
model <- glmmTMB(D1 ~ landuse + (1|location),
                 family = tweedie,
                 data = example_df)

# get "sigma"/extra dispersion necessary for bias adjustment (for random effects) when transforming estimates and intervals from log to response scale
extra_disp <- sqrt(sum(insight::get_variance(model)[["var.intercept"]]))
### singularity --> can't compute random effect variances reliably, but hope this is OK for the example
# a) get estimated marginal means and their 95%CI (Wald here) at the response scale
(model.emm <- emmeans(model, ~ landuse, type = "response",
                       bias.adjust = TRUE, sigma = extra_disp))
# b) get contrasts and their 95%CI at the response scale, with a Dunnet-like adjustment for multiple comparisons
(model.ratios <- confint(contrast(emmeans(model, ~ landuse),
                                     type = "response",
                                     bias.adjust = TRUE, sigma = extra_disp,
                                     "trt.vs.ctrl", ref = 1, reverse = TRUE)))

# c) get p-values for contrasts (tests performed on the log scale)
contrast(emmeans(model, ~ landuse), "trt.vs.ctrl", ref = 1, reverse = TRUE)

	[[alternative HTML version deleted]]