[R-sig-ME] glmmTMB and post hoc testing

Mon Jul 9 17:51:04 CEST 2018

On 2018-07-08 08:03 PM, Smith, Desmond wrote:
> Dear All,
> 
> I have previously tested a mixed model using lmer. However, since the
> dependent variable is count data, which is over-dispersed, it seems
> that I should use a negative binomial generalized linear mixed model
> (GLMM), such as glmmTMB.

  That is certainly one possibility.
http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#overdispersion
offers a variety of other alternatives.

  I haven't had a chance to look into the details below yet, but you
could try updating to a development branch of `glmmTMB` that has more
integrated emmeans support:

install_github("glmmTMB/glmmTMB/glmmTMB using effects")

(you would need compilation tools etc. installed).  After that check the
vignette - it may not be built/installed properly, but you can find it here:

https://github.com/glmmTMB/glmmTMB/blob/effects/glmmTMB/vignettes/model_evaluation.rmd

> 
> The full data frame is:
> 
> 
> dim(example)
> 
> [1] 115   7
> 
> 
> Here are the first six rows of the full 115 row data frame:
> 
> 
> head(example) fixed1 fixed2 random nested_random counts id
> log_total_counts 1         0      0      1             1    643  1
> 12.89582 1001      0      0      2             6    585  1
> 13.67509 2001      0      0      3            11    846  1
> 13.94209 3001      0      0      4            16    755  1
> 13.93056 4001      0      0      5            21   1428  1
> 13.65672 5001      0      0      6            26   1566  1
> 13.64421
> 
> str(example) 'data.frame': 115 obs. of  7 variables: $ fixed1
> : num  0 0 0 0 0 0 1 1 1 1 ... $ fixed2          : num  0 0 0 0 0 0 0
> 0 0 0 ... $ random          : num  1 2 3 4 5 6 1 2 3 4 ... $
> nested_random   : num  1 6 11 16 21 26 2 7 12 17 ... $ counts
> : int  643 585 846 755 1428 1566 309 719 1542 1446 ... $ id
> : int  1 1 1 1 1 1 1 1 1 1 ... $ log_total_counts: num  12.9 13.7
> 13.9 13.9 13.7 ...
> 
> There are six values of fixed1, plus a zero:
> 
> unique(example$fixed1)
> 
> [1] 0 1 2 3 4 6
> 
> 
> 
> The mean of the six values of fixed1 is
> 
> mean(c(1,2,3,4,6)) [1] 3.2
> 
> There are four values of fixed2, including zero:
> 
> unique(example$fixed2)
> 
> [1]  0 25 75  8
> 
> The mean of the four values of fixed1 is 
> mean(unique(example$fixed2)) [1] 27
> 
> 
> Here is the lmer model I used:
> 
> library(lme4)
> 
> m1 <- lmer(counts ~ fixed1*fixed2 + (1|random/nested_random) +
> offset(log_total_counts), data = example, verbose=FALSE)
> 
> Followed by the use of glht for post-hoc analysis:
> 
> library(multcomp)
> 
> glht_fixed1 <- glht(m1, linfct = c( "fixed1 == 0", "fixed1 +
> 8*fixed1:fixed2 == 0", "fixed1 + 25*fixed1:fixed2 == 0", "fixed1 +
> 75*fixed1:fixed2 == 0", "fixed1 + (27)*fixed1:fixed2 == 0"))
> 
> glht_fixed2 <- glht(m1, linfct = c( "fixed2 + 1*fixed1:fixed2 == 0", 
> "fixed2 + 2*fixed1:fixed2 == 0", "fixed2 + 3*fixed1:fixed2 == 0", 
> "fixed2 + 4*fixed1:fixed2 == 0", "fixed2 + 6*fixed1:fixed2 == 0", 
> "fixed2 + (3.2)*fixed1:fixed2 == 0"))
> 
> glht_omni <- glht(m1)
> 
> Here is the corresponding negative binomial glmmTMB model:
> 
> library(glmmTMB)
> 
> m2 <- glmmTMB(counts ~ fixed1*fixed2 + (1|random/nested_random) +
> offset(log_total_counts), data = example, verbose=FALSE,
> family="nbinom2")
> 
> According to this suggestion by Ben Bolker
> (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2017q3/025813.html),
> the best approach to post hoc testing with glmmTMB is to use lsmeans
> (or its more recent equivalent, emmeans).
> 
> After running
> 
> source(system.file("other_methods","lsmeans_methods.R",package="glmmTMB"))
>
>  I can use lsmeans on the glmmTMB object. For example,
> 
> as.glht(emmeans(m1,~(week + 27*week:conc)))
> 
> General Linear Hypotheses
> 
> Linear Hypotheses: Estimate 3.11304347826087, 21 == 0   -8.813
> 
> But this does not seem correct. What I cannot figure out, for the
> life of me, is how to correctly carry over the use of linfct to the
> lsmeans scenario on the glmmTMB. I have read all the manuals and
> vignettes until I am blue in face (or it feels that way, at least),
> but I am still at a loss. In my defense (culpability?) I am a
> statistical tyro, so many apologies if I am asking a question with
> very obvious answers here.
> 
> The glht software and post hoc testing works easily with the glmmADMB
> package, but the glmmADMB software is 10x slower than glmmTMB. I need
> to perform multiple runs of this analysis, each with 300,000 examples
> of the mixed model, so speed is essential.
> 
> Many thanks for your suggestions or help!
> 
> Cheers,
> 
> Des
> 
> 
> 
> .
> 
> ________________________________
> 
> UCLA HEALTH SCIENCES IMPORTANT WARNING: This email (and any
> attachments) is only intended for the use of the person or entity to
> which it is addressed, and may contain information that is privileged
> and confidential. You, the recipient, are obligated to maintain it in
> a safe, secure and confidential manner. Unauthorized redisclosure or
> failure to maintain confidentiality may subject you to federal and
> state penalties. If you are not the intended recipient, please
> immediately notify us by return email, and delete this message from
> your computer.
> 
> [[alternative HTML version deleted]]
> 
> _______________________________________________ 
> R-sig-mixed-models using r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>