[R-sig-ME] Truncated Negative Binomial Model Unexpected Marginal Means
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Tue Feb 15 17:06:58 CET 2022
A quick test suggests that emmeans is predicting the response based on
the mean of the *un*truncated distribution (I don't remember and/or
haven't looked into all of the guts of emmeans). Don't know if Russ
Lenth (emmeans maintainer) is reading ...
n <- 1000
dd <- data.frame(f = factor(rep(1:2, each = n)))
gb <- log(c(2,4))
set.seed(101)
dd <- transform(dd, y = rnbinom(2*n, mu = exp(gb[f]), size = 2))
dd2 <- subset(dd, y > 0)
## un-truncated means
aggregate(y ~ f, data = dd, FUN = mean)
## f y
## 1 1 2.047
## 2 2 3.917
## truncated means
aggregate(y ~ f, data = dd2, FUN = mean)
## f y
## 1 1 2.781250
## 2 2 4.446084
library(glmmTMB)
library(emmeans)
m1 <- glmmTMB(y ~ f, family = truncated_nbinom2, data = dd2)
## doesn't match exactly but close to untruncated means
emmeans(m1, ~ f, type = "response")
## f response SE df lower.CL upper.CL
## 1 2.15 0.0891 1614 1.99 2.34
## 2 3.98 0.1262 1614 3.74 4.23
## matches means exactly
m2 <- glmmTMB(y ~ f, family = nbinom2, data = dd)
emmeans(m2, ~ f, type = "response")
## f response SE df lower.CL upper.CL
## 1 2.05 0.0651 1997 1.92 2.18
## 2 3.92 0.1094 1997 3.71 4.14
On 2/15/22 10:04 AM, Alex Waldman wrote:
> Dear All,
>
> Hope all is well! This may be a naïve question but I am running a hurdle negative binomial model to look at the differences in counts of differing types in different locations. My major interest is the conditional model (ie when counts are above 0).
>
> I run the following code:
>
> model<-glmmTMB(Count ~ Location*Type + (1 | ID), zi=~Location*Type + (1|ID), data=data, family="truncated_nbinom1",control=glmmTMBControl(optimizer=optim, optArgs=list(method="BFGS")))
>
> var.corr <-VarCorr(model)
>
> Conditional model:
> Groups Name Std.Dev.
> ID (Intercept) 0.37105
>
> Zero-inflation model:
> Groups Name Std.Dev.
> ID (Intercept) 1.3207
>
> emmeans <- emmeans(model, ~ Location*Type, type="response", sigma=0.37105, bias.adjust=TRUE)
>
> Location Type response SE df lower.CL upper.CL
> 0 0 1.117 0.277 631 0.687 1.82
> 1 0 0.940 0.251 631 0.556 1.59
> 2 0 0.893 0.266 631 0.498 1.60
> 0 1 1.325 0.254 631 0.909 1.93
> 1 1 1.090 0.248 631 0.698 1.70
> 2 1 1.452 0.300 631 0.967 2.18
>
> Confidence level used: 0.95
> Intervals are back-transformed from the log scale
> Bias adjustment applied based on sigma = 0.37105
>
> However, I’m not sure why the estimated means and confidence intervals will include values below 1 in the conditional model as I anticipated these values would represent the average number of non-zero counts? Is there something I may be doing wrong or not understanding?
>
> Thanks in advance for your help!
>
> Warm Regards,
> Alex
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
More information about the R-sig-mixed-models
mailing list