[R-sig-ME] Truncated Negative Binomial Model Unexpected Marginal Means
Alex Waldman
@|ex@w@|dm@n @end|ng |rom @jc@ox@@c@uk
Tue Feb 15 20:20:02 CET 2022
Thanks, Ben!
I've copied a response from Russ:
Support for emmeans for glmmTMB models is in the glmmTMB package. However, I believe those estimates are based on the linear predictor for the conditional model only. That is, these are estimates of the mean of the fitted negative binomial distribution before it is truncated. Perhaps Ben can confirm this.
I think estimated means for hurdle and zero-inflated models are very difficult to interpret without taking into account both parts of the model; and, unfortunately, it is a somewhat complicated matter to do so that is not currently implemented. Seems worthwhile to try, though. One would have to include all the factors that affect both models; then from the truncated part, estimate its mean M and its probability of zero PT0. And from the ZI part, estimate its probability of zero PZ0. Then the estimated mean of the response is M * (1 - PZ0) / (1 - PT0), unless I made some kind of stupid mistake. The tricky part is estimating the SE of this result.
In addition, here is some example data attached. I ran the following code:
data<-read.csv("Example.csv",header=T)
as.factor(data$ID)->data$ID
as.factor(data$Location)->data$Location
as.factor(data$Type)->data$Type
model<-glmmTMB(Count ~ LocationType + (1 | ID), zi=~LocationType + (1|ID), data=data, family="truncated_nbinom1",control=glmmTMBControl(optimizer=optim, optArgs=list(method="BFGS")))
var.corr <-VarCorr(model)
Conditional model:
Groups Name Std.Dev.
ID (Intercept) 0.37105
Zero-inflation model:
Groups Name Std.Dev.
ID (Intercept) 1.3207
emmeans <- emmeans(model, ~ Location*Type, type="response", sigma=0.37105, bias.adjust=TRUE)
Location Type response SE df lower.CL upper.CL
0 0 1.117 0.277 631 0.687 1.82
1 0 0.940 0.251 631 0.556 1.59
2 0 0.893 0.266 631 0.498 1.60
0 1 1.325 0.254 631 0.909 1.93
1 1 1.090 0.248 631 0.698 1.70
2 1 1.452 0.300 631 0.967 2.18
Confidence level used: 0.95
Intervals are back-transformed from the log scale
Bias adjustment applied based on sigma = 0.37105
Interestingly, I tabulated the mean for location0/type1 including 0s and without 0s. Including 0s=1.05 and without 0s=1.88. The emmeans output seems to be in the middle of those two.
Does anyone have any workaround suggestions on how to tabulate the means of the conditional model above a threshold (ie above 0)? This page seemed to be relevant: https://www.theanalysisfactor.com/getting-accurate-predicted-counts-when-there-are-no-zeros-in-the-data/ but I wasn't sure what R package would fit my needs.
Thanks for your help!
Warm Regards,
Alex
On 2/15/22, 4:08 PM, "R-sig-mixed-models on behalf of Ben Bolker" <r-sig-mixed-models-bounces using r-project.org on behalf of bbolker using gmail.com> wrote:
A quick test suggests that emmeans is predicting the response based on
the mean of the *un*truncated distribution (I don't remember and/or
haven't looked into all of the guts of emmeans). Don't know if Russ
Lenth (emmeans maintainer) is reading ...
n <- 1000
dd <- data.frame(f = factor(rep(1:2, each = n)))
gb <- log(c(2,4))
set.seed(101)
dd <- transform(dd, y = rnbinom(2*n, mu = exp(gb[f]), size = 2))
dd2 <- subset(dd, y > 0)
## un-truncated means
aggregate(y ~ f, data = dd, FUN = mean)
## f y
## 1 1 2.047
## 2 2 3.917
## truncated means
aggregate(y ~ f, data = dd2, FUN = mean)
## f y
## 1 1 2.781250
## 2 2 4.446084
library(glmmTMB)
library(emmeans)
m1 <- glmmTMB(y ~ f, family = truncated_nbinom2, data = dd2)
## doesn't match exactly but close to untruncated means
emmeans(m1, ~ f, type = "response")
## f response SE df lower.CL upper.CL
## 1 2.15 0.0891 1614 1.99 2.34
## 2 3.98 0.1262 1614 3.74 4.23
## matches means exactly
m2 <- glmmTMB(y ~ f, family = nbinom2, data = dd)
emmeans(m2, ~ f, type = "response")
## f response SE df lower.CL upper.CL
## 1 2.05 0.0651 1997 1.92 2.18
## 2 3.92 0.1094 1997 3.71 4.14
On 2/15/22 10:04 AM, Alex Waldman wrote:
> Dear All,
>
> Hope all is well! This may be a naïve question but I am running a hurdle negative binomial model to look at the differences in counts of differing types in different locations. My major interest is the conditional model (ie when counts are above 0).
>
> I run the following code:
>
> model<-glmmTMB(Count ~ Location*Type + (1 | ID), zi=~Location*Type + (1|ID), data=data, family="truncated_nbinom1",control=glmmTMBControl(optimizer=optim, optArgs=list(method="BFGS")))
>
> var.corr <-VarCorr(model)
>
> Conditional model:
> Groups Name Std.Dev.
> ID (Intercept) 0.37105
>
> Zero-inflation model:
> Groups Name Std.Dev.
> ID (Intercept) 1.3207
>
> emmeans <- emmeans(model, ~ Location*Type, type="response", sigma=0.37105, bias.adjust=TRUE)
>
> Location Type response SE df lower.CL upper.CL
> 0 0 1.117 0.277 631 0.687 1.82
> 1 0 0.940 0.251 631 0.556 1.59
> 2 0 0.893 0.266 631 0.498 1.60
> 0 1 1.325 0.254 631 0.909 1.93
> 1 1 1.090 0.248 631 0.698 1.70
> 2 1 1.452 0.300 631 0.967 2.18
>
> Confidence level used: 0.95
> Intervals are back-transformed from the log scale
> Bias adjustment applied based on sigma = 0.37105
>
> However, I’m not sure why the estimated means and confidence intervals will include values below 1 in the conditional model as I anticipated these values would represent the average number of non-zero counts? Is there something I may be doing wrong or not understanding?
>
> Thanks in advance for your help!
>
> Warm Regards,
> Alex
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
_______________________________________________
R-sig-mixed-models using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list