[R-sig-ME] Query about fitting glmmTMB zero-inflated negative binomial model
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Thu Jun 24 17:52:48 CEST 2021
A few quick comments:
Since you're not using any random effects, you could also try this in
the pscl package (although I think it offers only ZIP and ZINB2, not
ZINB1 as you're fitting below).
If your conditional and z-i models are the same I think you can use
zi = ~ . as shorthand
A new version of glmmTMB (1.1.1) is on CRAN now, although the
Windows/MacOS binaries aren't built yet (should be in another day or
so); that will I think resolve the complex-eigenvalue warning you're
getting below. There's a new (experimental) diagnose() function as well,
which tries to give more precise guidance.
You don't need to create dummy variables by hand; R should do that for
you automatically given the factors.
I can't easily see what the difference is between your two models
here (I guess they're using different data sets?)
On 6/24/21 11:40 AM, William Silver wrote:
> Hi there,
>
> I�ve been referred to post my problem on this mailing list, so apologies in advance if I�m not following best practice.
>
> Essentially, I�m attempting to fit two zero-inflated negative binomial models. My two dependent variables are mode of accessing a news website (i.e., through social media, or through search engine) and I have raw counts for each individual (N = 900). Zero-inflated models are most theoretically plausible because some people do not use social media or search engines to access news. The distribution is extremely right skewed with an excess of zeros, and the right tail extends far out. On the advice of my professor, I truncated the models to exclude the 10-20 participants with extremely high counts, so that outliers would not be driving the model. Initially, I was met with NAs in the model summary and followed Ben Bolker�s advice here<https://stackoverflow.com/questions/62239351/why-am-i-getting-nas-in-the-model-summary-output-zero-inflated-glmm-with-glmmtm>. I looked for combinations of categories that were zero, and the categories I�ve found I have either dropped or combined. This fixed the NA issue initially when I ran the model with a different optimiser, following Ben Bolker�s advice here<https://stackoverflow.com/questions/62478569/understanding-and-fixing-false-convergence-errors-in-glmmtmb-lme4>, however this still seems to be a problem depending on which variables I include. I�ve pasted the model output below in which the standard errors look much more normal, however when running analysis of residuals, there are some significant problems.
>
> The problem I�m now met with a number of various warnings about model convergence which I�ve been unable to resolve. The first was �Error in e_complex_check(eigs$values) : detected complex eigenvalues of covariance matrix (max(abs(Im))=3.48036e-15: try se=FALSE?�, which disappears when including se=FALSE in the command. I�m then met with �Warning: In fitTMB(TMBStruc) : Model convergence problem. See vignette('troubleshooting')�. I�ve consulted the vignette<https://cran.r-project.org/web/packages/glmmTMB/vignettes/troubleshooting.html> it directs to, but am fairly new to statistics and find this all fairly advanced. I�ve tried applying the diagnose_vcov function, but am unsure how to interpret the results. Can I trust these models, given these warnings? Are there any recommendations someone might have for addressing the warnings if they are a significant issue?
>
> These errors are only occurring in the ZINB models (and this is the case for both glmmTMB and pscl packages). The ZI poisson models on the same variables run fine, but suffer from overdispersion (with which I hoped to solve by using the ZINB technique because it is better adapted for dealing with this). Ideally, I�d use the ZINB models over the ZIP models as they are better suited, but am wary of using them given these seemingly unresolvable warnings. I�d really appreciate any guidance anyone could give.
>
> Thank you so much in advance,
>
> Best wishes,
>
> William
>
> Output:
>
> zinb.am.se.test = glmmTMB(total_se_refer ~ diet_rob + polinterest + poleffic + age + trstmedia_sm + dummy_L_all + dummy_C_all + female_dum + trstmedia_sm + trstmedia_public, ziformula = ~ diet_rob + polinterest + poleffic + age + trstmedia_sm + dummy_L_all + dummy_C_all + female_dum + trstmedia_sm + trstmedia_public, data=df.trunc.se , family = nbinom1, se = FALSE, control = glmmTMBControl(optimizer = optim, optArgs = list(method="CG")))
> Warning message:
> In fitTMB(TMBStruc) :
> Model convergence problem; . See vignette('troubleshooting')
>> summary(zinb.am.se.test)
> Family: nbinom1 ( log )
> Formula: total_se_refer ~ diet_rob + polinterest + poleffic + age + trstmedia_sm +
> dummy_L_all + dummy_C_all + female_dum + trstmedia_sm + trstmedia_public
> Zero inflation:
> ~diet_rob + polinterest + poleffic + age + trstmedia_sm + dummy_L_all +
> dummy_C_all + female_dum + trstmedia_sm + trstmedia_public
> Data: df.trunc.se
>
> AIC BIC logLik deviance df.resid
> 8316.0 8417.3 -4137.0 8274.0 899
>
>
> Dispersion parameter for nbinom1 family (): 2.87
>
> Conditional model:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 0.096117 0.168652 0.570 0.568738
> diet_rob -0.007591 0.138051 -0.055 0.956151
> polinterest 0.211511 0.030390 6.960 3.4e-12 ***
> poleffic 0.089956 0.009867 9.117 < 2e-16 ***
> age 0.001146 0.001934 0.592 0.553555
> trstmedia_sm 0.090851 0.026701 3.402 0.000668 ***
> dummy_L_all 0.088489 0.069593 1.272 0.203539
> dummy_C_all -0.035509 0.062070 -0.572 0.567264
> female_dum 0.049242 0.053296 0.924 0.355518
> trstmedia_public 0.165294 0.027917 5.921 3.2e-09 ***
> ---
> Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
>
> Zero-inflation model:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.018224 0.425006 -0.043 0.9658
> diet_rob 0.005089 0.404882 0.013 0.9900
> polinterest -0.046104 0.087863 -0.525 0.5998
> poleffic -0.062502 0.028639 -2.182 0.0291 *
> age -0.001789 0.005584 -0.320 0.7487
> trstmedia_sm -0.042205 0.085556 -0.493 0.6218
> dummy_L_all -0.005172 0.209608 -0.025 0.9803
> dummy_C_all -0.005723 0.183298 -0.031 0.9751
> female_dum -0.007491 0.165891 -0.045 0.9640
> trstmedia_public -0.049461 0.082624 -0.599 0.5494
> ---
> Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
> Warning messages:
> 1: In vcov.glmmTMB(object, include_mapped = TRUE) :
> Calculating sdreport. Use se=TRUE in glmmTMB to avoid repetitive calculation of sdreport
> 2: In vcov.glmmTMB(object) :
> Calculating sdreport. Use se=TRUE in glmmTMB to avoid repetitive calculation of sdreport
>
>> summary(zinb.am.sm.test)
> Family: nbinom1 ( log )
> Formula: total_sm_refer ~ diet_rob + polinterest + poleffic + age + trstmedia_sm +
> dummy_L_all + dummy_C_all + female_dum + trstmedia_sm + trstmedia_public
> Zero inflation:
> ~diet_rob + polinterest + poleffic + age + trstmedia_sm + dummy_L_all +
> dummy_C_all + female_dum + trstmedia_sm + trstmedia_public
> Data: df.trunc.sm
>
> AIC BIC logLik deviance df.resid
> 5259.0 5360.3 -2608.5 5217.0 902
>
>
> Dispersion parameter for nbinom1 family (): 10.6
>
> Conditional model:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 0.10901 0.68090 0.160 0.8728
> diet_rob 0.08175 0.65938 0.124 0.9013
> polinterest 0.15325 0.11435 1.340 0.1802
> poleffic 0.03901 0.13561 0.288 0.7736
> age 0.01047 0.01093 0.958 0.3379
> trstmedia_sm 0.24108 0.13696 1.760 0.0784 .
> dummy_L_all 0.16702 0.89975 0.186 0.8527
> dummy_C_all 0.01118 1.46559 0.008 0.9939
> female_dum 0.12071 0.65590 0.184 0.8540
> trstmedia_public -0.01846 0.32520 -0.057 0.9547
> ---
> Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
>
> Zero-inflation model:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.0301001 1.3829502 -0.022 0.983
> diet_rob -0.0233937 2.3407518 -0.010 0.992
> polinterest -0.0897813 0.3597496 -0.250 0.803
> poleffic -0.0455312 0.3529030 -0.129 0.897
> age -0.0003903 0.0204699 -0.019 0.985
> trstmedia_sm -0.0884604 0.1374919 -0.643 0.520
> dummy_L_all -0.0205767 2.2994896 -0.009 0.993
> dummy_C_all -0.0167199 3.9414142 -0.004 0.997
> female_dum -0.0339406 1.8302418 -0.018 0.985
> trstmedia_public -0.0408538 0.6763691 -0.060 0.952
> Warning messages:
> 1: In vcov.glmmTMB(object, include_mapped = TRUE) :
> Calculating sdreport. Use se=TRUE in glmmTMB to avoid repetitive calculation of sdreport
> 2: In vcov.glmmTMB(object) :
> Calculating sdreport. Use se=TRUE in glmmTMB to avoid repetitive calculation of sdreport
>
> [[alternative HTML version deleted]]
>
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list