[R-sig-ME] Multilevel zero-inflated models: model selection and model assumption checks

Phillip Alday ph||||p@@|d@y @end|ng |rom mp|@n|
Mon May 25 01:15:44 CEST 2020

I won't address everything, but will give a few short pointers

On 28/4/20 1:58 am, Kate R wrote:
> Thank you in advance for reading my long-winded email! I appreciate any
> help and guidance.
> I am working on research that involves frequency of events (summed during
> an hour) and duration of events (calculated a seconds during an hour). Both
> data types have 0s and are positively skewed. The observations (n =
> 14,000+) are collected hourly for up to 2 weeks for 500 participants. A
> simple version of the model is: outcome ~ explanatory +
> (1|participant/date).
> For the frequency data, I have fit:
>    - (Hurdle) Poisson / Negative Binomial (the var > mean)
>    - (Hurdle) Gamma (log link) (assuming okay to treat frequency data as
>    continuous)
> The AIC is best for the hurdle negative binomial model. I understand that
> for AIC comparisons, the Poisson models do not get an extra scale
> parameter, while the other models. However, the difference in AICs are in
> the hundreds, so I imagine the small difference in k parameters would not
> change interpretation, as long as the log-likelihood functions are
> similar...
> (Q1) Does the above sound reasonable?
> For the duration data, I have fit:
>    - (Hurdle) Gamma (log link)
>    - Zero-Inflated Beta (after transforming duration by SECONDS/3600. I
>    decided to try Beta because the hourly data is bounded at 3600 seconds and
>    I was not sure if an upper bound affects the gamma distribution).
> (Q2) Does an upper bound mean make it inappropriate to fit a Gamma
> distribution?

For this type of thing, my advise is always "plot the model against the
data!". You can use an effects-type plot overlaid with the data or you
can plot both the fitted and observed values against the predictors (in
addition to the classical fitted vs. observed plot). Which models
capture which aspects of your data? Which models break down in
unacceptable ways? As George Box said, "all models are wrong, but some
models are useful." Which models are most useful (i.e. wrong in ways you
can accept and right in ways you need)?

> N.B. If fitting regular gamma/beta, I first transformed data to remove 0s.
> For both the regular and zero-inflated beta, I shrunk the 1s using EITHER
> the algorithm here: https://www.ncbi.nlm.nih.gov/pubmed/16594767 OR the
> inverse hyperbolic sine transformation (IHST)). I had thought about using
> ZOIB, but I did not want to use BRMS or GAMLSS (since I am a beginner).
> Because I transformed the data in different ways for the different models,
> I understand that I cannot compare the AICs. Therefore, I thought about
> using cross fold validation. However, the package I was recommended to use
> (cvms https://github.com/LudvigOlsen/cvms) says it supports lmer/glmer, but
> doesn't appear to support glmmTMB, which is how I fit the above models.
> I thought a potential solution might be to fit two separates models
> (binomial for 0/1 data and count/continuous for positive data, as suggested
> in this post: assessing glmmTMB hurdle model fit using DHARMa scaled
> residual plot
> <https://stats.stackexchange.com/questions/400147/assessing-glmmtmb-hurdle-model-fit-using-dharma-scaled-residual-plot>).
> In this way, I could fit lmer/glmer, and perform cross fold validation
> using cvms on the positive data. But I'm not sure if this makes sense (to
> partition out the 0 data, and only use the positive data to validate the
> models)? I suppose I could simulate data to check predictive accuracy on
> the glmmTMB models? Is this a reasonable method of comparing the models?

This is in some sense exactly what a hurdle model does. :)

> (Q3) How would you recommend comparing the models with different
> distributions and family transformations?

With difficulty if you want classical tests (they aren't nested so you
can't do likelihood-ratio tests and I find even AIC/BIC less than ideal
here because the different link functions means that you're not quite
comparing the same aspects of the data). You could use stratified
cross-validation to get about predictive power, but this can be
challenging depending on the exact nesting/crossing structure of your
data (sorry, I didn't have time to read and think about your particular
data in detail).  Otherwise, I recommend the graphical comparison I
mentioned above.

Hope that helps,

> Additionally, I considered fitting a Tweedie distribution to both the
> frequency and duration. However, I was having trouble finding guidance on
> how to interpret the beta coefficients.
> (Q4) How does one interpret Tweedie beta coefficients? Do you exponentiate
> and discuss as rates?
> Finally, I have used the performance package to check the model
> assumptions. The results can be found at this link:
> <
> https://docs.google.com/document/d/1s6qtSAvw297F_cvblAq7Gh2fWgFmz9I5Nxi1K2gjScI/edit?usp=sharing
> Pic 1 and 2 are both for the Hurdle Gamma on Duration, but have different
> explanatory variables. The pics are pretty similar to the results for the
> other models/distributions. I have been reading conflicting advice as to
> whether the residuals for GLM models need to be normally distributed, etc.
> I understand for large datasets, non-normality may be a non-issue even for
> linear regression. We are not concerned with predicting new data, but
> explaining the data that we have. Therefore, given the size of my sample
> and the distributions. how concerned should I be about the QQ plots and
> homogeneity of variance plots below?
> Thank you very much!
> 	[[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list