[R-sig-ME] zero-truncated negative binomial distribution
Ben Bolker
bbolker at gmail.com
Fri Nov 3 23:04:36 CET 2017
I assume you have multiple observations per individual (an
observation-level random effect wouldn't make sense with a model like
the [truncated] negative binomial, which includes an estimated
dispersion parameter)?
How big is your data set overall? What is summary(count) for your data
(e.g. min/max, 10% and 90% quantiles, mean, std dev) ? (The marginal
distribution is less important than the conditional distribution, but
getting information about the conditional distribution is more difficult.)
Transforming data and fitting with a linear model is always a
reasonable alternative if you can find a distribution that makes the
(conditional) distributions approximately normal (and homoscedastic).
What is your evidence of "a hard time"? Warning/error messages?
How important is the zero-truncation? Do you have a lot of small
counts (1,2,3) in addition to your extremely large values?
Other more heavy-tailed distributions do exist (e.g.
https://en.wikipedia.org/wiki/Beta_negative_binomial_distribution ) but
not yet implemented in glmmTMB (and we'd have to implement both the BNB
and its zero-truncated version). I think they'd likely be overkill.
On 17-11-03 05:07 PM, Alice Domalik wrote:
> Hi all,
>
> I am fitting mixed effects models using the package glmmTMB to investigate habitat use.
> My data does not contain any zeros, so I have considered the zero-truncated poisson and the zero-truncated negative binomial.
> Of these two distributions, the zt negative binomial was better, so I tried fitting my model:
>
> m1<-glmmTMB(count~waterdepth + temperature + chl.conc + (1|individual), family=list(family="truncated_nbinom1", link="log"), data=mydata)
>
> However, it is clear that the model is having a hard time fitting my very high response values (the distribution of my response variable has a very long tail).
> The QQplot also shows the high 'count' values being above the QQline.
>
> What are my options for improving model fit? Are there any distributions that might be better? Is it permissible to transform my response variable (eg. sqrt or log)?
>
> Any suggestions are greatly appreciated.
>
> -Ally
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list