[R-sig-ME] zero-truncated negative binomial distribution
Highland Statistics Ltd
highstat at highstat.com
Sat Nov 4 10:09:14 CET 2017
Ally,
Did you not count zeros...or is it not possible to observe zeros for
your data? If theoretically you can observe zeros, but by chance you
didn't observe them then you better stick to an ordinary distribution.
If theoretically you cannot get them (e.g. numbers of eggs in a birds
nest...it is always >0), then a zero-truncated distribution is a better
option. But if your data is relative far away from 0 then you could
decide to stick to an ordinary (e.g. NB) distribution.
If you have very high values for your response variable....and if a
covariate cannot explain that, then you could also consider NB-p models.
In such a model you use:
E(Y) = mu
var(Y) = mu + mu^p / theta
and p is estimated. (In an ordinary NB p = 2).
Apologies for self-citing here....but we apply them in Chapter 5 of our
Beginner's Guide to GAMM with R (2014). Unfortunately, this does mean
that you have to use MCMC.
Instead of looking at QQ-plots I suggest that you also simulate data
from your model and see whether it produces similar values (especially
the large values) as your observed data.
Kind regards,
Alain
Hi all,
I am fitting mixed effects models using the package glmmTMB to
investigate habitat use.
My data does not contain any zeros, so I have considered the
zero-truncated poisson and the zero-truncated negative binomial.
Of these two distributions, the zt negative binomial was better, so I
tried fitting my model:
m1<-glmmTMB(count~waterdepth + temperature + chl.conc + (1|individual),
family=list(family="truncated_nbinom1", link="log"), data=mydata)
However, it is clear that the model is having a hard time fitting my
very high response values (the distribution of my response variable has
a very long tail).
The QQplot also shows the high 'count' values being above the QQline.
What are my options for improving model fit? Are there any distributions
that might be better? Is it permissible to transform my response
variable (eg. sqrt or log)?
Any suggestions are greatly appreciated.
-Ally
--
Dr. Alain F. Zuur
Highland Statistics Ltd.
9 St Clair Wynd
AB41 6DZ Newburgh, UK
Email: highstat at highstat.com
URL: www.highstat.com
And:
NIOZ Royal Netherlands Institute for Sea Research,
Department of Coastal Systems, and Utrecht University,
P.O. Box 59, 1790 AB Den Burg,
Texel, The Netherlands
Author of:
1. Beginner's Guide to Spatial, Temporal and Spatial-Temporal Ecological Data Analysis with R-INLA. (2017).
2. Beginner's Guide to Zero-Inflated Models with R (2016).
3. Beginner's Guide to Data Exploration and Visualisation with R (2015).
4. Beginner's Guide to GAMM with R (2014).
5. Beginner's Guide to GLM and GLMM with R (2013).
6. Beginner's Guide to GAM with R (2012).
7. Zero Inflated Models and GLMM with R (2012).
8. A Beginner's Guide to R (2009).
9. Mixed effects models and extensions in ecology with R (2009).
10. Analysing Ecological Data (2007).
More information about the R-sig-mixed-models
mailing list