Hello,
I've explored my data some more and have an update to the above message. I
think my model for the non-zero counts is underfitting because the
zero-truncated negative binomial distribution is not a great representation
of the data. My count data for y>0 are as following:
> summary(as.factor(dat.nz$Count))
1 2 3 4 5 6 7 8 10 11 12
172 39 18 14 4 8 2 1 1 1 1
The zero-truncated negative binomial family does not seem to fit for this
many ones and the corresponding drop-off to lower values. Thus, I suppose I
shouldn't use this family of models for the 2nd part of the hurdle model
that I'm trying to fit. But what are the alternatives? I have thought about
quantile regression and am searching for non-parametric methods, but I am
not sure if it is acceptable to split up the model in this way.
I welcome any thoughts or suggestions.
Thank you,
Shawn
On Tue, Jun 24, 2014 at 4:37 PM, Shawn O'Neil
wrote:
> Dear all,
>
> I am fitting some count data to environmental covariates. These data are
> overdispersed and possibly zero-inflated (85% zeros). The objective of this
> research is to assess relative influence of various habitat characteristics
> on the spatial distribution of female ducks during the non-breeding season.
> The response variable is the number of marked females located during a
> survey. The predictors include some measured habitat variables and some
> measures associated with each survey session, e.g. weather conditions and
> median group sizes for the marked individuals. We need a mixed model
> because the data are longitudinal in nature (repeated measures at the same
> plot locations over time). Currently, I am fitting models that include
> crossed random effects for plot and year. I think that a hurdle model (or
> two-part, or zero-altered) is a good approach for us because we only have
> group information for the birds if >= 1 bird was observed. So we can model
> the presence/absence of marked birds as a separate process if we use a
> hurdle model.
>
> My results are okay with the first (binomial) part of the model (0's v.
> 1s) but I am having some difficulty with the second part. I fit models to
> the non-zero data using the truncated negative binomial distribution in
> glmmADMB, which fits the data better than the truncated poisson. The model
> formula is, for example
>
> m7.count<-glmmadmb(Count~ depth + view + HighWind + Group + LowWater +
> (1|Plot) + (1|YEAR), data=dat.nz,
> family="truncnbinom1")
>
> The model output seems okay to me, but I have run some residual
> diagnostics using "res<-m7.count$residuals[,1]" and identified several
> issues with the results which I am not sure how to correct. First, the mean
> and median of the residuals are > 0 and are rather peaked and not very
> "normal." I also get many fitted values between 0 and 1, which seems odd
> because the counts are all >=1. So the model is generally fitting lower
> values than expected.
>
> > quantile(res)
> 0% 25% 50% 75% 100%
> -5.2136210 -0.1655928 0.2069162 0.6925399 8.4033597
>
> > fits<-fitted(m7.count)
> > quantile(fits)
> 0% 25% 50% 75% 100%
> 0.2961099 0.7031512 1.0379357 1.4491914 9.3689599
>
> I am not sure if the above is a serious problem or not. A bigger problem
> seems to be increasing variance. Residual variance increases with counts.
> We don't have data to explain the increasing variance trend so I am
> wondering if there is a way to account for this in the models. Given that,
> my questions are:
>
> - Do I need to account for this variance structure, and is there a way to
> do it in glmmADMB (similar to varWeights in 'nlme')?
> - Is there a better way to handle these data, using another package or
> another model? From what I can find, I think I might be limited to
> glmmADMB, or perhaps MCMCglmm.
> - Other suggestions?
>
> Thank you and please let me know if I should provide more information, as
> I was trying to limit the length of the original message.
>
> regards,
> Shawn
>
> --
> Shawn O'Neil
> PhD Student, Forest Science
> Dept. of Forest Resources and Environmental Science
> Michigan Technological University
>
>
--
Shawn O'Neil
PhD Student, Forest Science
Dept. of Forest Resources and Environmental Science
Michigan Technological University
[[alternative HTML version deleted]]