[R-sig-ME] Zero alteration and mixed models?

Sat May 10 21:26:30 CEST 2014

Kirsty Gurney <kegurney at ...> writes:

> 
> Good afternoon;
> 
> A non-R specific question about mixed models, but I am sending this email
> with the hope that others in this group may be familiar with zero-altered
> (i.e. zero-inflated or hurdle) models in a mixed model context.

  I'll take at a stab at this with the usual caveat that the advice
is worth whatever you paid for it :-)   

> My current project is focused on understanding changes in wetland biota
> related to environmental change, and I'm using zero-altered negative
> binomial models (implemented in SAS) to evaluate changes in abundance of
> invertebrates in these habitats.  I have had good success implementing
> models that accurately reflect my data structure and that include predictor
> variables of interest, but I do have a question or two outstanding.
> 
> Specifically, I am curious about the logit (zero) part of the ZINB mixed
> models.  If parameters for this part of the model are estimated imprecisely
> and thereby uninformative, should they be removed from the model?
> 
> Prior to model construction, I plotted the proportion of zeroes in the
> dataset as a function of several variables, and these plots suggested that
> wetland class had an important influence on whether or not excess zeroes
> were observed.  However, none of the parameter estimates for the wetland
> class variable are predicted with any precision (nor is the intercept for
> this logit model) in the model that includes them.
> 
> If anyone on this list is willing / able to provide any insights or
> suggestions as to the mathematical interpretation for the inflation
> probability portion of these models, I would be most grateful.
> 
> Thank you in advance.
> 
> kbg

  I think the answer to this question depends on your general purpose
in modeling, and your general philosophy of model selection.  In other
words, the answer is similar to the question of whether you should
drop or simplify any terms that don't appear to be doing anything useful.

  If you are doing confirmatory hypothesis testing, then you definitely
shouldn't.

  If you are primarily interested in prediction, it might be
reasonable to try to do some form of model selection, which will
generally increase the bias and decrease the variance (with due
attention to the effect of model uncertainty and model selection on
the confidence intervals/uncertainty of the estimates and predictions).
What you're describing above is essentially a crude form of backward
stepwise model selection (or at least the first step ...)

  Ben Bolker