[R-sig-ME] single argument anova for GLMMs (really, glmer, or dispersion?)

Sat Dec 13 21:39:30 CET 2008

On Sat, Dec 13, 2008 at 12:46 PM, Murray Jorgensen
<maj at stats.waikato.ac.nz> wrote:
> I thought I might note that zero-inflated count data and negative binomial
> data can both be seen as cases where the response variable follows a mixture
> distribution. In the ZIP case a mixture of a constant [ Poisson(0) or
> Poisson(tiny) with another Poisson], in the negative binomial case a gamma
> mixture of Poissons [which might be approximated by a finite mixture].

I was thinking a bit more about your suggestion of mixtures as a way
of incorporating overdispersion.  It is quite a reasonable suggestion
but I am afraid I don't know enough about methods of estimating the
parameters in a mixture model to decide if it is feasible to put such
models in the framework I plan to use.  My "bottom line" is that I
want to be able to determine the conditional modes of the random
effects given the data and parameter values by solving a penalized
iteratively reweighted least squares problem. If mixture models, or
even restricted forms of mixture models like the ZIP model, can be
expressed in that form then it is just a question of deciding how the
model can be specified and how the specification can be translated
into such a problem.  (This process is not trivial.  It is a lot
easier to write down a model than it is to decide how to define the
arguments and defaults for specifying such a model as an R function.)

My guess is that models like ZIP can't be expressed that way so it
would be necessary to condition on the mixture components, estimate
the conditional modes of the random effects and conditional estimates
of the parameters, then iterate.

One of the basic changes in the allcoef branch of the lme4 code is the
way that the "outer" optimization is performed (PIRLS is the "inner"
optimization in the Laplace or adaptive Gauss-Hermite approximation;
optimization of the profiled deviance with respect to \theta is the
outer optimization).  In the current lme4 this is done internally in C
code and hence is somewhat inaccessible to other programmers.  In the
allcoef branch this is done at the R level by calling nlminb.  In that
branch setting doFit = FALSE in a call to lmer/glmer/nlmer returns an
environment that is suitable for defining the optimization problem in
that it has methods for getPars, getBounds and setPars.  The latter
method sets new values of the parameters and returns the objective
function evaluated at the new parameters. Allowing access to this
environment is intended to be the hook that others can use to set up a
model that is almost what they want so they can then mold the
optimization process to fit the model that they do want.