[R-sig-ME] r-sig-mixed-models answer

Sun Jan 29 02:09:40 CET 2012

> Date: Thu, 26 Jan 2012 13:16:12 -0600 From: Shawn McCracken 
> <smccracken at ...> To: r-sig-mixed-models at ... Subject: [R-sig-ME] GLMM
>  distribution family model comparison using Poisson w/observation 
> level random effect Message-ID: 
> <CAE+9gVGEtv6coOG1Un=pWVKWyd2s=cZQGZv3SeCgMvvENVjbWA at ...> 
> Content-Type: text/plain
> 
> Dear R mixed model users,
> 
> I have been using package glmmADMB to run a full model and then 
> reduced model versions with the following distribution families: 
> poisson, zero-inflated poisson, poisson w/observation level random 
> effect, negative binomial, zero-inflated negative binomial, negative
>  binomial type 1, and zero-inflated negative binomial type 1. The 
> models using poisson w/observation level random effect give the best
>  fit according to AIC values but I am wondering if this is a fair 
> comparison since it has an additional random variable (observation)?

  If you look, I believe you will find that the Poisson with
observation-level random effect is counted as having the same number of
parameters as the negative binomial.  The number of parameters *should* be:

  Poisson           N
  ZIP              N+1
  Poisson w/ obs   N+1
  NB               N+1
  ZINB             N+2
  NB1              N+1
  ZINB1            N+2

I just tried running this with a relatively trivial example (intercept +
1 continuous covariate + 1 intercept-only random effect, so N=3),
and glmmADMB appears to agree with what I thought it should do:

  poiss     ZIP LNPoiss      NB    ZINB     NB1   ZINB1
      3       4       4       4       5       4       5

The particular example I ran had a true NB2 distribution with N=500, 10
blocks, intercept=1, slope=2, RE variance=1, overdispersion parameter
=1.2, and this was the AIC table:

> AICtab(mlist)
        dAIC   df
NB         0.0 4
ZINB       0.9 5
LNPoiss   43.4 4
ZINB1    104.6 5
NB1      115.4 4
ZIP     1854.7 4
poiss   4730.9 3

  This is somewhat comforting.  The only thing I find surprising here is
that LNPoiss (= lognormal Poisson = Poisson with observation-level
error) is so much worse, since it has exactly the same mean-variance
relationship as NB2.  Everything else is about as I expected.

  As Mollie hints, there are two issues with applying AIC to mixed
models (one of which also applies to ZI models): (1) boundary effects
and (2) counting number of parameters.  There's more on this on
http://glmm.wikidot.com/faq as well as in the paper Mollie cites.

=========================
Since no one has responded yet, I'll take a stab at this.

You are correct that it's not quite fair, but it's not straightforward.
I'm guessing that negative binomial is the 2nd runner up because it also
is based on a Poisson distribution with a mean that comes from a
right skewed distribution (Gamma distribution). The difference is that
by including an observation level random effect, you have somewhere
between 1 and nobs-1 parameters. I believe this gives more flexibility
than the 1 extra parameter of the negative binomial model
and ideally the model would be penalized for this, but estimating
degrees of freedom in mixed models is not straightforward (see Box 3 of
Bolker et al 2009 doi:10.1016/j.tree.2008.10.008). By running a glmmadmb
example I see that the df only counts 1 parameter for the random effect
(it's standard deviation). Approximations of df are controversial.

Mollie
>