[R-sig-ME] glmer and overdispersed Poisson models

Tue Sep 16 20:42:59 CEST 2008

On Tue, Sep 16, 2008 at 12:44 AM,  <David.Ramsey at dse.vic.gov.au> wrote:
> Hi All,  I have recently upgraded my version of lme4 and redid an old
> analysis.  The data are counts and are overdispersed and an offset is
> included.  I originally fitted a mixed model in lmer() using
> quasi-likelihood

> lme4 version 0.99875-9   Matrix version 0.999375-4

>> fit1= lmer(total ~ direction + time + flow + offset(log(effort)) +
> (1|site),family=quasipoisson(link=log),data=data,method="ML")
>> summary(fit1)
> Generalized linear mixed model fit using Laplace
> Formula: total ~ direction + time + flow + offset(log(effort)) + (1 |
> site)
>   Data: data
>  Family: quasipoisson(log link)
>    AIC    BIC  logLik deviance
>  222622 222637 -111305   222610
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  site     (Intercept) 3930.7   62.695
>  Residual             3544.0   59.531
> number of obs: 80, groups: site, 7

> Fixed effects:
>             Estimate Std. Error t value
> (Intercept)   1.46436   23.70079   0.062
> directionout -0.22065    0.27874  -0.792
> timenight    -1.16554    0.28429  -4.100
> flows         0.01255    0.43394   0.029
> flowf         0.40867    0.55300   0.739

> Correlation of Fixed Effects:
>            (Intr) drctnt tmnght flows
> directionot -0.005
> timenight   -0.008  0.086
> flows       -0.011 -0.066  0.072
> flowf       -0.011 -0.048  0.041  0.687

> So far so good.  We have an estimate of the scale of 59.5 (ouch - yes
> pretty bad overdispersion!).

Yes.  It raises questions of whether it is a good idea to use the
results at all.

> I redid this analysis recently with the latest version of lme4

> lme4 version 0.999375-26: Matrix version 0.999375-14

>> fit2= glmer(total ~ direction + time + flow + offset(log(effort)) +
> (1|site),family=quasipoisson(link=log),data=data,REML=F)
>> summary(fit2)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: total ~ direction + time + flow + offset(log(effort)) + (1 |
> site)
>   Data: data
>    AIC    BIC  logLik deviance
>  222624 222641 -111305   222610
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  site     (Intercept) 4019367  2004.8
>  Residual             3487254  1867.4
> Number of obs: 80, groups: site, 7

That much of a difference cannot be due to changes in the optimization
method.  There must be a different formula being employed.  I suspect
that I am not taking the weights into account properly when
calculating the scale factor in one or the other implementation and
probably that the mistake is in the more recent implementation.

I have difficulty in following the calculations for the quasipoisson
and quasibinomial families because they are not really probability
distributions and you end up estimating a parameter that "isn't
there", in some sense.

If someone can suggest what the formula for the scale factor should
be, I can make a modification but I don't really have time right now
to research it or derive the theory.

> Fixed effects:
>              Estimate Std. Error  t value
> (Intercept)    1.44981  757.88854  0.00191
> directionout  -0.22063    8.74381 -0.02523
> timenight     -1.16552    8.91772 -0.13070
> flows          0.01266   13.61242  0.00093
> flowf          0.40877   17.34707  0.02356
>
> Correlation of Fixed Effects:
>            (Intr) drctnt tmnght flows
> directionot -0.005
> timenight   -0.008  0.086
> flows       -0.011 -0.066  0.072
> flowf       -0.011 -0.048  0.041  0.687
>
> Yikes!! the estimate of the site random effect and scale are now orders of
> magnitude larger.
> Log-likelihood and deviance is the same as are the estimates of fixed
> effects and correlations.
> Obviously the SE of the fixed effects are now a bit larger... This
> complicates
> my inferences from the previous analysis, to say the least.
>
> I guess there has been some changes in the way the scale is estimated in
> the latest version of lmer (glmer).
> If I refit these models without the estimation of the scale parameter (and
> close my eyes..)
>
> lme4 version 0.99875-9   Matrix version 0.999375-4
>
>>fit3<- lmer(total ~ direction + time + flow + offset(log(effort)) +
> (1|site),family=poisson(link=log),data=data,method="ML")
>> summary(fit3)
> Generalized linear mixed model fit using Laplace
> Formula: total ~ direction + time + flow + offset(log(effort)) + (1 |
> site)
>   Data: data
>  Family: poisson(log link)
>    AIC    BIC  logLik deviance
>  222622 222637 -111305   222610
> Random effects:
>  Groups Name        Variance Std.Dev.
>  site   (Intercept) 1.1131   1.0550
> number of obs: 80, groups: site, 7
>
> Estimated scale (compare to  1 )  59.53134
>
> Fixed effects:
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept)   1.464490   0.398841    3.67 0.000241 ***
> directionout -0.220652   0.004682  -47.12  < 2e-16 ***
> timenight    -1.165554   0.004775 -244.07  < 2e-16 ***
> flows         0.012577   0.007289    1.73 0.084456 .
> flowf         0.408721   0.009289   44.00  < 2e-16 ***
>
>
> lme4 version 0.999375-26: Matrix version 0.999375-14
>
>> fit4= glmer(total ~ direction + time + flow + offset(log(effort)) +
> (1|site),family=poisson(link=log),data=data,REML=F)
>> summary(fit6)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: total ~ direction + time + flow + offset(log(effort)) + (1 |
> site)
>   Data: data
>    AIC    BIC  logLik deviance
>  222622 222637 -111305   222610
> Random effects:
>  Groups Name        Variance Std.Dev.
>  site   (Intercept) 1.1526   1.0736
> Number of obs: 80, groups: site, 7
>
> Fixed effects:
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept)   1.449810   0.405848    3.57 0.000354 ***
> directionout -0.220630   0.004682  -47.12  < 2e-16 ***
> timenight    -1.165520   0.004775 -244.07  < 2e-16 ***
> flows         0.012655   0.007289    1.74 0.082547 .
> flowf         0.408774   0.009289   44.00  < 2e-16 ***
>
> The results from the two different versions now agree(to within a few
> decimal places).
> I notice the latest version of glmer() now does not output the scale in
> the summary.
>
> I guess my question is "which of these is correct ?" I routinely have to
> deal with
> overdispersed count data and would appreciate any advice on conducting
> these
> sorts of analyses in lme4.
>
> Thanks in advance
> Dave
>
>
> Notice:
> This email and any attachments may contain information t...{{dropped:14}}
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>