[R-sig-ME] glmer overdispersion correction, family = binomial

David Duffy davidD at qimr.edu.au
Sat Mar 5 00:00:53 CET 2011


On Fri, 4 Mar 2011, Colin Wahl wrote:

>
> The observation level effect accounts for a large portion of the variance.
> Should I interpret this as meaning the observation level effect is improving
> the model fit substantially? If so, would I expect the standard errors to
> decrease?
>
> The reason I ask is that I noticed the standard errors for the fixed effects
> in the model with the observation level effect are consistently decreased by
> ~1 or 2 percent for all fixed effects, resulting in slightly more
> significant p values. This seems counter to your last point that this
> "correction" should inflate the SEs by about the right amount.
>
> Is it still possible that I have an over fitted model? If so, how can I
> determine if that is the case?
>

I presume John M will say something cogent, but I am wondering more:

1) what are your key scientific questions?  Are you are more interested in 
the hypothesis testing on fixed effects, where your conclusions would seem 
to be same under either model, or do you want predictions for 
modelling populations?

2) Why are you including stream:rip as a random effect, with 
three observations per level? Or are there more than 72 observations? 
I certainly can't justify why I think it is OK to have 12 random effects 
for stream, plus 72 random effects for subject, but get nervous when you 
also want another 24, but I do ;)  I think it is because I am happy to do 
this when there are strong mechanistic reasons as in genetic models, 
but less so when one is adding it here, where the fixed effect of rip was 
less than stellar.

3) For further model criticism, re "overfitting", I could not rely on 
simple summaries, such as you are giving us, but eyeball the predicted 
values for every level of wsh and stream, and see if they look sensible.
On the face of it, the loglikelihood for the observation level effect 
model below is much better than for the second model.  The only way I have 
tripped up in the past is when the model is completely inappropriate for 
the data, so that likelihood ratios of that type are completely dependent 
on a few extreme observations.

4) If the dataset is not any larger than 72 observations, then maybe you 
should be looking at Bayesian approaches, where you could inject prior 
knowledge about the area.  I presume other people have analysed similar 
data...



> Here are the glmer outputs for the model with and without the obs level
> effect.
>
> The model with the observation level effect:
> Generalized linear mixed model fit by the Laplace approximation
> Formula: E ~ wsh * rip + (1 | stream) + (1 | stream:rip) + (1 | obs)
>   Data: ept
>   AIC   BIC logLik deviance
> 284.4 309.5 -131.2    262.4
> Random effects:
> Groups     Name        Variance Std.Dev.
> *obs         (Intercept) 0.30186  0.54942*
> stream:rip (Intercept) 0.40229  0.63427
> stream      (Intercept) 0.12788  0.35760
> Number of obs: 72, groups: obs, 72; stream:rip, 24; stream, 12
>
> Fixed effects:
>            Estimate Std. Error z value Pr(>|z|)
> (Intercept)   -4.2906     *0.4935*  -8.694  < 2e-16 ***
> wshd           -2.0557     *0.7601*  -2.705  0.00684 **
> wshf             3.3575     *0.6339*   5.297 1.18e-07 ***
> wshg            3.3923     *0.7486*   4.531 5.86e-06 ***
> ripN              0.1425     *0.6323*   0.225  0.82165
> wshd:ripN     0.3708     *0.9682*   0.383  0.70170
> wshf:ripN     -0.8665     *0.8087*  -1.071  0.28400
> wshg:ripN    -3.1530    * 0.9601*  -3.284  0.00102 **
>
> Model without the Observation level effect:
> Generalized linear mixed model fit by the Laplace approximation
> Formula: E ~ wsh * rip + (1 | stream) + (1 | stream:rip)
>   Data: ept
>   AIC BIC logLik deviance
> 754.3 777 -367.2    734.3
> Random effects:
> Groups     Name        Variance Std.Dev.
> stream:rip (Intercept) 0.48908  0.69934
>  stream     (Intercept) 0.18187  0.42647
> Number of obs: 72, groups: stream:rip, 24; stream, 12
>
> Fixed effects:
>            Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -4.28529    *0.50575*  -8.473  < 2e-16 ***
> wshd          -2.06605    *0.77357*  -2.671  0.00757 **
> wshf            3.36248    *0.65118 *  5.164 2.42e-07 ***
> wshg           3.30175    *0.76962*   4.290 1.79e-05 ***
> ripN             0.07063    *0.61930*   0.114  0.90920
> wshd:ripN    0.60510    *0.94778*   0.638  0.52319
> wshf:ripN    -0.80043    *0.79416*  -1.008  0.31350
> wshg:ripN   -2.78964    *0.94336*  -2.957  0.00311 **
>

-- 
| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v




More information about the R-sig-mixed-models mailing list