[R-sig-ME] glmer overdispersion correction, family = binomial
David Duffy
davidD at qimr.edu.au
Sat Mar 5 00:00:53 CET 2011
On Fri, 4 Mar 2011, Colin Wahl wrote:
>
> The observation level effect accounts for a large portion of the variance.
> Should I interpret this as meaning the observation level effect is improving
> the model fit substantially? If so, would I expect the standard errors to
> decrease?
>
> The reason I ask is that I noticed the standard errors for the fixed effects
> in the model with the observation level effect are consistently decreased by
> ~1 or 2 percent for all fixed effects, resulting in slightly more
> significant p values. This seems counter to your last point that this
> "correction" should inflate the SEs by about the right amount.
>
> Is it still possible that I have an over fitted model? If so, how can I
> determine if that is the case?
>
I presume John M will say something cogent, but I am wondering more:
1) what are your key scientific questions? Are you are more interested in
the hypothesis testing on fixed effects, where your conclusions would seem
to be same under either model, or do you want predictions for
modelling populations?
2) Why are you including stream:rip as a random effect, with
three observations per level? Or are there more than 72 observations?
I certainly can't justify why I think it is OK to have 12 random effects
for stream, plus 72 random effects for subject, but get nervous when you
also want another 24, but I do ;) I think it is because I am happy to do
this when there are strong mechanistic reasons as in genetic models,
but less so when one is adding it here, where the fixed effect of rip was
less than stellar.
3) For further model criticism, re "overfitting", I could not rely on
simple summaries, such as you are giving us, but eyeball the predicted
values for every level of wsh and stream, and see if they look sensible.
On the face of it, the loglikelihood for the observation level effect
model below is much better than for the second model. The only way I have
tripped up in the past is when the model is completely inappropriate for
the data, so that likelihood ratios of that type are completely dependent
on a few extreme observations.
4) If the dataset is not any larger than 72 observations, then maybe you
should be looking at Bayesian approaches, where you could inject prior
knowledge about the area. I presume other people have analysed similar
data...
> Here are the glmer outputs for the model with and without the obs level
> effect.
>
> The model with the observation level effect:
> Generalized linear mixed model fit by the Laplace approximation
> Formula: E ~ wsh * rip + (1 | stream) + (1 | stream:rip) + (1 | obs)
> Data: ept
> AIC BIC logLik deviance
> 284.4 309.5 -131.2 262.4
> Random effects:
> Groups Name Variance Std.Dev.
> *obs (Intercept) 0.30186 0.54942*
> stream:rip (Intercept) 0.40229 0.63427
> stream (Intercept) 0.12788 0.35760
> Number of obs: 72, groups: obs, 72; stream:rip, 24; stream, 12
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -4.2906 *0.4935* -8.694 < 2e-16 ***
> wshd -2.0557 *0.7601* -2.705 0.00684 **
> wshf 3.3575 *0.6339* 5.297 1.18e-07 ***
> wshg 3.3923 *0.7486* 4.531 5.86e-06 ***
> ripN 0.1425 *0.6323* 0.225 0.82165
> wshd:ripN 0.3708 *0.9682* 0.383 0.70170
> wshf:ripN -0.8665 *0.8087* -1.071 0.28400
> wshg:ripN -3.1530 * 0.9601* -3.284 0.00102 **
>
> Model without the Observation level effect:
> Generalized linear mixed model fit by the Laplace approximation
> Formula: E ~ wsh * rip + (1 | stream) + (1 | stream:rip)
> Data: ept
> AIC BIC logLik deviance
> 754.3 777 -367.2 734.3
> Random effects:
> Groups Name Variance Std.Dev.
> stream:rip (Intercept) 0.48908 0.69934
> stream (Intercept) 0.18187 0.42647
> Number of obs: 72, groups: stream:rip, 24; stream, 12
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -4.28529 *0.50575* -8.473 < 2e-16 ***
> wshd -2.06605 *0.77357* -2.671 0.00757 **
> wshf 3.36248 *0.65118 * 5.164 2.42e-07 ***
> wshg 3.30175 *0.76962* 4.290 1.79e-05 ***
> ripN 0.07063 *0.61930* 0.114 0.90920
> wshd:ripN 0.60510 *0.94778* 0.638 0.52319
> wshf:ripN -0.80043 *0.79416* -1.008 0.31350
> wshg:ripN -2.78964 *0.94336* -2.957 0.00311 **
>
--
| David Duffy (MBBS PhD) ,-_|\
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
| Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
More information about the R-sig-mixed-models
mailing list