[R-sig-ME] lmer models-confusing results - more information!
Jarrod Hadfield
j.hadfield at ed.ac.uk
Thu Dec 3 11:19:31 CET 2009
Dear Gwyneth,
Since you're not getting any answers - I'l give it a go, at the risk
of being wrong.
The likelihood for non-Gaussian GLMM cannot be obtained in closed
form and needs to be approximated. Often the approximation is good,
but in some cases it can be bad, particularly with binary data when
the incidence is extreme (low/high) and/or there is little replication
within factor levels. In extreme cases the parameter estimates +/- the
2*SE's do not even include the "true" values.
From your fixed effect summary it appears that reproductive successes
within some factor levels are all zero. If this is the case, this may
well be what is causing the problem and treating year as a random
effect may help. MCMC solutions are probably more robust for these
types of data because they use approximations which get more exact the
longer you run the analysis.
With regards to an earlier email, over-dispersed binary data does not
occur, because the mean determines the variance completely. This does
not mean that the probability of success is constant (after
conditioning on the model), it just means that any heterogeneity
cannot be observed and therefore estimated. In short, you don't need
to worry about it.
Cheers,
Jarrod
On 3 Dec 2009, at 06:33, Gwyneth Wilson wrote:
>
> I have been running lmer models in R, looking at what effects
> reproductive success in Ground Hornbills (a South African Bird). My
> response variable is breeding success and is binomial (0-1) and my
> random effect is group ID. My response variables include rainfall,
> vegetation, group size, year, nests, and proportion of open woodland.
>
> I have run numerous models with success but I am confused about what
> the outputs are. When I run my first model with all my variables
> (all additive) then i get a low AIC value with only a few of the
> variables being significant. When i take out the varaibles that are
> not significant then my AIC becomes higher but I have more
> significant variables! When I keep taking out the unsignificant
> variables, I am left with a model that has nests, open woodland, and
> group size as being extremely significant BUT the AIC is high! Why
> is my AIC value increasing when I have fewer varaibles that are all
> significant and seem to be best explaining my data? Do i look at
> only the AIC when choosing the 'best' model or do I look at only the
> p-values? or both? The model with the lowest AIC at the moment has
> the most variables and most are not significant?
>
> Please help. Any suggestions would be great!!
>
>
>
> Here is some more information and some of my outputs:
>
>
>
> The first model has all my variables included and i get a low AIC
> with only grp.sz and wood being significant:
>
>
>
> model1<-lmer(br.su~factor(art.n)+factor(yr)+grp.sz+rain+veg+wood+(1|
> grp.id),data=hornbill,family=binomial)
>> summary(model1)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: br.su ~ factor(art.n) + factor(yr) + grp.sz + rain + veg +
> wood + (1 | grp.id)
> Data: hornbill
> AIC BIC logLik deviance
> 138.5 182.3 -55.26 110.5
> Random effects:
> Groups Name Variance Std.Dev.
> grp.id (Intercept) 1.2913 1.1364
> Number of obs: 169, groups: grp.id, 23
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -3.930736 3.672337 -1.070 0.2845
> factor(art.n)1 1.462829 0.903328 1.619 0.1054
> factor(yr)2002 -2.592315 1.764551 -1.469 0.1418
> factor(yr)2003 -3.169365 1.759981 -1.801 0.0717 .
> factor(yr)2004 0.702210 1.341524 0.523 0.6007
> factor(yr)2005 -2.264257 1.722130 -1.315 0.1886
> factor(yr)2006 2.129728 1.270996 1.676 0.0938 .
> factor(yr)2007 -0.579961 1.390345 -0.417 0.6766
> factor(yr)2008 -1.062933 1.640774 -0.648 0.5171
> grp.sz 1.882616 0.368317 5.111 3.2e-07 ***
> rain -0.005896 0.003561 -1.656 0.0977 .
> veg -1.993443 1.948738 -1.023 0.3063
> wood 6.832543 3.050573 2.240 0.0251 *
>
>
> Then i carry on and remove varaibles i think are not having an
> influence on breeding success like the year, vegetation and rain.
> And i get this:
>
> model3<-lmer(br.su~factor(art.n)+grp.sz+wood+(1|
> grp.id),data=hornbill,family=binomial)
>> summary(model3)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: br.su ~ factor(art.n) + grp.sz + wood + (1 | grp.id)
> Data: hornbill
> AIC BIC logLik deviance
> 143.8 159.4 -66.88 133.8
> Random effects:
> Groups Name Variance Std.Dev.
> grp.id (Intercept) 0.75607 0.86953
> Number of obs: 169, groups: grp.id, 23
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -8.6619 1.3528 -6.403 1.52e-10 ***
> factor(art.n)1 1.5337 0.6420 2.389 0.0169 *
> grp.sz 1.6631 0.2968 5.604 2.09e-08 ***
> wood 3.2177 1.5793 2.037 0.0416 *
>
> So all the variables are significant but the AIC value is higher!
>
> I thought that with fewer variables and they are all showing
> significance which means they are influencing breeding success-then
> why is my AIC higher in this model??
> Do i only look at the AIC values and ignore the p-values? or only
> look at the p-values??
>
> Thanks!!
>
>
>
> _________________________________________________________________
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list