[R-sig-ME] Multi-level qualitative (fixed-effects) factors
Peter Francis
peterfrancis at me.com
Mon Aug 2 18:51:09 CEST 2010
Dear List,
For the analysis of my GLMM i am using AIC values rather than stepwise regression to simplify it. I have developed some candidate models and am running through them now. I know a priori that there are some important interactions and i have also removed all the factors i consider unimportant.
I have many multi level factors i.e habit - aquatic, terrestrial, epiphyte etc
I ran the model with habit as a factor
> model111 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: threatornot ~ 1 + (1 | order/family) + habit
> AIC BIC logLik deviance
> 1406 1436 -696.9 1394
> Random effects:
> Groups Name Variance Std.Dev.
> family:order (Intercept) 6.9892e-01 8.3602e-01
> order (Intercept) 4.2292e-14 2.0565e-07
> Number of obs: 1116, groups: family:order, 43; order, 9
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.04803 0.19174 -0.250 0.80219
> habit2 1.10627 0.41607 2.659 0.00784 **
> habit3 0.92578 0.78141 1.185 0.23611
> habit4 0.14383 0.38477 0.374 0.70856
---
Which had a AIC of 1406
I then re-ran the model with only aquatic and got a lower AIC value - which i guess is to be expected as aquatic is highly significant and aquatic species are more prone to threat ( my response).
> > model112 <-lmer(threatornot~1+(1|a/b) + aquatic, family=binomial)
> > model112
> Generalized linear mixed model fit by the Laplace approximation
> Formula: threatornot ~ 1 + (1 | order/family) + aquatic
> AIC BIC logLik deviance
> 1395 1415 -693.4 1387
> Random effects:
> Groups Name Variance Std.Dev.
> family:order (Intercept) 0.60007 0.77464
> order (Intercept) 0.00000 0.00000
> Number of obs: 1116, groups: family:order, 43; order, 9
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 0.1572 0.1827 0.860 0.389613
> aquatic -0.6683 0.1737 -3.847 0.000119 ***
My question is - when i developed the candidate models i thought using multilevel factors would be OK and i would be able to tease out the individual levels. If i split the factors into levels from the beginning then i am left with a huge amount of candidate models? This would not be a problem in stepwise regression as i could just remove the habit with the least significant P Value.
If i remove habits i "feel" are unimportant from the beginning i feel i would be limiting the model too much.
I hope this makes sense!
Has anyone else had this problem or can see a work around?
Thanks
Peter
More information about the R-sig-mixed-models
mailing list