[R-sig-ME] Multi-level qualitative (fixed-effects) factors
Peter Francis
peterfrancis at me.com
Tue Aug 3 08:48:41 CEST 2010
Hi David and Ben, thanks for your help -
I was worried this would make little sense!
I have set out my candidate models
A+B+C+D
B+C+D
A+C+D
etc etc
And am running through them in lmer. Factor A for instance is Habit, which takes 3 forms - aquatic, terrestrial or epiphyte.
When i run the model with A as a factor i get the breakdown of the individual levels habitat 1, habitat 2 and habitat 3 and a corresponding AIC score. However if i just run it with habitat 3 - aquatic - i get a lower AIC score, therefore the model fits the data better?
I am unsure how to, without splitting my factors into their constituent levels at the beginning - A1+A2+A3 + B1 + B2 etc, arrive at the model with the lowest AIC?
Thanks
Peter
On 3 Aug 2010, at 00:16, David Duffy wrote:
On Mon, 2 Aug 2010, Peter Francis wrote:
> I have many multi level factors i.e habit - aquatic, terrestrial, epiphyte etc
>
> I ran the model with habit as a factor
>
>> model111 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
>
>> Generalized linear mixed model fit by the Laplace approximation
>> Formula: threatornot ~ 1 + (1 | order/family) + habit
>> AIC BIC logLik deviance
>> 1406 1436 -696.9 1394
>> Random effects:
>> Groups Name Variance Std.Dev.
>> family:order (Intercept) 6.9892e-01 8.3602e-01
>> order (Intercept) 4.2292e-14 2.0565e-07
>> Number of obs: 1116, groups: family:order, 43; order, 9
>>
>> Fixed effects:
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) -0.04803 0.19174 -0.250 0.80219
>> habit2 1.10627 0.41607 2.659 0.00784 **
>> habit3 0.92578 0.78141 1.185 0.23611
>> habit4 0.14383 0.38477 0.374 0.70856
>
> ---
> Which had a AIC of 1406
>
> I then re-ran the model with only aquatic and got a lower AIC value - which i guess is to be expected as aquatic is highly significant and aquatic species are more prone to threat ( my response).
>
>
>>> model112 <-lmer(threatornot~1+(1|a/b) + aquatic, family=binomial)
>>> model112
>> Generalized linear mixed model fit by the Laplace approximation
>> Formula: threatornot ~ 1 + (1 | order/family) + aquatic
>> AIC BIC logLik deviance
>> 1395 1415 -693.4 1387
>> Random effects:
>> Groups Name Variance Std.Dev.
>> family:order (Intercept) 0.60007 0.77464
>> order (Intercept) 0.00000 0.00000
>> Number of obs: 1116, groups: family:order, 43; order, 9
>>
>> Fixed effects:
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) 0.1572 0.1827 0.860 0.389613
>> aquatic -0.6683 0.1737 -3.847 0.000119 ***
>
> My question is - when i developed the candidate models i thought using multilevel factors would be OK and i would be able to tease out the individual levels. If i split the factors into levels from the beginning then i am left with a huge amount of candidate models? This would not be a problem in stepwise regression as i could just remove the habit with the least significant P Value.
>
> If i remove habits i "feel" are unimportant from the beginning i feel i would be limiting the model too much.
>
> I hope this makes sense!
I don't understand at all, I'm afraid. Is aquatic the same as habit=2, or something? If so, there is something funny about the model fits.
If family and order are "nuisance" variables, then a stepwise approach is quite reasonable (if you are someone who thinks stepwise regression is reasonable, of course ;)).
Just 2c, David Duffy.
--
| David Duffy (MBBS PhD) ,-_|\
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
| Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
More information about the R-sig-mixed-models
mailing list