[R-sig-ME] assessing fixed factor significance depending on reference levels ?

Claudia Monica Campos claudia.monica.campos at gmail.com
Sun Mar 13 23:50:41 CET 2011

```Thanks Ben for the prompt reply, much appreciated!

On Sun, Mar 13, 2011 at 9:53 PM, Ben Bolker <bbolker at gmail.com> wrote:
> On 11-03-13 03:44 PM, Claudia Monica Campos wrote:
>> Dear list,
>>
>> I'm trying to fit a GLMM to assess whether some category of species
>> (native, mammal, bird, etc.)
>> from the total named by each student can be explained by differences
>> in the place of residence
>> (urban or rural), gender and/or age.
>>
>> m1= lmer(cbind (n_a_bird,n_animals-n_a_bird) ~ sex*place+age+ (1 |
>>   where:
>>     sex has two levels ('f' and 'm')
>>     place has two levels ('r' and 'u')
>>     age is numerical (from 7 to 18)
>>
>> As you can see from below, a\$place fixed effect could be an
>> explanatory variable,
>> but it may be significant (its p-value) depending on a\$sex ref level:
>>
>> a\$sex<-relevel(a\$sex,'f');a\$place<-relevel(a\$place,'r')
>> m1_fr= lmer(cbind (n_a_bird,n_animals-n_a_bird) ~ place*sex+age+ (1 |
>> a\$sex<-relevel(a\$sex,'m');a\$place<-relevel(a\$place,'r')
>> m1_mr= lmer(cbind (n_a_bird,n_animals-n_a_bird) ~ place*sex+age+ (1 |
>>
>> summary(m1_fr) ### ref levels: sex:'f' , place:'r'
>>   Generalized linear mixed model fit by the Laplace approximation
>>   Formula: cbind(n_a_bird, n_animals - n_a_bird) ~ place * sex + age +
>>      Data: a
>>     AIC  BIC logLik deviance
>>    2125 2163  -1055     2111
>>   Random effects:
>>    Groups       Name        Variance   Std.Dev.
>>    school       (Intercept) 1.6971e-02 1.3027e-01
>>   Number of obs: 1746, groups: grade:school, 51; school, 42
>>
>>   Fixed effects:
>>               Estimate Std. Error z value Pr(>|z|)
>>   (Intercept) -1.87093    0.15225 -12.289  < 2e-16 ***
>>   placeu       0.03351    0.07922   0.423 0.672331       <================
>>   sexm         0.16097    0.07476   2.153 0.031299 *
>>   age          0.02716    0.01099   2.472 0.013437 *
>>   placeu:sexm -0.32108    0.09478  -3.388 0.000705 ***
>>   ---
>>   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>>   Correlation of Fixed Effects:
>>               (Intr) placeu sexm   age
>>   placeu      -0.518
>>   sexm        -0.253  0.443
>>   age         -0.918  0.237  0.028
>>   placeu:sexm  0.160 -0.537 -0.788  0.021
>>
>> summary(m1_mr) ### ref levels: sex:'m', place: 'r'
>>   Generalized linear mixed model fit by the Laplace approximation
>>   Formula: cbind(n_a_bird, n_animals - n_a_bird) ~ place * sex + age +
>>      Data: a
>>     AIC  BIC logLik deviance
>>    2125 2163  -1055     2111
>>   Random effects:
>>    Groups       Name        Variance   Std.Dev.
>>    school       (Intercept) 1.6971e-02 1.3027e-01
>>   Number of obs: 1746, groups: grade:school, 51; school, 42
>>
>>   Fixed effects:
>>               Estimate Std. Error z value Pr(>|z|)
>>   (Intercept) -1.70995    0.15171 -11.271  < 2e-16 ***
>>   placeu      -0.28754    0.08484  -3.389 0.000701 ***   <================
>>   sexf        -0.16097    0.07476  -2.153 0.031298 *
>>   age          0.02716    0.01099   2.472 0.013438 *
>>   placeu:sexf  0.32102    0.09478   3.387 0.000706 ***
>>   ---
>>   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>>   Correlation of Fixed Effects:
>>               (Intr) placeu sexf   age
>>   placeu      -0.536
>>   sexf        -0.239  0.467
>>   age         -0.908  0.245 -0.028
>>   placeu:sexf  0.228 -0.616 -0.788 -0.021
>>
>> Doing a LRT, by removing a\$place seems to show it's indeed significant:
>>> m0=lmer(cbind (n_a_bird,n_animals-n_a_bird) ~ sex+age+ (1 | school/grade), data=a,family=binomial, REML=F)
>>> m1= lmer(cbind (n_a_bird,n_animals-n_a_bird) ~ place*sex+age+ (1 | school/grade), data=a,family=binomial, REML=F)
>>> anova(m0,m1)
>> Data: a
>> Models:
>> m0: cbind(n_a_bird, n_animals - n_a_bird) ~ sex + age + (1 | school/grade)
>> m1: cbind(n_a_bird, n_animals - n_a_bird) ~ place * sex + age + (1 |
>>    Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
>> m0  5 2134.7 2162.0 -1062.3
>> m1  7 2124.8 2163.1 -1055.4 13.808      2   0.001004 **
>> ---
>>
>> How should I proceed with the model selection ?
>>
>> To properly understand if a\$place alone or its interaction with a\$sex
>> is significant,
>> do I need to fit the model with different relevel-ing (in a
>> combinatorial way ) ?
>> Ie: if you see above "placeu", you'll find that shows significant for sex='m' as
>> reference but not for sex='f'.
>>
>

Studies show different results about the effect of place of residence on the
people's ecological knowledge (EK). The importance of this effect
could be higher
in developing countries, such as Argentina. Also social factors are
good predictors
of EK, like gender, and age.
My goal is to assess the influence of place of residence
(principally), gender and age
on EK about species by children from rural and urban schools of Mendoza.

>  The fact that there is a highly significant interaction means that the
> effect of 'urban' differs depending whether you consider males or
> females (and similarly for the effect of sex).
>
>  You can interpret the models as follows:
>
> fr: for females, urban is not sig. diff. from rural
>    for rural, male is sig. diff. from female
>
> mr: for males, urban is sig. diff. from rural
>    for rural, female is sig. diff. from male (same estimate as in model fr)
>
>   There is a school of thought (strongly represented in the R
> community) that says that interpreting main effects in the presence of
> interactions, especially significant interactions, just doesn't make
> sense: see Venables' "Exegeses on linear models" (google it).
>   A more traditional school of thought would look at marginal effects
> via the dreaded "type III sums of squares", i.e. looking at the
> *average* effect of rural vs urban across males and females and vice
> versa.  Again, it is arguable whether you want to do this or not,
> but setting sum contrasts makes it possible.

Thanks for the references. I will read them.

>   I would say that the most sensible thing to do, if you really want to
> test the significance of the main effect of place, is to subset your
> data and do the test separately for males and females.
> lmer(cbind (n_a_bird,n_animals-n_a_bird) ~ place+age+
> (1 | school/grade), data=a,subset=(sex=="male"), family=binomial)
>
>  and similarly for sex=="female"
>

I agree that this can be a reasonable approach, considering that the
effect of place of residence is the most relevant factor in my study.

> <http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html#Why-does-the-output-from-anova_0028_0029-depend-on-the-order-of-factors-in-the-model_003f>
>
>
>
>

Regards,

--

Claudia M. Campos