[R-sig-ME] Binary response ordering

Wed Aug 4 15:15:15 CEST 2010

On Wed, Aug 4, 2010 at 4:54 AM, John Haart <another83 at me.com> wrote:
> Dear List,
>
> I have a quick question regarding the setup of my data for analysis with a glmm.  I hope this is the appropriate list, i apologise if it is not.
>
> I have a response variable, TRUE or FALSE. I have coded this as 0 = False and 1 = TRUE in excel.
>
> I have 3 categorical factors with C,D and E
>
> I then read in the data frame and run the model as follows-
>
> lmer(trueorfalse~1+(1|A/B) + C + D+ E ,family=binomial)
>
> And this is the output
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: threatornot ~ 1 + (1 | A/B) + C + D+  E ,family=binomial)
>  AIC  BIC logLik deviance
>  1410 1450 -696.8     1394
> Random effects:
>  Groups       Name        Variance   Std.Dev.
>  family:order (Intercept) 6.7869e-01 8.2382e-01
>  order        (Intercept) 7.8204e-11 8.8433e-06
> Number of obs: 1116, groups: A:B, 43; B, 9

Apparently you altered the output at some point because the factors
that were named A and B ended up as order and family in the random
effects description.

> Fixed effects:
>            Estimate Std. Error z value Pr(>|z|)
> (Intercept)  0.11281    0.42232   0.267   0.7894
> C1   -0.02414    0.19964  -0.121   0.9038
> D2  -0.16482    0.38602  -0.427   0.6694
> E2       0.95381    0.54316   1.756   0.0791 .
> E3      0.75733    0.87275   0.868   0.3855
> E4       0.03044    0.47328   0.064   0.9487
>
> What i am unsure about is the inference, if a term is significant does this relate to TRUE or FALSE?

In this case it would be related to the probability of a TRUE response
but, as this is simply 1 - P(FALSE) then the only change if you
reversed the order would be to change the signs of the coefficients.
The simple way to verify this is to fit

glm(threatornot ~ 1)

and check the value of the coefficient.  It should be
log(pHat/(1-pHat)) where pHat is the proportion of TRUE responses.

> I.E E2 has a p value of 0.079, does this 0.079 relate to the probability of it resulting in a true or false response? Does it matter how i code the input i.e FALSE = 1, TRUE =2 for instance?

If there are two levels in the response then the model is fit
according to the probability of the second versus the first.  You can
disambiguate the process if you convert the response to a factor with
the levels specified explicitly.

The bigger issue is that you shouldn't pay too much attention to a
particular coefficient related to the levels of a factor like E
because the coefficients are defined with respect to the contrasts in
effect at the time the model was fit.  Without knowing the contrasts
being used and without prior knowledge that a particular contrast was
important, those coefficients are not important by themselves.  It is
the cumulative effect of the variability amongst the levels of the
factor that is important.

> Maybe i am reading the output wrong?
>
> Thanks
>
> John
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>