[R-sig-ME] Interpretation of lmer output in R

Mon Feb 21 21:06:48 CET 2011

On Mon, Feb 21, 2011 at 8:26 AM, Julia Sommerfeld
<Julia.Sommerfeld at utas.edu.au> wrote:
> Dear Douglas and list member,
>
> Thank you heaps for your answers. The interpretation of the summary output
> (lmer) is becoming much clearer now. I have to admit I had a slightly (not
> to say HUGE) different idea of the summary output.
>
> But a few questions still remain...
>
> I have tried out the suggestions with the following results:
>
> 1. I tested if "Sex" is an important factor in the model:
>
> fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
> fm1<-lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
> anova(fm1,fm)
>
> Data:
> Models:
> fm1: SameSite ~ BreedSuc1 + (1 | Bird)
> fm: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>     Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
> fm1  3 75.518 81.485 -34.759
> fm   4 77.379 85.335 -34.690 0.1387      1     0.7096

So I would conclude that even though sex was recorded it turns out
that it is not a significant predictor of the probability of site
fidelity and I would omit that term from the model.  As Ben described,
he would not be in favor of this approach because it verges on "data
snooping".  One can argue either way and, in this case it wouldn't
make much difference in the final conclusion whether or not the Sex
term is included.

> 2. Since Sex is not "important",  I compared fm1 with fm2 to test if
> BreedSuc1 is an important factor:
>
> fm2<-lmer(SameSite ~ 1 + (1|Bird), family="binomial")
> anova(fm2,fm1)
>
> Data:
> Models:
> fm2: SameSite ~ 1 + (1 | Bird)
> fm1: SameSite ~ BreedSuc1 + (1 | Bird)
>         Df    AIC        BIC      logLik      Chisq     Chi Df
>  Pr(>Chisq)
> fm2  2     77.617     81.595     -36.808
> fm1  3     75.518      81.485     -34.759     4.0991          1
>  0.04291 *

So if I needed to quote a p-value for the BreedSuc factor this is what
I would quote.

> BUT: Ben (thanks for the input) disagreed here:
> "I'm afraid I have to disagree with Doug here. This kind of model reduction,
> while seemingly sensible (and not as abusive as large-scale, automated
> stepwise regression), is a mild form of data snooping. Don't do it ..."
>
> If I don't do it, what could be an alternative? Simply rely on the output of
> fm1 without looking at fm2?

The alternative is to fit a model of the form

fm2a <- lmer(SameSite ~ 1 + Sex + (1|Bird), family="binomial")

and compare it to the original model, fm, using

anova(fm2a, fm)

The general idea of testing whether BreedSuc makes a significant
contribution to predicting the probability of site fidelity is to fit
a model with the term and then fit the model without the term and
compare the quality of the fits.  To me the most sensible way to
compare the quality of the fits is to consider the likelihood ratio.
The model with the term will always do better than the one without the
term - the question is, "Is it significantly better?".  One way to
answer that question is to convert the likelihood ratio test (LRT)
statistic to a probability or p-value using the result that, under the
null hypothesis (that the term does not make a significant
contribution) the LRT statistic has a chi-squared distribution with 1
degree of freedom.  One can set up other criteria; for example AIC
penalizes each parameter as 2 units on the deviance scale (negative
twice the log-likelihood).  BIC is a bit more complicated in that the
number of units of penalty per parameter on the deviance scale depends
on the number of observations in the data set.

I would claim that the LRT statistic is always a good way of
evaluating the difference in the quality of fit for two models - it is
how you convert it to a p-value that is not clear when you have small
sample sizes.

The difference between what I would advise and what Ben would advise
regarding the LRT is what the null and alternative models are.  I
would remove the Sex term from both.  He would retain the Sex term in
both.  This will result in slightly different conclusions.

This, by the way, emphasizes the point that a test statistic and its
corresponding p-value is not a property of the BreedSuc term.  It
results from comparing the quality of fits of two models - one with
the term and one without.  When we quote t- or z-statistics, and
p-values, in a coefficients table we are providing a summary of many
different types of tests simultaneously.  Unfortunately the conclusion
that is often drawn from the table is that the p-value is a property
of the term itself, which is wrong.

> 3.  From the output above, I conclude that BreedSuc1 has an effect on
> SameSite:
>
> summary(fm1)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: SameSite ~ BreedSuc1 + (1 | Bird)
>    AIC       BIC         logLik     deviance
>  75.52     81.48     -34.76    69.52
> Random effects:
>  Groups Name            Variance     Std.Dev.
>  Bird   (Intercept)         0.12332      0.35117
> Number of obs: 54, groups: Bird, 46
>
> Fixed effects:
>                     Estimate     Std. Error     z value     Pr(>|z|)
> (Intercept)      -0.2135       0.3794          -0.563        0.5736
> BreedSuc11    1.1831         0.5921           1.998        0.0457 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>            (Intr)
> BreedSuc11 -0.638
>
>
> Now, again the interpretation of the Fixed effects:
>
> BreedSuc1 has a significant effect on SameSite.
>
> #site-fidelity (SameSite=0)  of "bird" (bird(!), since I droppped "Sex"?)
> who was unsuccessful in breeding in the previous season (BreedSuc1=0),
> corresponds to a probability of 44%:
> plogis(-0.2135)
> [1] 0.4468268

Perhaps I misunderstood your original posting.  I thought that
SameSite=0 meant that the bird did not return to the same nest site.
That is, site fidelity corresponds to SameSite = 1.

In any case the probabilities constructed as you have done are the
probabilities for SameSite = 1.

> Further, site-fidelity (SameSite=0)  of "bird" who was successful in
> breeding in the previous season (BreedSuc1=1), corresponds to a probability
> of 72%:
> plogis(-0.2135 + 1.1831)
> [1] 0.7250398
>
>
> Is my interpretation correct?

> So this output is always based on SameSite=0?

> (SameSite=0 means that birds did NOT change nest site, i.e. they show high
> site-fidelity). But what if SameSite=1 would mean high site-fidelity (no
> change of nests?)?

Then the 0's and 1's would be reversed for that response variable and
your probabilities would be the complement (i.e. 1-p instead of p) of
those calculated above.

> 4. if I don't drop the term "Sex":
>
> summary(fm)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>    AIC   BIC logLik deviance
>  77.38 85.34 -34.69    69.38
> Random effects:
>  Groups Name        Variance     Std.Dev.
>  Bird   (Intercept)     0.14080       0.37524
> Number of obs: 54, groups: Bird, 46
>
> Fixed effects:
>                     Estimate         Std. Error     z value     Pr(>|z|)
> (Intercept)     -0.3294         0.4890         -0.674     0.5006
> BreedSuc11   1.1988          0.5957        2.012      0.0442 *
> Sex    M             0.2215           0.5877           0.377      0.7062
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>            (Intr)             BrdS11
> BreedSuc11     -0.536
> SexM           -0.628      0.065
>
>
> site fidelity for a female bird (i.e. SexM is 0) who was unsuccessful in
> breeding the previous season (i.e.
> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about 42%
>> plogis(-0.3294)
> [1] 0.4183866
>
> The log-odds of site fidelity for a female bird who was successful in
> breeding is -0.3294 + 1.1988, corresponding to a probability of about 70%
>> plogis(-0.3294 + 1.1988)
> [1] 0.7046208
>
>
> 5. The z-value: Sorry, I still have some trouble with this value....
>
> In model fm (without "Sex") the z-value of BreedSuc1 corresponds to 1.998.
>
> In model fm1("Sex" included) the z-value corresponds to 2.012.
>
> Nearly the same value in both models. But what can someone conclude from: p<
> 0.05, z=1.998 ??? Because this is what many people write in their result
> section (I was told to do so the same....).
>
> "You should see a LRT test statistic close to, but not exactly the same as,
> the square of the z value when you compare the models with and without that
> term.
> This is the sense in which the z-value is an approximation".
>
> I don't really understand how the z-value can be seen as an approximation?
> Am I missing some background knowledge here?

I may have been too terse in my explanations.  As mentioned above, I
would claim that the LRT statistic is a reasonable way to compare the
fits of two models, because it is based upon fitting the model with
and without the term of interest.  In the case of a linear model
without random effects it is not necessary to fit the model without
the term just to discover what the LRT statistic would be.  You can
tell from the model fit with the term what the maximum value for the
likelihood of any sub-model will be.

In the case of a linear mixed model or a generalized linear mixed
model you can't decide on the basis of the one model fit what the
likelihood for the other will be.  You can approximate but you you
don't get an exact value.  When it took a very long time to do a model
fit we just used the approximation.  Now that these fits can be done
much more quickly, it makes sense to fit both with and without.

The nature of the approximation is to take the parameter estimate for
BreedSuc and divide it by its approximate standard error.  We call
this the z-statistic because, when everything is working properly,
this should have a distribution close to a standard normal, which we
often write as Z.  The LRT statistic is the difference in the deviance
of the model without and the model with the term.  To me, that is the
quantity of interest and the fact that it should be approximately the
square of the z-statistic is helpful in making rough decisions but I
still want to calculate the difference in the deviance before making a
final decision.

In the summary of fm1 the z-statistic is 1.998 whereas the LRT
statistic comparing fm2 to fm1 is 4.0991.  The square of the
z-statistic will be close to, but not exactly the same as, the LRT
statistic.

> Again, thanks heaps for your help!!!!

You're welcome.  Thanks for the question.

> 2011/2/19 Douglas Bates <bates at stat.wisc.edu>
>>
>> Thank you for your questions and for transferring the discussion to
>> the R-SIG-Mixed-Models mailing list, as we had discussed.  I have also
>> copied the mailing list for a class on mixed-effects models that I am
>> teaching.
>>
>> I particularly appreciate your desire to learn about the model instead
>> of just quoting a p-value.  I often lament to my classes that
>> statisticians have been far too successful in propagating the idea of
>> p-values, to the extent that some researchers believe that is all that
>> is needed to learn about an analysis.
>>
>> On Sat, Feb 19, 2011 at 3:05 AM, Julia Sommerfeld
>> <Julia.Sommerfeld at utas.edu.au> wrote:
>> > Dear Douglas and list members,
>> >
>> > Apologies in advance if you might consider my questions as too simple to
>> > be asking the godfather of lme4 for an answer...thus, please feel free
>> > to
>> > ignore my email or to forward it to someone else.
>> >
>> > I'm a PhD student (Australia/Germany) working on tropical seabirds. As
>> > many of my PhD-collegues, I'm having some difficulties with the analysis
>> > of my data using lmer (family=binomial). While some say: What do you
>> > care
>> >
>> > about all the other values as long as you've got a p-value... I do
>> > believe
>> > that it is essential to understand WHAT I'm doing here and WHAT all
>> > these
>> > numbers/values mean.
>> >
>> > I've read the Chapters (lme4 Book Chapters) and publications about the
>> > use
>> > of lmer and searched the forums - but I don't find a satisfying answer.
>> > And I have the feeling that 1. the statistic lecture at my university
>> > was
>> > a joke (sad to say this) 2. that I need a huge statistical/mathematical
>> > background to fully understand GLMMs.
>> >
>> >
>> > One of the question I would like to answer is:
>> > Does the previous breeding success influences nest site fidelity?
>> >
>> > I have binomial data:
>> > SameSite=1 means birds use the same site
>> >
>> > SameSite=0 means birds change nest site
>> >
>> > BreedSuc1=1 Birds were successful in previous breeding season
>> > BreedSuc1=0 Birds were not successful "     "        "
>> >
>> > Sex= male, female
>> > Bird= Bird ID
>> >
>> > This is my model:
>>
>> > fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>>
>> > where Bird is my random factor (same birds were sampled more than once)
>>
>> One thing to note is that there are 46 different birds in the 54
>> observations.  Most birds will have just one observation so a random
>> effect for bird may not be necessary.
>>
>> > summary(fm)
>> >
>> > Generalized linear mixed model fit by the Laplace approximation
>> >
>> > Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>> >   AIC   BIC logLik deviance
>> >  77.38 85.34 -34.69    69.38
>> > Random effects:
>> >  Groups Name        Variance Std.Dev.
>> >  Bird   (Intercept) 0.14080  0.37524
>> > Number of obs: 54, groups: Bird, 46
>> >
>> > Fixed effects:
>> >            Estimate Std. Error z value Pr(>|z|)
>> > (Intercept)  -0.3294     0.4890  -0.674   0.5006
>> > BreedSuc11    1.1988     0.5957   2.012   0.0442 *
>> > SexM          0.2215     0.5877   0.377   0.7062
>>
>> this suggests that sex is not an important factor in the model.  The
>> (Intercept) term is close to zero, relative to its standard error, but
>> we would retain it in the model as explained below.
>>
>> > ---
>> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> >
>> > Correlation of Fixed Effects:
>> >           (Intr) BrdS11
>> > BreedSuc11 -0.536
>> > SexM       -0.628  0.065
>> >
>> >
>> > From this summary output I do understand that the Breeding Success has a
>> > significant effect on nest-site fidelity (p<0.05).
>>
>> Yes, but ... this p-value should be used as a guide only.  As
>> described below a p-value must be viewed in context.  It is not a
>> property of the Breeding Success factor; it comes from a comparison of
>> two models and we should bear in mind that these models are before
>> interpreting this number.
>>
>> The interpretation of a p-value for a particular coefficient is that
>> it is an approximation to the p-value we would get from comparing the
>> model that has been fit to the mode fit without this particular
>> coefficient.  In this case the coefficient corresponds to one of the
>> terms in the model and I would advocate performing a likelihood ratio
>> test comparing the two models
>>
>> fm <- glmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>> fm0 <- glmer(SameSite~Sex+(1|Bird), family="binomial")  # the null
>> hypothesis model
>> anova(fm0, fm)
>>
>> Even though the function is called anova it will, in this case,
>> perform a likelihood ratio test (LRT).  It also prints the values of
>> AIC and BIC if you prefer to compare models according to one of those
>> criteria but I prefer using the likelihood ratio for nested models.
>>
>> However, before doing that comparison you should ask yourself whether
>> you want to compare models that have the, apparently unnecessary term
>> for Sex in them.  The way I would approach the model building is first
>> to reduce the model to
>>
>> fm1 <- lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
>>
>> You could then compare
>>
>> anova(fm1, fm)
>>
>> which I presume will give a large p-value for the LRT, so we prefer
>> the simpler model, fm1.  After that, I would compare
>>
>> fm2 <- lmer(SameSite ~ 1 + (1|Bird), family="binomial")
>> anova(fm2, fm1)
>>
>> to see if the BreedSuc1 factor is an important predictor in its own right.
>>
>> Note that we don't drop the implicit "(Intercept)" term, even though
>> it has a high p-value in the coefficient table.  The reason is that
>> the interpretation of the (Intercept) coefficient depends on the
>> coding of BreedSuc1.
>>
>> In model fm, the log-odds of site fidelity for a female bird (i.e.
>> SexM is 0) who was unsuccessful in breeding the previous season (i.e.
>> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about
>> 42%
>>
>> > plogis(-0.3294)
>> [1] 0.4183866
>>
>> The log-odds of site fidelity for a female bird who was successful in
>> breeding is -0.3294 + 1.1988, corresponding to a probability of about
>> 70%
>>
>> > plogis(-0.3294 + 1.1988)
>> [1] 0.7046208
>>
>> If you had reversed the meaning of BreedSuc to BreedFail, where 0
>> indicates no failure at breeding and 1 indicates failure, then the
>> coefficient would change sign (i.e. the coefficient for BreedFail
>> would be -1.1988) and the intercept would change to
>>
>> > -0.3294 + 1.1988
>> [1] 0.8694
>>
>> because the reference level would now be a female bird who was
>> successful in breeding.
>>
>> Because the interpretation of the intercept depends upon the coding of
>> other factors, we retain it in the model whenever other terms are
>> retained.
>>
>>
>>
>> > But what else can I conclude from this model?
>> >
>> > Questions:
>> >
>> > 1.Random effects: What does the Random Effect table - the Variance, Std.
>> > Dev. and Intercept - tells me: Is there a random effect that my model
>> > has
>> > to account for?
>>
>> First I would remove the apparently unnecessary Sex term then,
>> ideally, I would check by comparing the fit of the reduced model to
>> that of a GLM without the random effect for Bird.  Unfortunately, I
>> don't think the definition of deviance for a glm fit is compatible
>> with that for a model fit by glmer.  This is something we will need to
>> fix.  For the time being I would instead examine the "caterpillar
>> plot" obtained with
>>
>> dotplot(ranef(fm1, postVar=TRUE))
>>
>> which represent the 95% prediction intervals for each of the birds.
>> If these all overlap zero comfortably I would conclude that the random
>> effect is not needed an fit a glm without a random effect for bird.
>> > Random effects:
>> >  Groups Name        Variance Std.Dev.
>> >  Bird   (Intercept) 0.14080  0.37524
>> > Number of obs: 54, groups: Bird, 46
>>
>> That estimated standard deviation is fairly large. We would expect a
>> range of contributions on the log-odds scale of about +/- 2 sd which,
>> at this point of the logistic curve corresponds to considerable
>> variability in predicted probabilities for birds with the same
>> characteristics.
>>
>> > 2. Fixed Effects: Again the Intercept? Not sure if I understand the
>> > meaning of it (sorry, explanation in Chapter I also doesn't help much)
>>
>> Actually in this model it is a bit different from the models described
>> in chapter 1.  I hope the explanation above makes sense.  Think of it
>> as the log-odds of site fidelity for a bird in the "reference group"
>> where reference group means that all the other variables are a the
>> zero level.
>>
>> > Fixed effects:
>> >            Estimate Std. Error z value Pr(>|z|)
>> > (Intercept)  -0.3294     0.4890  -0.674   0.5006
>> > BreedSuc11    1.1988     0.5957   2.012   0.0442 *
>> >
>> > SexM          0.2215     0.5877   0.377   0.7062
>> >
>> > 3. Meaning of the z-value? Why shall I mention it in te result section?
>>
>> I would regard the z-value as an approximation.  The quantity of
>> interest is the likelihood ratio test statistic which has a
>> chi-squared distribution under the null hypothesis (i.e. the term can
>> be deleted from the model without getting a significantly worse fit).
>> It happens that this would be a chi-squared distribution with 1 degree
>> of freedom, which corresponds to the square of a standard normal
>> distribution.  You should see a LRT test statistic close to, but not
>> exactly the same as, the square of the z value when you compare the
>> models with and without that term.  This is the sense in which the
>> z-value is an approximation.  To me the LRT statistic is more reliable
>> because it is based upon actually refitting the model.
>>
>> > 4. Estimate and Std. Error of the fixed effects? How can I tell from
>> > these
>> > values WHAT kind of effect (positiv, negativ?) these parameter have on
>> > nest-site fidelity? Do birds that were successful during the previous
>> > breeding success show a higher nest-site fidelity? Remember, I have
>> > binomial data...
>>
>> That is described above.  If you want the estimate of the site
>> fidelity for bird with certain characteristics you evaluate the
>> corresponding combination of coefficients and apply plogis to the
>> result.
>> > I would highly appreciate your feedback and/or suggestions of
>> >
>> > papers/chapters I could read for a better understanding of the output.
>> >
>> > Best regards,
>> >
>> >
>> > Julia
>>
>> I hope this helps.
>
>
>